WorldWideScience

Sample records for robust speaker identification

  1. Influence of binary mask estimation errors on robust speaker identification

    DEFF Research Database (Denmark)

    May, Tobias

    2017-01-01

    Missing-data strategies have been developed to improve the noise-robustness of automatic speech recognition systems in adverse acoustic conditions. This is achieved by classifying time-frequency (T-F) units into reliable and unreliable components, as indicated by a so-called binary mask. Different...... approaches have been proposed to handle unreliable feature components, each with distinct advantages. The direct masking (DM) approach attenuates unreliable T-F units in the spectral domain, which allows the extraction of conventionally used mel-frequency cepstral coefficients (MFCCs). Instead of attenuating....... Since each of these approaches utilizes the knowledge about reliable and unreliable feature components in a different way, they will respond differently to estimation errors in the binary mask. The goal of this study was to identify the most effective strategy to exploit knowledge about reliable...

  2. Multi-Frame Rate Based Multiple-Model Training for Robust Speaker Identification of Disguised Voice

    DEFF Research Database (Denmark)

    Prasad, Swati; Tan, Zheng-Hua; Prasad, Ramjee

    2013-01-01

    Speaker identification systems are prone to attack when voice disguise is adopted by the user. To address this issue,our paper studies the effect of using different frame rates on the accuracy of the speaker identification system for disguised voice.In addition, a multi-frame rate based multiple......-model training method is proposed. The experimental results show the superior performance of the proposed method compared to the commonly used single frame rate method for three types of disguised voice taken from the CHAINS corpus....

  3. Robustness-related issues in speaker recognition

    CERN Document Server

    Zheng, Thomas Fang

    2017-01-01

    This book presents an overview of speaker recognition technologies with an emphasis on dealing with robustness issues. Firstly, the book gives an overview of speaker recognition, such as the basic system framework, categories under different criteria, performance evaluation and its development history. Secondly, with regard to robustness issues, the book presents three categories, including environment-related issues, speaker-related issues and application-oriented issues. For each category, the book describes the current hot topics, existing technologies, and potential research focuses in the future. The book is a useful reference book and self-learning guide for early researchers working in the field of robust speech recognition.

  4. Robust speaker recognition in noisy environments

    CERN Document Server

    Rao, K Sreenivasa

    2014-01-01

    This book discusses speaker recognition methods to deal with realistic variable noisy environments. The text covers authentication systems for; robust noisy background environments, functions in real time and incorporated in mobile devices. The book focuses on different approaches to enhance the accuracy of speaker recognition in presence of varying background environments. The authors examine: (a) Feature compensation using multiple background models, (b) Feature mapping using data-driven stochastic models, (c) Design of super vector- based GMM-SVM framework for robust speaker recognition, (d) Total variability modeling (i-vectors) in a discriminative framework and (e) Boosting method to fuse evidences from multiple SVM models.

  5. Limited data speaker identification

    Indian Academy of Sciences (India)

    recognition can be either identification or verification depending on the task objective. .... like Bayesian formalism, voting method and Dempster-Shafer (D–S) theory ..... self-organizing map (SOM) (Kohonen 1990), learning vector quantization ...

  6. Cost-Sensitive Learning for Emotion Robust Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Dongdong Li

    2014-01-01

    Full Text Available In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  7. Cost-sensitive learning for emotion robust speaker recognition.

    Science.gov (United States)

    Li, Dongdong; Yang, Yingchun; Dai, Weihui

    2014-01-01

    In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  8. Robust Digital Speech Watermarking For Online Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Mohammad Ali Nematollahi

    2015-01-01

    Full Text Available A robust and blind digital speech watermarking technique has been proposed for online speaker recognition systems based on Discrete Wavelet Packet Transform (DWPT and multiplication to embed the watermark in the amplitudes of the wavelet’s subbands. In order to minimize the degradation effect of the watermark, these subbands are selected where less speaker-specific information was available (500 Hz–3500 Hz and 6000 Hz–7000 Hz. Experimental results on Texas Instruments Massachusetts Institute of Technology (TIMIT, Massachusetts Institute of Technology (MIT, and Mobile Biometry (MOBIO show that the degradation for speaker verification and identification is 1.16% and 2.52%, respectively. Furthermore, the proposed watermark technique can provide enough robustness against different signal processing attacks.

  9. Pitch Correlogram Clustering for Fast Speaker Identification

    Directory of Open Access Journals (Sweden)

    Nitin Jhanwar

    2004-12-01

    Full Text Available Gaussian mixture models (GMMs are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification.

  10. LEARNING VECTOR QUANTIZATION FOR ADAPTED GAUSSIAN MIXTURE MODELS IN AUTOMATIC SPEAKER IDENTIFICATION

    Directory of Open Access Journals (Sweden)

    IMEN TRABELSI

    2017-05-01

    Full Text Available Speaker Identification (SI aims at automatically identifying an individual by extracting and processing information from his/her voice. Speaker voice is a robust a biometric modality that has a strong impact in several application areas. In this study, a new combination learning scheme has been proposed based on Gaussian mixture model-universal background model (GMM-UBM and Learning vector quantization (LVQ for automatic text-independent speaker identification. Features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC extracted from the speech signal are used to train the New England subset of the TIMIT database. The best results obtained (90% for gender- independent speaker identification, 97 % for male speakers and 93% for female speakers for test data using 36 MFCC features.

  11. Developing a Speaker Identification System for the DARPA RATS Project

    DEFF Research Database (Denmark)

    Plchot, O; Matsoukas, S; Matejka, P

    2013-01-01

    This paper describes the speaker identification (SID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. ...... such as CFCCs out-perform MFCC front-ends on noisy audio, and (c) fusion of multiple systems provides 24% relative improvement in EER compared to the single best system when using a novel SVM-based fusion algorithm that uses side information such as gender, language, and channel id....

  12. Robust Speaker Authentication Based on Combined Speech and Voiceprint Recognition

    Science.gov (United States)

    Malcangi, Mario

    2009-08-01

    Personal authentication is becoming increasingly important in many applications that have to protect proprietary data. Passwords and personal identification numbers (PINs) prove not to be robust enough to ensure that unauthorized people do not use them. Biometric authentication technology may offer a secure, convenient, accurate solution but sometimes fails due to its intrinsically fuzzy nature. This research aims to demonstrate that combining two basic speech processing methods, voiceprint identification and speech recognition, can provide a very high degree of robustness, especially if fuzzy decision logic is used.

  13. FPGA Implementation for GMM-Based Speaker Identification

    Directory of Open Access Journals (Sweden)

    Phaklen EhKan

    2011-01-01

    Full Text Available In today's society, highly accurate personal identification systems are required. Passwords or pin numbers can be forgotten or forged and are no longer considered to offer a high level of security. The use of biological features, biometrics, is becoming widely accepted as the next level for security systems. Biometric-based speaker identification is a method of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract. These differences can be exploited by extracting feature vectors such as Mel-Frequency Cepstral Coefficients (MFCCs from the speech signal. A well-known statistical modelling process, the Gaussian Mixture Model (GMM, then models the distribution of each speaker's MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the hardware implementation for classification of a text-independent GMM-based speaker identification system. The aim was to produce a system that can perform simultaneous identification of large numbers of voice streams in real time. This has important potential applications in security and in automated call centre applications. A speedup factor of ninety was achieved compared to a software implementation on a standard PC.

  14. Noise Reduction with Microphone Arrays for Speaker Identification

    Energy Technology Data Exchange (ETDEWEB)

    Cohen, Z

    2011-12-22

    Reducing acoustic noise in audio recordings is an ongoing problem that plagues many applications. This noise is hard to reduce because of interfering sources and non-stationary behavior of the overall background noise. Many single channel noise reduction algorithms exist but are limited in that the more the noise is reduced; the more the signal of interest is distorted due to the fact that the signal and noise overlap in frequency. Specifically acoustic background noise causes problems in the area of speaker identification. Recording a speaker in the presence of acoustic noise ultimately limits the performance and confidence of speaker identification algorithms. In situations where it is impossible to control the environment where the speech sample is taken, noise reduction filtering algorithms need to be developed to clean the recorded speech of background noise. Because single channel noise reduction algorithms would distort the speech signal, the overall challenge of this project was to see if spatial information provided by microphone arrays could be exploited to aid in speaker identification. The goals are: (1) Test the feasibility of using microphone arrays to reduce background noise in speech recordings; (2) Characterize and compare different multichannel noise reduction algorithms; (3) Provide recommendations for using these multichannel algorithms; and (4) Ultimately answer the question - Can the use of microphone arrays aid in speaker identification?

  15. Using Avatars for Improving Speaker Identification in Captioning

    Science.gov (United States)

    Vy, Quoc V.; Fels, Deborah I.

    Captioning is the main method for accessing television and film content by people who are deaf or hard-of-hearing. One major difficulty consistently identified by the community is that of knowing who is speaking particularly for an off screen narrator. A captioning system was created using a participatory design method to improve speaker identification. The final prototype contained avatars and a coloured border for identifying specific speakers. Evaluation results were very positive; however participants also wanted to customize various components such as caption and avatar location.

  16. Text-Independent Speaker Identification Using the Histogram Transform Model

    DEFF Research Database (Denmark)

    Ma, Zhanyu; Yu, Hong; Tan, Zheng-Hua

    2016-01-01

    In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design a super-MFCCs features by cascading three neighboring Mel-frequency Cepstral coefficients (MFCCs) frames together....... These super-MFCC vectors are utilized for probabilistic model training such that the speaker’s characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recedes...

  17. Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals

    DEFF Research Database (Denmark)

    Saeidi, Rahim; Mowlaee, Pejman; Kinnunen, Tomi

    2010-01-01

    In this paper, we consider speaker identification for the co-channel scenario in which speech mixture from speakers is recorded by one microphone only. The goal is to identify both of the speakers from their mixed signal. High recognition accuracies have already been reported when an accurately...

  18. Speaker gender identification based on majority vote classifiers

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  19. A Joint Approach for Single-Channel Speaker Identification and Speech Separation

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Saeidi, Rahim; Christensen, Mads Græsbøll

    2012-01-01

    ) accuracy, here, we report the objective and subjective results as well. The results show that the proposed system performs as well as the best of the state-of-the-art in terms of perceived quality while its performance in terms of speaker identification and automatic speech recognition results......In this paper, we present a novel system for joint speaker identification and speech separation. For speaker identification a single-channel speaker identification algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as a by-product. For speech separation, we propose...... a sinusoidal model-based algorithm. The speech separation algorithm consists of a double-talk/single-talk detector followed by a minimum mean square error estimator of sinusoidal parameters for finding optimal codevectors from pre-trained speaker codebooks. In evaluating the proposed system, we start from...

  20. Visual speaker gender affects vowel identification in Danish

    DEFF Research Database (Denmark)

    Larsen, Charlotte; Tøndering, John

    2013-01-01

    The experiment examined the effect of visual speaker gender on the vowel perception of 20 native Danish-speaking subjects. Auditory stimuli consisting of a continuum between /muːlə/ ‘muzzle’ and /moːlə/ ‘pier’ generated using TANDEM-STRAIGHT matched with video clips of a female and a male speaker...

  1. Joint Single-Channel Speech Separation and Speaker Identification

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Saeidi, Rahim; Tan, Zheng-Hua

    2010-01-01

    In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished by incorporat......In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished...... enhances the quality of the separated output signals. To assess the improvements, the results are reported in terms of PESQ for both target and masked signals....

  2. Integrated Robust Open-Set Speaker Identification System (IROSIS)

    Science.gov (United States)

    2012-05-01

    the exact joint estimation, but the deviation is small enough. The references [16] and [18] introduce a “ Gauss - Seidel -like iterative algorithm... iteration for a given number of times or until convergence . The Baum-Welch statistics are re-calculated in every iteration . 3.3.1.2 MAP adaptation of GMMs...have almost converged , and in the subsequent iterations it is mostly the magnitude that gets adjusted. When the initial values are two small, it would

  3. Speaker identification for the improvement of the security communication between law enforcement units

    Science.gov (United States)

    Tovarek, Jaromir; Partila, Pavol

    2017-05-01

    This article discusses the speaker identification for the improvement of the security communication between law enforcement units. The main task of this research was to develop the text-independent speaker identification system which can be used for real-time recognition. This system is designed for identification in the open set. It means that the unknown speaker can be anyone. Communication itself is secured, but we have to check the authorization of the communication parties. We have to decide if the unknown speaker is the authorized for the given action. The calls are recorded by IP telephony server and then these recordings are evaluate using classification If the system evaluates that the speaker is not authorized, it sends a warning message to the administrator. This message can detect, for example a stolen phone or other unusual situation. The administrator then performs the appropriate actions. Our novel proposal system uses multilayer neural network for classification and it consists of three layers (input layer, hidden layer, and output layer). A number of neurons in input layer corresponds with the length of speech features. Output layer then represents classified speakers. Artificial Neural Network classifies speech signal frame by frame, but the final decision is done over the complete record. This rule substantially increases accuracy of the classification. Input data for the neural network are a thirteen Mel-frequency cepstral coefficients, which describe the behavior of the vocal tract. These parameters are the most used for speaker recognition. Parameters for training, testing and validation were extracted from recordings of authorized users. Recording conditions for training data correspond with the real traffic of the system (sampling frequency, bit rate). The main benefit of the research is the system developed for text-independent speaker identification which is applied to secure communication between law enforcement units.

  4. Speaker Recognition

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther

    2005-01-01

    Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...

  5. Gender Identification of the Speaker Using VQ Method

    Directory of Open Access Journals (Sweden)

    Vasif V. Nabiyev

    2009-11-01

    Full Text Available Speaking is the easiest and natural form of communication between people. Intensive studies are made in order to provide this communication via computers between people. The systems using voice biometric technology are attracting attention especially in the angle of cost and usage. When compared with the other biometic systems the application is much more practical. For example by using a microphone placed in the environment voice record can be obtained even without notifying the user and the system can be applied. Moreover the remote access facility is one of the other advantages of voice biometry. In this study, it is aimed to automatically determine the gender of the speaker through the speech waves which include personal information. If the speaker gender can be determined while composing models according to the gender information, the success of voice recognition systems can be increased in an important degree. Generally all the speaker recognition systems are composed of two parts which are feature extraction and matching. Feature extraction is the procedure in which the least information presenting the speech and the speaker is determined through voice signal. There are different features used in voice applications such as LPC, MFCC and PLP. In this study as a feature vector MFCC is used. Feature mathcing is the procedure in which the features derived from unknown speakers and known speaker group are compared. According to the text used in comparison the system is devided to two parts that are text dependent and text independent. While the same text is used in text dependent systems, different texts are used in indepentent text systems. Nowadays, DTW and HMM are text dependent, VQ and GMM are text indepentent matching methods. In this study due to the high success ratio and simple application features VQ approach is used.In this study a system which determines the speaker gender automatically and text independent is proposed. The proposed

  6. The Role of Speaker Identification in Korean University Students' Attitudes towards Five Varieties of English

    Science.gov (United States)

    Yook, Cheongmin; Lindemann, Stephanie

    2013-01-01

    This study investigates how the attitudes of 60 Korean university students towards five varieties of English are affected by the identification of the speaker's nationality and ethnicity. The study employed both a verbal guise technique and questions eliciting overt beliefs and preferences related to learning English. While the majority of the…

  7. Performance of svm, k-nn and nbc classifiers for text-independent speaker identification with and without modelling through merging models

    Directory of Open Access Journals (Sweden)

    Yussouf Nahayo

    2016-04-01

    Full Text Available This paper proposes some methods of robust text-independent speaker identification based on Gaussian Mixture Model (GMM. We implemented a combination of GMM model with a set of classifiers such as Support Vector Machine (SVM, K-Nearest Neighbour (K-NN, and Naive Bayes Classifier (NBC. In order to improve the identification rate, we developed a combination of hybrid systems by using validation technique. The experiments were performed on the dialect DR1 of the TIMIT corpus. The results have showed a better performance for the developed technique compared to the individual techniques.

  8. Feature Fusion Based Audio-Visual Speaker Identification Using Hidden Markov Model under Different Lighting Variations

    Directory of Open Access Journals (Sweden)

    Md. Rabiul Islam

    2014-01-01

    Full Text Available The aim of the paper is to propose a feature fusion based Audio-Visual Speaker Identification (AVSI system with varied conditions of illumination environments. Among the different fusion strategies, feature level fusion has been used for the proposed AVSI system where Hidden Markov Model (HMM is used for learning and classification. Since the feature set contains richer information about the raw biometric data than any other levels, integration at feature level is expected to provide better authentication results. In this paper, both Mel Frequency Cepstral Coefficients (MFCCs and Linear Prediction Cepstral Coefficients (LPCCs are combined to get the audio feature vectors and Active Shape Model (ASM based appearance and shape facial features are concatenated to take the visual feature vectors. These combined audio and visual features are used for the feature-fusion. To reduce the dimension of the audio and visual feature vectors, Principal Component Analysis (PCA method is used. The VALID audio-visual database is used to measure the performance of the proposed system where four different illumination levels of lighting conditions are considered. Experimental results focus on the significance of the proposed audio-visual speaker identification system with various combinations of audio and visual features.

  9. Using Closed-Set Speaker Identification Score Confidence to Enhance Audio-Based Collaborative Filtering for Multiple Users

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Kristoffersen, Miklas Strøm

    2018-01-01

    In this paper, we utilize a closed-set speaker-identification approach to convey the ratings needed for collaborative filtering-based recommendation. Instead of explicitly providing a rating for a given program, users use a speech interface to dictate the desired rating after watching a movie. Due...... to the inaccuracies that may be imposed by a state-of-the-art speaker identification system, it is possible to mistake a user for another user in the household, especially when the users exhibit similar or identical age and gender demographics. This leads to the undesirable effect of injecting unwanted ratings...... into the collaborative rating matrix, and when the users have different tastes, can result in the recommendation of undesirable items. We therefore propose a simple confidence-based heuristic that utilizes the log-likelihood scores from the speaker identification front-end. The algorithm limits the degree to which...

  10. Robust structural identification via polyhedral template matching

    DEFF Research Database (Denmark)

    Larsen, Peter Mahler; Schmidt, Søren; Schiøtz, Jakob

    2016-01-01

    Successful scientific applications of large-scale molecular dynamics often rely on automated methods for identifying the local crystalline structure of condensed phases. Many existing methods for structural identification, such as common neighbour analysis, rely on interatomic distances (or thres...... is made available under a Free and Open Source Software license....

  11. Identification and robust control of an experimental servo motor.

    Science.gov (United States)

    Adam, E J; Guestrin, E D

    2002-04-01

    In this work, the design of a robust controller for an experimental laboratory-scale position control system based on a dc motor drive as well as the corresponding identification and robust stability analysis are presented. In order to carry out the robust design procedure, first, a classic closed-loop identification technique is applied and then, the parametrization by internal model control is used. The model uncertainty is evaluated under both parametric and global representation. For the latter case, an interesting discussion about the conservativeness of this description is presented by means of a comparison between the uncertainty disk and the critical perturbation radius approaches. Finally, conclusions about the performance of the experimental system with the robust controller are discussed using comparative graphics of the controlled variable and the Nyquist stability margin as a robustness measurement.

  12. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment

    Science.gov (United States)

    2015-10-01

    Dallas Erik Jonsson School of Engineering & Computer Science EC32 P.O. Box 830688 Richardson, Texas 75083-0688 8. PERFORMING ORGANIZATION REPORT...87 4.3 Whisper Based Processing for ASR ………………………………………….…. 92 5.0 Task 5: SPEAKER STATE ASSESSMENT/ ENVIROMENTAL SNIFFING (SSA/ENVS...Dec. 7-10, 2014 [3] S. Amuda, H. Boril, A. Sangwan, J.H.L. Hansen, T.S. Ibiyemi, “ Engineering analysis and recognition of Nigerian English: An

  13. A robust firearm identification algorithm of forensic ballistics specimens

    Science.gov (United States)

    Chuan, Z. L.; Jemain, A. A.; Liong, C.-Y.; Ghani, N. A. M.; Tan, L. K.

    2017-09-01

    There are several inherent difficulties in the existing firearm identification algorithms, include requiring the physical interpretation and time consuming. Therefore, the aim of this study is to propose a robust algorithm for a firearm identification based on extracting a set of informative features from the segmented region of interest (ROI) using the simulated noisy center-firing pin impression images. The proposed algorithm comprises Laplacian sharpening filter, clustering-based threshold selection, unweighted least square estimator, and segment a square ROI from the noisy images. A total of 250 simulated noisy images collected from five different pistols of the same make, model and caliber are used to evaluate the robustness of the proposed algorithm. This study found that the proposed algorithm is able to perform the identical task on the noisy images with noise levels as high as 70%, while maintaining a firearm identification accuracy rate of over 90%.

  14. Set-Membership Identification for Robust Control Design

    Science.gov (United States)

    1993-04-28

    Clauifica lion) ( U) Set-Memnbership Identification for Robust Control Design ___________________ 1. PERSONAL A UTHOR(SI Dr. Robert L. Kosul. Final Report...Shalom, E.Tse "Caution, probing, and the value of information in the control of un- certain systems", Annals of Economic and Social Measurement, 5/3, pp...knowing a bound on I the impulse response is quantitative. A similar clasoitication can be made regarding signal charateristics . Knowing that a signal is

  15. A robust star identification algorithm with star shortlisting

    Science.gov (United States)

    Mehta, Deval Samirbhai; Chen, Shoushun; Low, Kay Soon

    2018-05-01

    A star tracker provides the most accurate attitude solution in terms of arc seconds compared to the other existing attitude sensors. When no prior attitude information is available, it operates in "Lost-In-Space (LIS)" mode. Star pattern recognition, also known as star identification algorithm, forms the most crucial part of a star tracker in the LIS mode. Recognition reliability and speed are the two most important parameters of a star pattern recognition technique. In this paper, a novel star identification algorithm with star ID shortlisting is proposed. Firstly, the star IDs are shortlisted based on worst-case patch mismatch, and later stars are identified in the image by an initial match confirmed with a running sequential angular match technique. The proposed idea is tested on 16,200 simulated star images having magnitude uncertainty, noise stars, positional deviation, and varying size of the field of view. The proposed idea is also benchmarked with the state-of-the-art star pattern recognition techniques. Finally, the real-time performance of the proposed technique is tested on the 3104 real star images captured by a star tracker SST-20S currently mounted on a satellite. The proposed technique can achieve an identification accuracy of 98% and takes only 8.2 ms for identification on real images. Simulation and real-time results depict that the proposed technique is highly robust and achieves a high speed of identification suitable for actual space applications.

  16. Forensic speaker identification through comparative analysis of the formant frequencies of the vowels in the Macedonian language

    International Nuclear Information System (INIS)

    Pop-Dimitrijoska, V.; Apostolovska, G

    2012-01-01

    The main objective of this study is forensic speaker identification from an incriminated recording. The identification was made through a comparative analysis between first three formants F 1 , F 2 and F 3 of the voice samples from the questioned and suspects’ recordings. The measurements were made with the PRAAT software, for each of the five vowels in the Macedonian language: a, e, i, o and u, which were isolated from the recordings. Used methodology of recording examinations employed in this research showed positive identification of the questioned voice. The forensic audio analysis still doesn't have its place in legal and the crime fighting systems in Macedonia. This is a sufficient reason to put a bigger accent on the research of this issue in the future that will contribute in solving many criminal cases which until now, because of the type of generally accepted evidence, were not resolved. (Author)

  17. "Feminism Lite?" Feminist Identification, Speaker Appearance, and Perceptions of Feminist and Antifeminist Messengers

    Science.gov (United States)

    Bullock, Heather E.; Fernald, Julian L.

    2003-01-01

    Drawing on a communications model of persuasion (Hovland, Janis, & Kelley, 1953), this study examined the effect of target appearance on feminists' and nonfeminists' perceptions of a speaker delivering a feminist or an antifeminist message. One hundred three college women watched one of four videotaped speeches that varied by content (profeminist…

  18. [On the use of the spectral speech characteristics for the determination of biometric parameters of the vocal tract in forensic medical identification of the speaker's personality].

    Science.gov (United States)

    Kaganov, A Sh

    2014-01-01

    The objective of the present study was to elucidate the relationship between the spectral speech characteristics and the biometric parameters of the speaker's vocal tract. The secondary objective was to consider the theoretical basis behind the medico-criminalistic personality identification from the biometric parameters of the speaker's vocal tract. The article is based on the results of real forensic medical investigations and the literature data.

  19. An automatic speech recognition system with speaker-independent identification support

    Science.gov (United States)

    Caranica, Alexandru; Burileanu, Corneliu

    2015-02-01

    The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

  20. Speaker Authentication

    CERN Document Server

    Li, Qi (Peter)

    2012-01-01

    This book focuses on use of voice as a biometric measure for personal authentication. In particular, "Speaker Recognition" covers two approaches in speaker authentication: speaker verification (SV) and verbal information verification (VIV). The SV approach attempts to verify a speaker’s identity based on his/her voice characteristics while the VIV approach validates a speaker’s identity through verification of the content of his/her utterance(s). SV and VIV can be combined for new applications. This is still a new research topic with significant potential applications. The book provides with a broad overview of the recent advances in speaker authentication while giving enough attention to advanced and useful algorithms and techniques. It also provides a step by step introduction to the current state of the speaker authentication technology, from the fundamental concepts to advanced algorithms. We will also present major design methodologies and share our experience in developing real and successful speake...

  1. Evaluation of a speaker identification system with and without fusion using three databases in the presence of noise and handset effects

    Science.gov (United States)

    S. Al-Kaltakchi, Musab T.; Woo, Wai L.; Dlay, Satnam; Chambers, Jonathon A.

    2017-12-01

    In this study, a speaker identification system is considered consisting of a feature extraction stage which utilizes both power normalized cepstral coefficients (PNCCs) and Mel frequency cepstral coefficients (MFCC). Normalization is applied by employing cepstral mean and variance normalization (CMVN) and feature warping (FW), together with acoustic modeling using a Gaussian mixture model-universal background model (GMM-UBM). The main contributions are comprehensive evaluations of the effect of both additive white Gaussian noise (AWGN) and non-stationary noise (NSN) (with and without a G.712 type handset) upon identification performance. In particular, three NSN types with varying signal to noise ratios (SNRs) were tested corresponding to street traffic, a bus interior, and a crowded talking environment. The performance evaluation also considered the effect of late fusion techniques based on score fusion, namely, mean, maximum, and linear weighted sum fusion. The databases employed were TIMIT, SITW, and NIST 2008; and 120 speakers were selected from each database to yield 3600 speech utterances. As recommendations from the study, mean fusion is found to yield overall best performance in terms of speaker identification accuracy (SIA) with noisy speech, whereas linear weighted sum fusion is overall best for original database recordings.

  2. Similar speaker recognition using nonlinear analysis

    International Nuclear Information System (INIS)

    Seo, J.P.; Kim, M.S.; Baek, I.C.; Kwon, Y.H.; Lee, K.S.; Chang, S.W.; Yang, S.I.

    2004-01-01

    Speech features of the conventional speaker identification system, are usually obtained by linear methods in spectral space. However, these methods have the drawback that speakers with similar voices cannot be distinguished, because the characteristics of their voices are also similar in spectral space. To overcome the difficulty in linear methods, we propose to use the correlation exponent in the nonlinear space as a new feature vector for speaker identification among persons with similar voices. We show that our proposed method surprisingly reduces the error rate of speaker identification system to speakers with similar voices

  3. Robust model identification applied to type 1diabetes

    DEFF Research Database (Denmark)

    Finan, Daniel Aaron; Jørgensen, John Bagterp; Poulsen, Niels Kjølstad

    2010-01-01

    In many realistic applications, process noise is known to be neither white nor normally distributed. When identifying models in these cases, it may be more effective to minimize a different penalty function than the standard sum of squared errors (as in a least-squares identification method). Thi...

  4. Limited data speaker identification

    Indian Academy of Sciences (India)

    This work demonstrates the following: multiple frame size and rate (MFSR) analysis provides improvement in the analysis stage, combination of mel frequency cepstral coefficients (MFCC), its temporal derivatives ( Δ , Δ Δ ) , linear prediction residual (LPR) and linear prediction residual phase (LPRP) features provides ...

  5. Robust uncertainty evaluation for system identification on distributed wireless platforms

    Science.gov (United States)

    Crinière, Antoine; Döhler, Michael; Le Cam, Vincent; Mevel, Laurent

    2016-04-01

    Health monitoring of civil structures by system identification procedures from automatic control is now accepted as a valid approach. These methods provide frequencies and modeshapes from the structure over time. For a continuous monitoring the excitation of a structure is usually ambient, thus unknown and assumed to be noise. Hence, all estimates from the vibration measurements are realizations of random variables with inherent uncertainty due to (unknown) process and measurement noise and finite data length. The underlying algorithms are usually running under Matlab under the assumption of large memory pool and considerable computational power. Even under these premises, computational and memory usage are heavy and not realistic for being embedded in on-site sensor platforms such as the PEGASE platform. Moreover, the current push for distributed wireless systems calls for algorithmic adaptation for lowering data exchanges and maximizing local processing. Finally, the recent breakthrough in system identification allows us to process both frequency information and its related uncertainty together from one and only one data sequence, at the expense of computational and memory explosion that require even more careful attention than before. The current approach will focus on presenting a system identification procedure called multi-setup subspace identification that allows to process both frequencies and their related variances from a set of interconnected wireless systems with all computation running locally within the limited memory pool of each system before being merged on a host supervisor. Careful attention will be given to data exchanges and I/O satisfying OGC standards, as well as minimizing memory footprints and maximizing computational efficiency. Those systems are built in a way of autonomous operations on field and could be later included in a wide distributed architecture such as the Cloud2SM project. The usefulness of these strategies is illustrated on

  6. Robust

    DEFF Research Database (Denmark)

    2017-01-01

    Robust – Reflections on Resilient Architecture’, is a scientific publication following the conference of the same name in November of 2017. Researches and PhD-Fellows, associated with the Masters programme: Cultural Heritage, Transformation and Restoration (Transformation), at The Royal Danish...

  7. A constrained robust least squares approach for contaminant release history identification

    Science.gov (United States)

    Sun, Alexander Y.; Painter, Scott L.; Wittmeyer, Gordon W.

    2006-04-01

    Contaminant source identification is an important type of inverse problem in groundwater modeling and is subject to both data and model uncertainty. Model uncertainty was rarely considered in the previous studies. In this work, a robust framework for solving contaminant source recovery problems is introduced. The contaminant source identification problem is first cast into one of solving uncertain linear equations, where the response matrix is constructed using a superposition technique. The formulation presented here is general and is applicable to any porous media flow and transport solvers. The robust least squares (RLS) estimator, which originated in the field of robust identification, directly accounts for errors arising from model uncertainty and has been shown to significantly reduce the sensitivity of the optimal solution to perturbations in model and data. In this work, a new variant of RLS, the constrained robust least squares (CRLS), is formulated for solving uncertain linear equations. CRLS allows for additional constraints, such as nonnegativity, to be imposed. The performance of CRLS is demonstrated through one- and two-dimensional test problems. When the system is ill-conditioned and uncertain, it is found that CRLS gave much better performance than its classical counterpart, the nonnegative least squares. The source identification framework developed in this work thus constitutes a reliable tool for recovering source release histories in real applications.

  8. System Identification and Resonant Control of Thermoacoustic Engines for Robust Solar Power

    Directory of Open Access Journals (Sweden)

    Boe-Shong Hong

    2015-05-01

    Full Text Available It was found that thermoacoustic solar-power generators with resonant control are more powerful than passive ones. To continue the work, this paper focuses on the synthesis of robustly resonant controllers that guarantee single-mode resonance not only in steady states, but also in transient states when modelling uncertainties happen and working temperature temporally varies. Here the control synthesis is based on the loop shifting and the frequency-domain identification in advance thereof. Frequency-domain identification is performed to modify the mathematical modelling and to identify the most powerful mode, so that the DSP-based feedback controller can online pitch the engine to the most powerful resonant-frequency robustly and accurately. Moreover, this paper develops two control tools, the higher-order van-der-Pol oscillator and the principle of Dynamical Equilibrium, to assist in system identification and feedback synthesis, respectively.

  9. Speaker segmentation and clustering

    OpenAIRE

    Kotti, M; Moschou, V; Kotropoulos, C

    2008-01-01

    07.08.13 KB. Ok to add the accepted version to Spiral, Elsevier says ok whlile mandate not enforced. This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker...

  10. A Parametric Learning and Identification Based Robust Iterative Learning Control for Time Varying Delay Systems

    Directory of Open Access Journals (Sweden)

    Lun Zhai

    2014-01-01

    Full Text Available A parametric learning based robust iterative learning control (ILC scheme is applied to the time varying delay multiple-input and multiple-output (MIMO linear systems. The convergence conditions are derived by using the H∞ and linear matrix inequality (LMI approaches, and the convergence speed is analyzed as well. A practical identification strategy is applied to optimize the learning laws and to improve the robustness and performance of the control system. Numerical simulations are illustrated to validate the above concepts.

  11. Hybrid Speaker Recognition Using Universal Acoustic Model

    Science.gov (United States)

    Nishimura, Jun; Kuroda, Tadahiro

    We propose a novel speaker recognition approach using a speaker-independent universal acoustic model (UAM) for sensornet applications. In sensornet applications such as “Business Microscope”, interactions among knowledge workers in an organization can be visualized by sensing face-to-face communication using wearable sensor nodes. In conventional studies, speakers are detected by comparing energy of input speech signals among the nodes. However, there are often synchronization errors among the nodes which degrade the speaker recognition performance. By focusing on property of the speaker's acoustic channel, UAM can provide robustness against the synchronization error. The overall speaker recognition accuracy is improved by combining UAM with the energy-based approach. For 0.1s speech inputs and 4 subjects, speaker recognition accuracy of 94% is achieved at the synchronization error less than 100ms.

  12. Robust stator resistance identification of an IM drive using model reference adaptive system

    International Nuclear Information System (INIS)

    Madadi Kojabadi, Hossein; Abarzadeh, Mostafa; Aghaei Farouji, Said

    2013-01-01

    Highlights: ► We estimate the stator resistance and rotor speed of the IM. ► We proposed a new quantity to estimate the speed and stator resistance of IM. ► The proposed algorithm is robust to rotor resistance variations. ► We estimate the IM speed and stator resistance simultaneously to avoid speed error. - Abstract: Model reference adaptive system (MRAS) based robust stator resistance estimator for sensorless induction motor (IM) drive is proposed. The MRAS is formed with a semi-active power quantity. The proposed identification method can be achieved with on-line tuning of the stator resistance with robustness against rotor resistance variations. Stable and efficient estimation of IM speed at low region will be guaranteed by simultaneous identification of IM speed and stator resistance. The stability of proposed stator resistance estimator is checked through Popov’s hyperstability theorem. Simulation and experimental results are given to highlight the feasibility, the simplicity, and the robustness of the proposed method.

  13. Multimodal Speaker Diarization.

    Science.gov (United States)

    Noulas, A; Englebienne, G; Krose, B J A

    2012-01-01

    We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an audiovisual recording as multimodal entities that generate observations in the audio stream, the video stream, and the joint audiovisual space. The framework is very robust to different contexts, makes no assumptions about the location of the recording equipment, and does not require labeled training data as it acquires the model parameters using the Expectation Maximization (EM) algorithm. We apply the proposed model to two meeting videos and a news broadcast video, all of which come from publicly available data sets. The results acquired in speaker diarization are in favor of the proposed multimodal framework, which outperforms the single modality analysis results and improves over the state-of-the-art audio-based speaker diarization.

  14. Robust Switching Control and Subspace Identification for Flutter of Flexible Wing

    Directory of Open Access Journals (Sweden)

    Yizhe Wang

    2018-01-01

    Full Text Available Active flutter suppression and subspace identification for a flexible wing model using micro fiber composite actuator were experimentally studied in a low speed wind tunnel. NACA0006 thin airfoil model was used for the experimental object to verify the performance of identification algorithm and designed controller. The equation of the fluid, vibration, and piezoelectric coupled motion was theoretically analyzed and experimentally identified under the open-loop and closed-loop condition by subspace method for controller design. A robust pole placement algorithm in terms of linear matrix inequality that accommodates the model uncertainty caused by identification deviation and flow speed variation was utilized to stabilize the divergent aeroelastic system. For further enlarging the flutter envelope, additional controllers were designed subject to the models beyond the flutter speed. Wind speed was measured online as the decision parameter of switching between the controllers. To ensure the stability of arbitrary switching, Common Lyapunov function method was applied to design the robust pole placement controllers for different models to ensure that the closed-loop system shared a common Lyapunov function. Wind tunnel result showed that the designed controllers could stabilize the time varying aeroelastic system over a wide range under arbitrary switching.

  15. An integer optimization algorithm for robust identification of non-linear gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Chemmangattuvalappil Nishanth

    2012-09-01

    Full Text Available Abstract Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters

  16. Robust synchronization of delayed neural networks based on adaptive control and parameters identification

    International Nuclear Information System (INIS)

    Zhou Jin; Chen Tianping; Xiang Lan

    2006-01-01

    This paper investigates synchronization dynamics of delayed neural networks with all the parameters unknown. By combining the adaptive control and linear feedback with the updated law, some simple yet generic criteria for determining the robust synchronization based on the parameters identification of uncertain chaotic delayed neural networks are derived by using the invariance principle of functional differential equations. It is shown that the approaches developed here further extend the ideas and techniques presented in recent literature, and they are also simple to implement in practice. Furthermore, the theoretical results are applied to a typical chaotic delayed Hopfied neural networks, and numerical simulation also demonstrate the effectiveness and feasibility of the proposed technique

  17. On-orbit real-time robust cooperative target identification in complex background

    Directory of Open Access Journals (Sweden)

    Wen Zhuoman

    2015-10-01

    Full Text Available Cooperative target identification is the prerequisite for the relative position and orientation measurement between the space robot arm and the to-be-arrested object. We propose an on-orbit real-time robust algorithm for cooperative target identification in complex background using the features of circle and lines. It first extracts only the interested edges in the target image using an adaptive threshold and refines them to about single-pixel-width with improved non-maximum suppression. Adapting a novel tracking approach, edge segments changing smoothly in tangential directions are obtained. With a small amount of calculation, large numbers of invalid edges are removed. From the few remained edges, valid circular arcs are extracted and reassembled to obtain circles according to a reliable criterion. Finally, the target is identified if there are certain numbers of straight lines whose relative positions with the circle match the known target pattern. Experiments demonstrate that the proposed algorithm accurately identifies the cooperative target within the range of 0.3–1.5 m under complex background at the speed of 8 frames per second, regardless of lighting condition and target attitude. The proposed algorithm is very suitable for real-time visual measurement of space robot arm because of its robustness and small memory requirement.

  18. Robust identification of noncoding RNA from transcriptomes requires phylogenetically-informed sampling.

    Directory of Open Access Journals (Sweden)

    Stinus Lindgreen

    2014-10-01

    Full Text Available Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression quantification. However, a major challenge is to robustly distinguish functional outputs from transcriptional noise. To establish whether annotation of existing transcriptome data has effectively captured all functional outputs, we analysed over 400 publicly available RNA-seq datasets spanning 37 different Archaea and Bacteria. Using comparative tools, we identify close to a thousand highly-expressed candidate noncoding RNAs. However, our analyses reveal that capacity to identify noncoding RNA outputs is strongly dependent on phylogenetic sampling. Surprisingly, and in stark contrast to protein-coding genes, the phylogenetic window for effective use of comparative methods is perversely narrow: aggregating public datasets only produced one phylogenetic cluster where these tools could be used to robustly separate unannotated noncoding RNAs from a null hypothesis of transcriptional noise. Our results show that for the full potential of transcriptomics data to be realized, a change in experimental design is paramount: effective transcriptomics requires phylogeny-aware sampling.

  19. A Robust Identification of the Protein Standard Bands in Two-Dimensional Electrophoresis Gel Images

    Directory of Open Access Journals (Sweden)

    Serackis Artūras

    2017-12-01

    Full Text Available The aim of the investigation presented in this paper was to develop a software-based assistant for the protein analysis workflow. The prior characterization of the unknown protein in two-dimensional electrophoresis gel images is performed according to the molecular weight and isoelectric point of each protein spot estimated from the gel image before further sequence analysis by mass spectrometry. The paper presents a method for automatic and robust identification of the protein standard band in a two-dimensional gel image. In addition, the method introduces the identification of the positions of the markers, prepared by using pre-selected proteins with known molecular mass. The robustness of the method was achieved by using special validation rules in the proposed original algorithms. In addition, a self-organizing map-based decision support algorithm is proposed, which takes Gabor coefficients as image features and searches for the differences in preselected vertical image bars. The experimental investigation proved the good performance of the new algorithms included into the proposed method. The detection of the protein standard markers works without modification of algorithm parameters on two-dimensional gel images obtained by using different staining and destaining procedures, which results in different average levels of intensity in the images.

  20. Robust Identification of Developmentally Active Endothelial Enhancers in Zebrafish Using FANS-Assisted ATAC-Seq.

    Science.gov (United States)

    Quillien, Aurelie; Abdalla, Mary; Yu, Jun; Ou, Jianhong; Zhu, Lihua Julie; Lawson, Nathan D

    2017-07-18

    Identification of tissue-specific and developmentally active enhancers provides insights into mechanisms that control gene expression during embryogenesis. However, robust detection of these regulatory elements remains challenging, especially in vertebrate genomes. Here, we apply fluorescent-activated nuclei sorting (FANS) followed by Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) to identify developmentally active endothelial enhancers in the zebrafish genome. ATAC-seq of nuclei from Tg(fli1a:egfp) y1 transgenic embryos revealed expected patterns of nucleosomal positioning at transcriptional start sites throughout the genome and association with active histone modifications. Comparison of ATAC-seq from GFP-positive and -negative nuclei identified more than 5,000 open elements specific to endothelial cells. These elements flanked genes functionally important for vascular development and that displayed endothelial-specific gene expression. Importantly, a majority of tested elements drove endothelial gene expression in zebrafish embryos. Thus, FANS-assisted ATAC-seq using transgenic zebrafish embryos provides a robust approach for genome-wide identification of active tissue-specific enhancer elements. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  1. Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statistic.

    Science.gov (United States)

    Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert

    2012-08-01

    Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.

  2. GPU-accelerated automatic identification of robust beam setups for proton and carbon-ion radiotherapy

    International Nuclear Information System (INIS)

    Ammazzalorso, F; Jelen, U; Bednarz, T

    2014-01-01

    We demonstrate acceleration on graphic processing units (GPU) of automatic identification of robust particle therapy beam setups, minimizing negative dosimetric effects of Bragg peak displacement caused by treatment-time patient positioning errors. Our particle therapy research toolkit, RobuR, was extended with OpenCL support and used to implement calculation on GPU of the Port Homogeneity Index, a metric scoring irradiation port robustness through analysis of tissue density patterns prior to dose optimization and computation. Results were benchmarked against an independent native CPU implementation. Numerical results were in agreement between the GPU implementation and native CPU implementation. For 10 skull base cases, the GPU-accelerated implementation was employed to select beam setups for proton and carbon ion treatment plans, which proved to be dosimetrically robust, when recomputed in presence of various simulated positioning errors. From the point of view of performance, average running time on the GPU decreased by at least one order of magnitude compared to the CPU, rendering the GPU-accelerated analysis a feasible step in a clinical treatment planning interactive session. In conclusion, selection of robust particle therapy beam setups can be effectively accelerated on a GPU and become an unintrusive part of the particle therapy treatment planning workflow. Additionally, the speed gain opens new usage scenarios, like interactive analysis manipulation (e.g. constraining of some setup) and re-execution. Finally, through OpenCL portable parallelism, the new implementation is suitable also for CPU-only use, taking advantage of multiple cores, and can potentially exploit types of accelerators other than GPUs.

  3. GPU-accelerated automatic identification of robust beam setups for proton and carbon-ion radiotherapy

    Science.gov (United States)

    Ammazzalorso, F.; Bednarz, T.; Jelen, U.

    2014-03-01

    We demonstrate acceleration on graphic processing units (GPU) of automatic identification of robust particle therapy beam setups, minimizing negative dosimetric effects of Bragg peak displacement caused by treatment-time patient positioning errors. Our particle therapy research toolkit, RobuR, was extended with OpenCL support and used to implement calculation on GPU of the Port Homogeneity Index, a metric scoring irradiation port robustness through analysis of tissue density patterns prior to dose optimization and computation. Results were benchmarked against an independent native CPU implementation. Numerical results were in agreement between the GPU implementation and native CPU implementation. For 10 skull base cases, the GPU-accelerated implementation was employed to select beam setups for proton and carbon ion treatment plans, which proved to be dosimetrically robust, when recomputed in presence of various simulated positioning errors. From the point of view of performance, average running time on the GPU decreased by at least one order of magnitude compared to the CPU, rendering the GPU-accelerated analysis a feasible step in a clinical treatment planning interactive session. In conclusion, selection of robust particle therapy beam setups can be effectively accelerated on a GPU and become an unintrusive part of the particle therapy treatment planning workflow. Additionally, the speed gain opens new usage scenarios, like interactive analysis manipulation (e.g. constraining of some setup) and re-execution. Finally, through OpenCL portable parallelism, the new implementation is suitable also for CPU-only use, taking advantage of multiple cores, and can potentially exploit types of accelerators other than GPUs.

  4. Robust volcano plot: identification of differential metabolites in the presence of outliers.

    Science.gov (United States)

    Kumar, Nishith; Hoque, Md Aminul; Sugimoto, Masahiro

    2018-04-11

    The identification of differential metabolites in metabolomics is still a big challenge and plays a prominent role in metabolomics data analyses. Metabolomics datasets often contain outliers because of analytical, experimental, and biological ambiguity, but the currently available differential metabolite identification techniques are sensitive to outliers. We propose a kernel weight based outlier-robust volcano plot for identifying differential metabolites from noisy metabolomics datasets. Two numerical experiments are used to evaluate the performance of the proposed technique against nine existing techniques, including the t-test and the Kruskal-Wallis test. Artificially generated data with outliers reveal that the proposed method results in a lower misclassification error rate and a greater area under the receiver operating characteristic curve compared with existing methods. An experimentally measured breast cancer dataset to which outliers were artificially added reveals that our proposed method produces only two non-overlapping differential metabolites whereas the other nine methods produced between seven and 57 non-overlapping differential metabolites. Our data analyses show that the performance of the proposed differential metabolite identification technique is better than that of existing methods. Thus, the proposed method can contribute to analysis of metabolomics data with outliers. The R package and user manual of the proposed method are available at https://github.com/nishithkumarpaul/Rvolcano .

  5. Performance-Driven Robust Identification and Control of Uncertain Dynamical Systems

    Energy Technology Data Exchange (ETDEWEB)

    Basar, Tamer

    2001-10-29

    The grant DEFG02-97ER13939 from the Department of Energy has supported our research program on robust identification and control of uncertain dynamical systems, initially for the three-year period June 15, 1997-June 14, 2000, which was then extended on a no-cost basis for another year until June 14, 2001. This final report provides an overview of our research conducted during this period, along with a complete list of publications supported by the Grant. Within the scope of this project, we have studied fundamental issues that arise in modeling, identification, filtering, control, stabilization, control-based model reduction, decomposition and aggregation, and optimization of uncertain systems. The mathematical framework we have worked in has allowed the system dynamics to be only partially known (with the uncertainties being of both parametric or structural nature), and further the dynamics to be perturbed by unknown dynamic disturbances. Our research over these four years has generated a substantial body of new knowledge, and has led to new major developments in theory, applications, and computational algorithms. These have all been documented in various journal articles and book chapters, and have been presented at leading conferences, as to be described. A brief description of the results we have obtained within the scope of this project can be found in Section 3. To set the stage for the material of that section, we first provide in the next section (Section 2) a brief description of the issues that arise in the control of uncertain systems, and introduce several criteria under which optimality will lead to robustness and stability. Section 4 contains a list of references cited in these two sections. A list of our publications supported by the DOE Grant (covering the period June 15, 1997-June 14, 2001) comprises Section 5 of the report.

  6. Working with Speakers.

    Science.gov (United States)

    Pestel, Ann

    1989-01-01

    The author discusses working with speakers from business and industry to present career information at the secondary level. Advice for speakers is presented, as well as tips for program coordinators. (CH)

  7. Robust identification and localization of intramedullary nail holes for distal locking using CBCT: a simulation study.

    Science.gov (United States)

    Kamarianakis, Z; Buliev, I; Pallikarakis, N

    2011-05-01

    Closed intramedullary nailing is a common technique for treatment of femur and tibia fractures. The most challenging step in this procedure is the precise placement of the lateral screws that stabilize the fragmented bone. The present work concerns the development and the evaluation of a method to accurately identify in the 3D space the axes of the nail hole canals. A limited number of projection images are acquired around the leg with the help of a C-arm. On two of them, the locking hole entries are interactively selected and a rough localization of the hole axes is performed. Perpendicularly to one of them, cone-beam computed tomography (CBCT) reconstructions are produced. The accurate identification and localization of the hole axes are done by an identification of the centers of the nail holes on the tomograms and a further 3D linear regression through principal component analysis (PCA). Various feature-based approaches (RANSAC, least-square fitting, Hough transform) have been compared for best matching the contours and the centers of the holes on the tomograms. The robustness of the suggested method was investigated using simulations. Programming is done in Matlab and C++. Results obtained on synthetic data confirm very good localization accuracy - mean translational error of 0.14 mm (std=0.08 mm) and mean angular error of 0.84° (std=0.35°) at no radiation excess. Successful localization can be further used to guide a surgeon or a robot for correct drilling the bone along the nail openings. Copyright © 2010 IPEM. Published by Elsevier Ltd. All rights reserved.

  8. A Robust Iris Identification System Based on Wavelet Packet Decomposition and Local Comparisons of the Extracted Signatures

    Directory of Open Access Journals (Sweden)

    Rossant Florence

    2010-01-01

    Full Text Available Abstract This paper presents a complete iris identification system including three main stages: iris segmentation, signature extraction, and signature comparison. An accurate and robust pupil and iris segmentation process, taking into account eyelid occlusions, is first detailed and evaluated. Then, an original wavelet-packet-based signature extraction method and a novel identification approach, based on the fusion of local distance measures, are proposed. Performance measurements validating the proposed iris signature and demonstrating the benefit of our local-based signature comparison are provided. Moreover, an exhaustive evaluation of robustness, with regards to the acquisition conditions, attests the high performances and the reliability of our system. Tests have been conducted on two different databases, the well-known CASIA database (V3 and our ISEP database. Finally, a comparison of the performances of our system with the published ones is given and discussed.

  9. Using timing information in speaker verification

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2005-11-01

    Full Text Available This paper presents an analysis of temporal information as a feature for use in speaker verification systems. The relevance of temporal information in a speaker’s utterances is investigated, both with regard to improving the robustness of modern...

  10. Forensic speaker recognition

    NARCIS (Netherlands)

    Meuwly, Didier

    2013-01-01

    The aim of forensic speaker recognition is to establish links between individuals and criminal activities, through audio speech recordings. This field is multidisciplinary, combining predominantly phonetics, linguistics, speech signal processing, and forensic statistics. On these bases, expert-based

  11. Optimization of multilayer neural network parameters for speaker recognition

    Science.gov (United States)

    Tovarek, Jaromir; Partila, Pavol; Rozhon, Jan; Voznak, Miroslav; Skapa, Jan; Uhrin, Dominik; Chmelikova, Zdenka

    2016-05-01

    This article discusses the impact of multilayer neural network parameters for speaker identification. The main task of speaker identification is to find a specific person in the known set of speakers. It means that the voice of an unknown speaker (wanted person) belongs to a group of reference speakers from the voice database. One of the requests was to develop the text-independent system, which means to classify wanted person regardless of content and language. Multilayer neural network has been used for speaker identification in this research. Artificial neural network (ANN) needs to set parameters like activation function of neurons, steepness of activation functions, learning rate, the maximum number of iterations and a number of neurons in the hidden and output layers. ANN accuracy and validation time are directly influenced by the parameter settings. Different roles require different settings. Identification accuracy and ANN validation time were evaluated with the same input data but different parameter settings. The goal was to find parameters for the neural network with the highest precision and shortest validation time. Input data of neural networks are a Mel-frequency cepstral coefficients (MFCC). These parameters describe the properties of the vocal tract. Audio samples were recorded for all speakers in a laboratory environment. Training, testing and validation data set were split into 70, 15 and 15 %. The result of the research described in this article is different parameter setting for the multilayer neural network for four speakers.

  12. Identification and robust water level control of horizontal steam generators using quantitative feedback theory

    International Nuclear Information System (INIS)

    Safarzadeh, O.; Khaki-Sedigh, A.; Shirani, A.S.

    2011-01-01

    Highlights: → A robust water level controller for steam generators (SGs) is designed based on the Quantitative Feedback Theory. → To design the controller, fairly accurate linear models are identified for the SG. → The designed controller is verified using a developed novel global locally linear neuro-fuzzy model of the SG. → Both of the linear and nonlinear models are based on the SG mathematical thermal-hydraulic model developed using the simulation computer code. → The proposed method is easy to apply and guarantees desired closed loop performance. - Abstract: In this paper, a robust water level control system for the horizontal steam generator (SG) using the quantitative feedback theory (QFT) method is presented. To design a robust QFT controller for the nonlinear uncertain SG, control oriented linear models are identified. Then, the nonlinear system is modeled as an uncertain linear time invariant (LTI) system. The robust designed controller is applied to the nonlinear plant model. This nonlinear model is based on a locally linear neuro-fuzzy (LLNF) model. This model is trained using the locally linear model tree (LOLIMOT) algorithm. Finally, simulation results are employed to show the effectiveness of the designed QFT level controller. It is shown that it will ensure the entire designer's water level closed loop specifications.

  13. Identification and robust water level control of horizontal steam generators using quantitative feedback theory

    Energy Technology Data Exchange (ETDEWEB)

    Safarzadeh, O., E-mail: O_Safarzadeh@sbu.ac.ir [Shahid Beheshti University, P.O. Box: 19839-63113, Tehran (Iran, Islamic Republic of); Khaki-Sedigh, A. [K. N. Toosi University of Technology, Tehran (Iran, Islamic Republic of); Shirani, A.S. [Shahid Beheshti University, P.O. Box: 19839-63113, Tehran (Iran, Islamic Republic of)

    2011-09-15

    Highlights: {yields} A robust water level controller for steam generators (SGs) is designed based on the Quantitative Feedback Theory. {yields} To design the controller, fairly accurate linear models are identified for the SG. {yields} The designed controller is verified using a developed novel global locally linear neuro-fuzzy model of the SG. {yields} Both of the linear and nonlinear models are based on the SG mathematical thermal-hydraulic model developed using the simulation computer code. {yields} The proposed method is easy to apply and guarantees desired closed loop performance. - Abstract: In this paper, a robust water level control system for the horizontal steam generator (SG) using the quantitative feedback theory (QFT) method is presented. To design a robust QFT controller for the nonlinear uncertain SG, control oriented linear models are identified. Then, the nonlinear system is modeled as an uncertain linear time invariant (LTI) system. The robust designed controller is applied to the nonlinear plant model. This nonlinear model is based on a locally linear neuro-fuzzy (LLNF) model. This model is trained using the locally linear model tree (LOLIMOT) algorithm. Finally, simulation results are employed to show the effectiveness of the designed QFT level controller. It is shown that it will ensure the entire designer's water level closed loop specifications.

  14. Learning speaker-specific characteristics with a deep neural architecture.

    Science.gov (United States)

    Chen, Ke; Salman, Ahmad

    2011-11-01

    Speech signals convey various yet mixed information ranging from linguistic to speaker-specific information. However, most of acoustic representations characterize all different kinds of information as whole, which could hinder either a speech or a speaker recognition (SR) system from producing a better performance. In this paper, we propose a novel deep neural architecture (DNA) especially for learning speaker-specific characteristics from mel-frequency cepstral coefficients, an acoustic representation commonly used in both speech recognition and SR, which results in a speaker-specific overcomplete representation. In order to learn intrinsic speaker-specific characteristics, we come up with an objective function consisting of contrastive losses in terms of speaker similarity/dissimilarity and data reconstruction losses used as regularization to normalize the interference of non-speaker-related information. Moreover, we employ a hybrid learning strategy for learning parameters of the deep neural networks: i.e., local yet greedy layerwise unsupervised pretraining for initialization and global supervised learning for the ultimate discriminative goal. With four Linguistic Data Consortium (LDC) benchmarks and two non-English corpora, we demonstrate that our overcomplete representation is robust in characterizing various speakers, no matter whether their utterances have been used in training our DNA, and highly insensitive to text and languages spoken. Extensive comparative studies suggest that our approach yields favorite results in speaker verification and segmentation. Finally, we discuss several issues concerning our proposed approach.

  15. Automatic Speaker Recognition for Mobile Forensic Applications

    Directory of Open Access Journals (Sweden)

    Mohammed Algabri

    2017-01-01

    Full Text Available Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER, within noisy environments and with very short test samples.

  16. Multimodal Speaker Diarization

    NARCIS (Netherlands)

    Noulas, A.; Englebienne, G.; Kröse, B.J.A.

    2012-01-01

    We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an

  17. Are subject-specific musculoskeletal models robust to the uncertainties in parameter identification?

    Directory of Open Access Journals (Sweden)

    Giordano Valente

    Full Text Available Subject-specific musculoskeletal modeling can be applied to study musculoskeletal disorders, allowing inclusion of personalized anatomy and properties. Independent of the tools used for model creation, there are unavoidable uncertainties associated with parameter identification, whose effect on model predictions is still not fully understood. The aim of the present study was to analyze the sensitivity of subject-specific model predictions (i.e., joint angles, joint moments, muscle and joint contact forces during walking to the uncertainties in the identification of body landmark positions, maximum muscle tension and musculotendon geometry. To this aim, we created an MRI-based musculoskeletal model of the lower limbs, defined as a 7-segment, 10-degree-of-freedom articulated linkage, actuated by 84 musculotendon units. We then performed a Monte-Carlo probabilistic analysis perturbing model parameters according to their uncertainty, and solving a typical inverse dynamics and static optimization problem using 500 models that included the different sets of perturbed variable values. Model creation and gait simulations were performed by using freely available software that we developed to standardize the process of model creation, integrate with OpenSim and create probabilistic simulations of movement. The uncertainties in input variables had a moderate effect on model predictions, as muscle and joint contact forces showed maximum standard deviation of 0.3 times body-weight and maximum range of 2.1 times body-weight. In addition, the output variables significantly correlated with few input variables (up to 7 out of 312 across the gait cycle, including the geometry definition of larger muscles and the maximum muscle tension in limited gait portions. Although we found subject-specific models not markedly sensitive to parameter identification, researchers should be aware of the model precision in relation to the intended application. In fact, force

  18. Control system design for electrical stimulation in upper limb rehabilitation modelling, identification and robust performance

    CERN Document Server

    Freeman, Chris

    2016-01-01

    This book presents a comprehensive framework for model-based electrical stimulation (ES) controller design, covering the whole process needed to develop a system for helping people with physical impairments perform functional upper limb tasks such as eating, grasping and manipulating objects. The book first demonstrates procedures for modelling and identifying biomechanical models of the response of ES, covering a wide variety of aspects including mechanical support structures, kinematics, electrode placement, tasks, and sensor locations. It then goes on to demonstrate how complex functional activities of daily living can be captured in the form of optimisation problems, and extends ES control design to address this case. It then lays out a design methodology, stability conditions, and robust performance criteria that enable control schemes to be developed systematically and transparently, ensuring that they can operate effectively in the presence of realistic modelling uncertainty, physiological variation an...

  19. Identification of a robust gene signature that predicts breast cancer outcome in independent data sets

    International Nuclear Information System (INIS)

    Korkola, James E; Waldman, Frederic M; Blaveri, Ekaterina; DeVries, Sandy; Moore, Dan H II; Hwang, E Shelley; Chen, Yunn-Yi; Estep, Anne LH; Chew, Karen L; Jensen, Ronald H

    2007-01-01

    Breast cancer is a heterogeneous disease, presenting with a wide range of histologic, clinical, and genetic features. Microarray technology has shown promise in predicting outcome in these patients. We profiled 162 breast tumors using expression microarrays to stratify tumors based on gene expression. A subset of 55 tumors with extensive follow-up was used to identify gene sets that predicted outcome. The predictive gene set was further tested in previously published data sets. We used different statistical methods to identify three gene sets associated with disease free survival. A fourth gene set, consisting of 21 genes in common to all three sets, also had the ability to predict patient outcome. To validate the predictive utility of this derived gene set, it was tested in two published data sets from other groups. This gene set resulted in significant separation of patients on the basis of survival in these data sets, correctly predicting outcome in 62–65% of patients. By comparing outcome prediction within subgroups based on ER status, grade, and nodal status, we found that our gene set was most effective in predicting outcome in ER positive and node negative tumors. This robust gene selection with extensive validation has identified a predictive gene set that may have clinical utility for outcome prediction in breast cancer patients

  20. S/HIC: Robust Identification of Soft and Hard Sweeps Using Machine Learning.

    Directory of Open Access Journals (Sweden)

    Daniel R Schrider

    2016-03-01

    Full Text Available Detecting the targets of adaptive natural selection from whole genome sequencing data is a central problem for population genetics. However, to date most methods have shown sub-optimal performance under realistic demographic scenarios. Moreover, over the past decade there has been a renewed interest in determining the importance of selection from standing variation in adaptation of natural populations, yet very few methods for inferring this model of adaptation at the genome scale have been introduced. Here we introduce a new method, S/HIC, which uses supervised machine learning to precisely infer the location of both hard and soft selective sweeps. We show that S/HIC has unrivaled accuracy for detecting sweeps under demographic histories that are relevant to human populations, and distinguishing sweeps from linked as well as neutrally evolving regions. Moreover, we show that S/HIC is uniquely robust among its competitors to model misspecification. Thus, even if the true demographic model of a population differs catastrophically from that specified by the user, S/HIC still retains impressive discriminatory power. Finally, we apply S/HIC to the case of resequencing data from human chromosome 18 in a European population sample, and demonstrate that we can reliably recover selective sweeps that have been identified earlier using less specific and sensitive methods.

  1. Robust cell tracking in epithelial tissues through identification of maximum common subgraphs.

    Science.gov (United States)

    Kursawe, Jochen; Bardenet, Rémi; Zartman, Jeremiah J; Baker, Ruth E; Fletcher, Alexander G

    2016-11-01

    Tracking of cells in live-imaging microscopy videos of epithelial sheets is a powerful tool for investigating fundamental processes in embryonic development. Characterizing cell growth, proliferation, intercalation and apoptosis in epithelia helps us to understand how morphogenetic processes such as tissue invagination and extension are locally regulated and controlled. Accurate cell tracking requires correctly resolving cells entering or leaving the field of view between frames, cell neighbour exchanges, cell removals and cell divisions. However, current tracking methods for epithelial sheets are not robust to large morphogenetic deformations and require significant manual interventions. Here, we present a novel algorithm for epithelial cell tracking, exploiting the graph-theoretic concept of a 'maximum common subgraph' to track cells between frames of a video. Our algorithm does not require the adjustment of tissue-specific parameters, and scales in sub-quadratic time with tissue size. It does not rely on precise positional information, permitting large cell movements between frames and enabling tracking in datasets acquired at low temporal resolution due to experimental constraints such as phototoxicity. To demonstrate the method, we perform tracking on the Drosophila embryonic epidermis and compare cell-cell rearrangements to previous studies in other tissues. Our implementation is open source and generally applicable to epithelial tissues. © 2016 The Authors.

  2. Efficient and robust identification of cortical targets in concurrent TMS-fMRI experiments

    Science.gov (United States)

    Yau, Jeffrey M.; Hua, Jun; Liao, Diana A.; Desmond, John E.

    2014-01-01

    Transcranial magnetic stimulation (TMS) can be delivered during fMRI scans to evoke BOLD responses in distributed brain networks. While concurrent TMS-fMRI offers a potentially powerful tool for non-invasively investigating functional human neuroanatomy, the technique is currently limited by the lack of methods to rapidly and precisely localize targeted brain regions – a reliable procedure is necessary for validly relating stimulation targets to BOLD activation patterns, especially for cortical targets outside of motor and visual regions. Here we describe a convenient and practical method for visualizing coil position (in the scanner) and identifying the cortical location of TMS targets without requiring any calibration or any particular coil-mounting device. We quantified the precision and reliability of the target position estimates by testing the marker processing procedure on data from 9 scan sessions: Rigorous testing of the localization procedure revealed minimal variability in coil and target position estimates. We validated the marker processing procedure in concurrent TMS-fMRI experiments characterizing motor network connectivity. Together, these results indicate that our efficient method accurately and reliably identifies TMS targets in the MR scanner, which can be useful during scan sessions for optimizing coil placement and also for post-scan outlier identification. Notably, this method can be used generally to identify the position and orientation of MR-compatible hardware placed near the head in the MR scanner. PMID:23507384

  3. Model reduction and frequency residuals for a robust estimation of nonlinearities in subspace identification

    Science.gov (United States)

    De Filippis, G.; Noël, J. P.; Kerschen, G.; Soria, L.; Stephan, C.

    2017-09-01

    The introduction of the frequency-domain nonlinear subspace identification (FNSI) method in 2013 constitutes one in a series of recent attempts toward developing a realistic, first-generation framework applicable to complex structures. If this method showed promising capabilities when applied to academic structures, it is still confronted with a number of limitations which needs to be addressed. In particular, the removal of nonphysical poles in the identified nonlinear models is a distinct challenge. In the present paper, it is proposed as a first contribution to operate directly on the identified state-space matrices to carry out spurious pole removal. A modal-space decomposition of the state and output matrices is examined to discriminate genuine from numerical poles, prior to estimating the extended input and feedthrough matrices. The final state-space model thus contains physical information only and naturally leads to nonlinear coefficients free of spurious variations. Besides spurious variations due to nonphysical poles, vibration modes lying outside the frequency band of interest may also produce drifts of the nonlinear coefficients. The second contribution of the paper is to include residual terms, accounting for the existence of these modes. The proposed improved FNSI methodology is validated numerically and experimentally using a full-scale structure, the Morane-Saulnier Paris aircraft.

  4. Multi-stage robust scheme for citrus identification from high resolution airborne images

    Science.gov (United States)

    Amorós-López, Julia; Izquierdo Verdiguier, Emma; Gómez-Chova, Luis; Muñoz-Marí, Jordi; Zoilo Rodríguez-Barreiro, Jorge; Camps-Valls, Gustavo; Calpe-Maravilla, Javier

    2008-10-01

    Identification of land cover types is one of the most critical activities in remote sensing. Nowadays, managing land resources by using remote sensing techniques is becoming a common procedure to speed up the process while reducing costs. However, data analysis procedures should satisfy the accuracy figures demanded by institutions and governments for further administrative actions. This paper presents a methodological scheme to update the citrus Geographical Information Systems (GIS) of the Comunidad Valenciana autonomous region, Spain). The proposed approach introduces a multi-stage automatic scheme to reduce visual photointerpretation and ground validation tasks. First, an object-oriented feature extraction process is carried out for each cadastral parcel from very high spatial resolution (VHR) images (0.5m) acquired in the visible and near infrared. Next, several automatic classifiers (decision trees, multilayer perceptron, and support vector machines) are trained and combined to improve the final accuracy of the results. The proposed strategy fulfills the high accuracy demanded by policy makers by means of combining automatic classification methods with visual photointerpretation available resources. A level of confidence based on the agreement between classifiers allows us an effective management by fixing the quantity of parcels to be reviewed. The proposed methodology can be applied to similar problems and applications.

  5. PhasePApy: A robust pure Python package for automatic identification of seismic phases

    Science.gov (United States)

    Chen, Chen; Holland, Austin

    2016-01-01

    We developed a Python phase identification package: the PhasePApy for earthquake data processing and near‐real‐time monitoring. The package takes advantage of the growing number of Python libraries including Obspy. All the data formats supported by Obspy can be supported within the PhasePApy. The PhasePApy has two subpackages: the PhasePicker and the Associator, aiming to identify phase arrival onsets and associate them to phase types, respectively. The PhasePicker and the Associator can work jointly or separately. Three autopickers are implemented in the PhasePicker subpackage: the frequency‐band picker, the Akaike information criteria function derivative picker, and the kurtosis picker. All three autopickers identify picks with the same processing methods but different characteristic functions. The PhasePicker triggers the pick with a dynamic threshold and can declare a pick with false‐pick filtering. Also, the PhasePicker identifies a pick polarity and uncertainty for further seismological analysis, such as focal mechanism determination. Two associators are included in the Associator subpackage: the 1D Associator and 3D Associator, which assign phase types to picks that can best fit potential earthquakes by minimizing root mean square (rms) residuals of the misfits in distance and time, respectively. The Associator processes multiple picks from all channels at a seismic station and aggregates them to increase computational efficiencies. Both associators use travel‐time look up tables to determine the best estimation of the earthquake location and evaluate the phase type for picks. The PhasePApy package has been used extensively for local and regional earthquakes and can work for active source experiments as well.

  6. The native-language benefit for talker identification is robust in 7.5-month-old infants.

    Science.gov (United States)

    Fecher, Natalie; Johnson, Elizabeth K

    2018-04-26

    Adults recognize talkers better when the talkers speak a familiar language than when they speak an unfamiliar language. This language familiarity effect (LFE) demonstrates the inseparable nature of linguistic and indexical information in adult spoken language processing. Relatively little is known about children's integration of linguistic and indexical information in speech. For example, to date, only one study has explored the LFE in infants. Here, we sought to better understand the maturation of speech processing abilities in infants by replicating this earlier study using a more stringent experimental design (eliminating a potential voice-language confound), a different test population (English- rather than Dutch-learning infants), and a new language pairing (English vs. Polish rather than Dutch vs. Italian or Japanese). Furthermore, we explored the language exposure conditions required for infants to develop an LFE for a formerly unfamiliar language. We hypothesized based on previous studies (including the perceptual narrowing literature) that infants might develop an LFE more readily than would adults. Although our findings replicate those of the earlier study-demonstrating that the LFE is robust in 7.5-month-olds-we found no evidence that infants need less language exposure than do adults to develop an LFE. We concluded that both infants and adults need extensive (potentially live) exposure to an unfamiliar language before talker identification in that language improves. Moreover, our study suggests that the LFE is likely rooted in early emerging phonology rather than shared lexical knowledge and that infants already closely resemble adults in their processing of linguistic and indexical information. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  7. Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification

    DEFF Research Database (Denmark)

    Delgado, Hector; Todisco, Massimiliano; Sahidullah, Md

    2016-01-01

    Many authentication applications involving automatic speaker verification (ASV) demand robust performance using short-duration, fixed or prompted text utterances. Text constraints not only reduce the phone-mismatch between enrollment and test utterances, which generally leads to improved performa...

  8. The speaker's formant.

    Science.gov (United States)

    Bele, Irene Velsvik

    2006-12-01

    The current study concerns speaking voice quality in two groups of professional voice users, teachers (n = 35) and actors (n = 36), representing trained and untrained voices. The voice quality of text reading at two intensity levels was acoustically analyzed. The central concept was the speaker's formant (SPF), related to the perceptual characteristics "better normal voice quality" (BNQ) and "worse normal voice quality" (WNQ). The purpose of the current study was to get closer to the origin of the phenomenon of the SPF, and to discover the differences in spectral and formant characteristics between the two professional groups and the two voice quality groups. The acoustic analyses were long-term average spectrum (LTAS) and spectrographical measurements of formant frequencies. At very high intensities, the spectral slope was rather quandrangular without a clear SPF peak. The trained voices had a higher energy level in the SPF region compared with the untrained, significantly so in loud phonation. The SPF seemed to be related to both sufficiently strong overtones and a glottal setting, allowing for a lowering of F4 and a closeness of F3 and F4. However, the existence of SPF also in LTAS of the WNQ voices implies that more research is warranted concerning the formation of SPF, and concerning the acoustic correlates of the BNQ voices.

  9. Do Speakers and Listeners Observe the Gricean Maxim of Quantity?

    Science.gov (United States)

    Engelhardt, Paul E.; Bailey, Karl G. D.; Ferreira, Fernanda

    2006-01-01

    The Gricean Maxim of Quantity is believed to govern linguistic performance. Speakers are assumed to provide as much information as required for referent identification and no more, and listeners are believed to expect unambiguous but concise descriptions. In three experiments we examined the extent to which naive participants are sensitive to the…

  10. Arctic Visiting Speakers Series (AVS)

    Science.gov (United States)

    Fox, S. E.; Griswold, J.

    2011-12-01

    The Arctic Visiting Speakers (AVS) Series funds researchers and other arctic experts to travel and share their knowledge in communities where they might not otherwise connect. Speakers cover a wide range of arctic research topics and can address a variety of audiences including K-12 students, graduate and undergraduate students, and the general public. Host applications are accepted on an on-going basis, depending on funding availability. Applications need to be submitted at least 1 month prior to the expected tour dates. Interested hosts can choose speakers from an online Speakers Bureau or invite a speaker of their choice. Preference is given to individuals and organizations to host speakers that reach a broad audience and the general public. AVS tours are encouraged to span several days, allowing ample time for interactions with faculty, students, local media, and community members. Applications for both domestic and international visits will be considered. Applications for international visits should involve participation of more than one host organization and must include either a US-based speaker or a US-based organization. This is a small but important program that educates the public about Arctic issues. There have been 27 tours since 2007 that have impacted communities across the globe including: Gatineau, Quebec Canada; St. Petersburg, Russia; Piscataway, New Jersey; Cordova, Alaska; Nuuk, Greenland; Elizabethtown, Pennsylvania; Oslo, Norway; Inari, Finland; Borgarnes, Iceland; San Francisco, California and Wolcott, Vermont to name a few. Tours have included lectures to K-12 schools, college and university students, tribal organizations, Boy Scout troops, science center and museum patrons, and the general public. There are approximately 300 attendees enjoying each AVS tour, roughly 4100 people have been reached since 2007. The expectations for each tour are extremely manageable. Hosts must submit a schedule of events and a tour summary to be posted online

  11. Improving Speaker Recognition by Biometric Voice Deconstruction

    Directory of Open Access Journals (Sweden)

    Luis Miguel eMazaira-Fernández

    2015-09-01

    Full Text Available Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g. YouTube to broadcast its message. In this new scenario, classical identification methods (such fingerprints or face recognition have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. Through the present paper, a new methodology to characterize speakers will be shown. This methodology is benefiting from the advances achieved during the last years in understanding and modelling voice production. The paper hypothesizes that a gender dependent characterization of speakers combined with the use of a new set of biometric parameters extracted from the components resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract gender-dependent extended biometric parameters are given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

  12. Shhh… I Need Quiet! Children's Understanding of American, British, and Japanese-accented English Speakers.

    Science.gov (United States)

    Bent, Tessa; Holt, Rachael Frush

    2018-02-01

    Children's ability to understand speakers with a wide range of dialects and accents is essential for efficient language development and communication in a global society. Here, the impact of regional dialect and foreign-accent variability on children's speech understanding was evaluated in both quiet and noisy conditions. Five- to seven-year-old children ( n = 90) and adults ( n = 96) repeated sentences produced by three speakers with different accents-American English, British English, and Japanese-accented English-in quiet or noisy conditions. Adults had no difficulty understanding any speaker in quiet conditions. Their performance declined for the nonnative speaker with a moderate amount of noise; their performance only substantially declined for the British English speaker (i.e., below 93% correct) when their understanding of the American English speaker was also impeded. In contrast, although children showed accurate word recognition for the American and British English speakers in quiet conditions, they had difficulty understanding the nonnative speaker even under ideal listening conditions. With a moderate amount of noise, their perception of British English speech declined substantially and their ability to understand the nonnative speaker was particularly poor. These results suggest that although school-aged children can understand unfamiliar native dialects under ideal listening conditions, their ability to recognize words in these dialects may be highly susceptible to the influence of environmental degradation. Fully adult-like word identification for speakers with unfamiliar accents and dialects may exhibit a protracted developmental trajectory.

  13. Identification of a robust subpathway-based signature for acute myeloid leukemia prognosis using an miRNA integrated strategy.

    Science.gov (United States)

    Chang, Huijuan; Gao, Qiuying; Ding, Wei; Qing, Xueqin

    2018-01-01

    Acute myeloid leukemia (AML) is a heterogeneous disease, and survival signatures are urgently needed to better monitor treatment. MiRNAs displayed vital regulatory roles on target genes, which was necessary involved in the complex disease. We therefore examined the expression levels of miRNAs and genes to identify robust signatures for survival benefit analyses. First, we reconstructed subpathway graphs by embedding miRNA components that were derived from low-throughput miRNA-gene interactions. Then, we randomly divided the data sets from The Cancer Genome Atlas (TCGA) into training and testing sets, and further formed 100 subsets based on the training set. Using each subset, we identified survival-related miRNAs and genes, and identified survival subpathways based on the reconstructed subpathway graphs. After statistical analyses of these survival subpathways, the most robust subpathways with the top three ranks were identified, and risk scores were calculated based on these robust subpathways for AML patient prognoses. Among these robust subpathways, three representative subpathways, path: 05200_10 from Pathways in cancer, path: 04110_20 from Cell cycle, and path: 04510_8 from Focal adhesion, were significantly associated with patient survival in the TCGA training and testing sets based on subpathway risk scores. In conclusion, we performed integrated analyses of miRNAs and genes to identify robust prognostic subpathways, and calculated subpathway risk scores to characterize AML patient survival.

  14. Segmentation of the Speaker's Face Region with Audiovisual Correlation

    Science.gov (United States)

    Liu, Yuyu; Sato, Yoichi

    The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.

  15. Speaker's voice as a memory cue.

    Science.gov (United States)

    Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

    2015-02-01

    Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect

  16. Utilising Tree-Based Ensemble Learning for Speaker Segmentation

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Tan, Zheng-Hua; Christensen, Mads Græsbøll

    2014-01-01

    In audio and speech processing, accurate detection of the changing points between multiple speakers in speech segments is an important stage for several applications such as speaker identification and tracking. Bayesian Information Criteria (BIC)-based approaches are the most traditionally used...... for a certain condition, the model becomes biased to the data used for training limiting the model’s generalisation ability. In this paper, we propose a BIC-based tuning-free approach for speaker segmentation through the use of ensemble-based learning. A forest of segmentation trees is constructed in which each...... tree is trained using a sampled version of the speech segment. During the tree construction process, a set of randomly selected points in the input sequence is examined as potential segmentation points. The point that yields the highest ΔBIC is chosen and the same process is repeated for the resultant...

  17. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    Science.gov (United States)

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  18. Combining metal oxide affinity chromatography (MOAC and selective mass spectrometry for robust identification of in vivo protein phosphorylation sites

    Directory of Open Access Journals (Sweden)

    Weckwerth Wolfram

    2005-11-01

    Full Text Available Abstract Background Protein phosphorylation is accepted as a major regulatory pathway in plants. More than 1000 protein kinases are predicted in the Arabidopsis proteome, however, only a few studies look systematically for in vivo protein phosphorylation sites. Owing to the low stoichiometry and low abundance of phosphorylated proteins, phosphorylation site identification using mass spectrometry imposes difficulties. Moreover, the often observed poor quality of mass spectra derived from phosphopeptides results frequently in uncertain database hits. Thus, several lines of evidence have to be combined for a precise phosphorylation site identification strategy. Results Here, a strategy is presented that combines enrichment of phosphoproteins using a technique termed metaloxide affinity chromatography (MOAC and selective ion trap mass spectrometry. The complete approach involves (i enrichment of proteins with low phosphorylation stoichiometry out of complex mixtures using MOAC, (ii gel separation and detection of phosphorylation using specific fluorescence staining (confirmation of enrichment, (iii identification of phosphoprotein candidates out of the SDS-PAGE using liquid chromatography coupled to mass spectrometry, and (iv identification of phosphorylation sites of these enriched proteins using automatic detection of H3PO4 neutral loss peaks and data-dependent MS3-fragmentation of the corresponding MS2-fragment. The utility of this approach is demonstrated by the identification of phosphorylation sites in Arabidopsis thaliana seed proteins. Regulatory importance of the identified sites is indicated by conservation of the detected sites in gene families such as ribosomal proteins and sterol dehydrogenases. To demonstrate further the wide applicability of MOAC, phosphoproteins were enriched from Chlamydomonas reinhardtii cell cultures. Conclusion A novel phosphoprotein enrichment procedure MOAC was applied to seed proteins of A. thaliana and to

  19. Identification of selective inhibitors of RET and comparison with current clinical candidates through development and validation of a robust screening cascade [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Amanda J. Watson

    2016-05-01

    Full Text Available RET (REarranged during Transfection is a receptor tyrosine kinase, which plays pivotal roles in regulating cell survival, differentiation, proliferation, migration and chemotaxis. Activation of RET is a mechanism of oncogenesis in medullary thyroid carcinomas where both germline and sporadic activating somatic mutations are prevalent.   At present, there are no known specific RET inhibitors in clinical development, although many potent inhibitors of RET have been opportunistically identified through selectivity profiling of compounds initially designed to target other tyrosine kinases. Vandetanib and cabozantinib, both multi-kinase inhibitors with RET activity, are approved for use in medullary thyroid carcinoma, but additional pharmacological activities, most notably inhibition of vascular endothelial growth factor - VEGFR2 (KDR, lead to dose-limiting toxicity. The recent identification of RET fusions present in ~1% of lung adenocarcinoma patients has renewed interest in the identification and development of more selective RET inhibitors lacking the toxicities associated with the current treatments.   In an earlier publication [Newton et al, 2016; 1] we reported the discovery of a series of 2-substituted phenol quinazolines as potent and selective RET kinase inhibitors. Here we describe the development of the robust screening cascade which allowed the identification and advancement of this chemical series.  Furthermore we have profiled a panel of RET-active clinical compounds both to validate the cascade and to confirm that none display a RET-selective target profile.

  20. The 2016 NIST Speaker Recognition Evaluation

    Science.gov (United States)

    2017-08-20

    impact on system performance. Index Terms: NIST evaluation, NIST SRE, speaker detection, speaker recognition, speaker verification 1. Introduction NIST... self -reported. Second, there were two training conditions in SRE16, namely fixed and open. In the fixed training condition, par- ticipants were only

  1. On the Use of Complementary Spectral Features for Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Sridhar Krishnan

    2007-12-01

    Full Text Available The most popular features for speaker recognition are Mel frequency cepstral coefficients (MFCCs and linear prediction cepstral coefficients (LPCCs. These features are used extensively because they characterize the vocal tract configuration which is known to be highly speaker-dependent. In this work, several features are introduced that can characterize the vocal system in order to complement the traditional features and produce better speaker recognition models. The spectral centroid (SC, spectral bandwidth (SBW, spectral band energy (SBE, spectral crest factor (SCF, spectral flatness measure (SFM, Shannon entropy (SE, and Renyi entropy (RE were utilized for this purpose. This work demonstrates that these features are robust in noisy conditions by simulating some common distortions that are found in the speakers' environment and a typical telephone channel. Babble noise, additive white Gaussian noise (AWGN, and a bandpass channel with 1 dB of ripple were used to simulate these noisy conditions. The results show significant improvements in classification performance for all noise conditions when these features were used to complement the MFCC and ΔMFCC features. In particular, the SC and SCF improved performance in almost all noise conditions within the examined SNR range (10–40 dB. For example, in cases where there was only one source of distortion, classification improvements of up to 8% and 10% were achieved under babble noise and AWGN, respectively, using the SCF feature.

  2. A more robust model of the biodiesel reaction, allowing identification of process conditions for significantly enhanced rate and water tolerance.

    Science.gov (United States)

    Eze, Valentine C; Phan, Anh N; Harvey, Adam P

    2014-03-01

    A more robust kinetic model of base-catalysed transesterification than the conventional reaction scheme has been developed. All the relevant reactions in the base-catalysed transesterification of rapeseed oil (RSO) to fatty acid methyl ester (FAME) were investigated experimentally, and validated numerically in a model implemented using MATLAB. It was found that including the saponification of RSO and FAME side reactions and hydroxide-methoxide equilibrium data explained various effects that are not captured by simpler conventional models. Both the experiment and modelling showed that the "biodiesel reaction" can reach the desired level of conversion (>95%) in less than 2min. Given the right set of conditions, the transesterification can reach over 95% conversion, before the saponification losses become significant. This means that the reaction must be performed in a reactor exhibiting good mixing and good control of residence time, and the reaction mixture must be quenched rapidly as it leaves the reactor. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Identification of robust statistical downscaling methods based on a comprehensive suite of performance metrics for South Korea

    Science.gov (United States)

    Eum, H. I.; Cannon, A. J.

    2015-12-01

    Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the

  4. Improving Speaker Recognition by Biometric Voice Deconstruction

    Science.gov (United States)

    Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

    2015-01-01

    Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245

  5. Analysis of human scream and its impact on text-independent speaker verification.

    Science.gov (United States)

    Hansen, John H L; Nandwana, Mahesh Kumar; Shokouhi, Navid

    2017-04-01

    Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.

  6. Recognition of speaker-dependent continuous speech with KEAL

    Science.gov (United States)

    Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.

    1989-04-01

    A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.

  7. Perception of English palatal codas by Korean speakers of English

    Science.gov (United States)

    Yeon, Sang-Hee

    2003-04-01

    This study aimed at looking at perception of English palatal codas by Korean speakers of English to determine if perception problems are the source of production problems. In particular, first, this study looked at the possible first language effect on the perception of English palatal codas. Second, a possible perceptual source of vowel epenthesis after English palatal codas was investigated. In addition, individual factors, such as length of residence, TOEFL score, gender and academic status, were compared to determine if those affected the varying degree of the perception accuracy. Eleven adult Korean speakers of English as well as three native speakers of English participated in the study. Three sets of a perception test including identification of minimally different English pseudo- or real words were carried out. The results showed that, first, the Korean speakers perceived the English codas significantly worse than the Americans. Second, the study supported the idea that Koreans perceived an extra /i/ after the final affricates due to final release. Finally, none of the individual factors explained the varying degree of the perceptional accuracy. In particular, TOEFL scores and the perception test scores did not have any statistically significant association.

  8. Identification of dust outbreaks on infrared MSG-SEVIRI data by using a Robust Satellite Technique (RST)

    Science.gov (United States)

    Sannazzaro, Filomena; Filizzola, Carolina; Marchese, Francesco; Corrado, Rosita; Paciello, Rossana; Mazzeo, Giuseppe; Pergola, Nicola; Tramutoli, Valerio

    2014-01-01

    Dust storms are meteorological phenomena of great interest for scientific community because of their potential impact on climate changes, for the risk that may pose to human health and due to other issues as desertification processes and reduction of the agricultural production. Satellite remote sensing, thanks to global coverage, high frequency of observation and low cost data, may highly contribute in monitoring these phenomena, provided that proper detection methods are used. In this work, the known Robust Satellite Techniques (RST) multitemporal approach, used for studying and monitoring several natural/environmental hazards, is tested on some important dust events affecting Mediterranean region in May 2004 and Arabian Peninsula in February 2008. To perform this study, data provided by the Spinning Enhanced Visible and Infrared Imager (SEVIRI) have been processed, comparing the generated dust maps to some independent satellite-based aerosol products. Outcomes of this work show that the RST technique can be profitably used for detecting dust outbreaks from space, providing information also about areas characterized by a different probability of dust presence. They encourage further improvements of this technique in view of its possible implementation in the framework of operational warning systems.

  9. Hardware-efficient robust biometric identification from 0.58 second template and 12 features of limb (Lead I) ECG signal using logistic regression classifier.

    Science.gov (United States)

    Sahadat, Md Nazmus; Jacobs, Eddie L; Morshed, Bashir I

    2014-01-01

    The electrocardiogram (ECG), widely known as a cardiac diagnostic signal, has recently been proposed for biometric identification of individuals; however reliability and reproducibility are of research interest. In this paper, we propose a template matching technique with 12 features using logistic regression classifier that achieved high reliability and identification accuracy. Non-invasive ECG signals were captured using our custom-built ambulatory EEG/ECG embedded device (NeuroMonitor). ECG data were collected from healthy subjects (10), between 25-35 years, for 10 seconds per trial. The number of trials from each subject was 10. From each trial, only 0.58 seconds of Lead I ECG data were used as template. Hardware-efficient fiducial point detection technique was implemented for feature extraction. To obtain repeated random sub-sampling validation, data were randomly separated into training and testing sets at a ratio of 80:20. Test data were used to find the classification accuracy. ECG template data with 12 extracted features provided the best performance in terms of accuracy (up to 100%) and processing complexity (computation time of 1.2ms). This work shows that a single limb (Lead I) ECG can robustly identify an individual quickly and reliably with minimal contact and data processing using the proposed algorithm.

  10. ROMANCE: A new software tool to improve data robustness and feature identification in CE-MS metabolomics.

    Science.gov (United States)

    González-Ruiz, Víctor; Gagnebin, Yoric; Drouin, Nicolas; Codesido, Santiago; Rudaz, Serge; Schappler, Julie

    2018-05-01

    The use of capillary electrophoresis coupled to mass spectrometry (CE-MS) in metabolomics remains an oddity compared to the widely adopted use of liquid chromatography. This technique is traditionally regarded as lacking the reproducibility to adequately identify metabolites by their migration times. The major reason is the variability of the velocity of the background electrolyte, mainly coming from shifts in the magnitude of the electroosmotic flow and from the suction caused by electrospray interfaces. The use of the effective electrophoretic mobility is one solution to overcome this issue as it is a characteristic feature of each compound. To date, such an approach has not been applied to metabolomics due to the complexity and size of CE-MS data obtained in such studies. In this paper, ROMANCE (RObust Metabolomic Analysis with Normalized CE) is introduced as a new software for CE-MS-based metabolomics. It allows the automated conversion of batches of CE-MS files with minimal user intervention. ROMANCE converts the x-axis of each MS file from the time into the effective mobility scale and the resulting files are already pseudo-aligned, present normalized peak areas and improved reproducibility, and can eventually follow existing metabolomic workflows. The software was developed in Scala, so it is multi-platform and computationally-efficient. It is available for download under a CC license. In this work, the versatility of ROMANCE was demonstrated by using data obtained in the same and in different laboratories, as well as its application to the analysis of human plasma samples. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. What makes a charismatic speaker?

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Voße, Jana; Brem, Alexander

    2016-01-01

    The former Apple CEO Steve Jobs was one of the most charismatic speakers of the past decades. However, there is, as yet, no detailed quantitative profile of his way of speaking. We used state-of-the-art computer techniques to acoustically analyze his speech behavior and relate it to reference...... samples. Our paper provides the first-ever acoustic profile of Steve Jobs, based on about 4000 syllables and 12,000 individual speech sounds from his two most outstanding and well-known product presentations: the introductions of the iPhone 4 and the iPad 2. Our results show that Steve Jobs stands out...

  12. Speaker Segmentation and Clustering Using Gender Information

    Science.gov (United States)

    2006-02-01

    used in the first stages of segmentation forder information in the clustering of the opposite-gender speaker diarization of news broadcasts. files, the...AFRL-HE-WP-TP-2006-0026 AIR FORCE RESEARCH LABORATORY Speaker Segmentation and Clustering Using Gender Information Brian M. Ore General Dynamics...COVERED (From - To) February 2006 ProceedinLgs 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Speaker Segmentation and Clustering Using Gender Information 5b

  13. Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation

    National Research Council Canada - National Science Library

    Hansen, Eric G; Slyh, Raymond E; Anderson, Timothy R

    2006-01-01

    Starting in 2004, the annual NIST Speaker Recognition Evaluation (SRE) has added an optional unsupervised speaker adaptation track where test files are processed sequentially and one may update the target model...

  14. Speaker's presentations. Energy supply security

    International Nuclear Information System (INIS)

    Pierret, Ch.

    2000-01-01

    This document is a collection of most of the papers used by the speakers of the European Seminar on Energy Supply Security organised in Paris (at the French Ministry of Economy, Finance and Industry) on 24 November 2000 by the General Direction of Energy and Raw Materials, in co-operation with the European Commission and the French Planning Office. About 250 attendees were present, including a lot of high level Civil Servants from the 15 European State members, and their questions have allowed to create a rich debate. It took place five days before the publication, on 29 November 2000, by the European Commission, of the Green Paper 'Towards a European Strategy for the Security of Energy Supply'. This French initiative, which took place within the framework of the European Presidency of the European Union, during the second half-year 2000. will bring a first impetus to the brainstorming launched by the Commission. (author)

  15. Speaker-specific variability of phoneme durations

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2007-11-01

    Full Text Available The durations of phonemes varies for different speakers. To this end, the correlations between phonemes across different speakers are studied and a novel approach to predict unknown phoneme durations from the values of known phoneme durations for a...

  16. A New Database for Speaker Recognition

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2005-01-01

    In this paper we discuss properties of speech databases used for speaker recognition research and evaluation, and we characterize some popular standard databases. The paper presents a new database called ELSDSR dedicated to speaker recognition applications. The main characteristics of this database...

  17. Identifying the nonlinear mechanical behaviour of micro-speakers from their quasi-linear electrical response

    Science.gov (United States)

    Zilletti, Michele; Marker, Arthur; Elliott, Stephen John; Holland, Keith

    2017-05-01

    In this study model identification of the nonlinear dynamics of a micro-speaker is carried out by purely electrical measurements, avoiding any explicit vibration measurements. It is shown that a dynamic model of the micro-speaker, which takes into account the nonlinear damping characteristic of the device, can be identified by measuring the response between the voltage input and the current flowing into the coil. An analytical formulation of the quasi-linear model of the micro-speaker is first derived and an optimisation method is then used to identify a polynomial function which describes the mechanical damping behaviour of the micro-speaker. The analytical results of the quasi-linear model are compared with numerical results. This study potentially opens up the possibility of efficiently implementing nonlinear echo cancellers.

  18. System Identification and Robust Control

    DEFF Research Database (Denmark)

    Tøffner-Clausen, S.

    a thorough introduction to the m framework. A central results is that if performance is measured in terms of the inifity-norm and model uncertainty is bounded in the same manner, then, using m it is possible to pose one necessary and sufficient condition for block-structured norm-bounded perturbations which...... permitted by m is definitely much more flexible than those used in H inifity. Unfortunately m synthesis is a very difficult mathematical problem which is only well developed for purely complex perturbation sets. In order to develop our main result we will unfortunately need to synthesize m controllers....... The given examples thus represent a major part of the work behind this thesis. They consequently serve not just as illustrations but introduce many new ideas and should be interesting in their own right....

  19. Utterance Verification for Text-Dependent Speaker Recognition

    DEFF Research Database (Denmark)

    Kinnunen, Tomi; Sahidullah, Md; Kukanov, Ivan

    2016-01-01

    Text-dependent automatic speaker verification naturally calls for the simultaneous verification of speaker identity and spoken content. These two tasks can be achieved with automatic speaker verification (ASV) and utterance verification (UV) technologies. While both have been addressed previously...

  20. Apology Strategy in English By Native Speaker

    Directory of Open Access Journals (Sweden)

    Mezia Kemala Sari

    2016-05-01

    Full Text Available This research discussed apology strategies in English by native speaker. This descriptive study was presented within the framework of Pragmatics based on the forms of strategies due to the coding manual as found in CCSARP (Cross-Cultural Speech Acts Realization Project.The goals of this study were to describe the apology strategies in English by native speaker and identify the influencing factors of it. Data were collected through the use of the questionnaire in the form of Discourse Completion Test, which was distributed to 30 native speakers. Data were classified based on the degree of familiarity and the social distance between speaker and hearer and then the data of native will be separated and classified by the type of strategies in coding manual. The results of this study are the pattern of apology strategies of native speaker brief with the pattern that potentially occurs IFID plus Offer of repair plus Taking on responsibility. While Alerters, Explanation and Downgrading appear with less number of percentage. Then, the factors that influence the apology utterance by native speakers are the social situation, the degree of familiarity and degree of the offence which more complicated the mistake tend to produce the most complex utterances by the speaker.

  1. Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus

    Directory of Open Access Journals (Sweden)

    Patterson Eric K

    2002-01-01

    Full Text Available Strides in computer technology and the search for deeper, more powerful techniques in signal processing have brought multimodal research to the forefront in recent years. Audio-visual speech processing has become an important part of this research because it holds great potential for overcoming certain problems of traditional audio-only methods. Difficulties, due to background noise and multiple speakers in an application environment, are significantly reduced by the additional information provided by visual features. This paper presents information on a new audio-visual database, a feature study on moving speakers, and on baseline results for the whole speaker group. Although a few databases have been collected in this area, none has emerged as a standard for comparison. Also, efforts to date have often been limited, focusing on cropped video or stationary speakers. This paper seeks to introduce a challenging audio-visual database that is flexible and fairly comprehensive, yet easily available to researchers on one DVD. The Clemson University Audio-Visual Experiments (CUAVE database is a speaker-independent corpus of both connected and continuous digit strings totaling over 7000 utterances. It contains a wide variety of speakers and is designed to meet several goals discussed in this paper. One of these goals is to allow testing of adverse conditions such as moving talkers and speaker pairs. A feature study of connected digit strings is also discussed. It compares stationary and moving talkers in a speaker-independent grouping. An image-processing-based contour technique, an image transform method, and a deformable template scheme are used in this comparison to obtain visual features. This paper also presents methods and results in an attempt to make these techniques more robust to speaker movement. Finally, initial baseline speaker-independent results are included using all speakers, and conclusions as well as suggested areas of research are

  2. Identification for Control

    DEFF Research Database (Denmark)

    Tøffner-Clausen, S.

    1995-01-01

    Identification of model error bounds for robust control design has recently achieved much attention.......Identification of model error bounds for robust control design has recently achieved much attention....

  3. On the optimization of a mixed speaker array in an enclosed space using the virtual-speaker weighting method

    Science.gov (United States)

    Peng, Bo; Zheng, Sifa; Liao, Xiangning; Lian, Xiaomin

    2018-03-01

    In order to achieve sound field reproduction in a wide frequency band, multiple-type speakers are used. The reproduction accuracy is not only affected by the signals sent to the speakers, but also depends on the position and the number of each type of speaker. The method of optimizing a mixed speaker array is investigated in this paper. A virtual-speaker weighting method is proposed to optimize both the position and the number of each type of speaker. In this method, a virtual-speaker model is proposed to quantify the increment of controllability of the speaker array when the speaker number increases. While optimizing a mixed speaker array, the gain of the virtual-speaker transfer function is used to determine the priority orders of the candidate speaker positions, which optimizes the position of each type of speaker. Then the relative gain of the virtual-speaker transfer function is used to determine whether the speakers are redundant, which optimizes the number of each type of speaker. Finally the virtual-speaker weighting method is verified by reproduction experiments of the interior sound field in a passenger car. The results validate that the optimum mixed speaker array can be obtained using the proposed method.

  4. Data requirements for speaker independent acoustic models

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2008-11-01

    Full Text Available When developing speech recognition systems in resource-constrained environments, careful design of the training corpus can play an important role in compensating for data scarcity. One of the factors to consider relates to the speaker composition...

  5. English Language Schooling, Linguistic Realities, and the Native Speaker of English in Hong Kong

    Science.gov (United States)

    Hansen Edwards, Jette G.

    2018-01-01

    The study employs a case study approach to examine the impact of educational backgrounds on nine Hong Kong tertiary students' English and Cantonese language practices and identifications as native speakers of English and Cantonese. The study employed both survey and interview data to probe the participants' English and Cantonese language use at…

  6. Role of Speaker Cues in Attention Inference

    OpenAIRE

    Jin Joo Lee; Cynthia Breazeal; David DeSteno

    2017-01-01

    Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in at...

  7. Speakers' choice of frame in binary choice

    Directory of Open Access Journals (Sweden)

    Marc van Buiten

    2009-02-01

    Full Text Available A distinction is proposed between extit{recommending for} preferred choice options and extit{recommending against} non-preferred choice options. In binary choice, both recommendation modes are logically, though not psychologically, equivalent. We report empirical evidence showing that speakers recommending for preferred options predominantly select positive frames, which are less common when speakers recommend against non-preferred options. In addition, option attractiveness is shown to affect speakers' choice of frame, and adoption of recommendation mode. The results are interpreted in terms of three compatibility effects, (i extit{recommendation mode---valence framing compatibility}: speakers' preference for positive framing is enhanced under extit{recommending for} and diminished under extit{recommending against} instructions, (ii extit{option attractiveness---valence framing compatibility}: speakers' preference for positive framing is more pronounced for attractive than for unattractive options, and (iii extit{recommendation mode---option attractiveness compatibility}: speakers are more likely to adopt a extit{recommending for} approach for attractive than for unattractive binary choice pairs.

  8. Audiovisual perceptual learning with multiple speakers.

    Science.gov (United States)

    Mitchel, Aaron D; Gerfen, Chip; Weiss, Daniel J

    2016-05-01

    One challenge for speech perception is between-speaker variability in the acoustic parameters of speech. For example, the same phoneme (e.g. the vowel in "cat") may have substantially different acoustic properties when produced by two different speakers and yet the listener must be able to interpret these disparate stimuli as equivalent. Perceptual tuning, the use of contextual information to adjust phonemic representations, may be one mechanism that helps listeners overcome obstacles they face due to this variability during speech perception. Here we test whether visual contextual cues to speaker identity may facilitate the formation and maintenance of distributional representations for individual speakers, allowing listeners to adjust phoneme boundaries in a speaker-specific manner. We familiarized participants to an audiovisual continuum between /aba/ and /ada/. During familiarization, the "b-face" mouthed /aba/ when an ambiguous token was played, while the "D-face" mouthed /ada/. At test, the same ambiguous token was more likely to be identified as /aba/ when paired with a stilled image of the "b-face" than with an image of the "D-face." This was not the case in the control condition when the two faces were paired equally with the ambiguous token. Together, these results suggest that listeners may form speaker-specific phonemic representations using facial identity cues.

  9. Role of Speaker Cues in Attention Inference

    Directory of Open Access Journals (Sweden)

    Jin Joo Lee

    2017-10-01

    Full Text Available Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in attention inference, we conduct investigations into real-world interactions of children (5–6 years old storytelling with their peers. Through in-depth analysis of human–human interaction data, we first identify nonverbal speaker cues (i.e., backchannel-inviting cues and listener responses (i.e., backchannel feedback. We then demonstrate how speaker cues can modify the interpretation of attention-related backchannels as well as serve as a means to regulate the responsiveness of listeners. We discuss the design implications of our findings toward our primary goal of developing attention recognition models for storytelling robots, and we argue that social robots can proactively use speaker cues to form more accurate inferences about the attentive state of their human partners.

  10. Robust Scientists

    DEFF Research Database (Denmark)

    Gorm Hansen, Birgitte

    their core i nterests, 2) developing a selfsupply of industry interests by becoming entrepreneurs and thus creating their own compliant industry partner and 3) balancing resources within a larger collective of researchers, thus countering changes in the influx of funding caused by shifts in political...... knowledge", Danish research policy seems to have helped develop politically and economically "robust scientists". Scientific robustness is acquired by way of three strategies: 1) tasting and discriminating between resources so as to avoid funding that erodes academic profiles and push scientists away from...

  11. Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker verification

    DEFF Research Database (Denmark)

    Sarkar, Achintya Kumar; Tan, Zheng-Hua

    2018-01-01

    -dependent. We show that the proposed method significantly reduces the error rates of text-dependent speaker verification for the non-target types: target-wrong and impostor-wrong while it maintains comparable TD-SV performance when impostors speak a correct utterance with respect to the conventional system......In this paper, we propose pass-phrase dependent background models (PBMs) for text-dependent (TD) speaker verification (SV) to integrate the pass-phrase identification process into the conventional TD-SV system, where a PBM is derived from a text-independent background model through adaptation using...... the utterances of a particular pass-phrase. During training, pass-phrase specific target speaker models are derived from the particular PBM using the training data for the respective target model. While testing, the best PBM is first selected for the test utterance in the maximum likelihood (ML) sense...

  12. Robust Recognition of Loud and Lombard speech in the Fighter Cockpit Environment

    Science.gov (United States)

    1988-08-01

    the latter as inter-speaker variability. According to Zue [Z85j, inter-speaker variabilities can be attributed to sociolinguistic background, dialect...34 Journal of the Acoustical Society of America , Vol 50, 1971. [At74I B. S. Atal, "Linear prediction for speaker identification," Journal of the Acoustical...Society of America , Vol 55, 1974. [B771 B. Beek, E. P. Neuberg, and D. C. Hodge, "An Assessment of the Technology of Automatic Speech Recognition for

  13. Gricean Semantics and Vague Speaker-Meaning

    OpenAIRE

    Schiffer, Stephen

    2017-01-01

    Presentations of Gricean semantics, including Stephen Neale’s in “Silent Reference,” totally ignore vagueness, even though virtually every utterance is vague. I ask how Gricean semantics might be adjusted to accommodate vague speaker-meaning. My answer is that it can’t accommodate it: the Gricean program collapses in the face of vague speaker-meaning. The Gricean might, however, fi nd some solace in knowing that every other extant meta-semantic and semantic program is in the same boat.

  14. A Novel Approach in Text-Independent Speaker Recognition in Noisy Environment

    Directory of Open Access Journals (Sweden)

    Nona Heydari Esfahani

    2014-10-01

    Full Text Available In this paper, robust text-independent speaker recognition is taken into consideration. The proposed method performs on manual silence-removed utterances that are segmented into smaller speech units containing few phones and at least one vowel. The segments are basic units for long-term feature extraction. Sub-band entropy is directly extracted in each segment. A robust vowel detection method is then applied on each segment to separate a high energy vowel that is used as unit for pitch frequency and formant extraction. By applying a clustering technique, extracted short-term features namely MFCC coefficients are combined with long term features. Experiments using MLP classifier show that the average speaker accuracy recognition rate is 97.33% for clean speech and 61.33% in noisy environment for -2db SNR, that shows improvement compared to other conventional methods.

  15. Speaker Prediction based on Head Orientations

    NARCIS (Netherlands)

    Rienks, R.J.; Poppe, Ronald Walter; van Otterlo, M.; Poel, Mannes; Poel, M.; Nijholt, A.; Nijholt, Antinus

    2005-01-01

    To gain insight into gaze behavior in meetings, this paper compares the results from a Naive Bayes classifier, Neural Networks and humans on speaker prediction in four-person meetings given solely the azimuth head angles. The Naive Bayes classifier scored 69.4% correctly, Neural Networks 62.3% and

  16. Study of audio speakers containing ferrofluid

    Energy Technology Data Exchange (ETDEWEB)

    Rosensweig, R E [34 Gloucester Road, Summit, NJ 07901 (United States); Hirota, Y; Tsuda, S [Ferrotec, 1-4-14 Kyobashi, chuo-Ku, Tokyo 104-0031 (Japan); Raj, K [Ferrotec, 33 Constitution Drive, Bedford, NH 03110 (United States)

    2008-05-21

    This work validates a method for increasing the radial restoring force on the voice coil in audio speakers containing ferrofluid. In addition, a study is made of factors influencing splash loss of the ferrofluid due to shock. Ferrohydrodynamic analysis is employed throughout to model behavior, and predictions are compared to experimental data.

  17. B Anand | Speakers | Indian Academy of Sciences

    Indian Academy of Sciences (India)

    However, the mechanism by which this protospacer fragment gets integrated in a directional fashion into the leader proximal end is elusive. The speakers group identified that the leader region abutting the first CRISPR repeat localizes Integration Host Factor (IHF) and Cas1-2 complex in Escherichia coli. IHF binding to the ...

  18. Speaker recognition through NLP and CWT modeling.

    Energy Technology Data Exchange (ETDEWEB)

    Brown-VanHoozer, A.; Kercel, S. W.; Tucker, R. W.

    1999-06-23

    The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the ''huge population'' problem by seeking two completely different kinds of characterizing features. These features are extracted using the techniques of Neuro-Linguistic Programming (NLP) and the continuous wavelet transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant

  19. SU-F-R-31: Identification of Robust Normal Lung CT Texture Features for the Prediction of Radiation-Induced Lung Disease

    Energy Technology Data Exchange (ETDEWEB)

    Choi, W; Riyahi, S; Lu, W [University of Maryland School of Medicine, Baltimore, MD (United States)

    2016-06-15

    Purpose: Normal lung CT texture features have been used for the prediction of radiation-induced lung disease (radiation pneumonitis and radiation fibrosis). For these features to be clinically useful, they need to be relatively invariant (robust) to tumor size and not correlated with normal lung volume. Methods: The free-breathing CTs of 14 lung SBRT patients were studied. Different sizes of GTVs were simulated with spheres placed at the upper lobe and lower lobe respectively in the normal lung (contralateral to tumor). 27 texture features (9 from intensity histogram, 8 from grey-level co-occurrence matrix [GLCM] and 10 from grey-level run-length matrix [GLRM]) were extracted from [normal lung-GTV]. To measure the variability of a feature F, the relative difference D=|Fref -Fsim|/Fref*100% was calculated, where Fref was for the entire normal lung and Fsim was for [normal lung-GTV]. A feature was considered as robust if the largest non-outlier (Q3+1.5*IQR) D was less than 5%, and considered as not correlated with normal lung volume when their Pearson correlation was lower than 0.50. Results: Only 11 features were robust. All first-order intensity-histogram features (mean, max, etc.) were robust, while most higher-order features (skewness, kurtosis, etc.) were unrobust. Only two of the GLCM and four of the GLRM features were robust. Larger GTV resulted greater feature variation, this was particularly true for unrobust features. All robust features were not correlated with normal lung volume while three unrobust features showed high correlation. Excessive variations were observed in two low grey-level run features and were later identified to be from one patient with local lung diseases (atelectasis) in the normal lung. There was no dependence on GTV location. Conclusion: We identified 11 robust normal lung CT texture features that can be further examined for the prediction of radiation-induced lung disease. Interestingly, low grey-level run features identified normal

  20. When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.

    Science.gov (United States)

    Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola

    2017-11-01

    Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Unsupervised Speaker Change Detection for Broadcast News Segmentation

    DEFF Research Database (Denmark)

    Jørgensen, Kasper Winther; Mølgaard, Lasse Lohilahti; Hansen, Lars Kai

    2006-01-01

    This paper presents a speaker change detection system for news broadcast segmentation based on a vector quantization (VQ) approach. The system does not make any assumption about the number of speakers or speaker identity. The system uses mel frequency cepstral coefficients and change detection...

  2. Accent Attribution in Speakers with Foreign Accent Syndrome

    Science.gov (United States)

    Verhoeven, Jo; De Pauw, Guy; Pettinato, Michele; Hirson, Allen; Van Borsel, John; Marien, Peter

    2013-01-01

    Purpose: The main aim of this experiment was to investigate the perception of Foreign Accent Syndrome in comparison to speakers with an authentic foreign accent. Method: Three groups of listeners attributed accents to conversational speech samples of 5 FAS speakers which were embedded amongst those of 5 speakers with a real foreign accent and 5…

  3. Young Children's Sensitivity to Speaker Gender When Learning from Others

    Science.gov (United States)

    Ma, Lili; Woolley, Jacqueline D.

    2013-01-01

    This research explores whether young children are sensitive to speaker gender when learning novel information from others. Four- and 6-year-olds ("N" = 144) chose between conflicting statements from a male versus a female speaker (Studies 1 and 3) or decided which speaker (male or female) they would ask (Study 2) when learning about the functions…

  4. Speaker Reliability Guides Children's Inductive Inferences about Novel Properties

    Science.gov (United States)

    Kim, Sunae; Kalish, Charles W.; Harris, Paul L.

    2012-01-01

    Prior work shows that children can make inductive inferences about objects based on their labels rather than their appearance (Gelman, 2003). A separate line of research shows that children's trust in a speaker's label is selective. Children accept labels from a reliable speaker over an unreliable speaker (e.g., Koenig & Harris, 2005). In the…

  5. Vocal caricatures reveal signatures of speaker identity

    Science.gov (United States)

    López, Sabrina; Riera, Pablo; Assaneo, María Florencia; Eguía, Manuel; Sigman, Mariano; Trevisan, Marcos A.

    2013-12-01

    What are the features that impersonators select to elicit a speaker's identity? We built a voice database of public figures (targets) and imitations produced by professional impersonators. They produced one imitation based on their memory of the target (caricature) and another one after listening to the target audio (replica). A set of naive participants then judged identity and similarity of pairs of voices. Identity was better evoked by the caricatures and replicas were perceived to be closer to the targets in terms of voice similarity. We used this data to map relevant acoustic dimensions for each task. Our results indicate that speaker identity is mainly associated with vocal tract features, while perception of voice similarity is related to vocal folds parameters. We therefore show the way in which acoustic caricatures emphasize identity features at the cost of loosing similarity, which allows drawing an analogy with caricatures in the visual space.

  6. Methods of Speakers\\' Effects on the Audience

    Directory of Open Access Journals (Sweden)

    فریبا حسینی

    2010-09-01

    Full Text Available Methods of Speakers' Effects on the Audience    Nasrollah Shameli *   Fariba Hosayni **     Abstract   This article is focused on four issues. The first issue is related to the speaker's external appearance including the beauty of face, the power of his voice, moves and signals by hand, the stick and eyebrow as well as the height. Such characteristics could have an important effect on the audience. The second issue is related to internal features of the speaker. These include the ethics of the preacher , his/her piety and intention on the speakers based on their personalities, habits and emotions, knowledge and culture, and speed of learning. The third issue is concerned with the appearance of the lecture. Words should be clear enough as well as being mixed with Quranic verses, poetry and proverbs. The final issue is related to the content. It is argued that the subject of the talk should be in accordance with the level of understanding of listeners as well as being new and interesting for them.   3 - A phenomenon rhetoric: It was noted in this section How to give words and phrases so that these words and phrases are clear, correct, mixed in parables, governance and Quranic verses, and appropriate their meaning.   4 - the content of Oratory : It was noted in this section to the topic of Oratory and say that the Oratory should be the theme commensurate with the minds of audiences and also should mean that agree with the case may be, then I say: that the rhetoric if the theme was innovative and new is affecting more and more on the audience.     Key words : Oratory , Preacher , Audience, Influence of speech     * Associate Professor, Department of Arabic Language and Literature, University of Isfahan E-mail: Dr-Nasrolla Shameli@Yahoo.com   * * M.A. in Arabic Language and Literature from Isfahan University E-mail: faribahosayni@yahoo.com

  7. Physiological responses at short distances from a parametric speaker

    Directory of Open Access Journals (Sweden)

    Lee Soomin

    2012-06-01

    Full Text Available Abstract In recent years, parametric speakers have been used in various circumstances. In our previous studies, we verified that the physiological burden of the sound of parametric speaker set at 2.6 m from the subjects was lower than that of the general speaker. However, nothing has yet been demonstrated about the effects of the sound of a parametric speaker at the shorter distance between parametric speakers the human body. Therefore, we studied this effect on physiological functions and task performance. Nine male subjects participated in this study. They completed three consecutive sessions: a 20-minute quiet period as a baseline, a 30-minute mental task period with general speakers or parametric speakers, and a 20-minute recovery period. We measured electrocardiogram (ECG photoplethysmogram (PTG, electroencephalogram (EEG, systolic and diastolic blood pressure. Four experiments, one with a speaker condition (general speaker and parametric speaker, the other with a distance condition (0.3 m and 1.0 m, were conducted respectively at the same time of day on separate days. To examine the effects of the speaker and distance, three-way repeated measures ANOVA (speaker factor x distance factor x time factor were conducted. In conclusion, we found that the physiological responses were not significantly different between the speaker condition and the distance condition. Meanwhile, it was shown that the physiological burdens increased with progress in time independently of speaker condition and distance condition. In summary, the effects of the parametric speaker at the 2.6 m distance were not obtained at the distance of 1 m or less.

  8. English Speakers Attend More Strongly than Spanish Speakers to Manner of Motion when Classifying Novel Objects and Events

    Science.gov (United States)

    Kersten, Alan W.; Meissner, Christian A.; Lechuga, Julia; Schwartz, Bennett L.; Albrechtsen, Justin S.; Iglesias, Adam

    2010-01-01

    Three experiments provide evidence that the conceptualization of moving objects and events is influenced by one's native language, consistent with linguistic relativity theory. Monolingual English speakers and bilingual Spanish/English speakers tested in an English-speaking context performed better than monolingual Spanish speakers and bilingual…

  9. Grammatical Planning Units during Real-Time Sentence Production in Speakers with Agrammatic Aphasia and Healthy Speakers

    Science.gov (United States)

    Lee, Jiyeon; Yoshida, Masaya; Thompson, Cynthia K.

    2015-01-01

    Purpose: Grammatical encoding (GE) is impaired in agrammatic aphasia; however, the nature of such deficits remains unclear. We examined grammatical planning units during real-time sentence production in speakers with agrammatic aphasia and control speakers, testing two competing models of GE. We queried whether speakers with agrammatic aphasia…

  10. Student perceptions of native and non-native speaker language instructors: A comparison of ESL and Spanish

    Directory of Open Access Journals (Sweden)

    Laura Callahan

    2006-12-01

    Full Text Available The question of the native vs. non-native speaker status of second and foreign language instructors has been investigated chiefly from the perspective of the teacher. Anecdotal evidence suggests that students have strong opinions on the relative qualities of instruction by native and non-native speakers. Most research focuses on students of English as a foreign or second language. This paper reports on data gathered through a questionnaire administered to 55 university students: 31 students of Spanish as FL and 24 students of English as SL. Qualitative results show what strengths students believe each type of instructor has, and quantitative results confirm that any gap students may perceive between the abilities of native and non-native instructors is not so wide as one might expect based on popular notions of the issue. ESL students showed a stronger preference for native-speaker instructors overall, and were at variance with the SFL students' ratings of native-speaker instructors' performance on a number of aspects. There was a significant correlation in both groups between having a family member who is a native speaker of the target language and student preference for and self-identification with a native speaker as instructor. (English text

  11. A system of automatic speaker recognition on a minicomputer

    International Nuclear Information System (INIS)

    El Chafei, Cherif

    1978-01-01

    This study describes a system of automatic speaker recognition using the pitch of the voice. The pre-treatment consists in the extraction of the speakers' discriminating characteristics taken from the pitch. The programme of recognition gives, firstly, a preselection and then calculates the distance between the speaker's characteristics to be recognized and those of the speakers already recorded. An experience of recognition has been realized. It has been undertaken with 15 speakers and included 566 tests spread over an intermittent period of four months. The discriminating characteristics used offer several interesting qualities. The algorithms concerning the measure of the characteristics on one hand, the speakers' classification on the other hand, are simple. The results obtained in real time with a minicomputer are satisfactory. Furthermore they probably could be improved if we considered other speaker's discriminating characteristics but this was unfortunately not in our possibilities. (author) [fr

  12. The Crane Robust Control

    Directory of Open Access Journals (Sweden)

    Marek Hicar

    2004-01-01

    Full Text Available The article is about a control design for complete structure of the crane: crab, bridge and crane uplift.The most important unknown parameters for simulations are burden weight and length of hanging rope. We will use robustcontrol for crab and bridge control to ensure adaptivity for burden weight and rope length. Robust control will be designed for current control of the crab and bridge, necessary is to know the range of unknown parameters. Whole robust will be splitto subintervals and after correct identification of unknown parameters the most suitable robust controllers will be chosen.The most important condition at the crab and bridge motion is avoiding from burden swinging in the final position. Crab and bridge drive is designed by asynchronous motor fed from frequency converter. We will use crane uplift with burden weightobserver in combination for uplift, crab and bridge drive with cooperation of their parameters: burden weight, rope length and crab and bridge position. Controllers are designed by state control method. We will use preferably a disturbance observerwhich will identify burden weight as a disturbance. The system will be working in both modes at empty hook as well asat maximum load: burden uplifting and dropping down.

  13. Robust matching for voice recognition

    Science.gov (United States)

    Higgins, Alan; Bahler, L.; Porter, J.; Blais, P.

    1994-10-01

    This paper describes an automated method of comparing a voice sample of an unknown individual with samples from known speakers in order to establish or verify the individual's identity. The method is based on a statistical pattern matching approach that employs a simple training procedure, requires no human intervention (transcription, work or phonetic marketing, etc.), and makes no assumptions regarding the expected form of the statistical distributions of the observations. The content of the speech material (vocabulary, grammar, etc.) is not assumed to be constrained in any way. An algorithm is described which incorporates frame pruning and channel equalization processes designed to achieve robust performance with reasonable computational resources. An experimental implementation demonstrating the feasibility of the concept is described.

  14. Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification

    DEFF Research Database (Denmark)

    Thomsen, Nicolai Bæk; Thomsen, Dennis Alexander Lehmann; Tan, Zheng-Hua

    2016-01-01

    not perform well in this setting. In this work we compare the performance of different noise reduction methods under different noise conditions in terms of speaker verification when the text is known and the system is trained on clean data (mis-matched conditions). We furthermore propose a new approach based......The problem of text-dependent speaker verification under noisy conditions is becoming ever more relevant, due to increased usage for authentication in real-world applications. Classical methods for noise reduction such as spectral subtraction and Wiener filtering introduce distortion and do...... on dictionary-based noise reduction and compare it to the baseline methods....

  15. Speaker diarization system using HXLPS and deep neural network

    Directory of Open Access Journals (Sweden)

    V. Subba Ramaiah

    2018-03-01

    Full Text Available In general, speaker diarization is defined as the process of segmenting the input speech signal and grouped the homogenous regions with regard to the speaker identity. The main idea behind this system is that it is able to discriminate the speaker signal by assigning the label of the each speaker signal. Due to rapid growth of broadcasting and meeting, the speaker diarization is burdensome to enhance the readability of the speech transcription. In order to solve this issue, Holoentropy with the eXtended Linear Prediction using autocorrelation Snapshot (HXLPS and deep neural network (DNN is proposed for the speaker diarization system. The HXLPS extraction method is newly developed by incorporating the Holoentropy with the XLPS. Once we attain the features, the speech and non-speech signals are detected by the Voice Activity Detection (VAD method. Then, i-vector representation of every segmented signal is obtained using Universal Background Model (UBM model. Consequently, DNN is utilized to assign the label for the speaker signal which is then clustered according to the speaker label. The performance is analysed using the evaluation metrics, such as tracking distance, false alarm rate and diarization error rate. The outcome of the proposed method ensures the better diarization performance by achieving the lower DER of 1.36% based on lambda value and DER of 2.23% depends on the frame length. Keywords: Speaker diarization, HXLPS feature extraction, Voice activity detection, Deep neural network, Speaker clustering, Diarization Error Rate (DER

  16. Communication Interface for Mexican Spanish Dysarthric Speakers

    Directory of Open Access Journals (Sweden)

    Gladys Bonilla-Enriquez

    2012-03-01

    Full Text Available La disartria es una discapacidad motora del habla caracterizada por debilidad o poca coordinación de los músculos del habla. Esta condición puede ser causada por un infarto, parálisis cerebral, o por una lesión severa en el cerebro. Para mexicanos con esta condición hay muy pocas, si es que hay alguna, tecnologías de asistencia para mejorar sus habilidades sociales de interacción. En este artículo presentamos nuestros avances hacia el desarrollo de una interfazde comunicación para hablantes con disartria cuya lengua materna sea el español mexicano. La metodología propuesta depende de (1 diseño especial de un corpus de entrenamiento con voz normal y recursos limitados, (2 adaptación de usuario estándar, y (3 control de la perplejidad del modelo de lenguaje para lograr alta precisión en el Reconocimiento Automático del Habla (RAH. La interfaz permite al usuario y terapéuta el realizar actividades como adaptación dinámica de usuario, adaptación de vocabulario, y síntesis de texto a voz. Pruebas en vivo fueron realizadas con un usuario con disartria leve, logrando precisiones de 93%-95% para habla espontánea.Dysarthria is a motor speech disorder due to weakness or poor coordination of the speechmuscles. This condition can be caused by a stroke, cerebral palsy, or by a traumatic braininjury. For Mexican people with this condition there are few, if any, assistive technologies to improve their social interaction skills. In this paper we present our advances towards the development of a communication interface for dysarthric speakers whose native language is Mexican Spanish. We propose a methodology that relies on (1 special design of a training normal-speech corpus with limited resources, (2 standard speaker adaptation, and (3 control of language model perplexity, to achieve high Automatic Speech Recognition (ASR accuracy. The interface allows the user and therapist to perform tasks such as dynamic speaker adaptation, vocabulary

  17. Identification of myogenic regulatory genes in the muscle transcriptome of beltfish (Trichiurus lepturus: A major commercial marine fish species with robust swimming ability

    Directory of Open Access Journals (Sweden)

    Hui Zhang

    2016-06-01

    Full Text Available The beltfish (Trichiurus lepturus is considered as one of the most economically important marine fish in East Asia. It is a top predator with a robust swimming ability that is a good model to study muscle physiology in fish. In the present study, we used Illumina sequencing technology (NextSeq500 to sequence, assemble and annotate the muscle transcriptome of juvenile beltfish. A total of 57,509,280 clean reads (deposited in NCBI SRA database with accession number of SRX1674471 were obtained from RNA sequencing and 26,811 unigenes (with N50 of 1033 bp were obtained after de novo assembling with Trinity software. BLASTX against NR, GO, KEGG and eggNOG databases show 100%, 49%, 31% and 96% annotation rate, respectively. By mining beltfish muscle transcriptome, several key genes which play essential role on regulating myogenesis, including pax3, pax7, myf5, myoD, mrf4/myf6, myogenin and myostatin were identified with a low expression level. The muscle transcriptome of beltfish can provide some insight into the understanding of genome-wide transcriptome profile of teleost muscle tissue and give useful information to study myogenesis in juvenile/adult fish.

  18. Real Time Recognition Of Speakers From Internet Audio Stream

    Directory of Open Access Journals (Sweden)

    Weychan Radoslaw

    2015-09-01

    Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.

  19. Effect of lisping on audience evaluation of male speakers.

    Science.gov (United States)

    Mowrer, D E; Wahl, P; Doolan, S J

    1978-05-01

    The social consequences of adult listeners' first impression of lisping were evaluated in two studies. Five adult speakers were rated by adult listeners with regard to speaking ability, intelligence, education, masculinity, and friendship. Results from both studies indicate that listeners rate adult speakers who demonstrate frontal lisping lower than nonlispers in all five categories investigated. Efforts to correct frontal lisping are justifiable on the basis of the poor impression lisping speakers make on the listener.

  20. The Speaker Gender Gap at Critical Care Conferences.

    Science.gov (United States)

    Mehta, Sangeeta; Rose, Louise; Cook, Deborah; Herridge, Margaret; Owais, Sawayra; Metaxa, Victoria

    2018-06-01

    To review women's participation as faculty at five critical care conferences over 7 years. Retrospective analysis of five scientific programs to identify the proportion of females and each speaker's profession based on conference conveners, program documents, or internet research. Three international (European Society of Intensive Care Medicine, International Symposium on Intensive Care and Emergency Medicine, Society of Critical Care Medicine) and two national (Critical Care Canada Forum, U.K. Intensive Care Society State of the Art Meeting) annual critical care conferences held between 2010 and 2016. Female faculty speakers. None. Male speakers outnumbered female speakers at all five conferences, in all 7 years. Overall, women represented 5-31% of speakers, and female physicians represented 5-26% of speakers. Nursing and allied health professional faculty represented 0-25% of speakers; in general, more than 50% of allied health professionals were women. Over the 7 years, Society of Critical Care Medicine had the highest representation of female (27% overall) and nursing/allied health professional (16-25%) speakers; notably, male physicians substantially outnumbered female physicians in all years (62-70% vs 10-19%, respectively). Women's representation on conference program committees ranged from 0% to 40%, with Society of Critical Care Medicine having the highest representation of women (26-40%). The female proportions of speakers, physician speakers, and program committee members increased significantly over time at the Society of Critical Care Medicine and U.K. Intensive Care Society State of the Art Meeting conferences (p gap at critical care conferences, with male faculty outnumbering female faculty. This gap is more marked among physician speakers than those speakers representing nursing and allied health professionals. Several organizational strategies can address this gender gap.

  1. On the improvement of speaker diarization by detecting overlapped speech

    OpenAIRE

    Hernando Pericás, Francisco Javier; Hernando Pericás, Francisco Javier

    2010-01-01

    Simultaneous speech in meeting environment is responsible for a certain amount of errors caused by standard speaker diarization systems. We are presenting an overlap detection system for far-field data based on spectral and spatial features, where the spatial features obtained on different microphone pairs are fused by means of principal component analysis. Detected overlap segments are applied for speaker diarization in order to increase the purity of speaker clusters an...

  2. IDENTIFICATION OF A ROBUST LICHEN INDEX FOR THE DECONVOLUTION OF LICHEN AND ROCK MIXTURES USING PATTERN SEARCH ALGORITHM (CASE STUDY: GREENLAND

    Directory of Open Access Journals (Sweden)

    S. Salehi

    2016-06-01

    Full Text Available Lichens are the dominant autotrophs of polar and subpolar ecosystems commonly encrust the rock outcrops. Spectral mixing of lichens and bare rock can shift diagnostic spectral features of materials of interest thus leading to misinterpretation and false positives if mapping is done based on perfect spectral matching methodologies. Therefore, the ability to distinguish the lichen coverage from rock and decomposing a mixed pixel into a collection of pure reflectance spectra, can improve the applicability of hyperspectral methods for mineral exploration. The objective of this study is to propose a robust lichen index that can be used to estimate lichen coverage, regardless of the mineral composition of the underlying rocks. The performance of three index structures of ratio, normalized ratio and subtraction have been investigated using synthetic linear mixtures of pure rock and lichen spectra with prescribed mixing ratios. Laboratory spectroscopic data are obtained from lichen covered samples collected from Karrat, Liverpool Land, and Sisimiut regions in Greenland. The spectra are then resampled to Hyperspectral Mapper (HyMAP resolution, in order to further investigate the functionality of the indices for the airborne platform. In both resolutions, a Pattern Search (PS algorithm is used to identify the optimal band wavelengths and bandwidths for the lichen index. The results of our band optimization procedure revealed that the ratio between R894-1246 and R1110 explains most of the variability in the hyperspectral data at the original laboratory resolution (R2=0.769. However, the normalized index incorporating R1106-1121 and R904-1251 yields the best results for the HyMAP resolution (R2=0.765.

  3. Noise-robust speech triage.

    Science.gov (United States)

    Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav

    2018-04-01

    A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).

  4. Speaker-dependent Multipitch Tracking Using Deep Neural Networks

    Science.gov (United States)

    2015-01-01

    sentences spoken by each of 34 speakers (18 male, 16 female). Two male and two female speakers (No. 1, 2, 18, 20, same as [30]), denoted as MA1, MA2 ...Engineering Technical Report #12, 2015 Speaker Pairs MA1- MA2 MA1-FE1 MA1-FE2 MA2 -FE1 MA2 -FE2 FE1-FE2 E T ot al 0 10 20 30 40 50 60 70 80 Jin and Wang Hu and...Pitch 1 Estimated Pitch 2 (d) Figure 6: Multipitch tracking results on a test mixture (pbbv6n and priv3n) for the MA1- MA2 speaker pair. (a) Groundtruth

  5. Comparison of Diarization Tools for Building Speaker Database

    Directory of Open Access Journals (Sweden)

    Eva Kiktova

    2015-01-01

    Full Text Available This paper compares open source diarization toolkits (LIUM, DiarTK, ALIZE-Lia_Ral, which were designed for extraction of speaker identity from audio records without any prior information about the analysed data. The comparative study of used diarization tools was performed for three different types of analysed data (broadcast news - BN and TV shows. Corresponding values of achieved DER measure are presented here. The automatic speaker diarization system developed by LIUM was able to identified speech segments belonging to speakers at very good level. Its segmentation outputs can be used to build a speaker database.

  6. Speaker Clustering for a Mixture of Singing and Reading (Preprint)

    Science.gov (United States)

    2012-03-01

    diarization [2, 3] which answers the ques- tion of ”who spoke when?” is a combination of speaker segmentation and clustering. Although it is possible to...focuses on speaker clustering, the techniques developed here can be applied to speaker diarization . For the remainder of this paper, the term ”speech...and retrieval,” Proceedings of the IEEE, vol. 88, 2000. [2] S. Tranter and D. Reynolds, “An overview of automatic speaker diarization systems,” IEEE

  7. Robust Multimodal Dictionary Learning

    Science.gov (United States)

    Cao, Tian; Jojic, Vladimir; Modla, Shannon; Powell, Debbie; Czymmek, Kirk; Niethammer, Marc

    2014-01-01

    We propose a robust multimodal dictionary learning method for multimodal images. Joint dictionary learning for both modalities may be impaired by lack of correspondence between image modalities in training data, for example due to areas of low quality in one of the modalities. Dictionaries learned with such non-corresponding data will induce uncertainty about image representation. In this paper, we propose a probabilistic model that accounts for image areas that are poorly corresponding between the image modalities. We cast the problem of learning a dictionary in presence of problematic image patches as a likelihood maximization problem and solve it with a variant of the EM algorithm. Our algorithm iterates identification of poorly corresponding patches and re-finements of the dictionary. We tested our method on synthetic and real data. We show improvements in image prediction quality and alignment accuracy when using the method for multimodal image registration. PMID:24505674

  8. A Text-Independent Speaker Authentication System for Mobile Devices

    Directory of Open Access Journals (Sweden)

    Florentin Thullier

    2017-09-01

    Full Text Available This paper presents a text independent speaker authentication method adapted to mobile devices. Special attention was placed on delivering a fully operational application, which admits a sufficient reliability level and an efficient functioning. To this end, we have excluded the need for any network communication. Hence, we opted for the completion of both the training and the identification processes directly on the mobile device through the extraction of linear prediction cepstral coefficients and the naive Bayes algorithm as the classifier. Furthermore, the authentication decision is enhanced to overcome misidentification through access privileges that the user should attribute to each application beforehand. To evaluate the proposed authentication system, eleven participants were involved in the experiment, conducted in quiet and noisy environments. Public speech corpora were also employed to compare this implementation to existing methods. Results were efficient regarding mobile resources’ consumption. The overall classification performance obtained was accurate with a small number of samples. Then, it appeared that our authentication system might be used as a first security layer, but also as part of a multilayer authentication, or as a fall-back mechanism.

  9. Affective processing in bilingual speakers: disembodied cognition?

    Science.gov (United States)

    Pavlenko, Aneta

    2012-01-01

    A recent study by Keysar, Hayakawa, and An (2012) suggests that "thinking in a foreign language" may reduce decision biases because a foreign language provides a greater emotional distance than a native tongue. The possibility of such "disembodied" cognition is of great interest for theories of affect and cognition and for many other areas of psychological theory and practice, from clinical and forensic psychology to marketing, but first this claim needs to be properly evaluated. The purpose of this review is to examine the findings of clinical, introspective, cognitive, psychophysiological, and neuroimaging studies of affective processing in bilingual speakers in order to identify converging patterns of results, to evaluate the claim about "disembodied cognition," and to outline directions for future inquiry. The findings to date reveal two interrelated processing effects. First-language (L1) advantage refers to increased automaticity of affective processing in the L1 and heightened electrodermal reactivity to L1 emotion-laden words. Second-language (L2) advantage refers to decreased automaticity of affective processing in the L2, which reduces interference effects and lowers electrodermal reactivity to negative emotional stimuli. The differences in L1 and L2 affective processing suggest that in some bilingual speakers, in particular late bilinguals and foreign language users, respective languages may be differentially embodied, with the later learned language processed semantically but not affectively. This difference accounts for the reduction of framing biases in L2 processing in the study by Keysar et al. (2012). The follow-up discussion identifies the limits of the findings to date in terms of participant populations, levels of processing, and types of stimuli, puts forth alternative explanations of the documented effects, and articulates predictions to be tested in future research.

  10. Differences in Sickness Allowance Receipt between Swedish Speakers and Finnish Speakers in Finland

    Directory of Open Access Journals (Sweden)

    Kaarina S. Reini

    2017-12-01

    Full Text Available Previous research has documented lower disability retirement and mortality rates of Swedish speakers as compared with Finnish speakers in Finland. This paper is the first to compare the two language groups with regard to the receipt of sickness allowance, which is an objective health measure that reflects a less severe poor health condition. Register-based data covering the years 1988-2011 are used. We estimate logistic regression models with generalized estimating equations to account for repeated observations at the individual level. We find that Swedish-speaking men have approximately 30 percent lower odds of receiving sickness allowance than Finnish-speaking men, whereas the difference in women is about 15 percent. In correspondence with previous research on all-cause mortality at working ages, we find no language-group difference in sickness allowance receipt in the socially most successful subgroup of the population.

  11. The Status of Native Speaker Intuitions in a Polylectal Grammar.

    Science.gov (United States)

    Debose, Charles E.

    A study of one speaker's intuitions about and performance in Black English is presented with relation to Saussure's "langue-parole" dichotomy. Native speakers of a language have intuitions about the static synchronic entities although the data of their speaking is variable and panchronic. These entities are in a diglossic relationship to each…

  12. Speaker and Observer Perceptions of Physical Tension during Stuttering.

    Science.gov (United States)

    Tichenor, Seth; Leslie, Paula; Shaiman, Susan; Yaruss, J Scott

    2017-01-01

    Speech-language pathologists routinely assess physical tension during evaluation of those who stutter. If speakers experience tension that is not visible to clinicians, then judgments of severity may be inaccurate. This study addressed this potential discrepancy by comparing judgments of tension by people who stutter and expert clinicians to determine if clinicians could accurately identify the speakers' experience of physical tension. Ten adults who stutter were audio-video recorded in two speaking samples. Two board-certified specialists in fluency evaluated the samples using the Stuttering Severity Instrument-4 and a checklist adapted for this study. Speakers rated their tension using the same forms, and then discussed their experiences in a qualitative interview so that themes related to physical tension could be identified. The degree of tension reported by speakers was higher than that observed by specialists. Tension in parts of the body that were less visible to the observer (chest, abdomen, throat) was reported more by speakers than by specialists. The thematic analysis revealed that speakers' experience of tension changes over time and that these changes may be related to speakers' acceptance of stuttering. The lack of agreement between speaker and specialist perceptions of tension suggests that using self-reports is a necessary component for supporting the accurate diagnosis of tension in stuttering. © 2018 S. Karger AG, Basel.

  13. Request Strategies in Everyday Interactions of Persian and English Speakers

    Directory of Open Access Journals (Sweden)

    Shiler Yazdanfar

    2016-12-01

    Full Text Available Cross-cultural studies of speech acts in different linguistic contexts might have interesting implications for language researchers and practitioners. Drawing on the Speech Act Theory, the present study aimed at conducting a comparative study of request speech act in Persian and English. Specifically, the study endeavored to explore the request strategies used in daily interactions of Persian and English speakers based on directness level and supportive moves. To this end, English and Persian TV series were observed and requestive utterances were transcribed. The utterances were then categorized based on Blum-Kulka and Olshtain’s Cross-Cultural Study of Speech Act Realization Pattern (CCSARP for directness level and internal and external mitigation devises. According to the results, although speakers of both languages opted for the direct level as their most frequently used strategy in their daily interactions, the English speakers used more conventionally indirect strategies than the Persian speakers did, and the Persian speakers used more non-conventionally indirect strategies than the English speakers did. Furthermore, the analyzed data revealed the fact that American English speakers use more mitigation devices in their daily interactions with friends and family members than Persian speakers.

  14. Dysprosody and Stimulus Effects in Cantonese Speakers with Parkinson's Disease

    Science.gov (United States)

    Ma, Joan K.-Y.; Whitehill, Tara; Cheung, Katherine S.-K.

    2010-01-01

    Background: Dysprosody is a common feature in speakers with hypokinetic dysarthria. However, speech prosody varies across different types of speech materials. This raises the question of what is the most appropriate speech material for the evaluation of dysprosody. Aims: To characterize the prosodic impairment in Cantonese speakers with…

  15. Teaching Portuguese to Spanish Speakers: A Case for Trilingualism

    Science.gov (United States)

    Carvalho, Ana M.; Freire, Juliana Luna; da Silva, Antonio J. B.

    2010-01-01

    Portuguese is the sixth-most-spoken native language in the world, with approximately 240,000,000 speakers. Within the United States, there is a growing demand for K-12 language programs to engage the community of Portuguese heritage speakers. According to the 2000 U.S. census, 85,000 school-age children speak Portuguese at home. As a result, more…

  16. Profiles of an Acquisition Generation: Nontraditional Heritage Speakers of Spanish

    Science.gov (United States)

    DeFeo, Dayna Jean

    2018-01-01

    Though definitions vary, the literature on heritage speakers of Spanish identifies two primary attributes: a linguistic and cultural connection to the language. This article profiles four Anglo college students who grew up in bilingual or Spanish-dominant communities in the Southwest who self-identified as Spanish heritage speakers, citing…

  17. Progress in the AMIDA speaker diarization system for meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Konečný, M.

    2008-01-01

    In this paper we describe the AMIDA speaker dizarization system as it was submitted to the NIST Rich Transcription evaluation 2007 for conference room data. This is done in the context of the history of this system and other speaker diarization systems. One of the goals of our system is to have as

  18. A hybrid generative-discriminative approach to speaker diarization

    NARCIS (Netherlands)

    Noulas, A.K.; van Kasteren, T.; Kröse, B.J.A.

    2008-01-01

    In this paper we present a sound probabilistic approach to speaker diarization. We use a hybrid framework where a distribution over the number of speakers at each point of a multimodal stream is estimated with a discriminative model. The output of this process is used as input in a generative model

  19. Guest Speakers in School-Based Sexuality Education

    Science.gov (United States)

    McRee, Annie-Laurie; Madsen, Nikki; Eisenberg, Marla E.

    2014-01-01

    This study, using data from a statewide survey (n = 332), examined teachers' practices regarding the inclusion of guest speakers to cover sexuality content. More than half of teachers (58%) included guest speakers. In multivariate analyses, teachers who taught high school, had professional preparation in health education, or who received…

  20. Internal request modification by first and second language speakers ...

    African Journals Online (AJOL)

    This study focuses on the question of whether Luganda English speakers would negatively transfer into their English speech the use of syntactic and lexical down graders resulting in pragmatic failure. Data were collected from Luganda and Luganda English speakers by means of a Discourse Completion Test (DCT) ...

  1. (En)countering native-speakerism global perspectives

    CERN Document Server

    Holliday, Adrian; Swan, Anne

    2015-01-01

    The book addresses the issue of native-speakerism, an ideology based on the assumption that 'native speakers' of English have a special claim to the language itself, through critical qualitative studies of the lived experiences of practising teachers and students in a range of scenarios.

  2. Who spoke when? Audio-based speaker location estimation for diarization

    NARCIS (Netherlands)

    Dadvar, M.

    2011-01-01

    Speaker diarization is the process which detects active speakers and groups those speech signals which has been uttered by the same speaker. Generally we can find two main applications for speaker diarization. Automatic Speech Recognition systems make use of the speaker homogeneous clusters to adapt

  3. The Communication of Public Speaking Anxiety: Perceptions of Asian and American Speakers.

    Science.gov (United States)

    Martini, Marianne; And Others

    1992-01-01

    Finds that U.S. audiences perceive Asian speakers to have more speech anxiety than U.S. speakers, even though Asian speakers do not self-report higher anxiety levels. Confirms that speech state anxiety is not communicated effectively between speakers and audiences for Asian or U.S. speakers. (SR)

  4. Data-Model Relationship in Text-Independent Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Stapert Robert

    2005-01-01

    Full Text Available Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs do not include time sequence information (TSI within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM, embeds dynamic time warping (DTW into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM.

  5. Speakers of different languages process the visual world differently.

    Science.gov (United States)

    Chabal, Sarah; Marian, Viorica

    2015-06-01

    Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct language input, showing that linguistic information is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. (c) 2015 APA, all rights reserved).

  6. Mitali Chatterjee | Speakers | Indian Academy of Sciences

    Indian Academy of Sciences (India)

    ... diagnosis, identification of factors contributing to antimonial unresponsiveness and evaluation of newer chemotherapeutic modalities for better management of ... The monocytes/macrophages displayed polarization towards an alternatively activated phenotype with a consistent increase in expression of classical M2 ...

  7. Linear array of photodiodes to track a human speaker for video recording

    International Nuclear Information System (INIS)

    DeTone, D; Neal, H; Lougheed, R

    2012-01-01

    Communication and collaboration using stored digital media has garnered more interest by many areas of business, government and education in recent years. This is due primarily to improvements in the quality of cameras and speed of computers. An advantage of digital media is that it can serve as an effective alternative when physical interaction is not possible. Video recordings that allow for viewers to discern a presenter's facial features, lips and hand motions are more effective than videos that do not. To attain this, one must maintain a video capture in which the speaker occupies a significant portion of the captured pixels. However, camera operators are costly, and often do an imperfect job of tracking presenters in unrehearsed situations. This creates motivation for a robust, automated system that directs a video camera to follow a presenter as he or she walks anywhere in the front of a lecture hall or large conference room. Such a system is presented. The system consists of a commercial, off-the-shelf pan/tilt/zoom (PTZ) color video camera, a necklace of infrared LEDs and a linear photodiode array detector. Electronic output from the photodiode array is processed to generate the location of the LED necklace, which is worn by a human speaker. The computer controls the video camera movements to record video of the speaker. The speaker's vertical position and depth are assumed to remain relatively constant– the video camera is sent only panning (horizontal) movement commands. The LED necklace is flashed at 70Hz at a 50% duty cycle to provide noise-filtering capability. The benefit to using a photodiode array versus a standard video camera is its higher frame rate (4kHz vs. 60Hz). The higher frame rate allows for the filtering of infrared noise such as sunlight and indoor lighting–a capability absent from other tracking technologies. The system has been tested in a large lecture hall and is shown to be effective.

  8. Linear array of photodiodes to track a human speaker for video recording

    Science.gov (United States)

    DeTone, D.; Neal, H.; Lougheed, R.

    2012-12-01

    Communication and collaboration using stored digital media has garnered more interest by many areas of business, government and education in recent years. This is due primarily to improvements in the quality of cameras and speed of computers. An advantage of digital media is that it can serve as an effective alternative when physical interaction is not possible. Video recordings that allow for viewers to discern a presenter's facial features, lips and hand motions are more effective than videos that do not. To attain this, one must maintain a video capture in which the speaker occupies a significant portion of the captured pixels. However, camera operators are costly, and often do an imperfect job of tracking presenters in unrehearsed situations. This creates motivation for a robust, automated system that directs a video camera to follow a presenter as he or she walks anywhere in the front of a lecture hall or large conference room. Such a system is presented. The system consists of a commercial, off-the-shelf pan/tilt/zoom (PTZ) color video camera, a necklace of infrared LEDs and a linear photodiode array detector. Electronic output from the photodiode array is processed to generate the location of the LED necklace, which is worn by a human speaker. The computer controls the video camera movements to record video of the speaker. The speaker's vertical position and depth are assumed to remain relatively constant- the video camera is sent only panning (horizontal) movement commands. The LED necklace is flashed at 70Hz at a 50% duty cycle to provide noise-filtering capability. The benefit to using a photodiode array versus a standard video camera is its higher frame rate (4kHz vs. 60Hz). The higher frame rate allows for the filtering of infrared noise such as sunlight and indoor lighting-a capability absent from other tracking technologies. The system has been tested in a large lecture hall and is shown to be effective.

  9. ROBUST LOCALISATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES

    DEFF Research Database (Denmark)

    May, Tobias; Ma, Ning; Brown, Guy

    2015-01-01

    differences (ITDs) and interaural level dif- ferences (ILDs) in the front and rear hemifield, a machine hearing system is presented which combines supervised learning of binaural cues using multi-conditional training (MCT) with a head movement strategy. A systematic evaluation showed that this approach...

  10. The invisible minority: revisiting the debate on foreign-accented speakers and upward mobility in the workplace.

    Science.gov (United States)

    Akomolafe, Soji

    2013-01-01

    Of some of the major types of discrimination, the one that gets the least attention is national origin discrimination and in particular, accent discrimination, especially when it comes to upward mobility in the workplace. Yet, unlike other forms of discrimination, accent discrimination is rarely a subject of any robust public debate. This paper is a modest attempt to help establish a framework for understanding the relative neglect to which the discourse on accent discrimination has been subjected vis-a-vis the overall national debate on diversity. Hopefully, in the process, it will stimulate a more robust conversation on the plight of foreign-accented speakers.

  11. Studies on inter-speaker variability in speech and its application in ...

    Indian Academy of Sciences (India)

    tic representation of vowel realizations by different speakers. ... in regional background, education level and gender of speaker. A more ...... formal maps such as bilinear transform and its generalizations for speaker normalization. Since.

  12. Methods for robustness programming

    NARCIS (Netherlands)

    Olieman, N.J.

    2008-01-01

    Robustness of an object is defined as the probability that an object will have properties as required. Robustness Programming (RP) is a mathematical approach for Robustness estimation and Robustness optimisation. An example in the context of designing a food product, is finding the best composition

  13. Robustness in laying hens

    NARCIS (Netherlands)

    Star, L.

    2008-01-01

    The aim of the project ‘The genetics of robustness in laying hens’ was to investigate nature and regulation of robustness in laying hens under sub-optimal conditions and the possibility to increase robustness by using animal breeding without loss of production. At the start of the project, a robust

  14. Human and automatic speaker recognition over telecommunication channels

    CERN Document Server

    Fernández Gallardo, Laura

    2016-01-01

    This work addresses the evaluation of the human and the automatic speaker recognition performances under different channel distortions caused by bandwidth limitation, codecs, and electro-acoustic user interfaces, among other impairments. Its main contribution is the demonstration of the benefits of communication channels of extended bandwidth, together with an insight into how speaker-specific characteristics of speech are preserved through different transmissions. It provides sufficient motivation for considering speaker recognition as a criterion for the migration from narrowband to enhanced bandwidths, such as wideband and super-wideband.

  15. Robust System Identification and Control Design

    National Research Council Canada - National Science Library

    Zhou, Kemin

    2001-01-01

    ..., some advanced nonlinear control techniques including bifurcation stabilization and compressor stabilization techniques, model reduction techniques, fault detection and fault tolerant control methods...

  16. Fundamental frequency characteristics of Jordanian Arabic speakers.

    Science.gov (United States)

    Natour, Yaser S; Wingate, Judith M

    2009-09-01

    This study is the first in a series of investigations designed to test the acoustic characteristics of the normal Arabic voice. The subjects were three hundred normal Jordanian Arabic speakers (100 adult males, 100 adult females, and 100 children). The subjects produced a sustained phonation of the vowel /a:/ and stated their complete names (i.e. first, second, third and surname) using a carrier phrase. The samples were analyzed using the Multi Dimensional Voice Program (MDVP). Fundamental frequency (F0) from the /a:/ and speaking fundamental frequency (SF0) from the sentence were analyzed. Results revealed a significant difference of both F0 and SF0 values among adult Jordanian Arabic-speaking males (F0=131.34Hz +/- 18.65, SF0=137.45 +/- 18.93), females (F0=231.13Hz +/- 20.86, SF0=230.84 +/- 16.50) and children (F0=270.93Hz +/- 20.01, SF0=278.04 +/- 32.07). Comparison with other ethnicities indicated that F0 values of adult Jordanian Arabic-speaking males and females are generally consistent with adult Caucasian and African-American values. However, for Jordanian Arabic-speaking children, a higher trend in F0 values was present than their Western counterparts. SF0 values for adult Jordanian Arabic-speaking males are generally consistent with the adult Caucasian male SF0 values. However, SF0 values of adult Jordanian-speaking females and children were relatively higher than the reported Western values. It is recommended that speech-language pathologists in Arabic-speaking countries, Jordan in specific, utilize the new data provided (F0 and SF0) when evaluating and/or treating Arabic-speaking patients. Due to its cross-linguistic variability, SF0 emerged as a preferred measurement when conducting cross-cultural comparisons of voice features.

  17. Word level language identification in online multilingual communication

    NARCIS (Netherlands)

    Nguyen, Dong-Phuong; Dogruoz, A. Seza

    2013-01-01

    Multilingual speakers switch between languages in online and spoken communication. Analyses of large scale multilingual data require automatic language identification at the word level. For our experiments with multilingual online discussions, we first tag the language of individual words using

  18. a sociophonetic study of young nigerian english speakers

    African Journals Online (AJOL)

    Oladipupo

    between male and female speakers in boundary consonant deletion, (F(1, .... speech perception (Foulkes 2006, Clopper & Pisoni, 2005, Thomas 2002). ... in Nigeria, and had had the privilege of travelling to Europe and the Americas for the.

  19. Forensic Speaker Recognition Law Enforcement and Counter-Terrorism

    CERN Document Server

    Patil, Hemant

    2012-01-01

    Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...

  20. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  1. Electrophysiology of subject-verb agreement mediated by speakers' gender.

    Science.gov (United States)

    Hanulíková, Adriana; Carreiras, Manuel

    2015-01-01

    An important property of speech is that it explicitly conveys features of a speaker's identity such as age or gender. This event-related potential (ERP) study examined the effects of social information provided by a speaker's gender, i.e., the conceptual representation of gender, on subject-verb agreement. Despite numerous studies on agreement, little is known about syntactic computations generated by speaker characteristics extracted from the acoustic signal. Slovak is well suited to investigate this issue because it is a morphologically rich language in which agreement involves features for number, case, and gender. Grammaticality of a sentence can be evaluated by checking a speaker's gender as conveyed by his/her voice. We examined how conceptual information about speaker gender, which is not syntactic but rather social and pragmatic in nature, is interpreted for the computation of agreement patterns. ERP responses to verbs disagreeing with the speaker's gender (e.g., a sentence including a masculine verbal inflection spoken by a female person 'the neighbors were upset because I (∗)stoleMASC plums') elicited a larger early posterior negativity compared to correct sentences. When the agreement was purely syntactic and did not depend on the speaker's gender, a disagreement between a formally marked subject and the verb inflection (e.g., the womanFEM (∗)stoleMASC plums) resulted in a larger P600 preceded by a larger anterior negativity compared to the control sentences. This result is in line with proposals according to which the recruitment of non-syntactic information such as the gender of the speaker results in N400-like effects, while formally marked syntactic features lead to structural integration as reflected in a LAN/P600 complex.

  2. Understanding speaker attitudes from prosody by adults with Parkinson's disease.

    Science.gov (United States)

    Monetta, Laura; Cheang, Henry S; Pell, Marc D

    2008-09-01

    The ability to interpret vocal (prosodic) cues during social interactions can be disrupted by Parkinson's disease, with notable effects on how emotions are understood from speech. This study investigated whether PD patients who have emotional prosody deficits exhibit further difficulties decoding the attitude of a speaker from prosody. Vocally inflected but semantically nonsensical 'pseudo-utterances' were presented to listener groups with and without PD in two separate rating tasks. Task I required participants to rate how confident a speaker sounded from their voice and Task 2 required listeners to rate how polite the speaker sounded for a comparable set of pseudo-utterances. The results showed that PD patients were significantly less able than HC participants to use prosodic cues to differentiate intended levels of speaker confidence in speech, although the patients could accurately detect the politelimpolite attitude of the speaker from prosody in most cases. Our data suggest that many PD patients fail to use vocal cues to effectively infer a speaker's emotions as well as certain attitudes in speech such as confidence, consistent with the idea that the basal ganglia play a role in the meaningful processing of prosodic sequences in spoken language (Pell & Leonard, 2003).

  3. A fundamental residue pitch perception bias for tone language speakers

    Science.gov (United States)

    Petitti, Elizabeth

    A complex tone composed of only higher-order harmonics typically elicits a pitch percept equivalent to the tone's missing fundamental frequency (f0). When judging the direction of residue pitch change between two such tones, however, listeners may have completely opposite perceptual experiences depending on whether they are biased to perceive changes based on the overall spectrum or the missing f0 (harmonic spacing). Individual differences in residue pitch change judgments are reliable and have been associated with musical experience and functional neuroanatomy. Tone languages put greater pitch processing demands on their speakers than non-tone languages, and we investigated whether these lifelong differences in linguistic pitch processing affect listeners' bias for residue pitch. We asked native tone language speakers and native English speakers to perform a pitch judgment task for two tones with missing fundamental frequencies. Given tone pairs with ambiguous pitch changes, listeners were asked to judge the direction of pitch change, where the direction of their response indicated whether they attended to the overall spectrum (exhibiting a spectral bias) or the missing f0 (exhibiting a fundamental bias). We found that tone language speakers are significantly more likely to perceive pitch changes based on the missing f0 than English speakers. These results suggest that tone-language speakers' privileged experience with linguistic pitch fundamentally tunes their basic auditory processing.

  4. The effect of L1 prosodic backgrounds of Cantonese and Japanese speakers on the perception of Mandarin tones after training

    Science.gov (United States)

    So, Connie K.

    2005-04-01

    The present study investigated to what extent ones' L1 prosodic backgrounds affect their learning of a new tonal system. The question as to whether native speakers of a tone language perform differently from those of a pitch accent language will be addressed. Twenty native speakers of Hong Kong Cantonese (a tone language) and Japanese (a pitch accent language) were assigned to two groups. All of them had had no prior knowledge of Mandarin, and had never received any form of musical training before they participated in the study. Their performance of the identification of Mandarin tones before and after a short-term training was compared. Analysis of listeners' tonal confusions in the pretest, posttest, and generalization tests revealed that both Cantonese and Japanese listeners had more confusion for two contrastive tone pairs: Tone 1-Tone 4, and Tone 2-Tone 3. Moreover, Cantonese speakers consistently had greater difficulty than Japanese speakers in distinguishing the tones in each pair. These imply that listeners L1 prosodic backgrounds are at work during the process of learning a new tonal system. The findings will be further discussed in terms of the Perceptual Assimilation Model (Best, 1995). [Work supported by SSHRC.

  5. Perceptual Robust Design

    DEFF Research Database (Denmark)

    Pedersen, Søren Nygaard

    The research presented in this PhD thesis has focused on a perceptual approach to robust design. The results of the research and the original contribution to knowledge is a preliminary framework for understanding, positioning, and applying perceptual robust design. Product quality is a topic...... been presented. Therefore, this study set out to contribute to the understanding and application of perceptual robust design. To achieve this, a state-of-the-art and current practice review was performed. From the review two main research problems were identified. Firstly, a lack of tools...... for perceptual robustness was found to overlap with the optimum for functional robustness and at most approximately 2.2% out of the 14.74% could be ascribed solely to the perceptual robustness optimisation. In conclusion, the thesis have offered a new perspective on robust design by merging robust design...

  6. Brain Plasticity in Speech Training in Native English Speakers Learning Mandarin Tones

    Science.gov (United States)

    Heinzen, Christina Carolyn

    The current study employed behavioral and event-related potential (ERP) measures to investigate brain plasticity associated with second-language (L2) phonetic learning based on an adaptive computer training program. The program utilized the acoustic characteristics of Infant-Directed Speech (IDS) to train monolingual American English-speaking listeners to perceive Mandarin lexical tones. Behavioral identification and discrimination tasks were conducted using naturally recorded speech, carefully controlled synthetic speech, and non-speech control stimuli. The ERP experiments were conducted with selected synthetic speech stimuli in a passive listening oddball paradigm. Identical pre- and post- tests were administered on nine adult listeners, who completed two-to-three hours of perceptual training. The perceptual training sessions used pair-wise lexical tone identification, and progressed through seven levels of difficulty for each tone pair. The levels of difficulty included progression in speaker variability from one to four speakers and progression through four levels of acoustic exaggeration of duration, pitch range, and pitch contour. Behavioral results for the natural speech stimuli revealed significant training-induced improvement in identification of Tones 1, 3, and 4. Improvements in identification of Tone 4 generalized to novel stimuli as well. Additionally, comparison between discrimination of across-category and within-category stimulus pairs taken from a synthetic continuum revealed a training-induced shift toward more native-like categorical perception of the Mandarin lexical tones. Analysis of the Mismatch Negativity (MMN) responses in the ERP data revealed increased amplitude and decreased latency for pre-attentive processing of across-category discrimination as a result of training. There were also laterality changes in the MMN responses to the non-speech control stimuli, which could reflect reallocation of brain resources in processing pitch patterns

  7. Robustness Evaluation of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; čizmar, D.

    2010-01-01

    The present paper outlines results from working group 3 (WG3) in the EU COST Action E55 – ‘Modelling of the performance of timber structures’. The objectives of the project are related to the three main research activities: the identification and modelling of relevant load and environmental...... exposure scenarios, the improvement of knowledge concerning the behaviour of timber structural elements and the development of a generic framework for the assessment of the life-cycle vulnerability and robustness of timber structures....

  8. An introduction to application-independent evaluation of speaker recognition systems

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Brümmer, N.

    2007-01-01

    In the evaluation of speaker recognition systems - an important part of speaker classification [1], the trade-off between missed speakers and false alarms has always been an important diagnostic tool. NIST has defined the task of speaker detection with the associated Detection Cost Function (DCF) to

  9. Defining robustness protocols: a method to include and evaluate robustness in clinical plans

    International Nuclear Information System (INIS)

    McGowan, S E; Albertini, F; Lomax, A J; Thomas, S J

    2015-01-01

    We aim to define a site-specific robustness protocol to be used during the clinical plan evaluation process. Plan robustness of 16 skull base IMPT plans to systematic range and random set-up errors have been retrospectively and systematically analysed. This was determined by calculating the error-bar dose distribution (ebDD) for all the plans and by defining some metrics used to define protocols aiding the plan assessment. Additionally, an example of how to clinically use the defined robustness database is given whereby a plan with sub-optimal brainstem robustness was identified. The advantage of using different beam arrangements to improve the plan robustness was analysed. Using the ebDD it was found range errors had a smaller effect on dose distribution than the corresponding set-up error in a single fraction, and that organs at risk were most robust to the range errors, whereas the target was more robust to set-up errors. A database was created to aid planners in terms of plan robustness aims in these volumes. This resulted in the definition of site-specific robustness protocols. The use of robustness constraints allowed for the identification of a specific patient that may have benefited from a treatment of greater individuality. A new beam arrangement showed to be preferential when balancing conformality and robustness for this case. The ebDD and error-bar volume histogram proved effective in analysing plan robustness. The process of retrospective analysis could be used to establish site-specific robustness planning protocols in proton therapy. These protocols allow the planner to determine plans that, although delivering a dosimetrically adequate dose distribution, have resulted in sub-optimal robustness to these uncertainties. For these cases the use of different beam start conditions may improve the plan robustness to set-up and range uncertainties. (paper)

  10. Defining robustness protocols: a method to include and evaluate robustness in clinical plans

    Science.gov (United States)

    McGowan, S. E.; Albertini, F.; Thomas, S. J.; Lomax, A. J.

    2015-04-01

    We aim to define a site-specific robustness protocol to be used during the clinical plan evaluation process. Plan robustness of 16 skull base IMPT plans to systematic range and random set-up errors have been retrospectively and systematically analysed. This was determined by calculating the error-bar dose distribution (ebDD) for all the plans and by defining some metrics used to define protocols aiding the plan assessment. Additionally, an example of how to clinically use the defined robustness database is given whereby a plan with sub-optimal brainstem robustness was identified. The advantage of using different beam arrangements to improve the plan robustness was analysed. Using the ebDD it was found range errors had a smaller effect on dose distribution than the corresponding set-up error in a single fraction, and that organs at risk were most robust to the range errors, whereas the target was more robust to set-up errors. A database was created to aid planners in terms of plan robustness aims in these volumes. This resulted in the definition of site-specific robustness protocols. The use of robustness constraints allowed for the identification of a specific patient that may have benefited from a treatment of greater individuality. A new beam arrangement showed to be preferential when balancing conformality and robustness for this case. The ebDD and error-bar volume histogram proved effective in analysing plan robustness. The process of retrospective analysis could be used to establish site-specific robustness planning protocols in proton therapy. These protocols allow the planner to determine plans that, although delivering a dosimetrically adequate dose distribution, have resulted in sub-optimal robustness to these uncertainties. For these cases the use of different beam start conditions may improve the plan robustness to set-up and range uncertainties.

  11. A fully robust PARAFAC method for analyzing fluorescence data

    DEFF Research Database (Denmark)

    Engelen, Sanne; Frosch, Stina; Jørgensen, Bo

    2009-01-01

    and Rayleigh scatter. Recently, a robust PARAFAC method that circumvents the harmful effects of outlying samples has been developed. For removing the scatter effects on the final PARAFAC model, different techniques exist. Newly, an automated scatter identification tool has been constructed. However......, there still exists no robust method for handling fluorescence data encountering both outlying EEM landscapes and scatter. In this paper, we present an iterative algorithm where the robust PARAFAC method and the scatter identification tool are alternately performed. A fully automated robust PARAFAC method...

  12. Robustness of Structural Systems

    DEFF Research Database (Denmark)

    Canisius, T.D.G.; Sørensen, John Dalsgaard; Baker, J.W.

    2007-01-01

    The importance of robustness as a property of structural systems has been recognised following several structural failures, such as that at Ronan Point in 1968,where the consequenceswere deemed unacceptable relative to the initiating damage. A variety of research efforts in the past decades have...... attempted to quantify aspects of robustness such as redundancy and identify design principles that can improve robustness. This paper outlines the progress of recent work by the Joint Committee on Structural Safety (JCSS) to develop comprehensive guidance on assessing and providing robustness in structural...... systems. Guidance is provided regarding the assessment of robustness in a framework that considers potential hazards to the system, vulnerability of system components, and failure consequences. Several proposed methods for quantifying robustness are reviewed, and guidelines for robust design...

  13. Robust multivariate analysis

    CERN Document Server

    J Olive, David

    2017-01-01

    This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given.  The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory.   The robust techniques  are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis.  A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...

  14. Performance of wavelet analysis and neural networks for pathological voices identification

    Science.gov (United States)

    Salhi, Lotfi; Talbi, Mourad; Abid, Sabeur; Cherif, Adnane

    2011-09-01

    Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital 'RABTA' of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.

  15. Fluency profile: comparison between Brazilian and European Portuguese speakers.

    Science.gov (United States)

    Castro, Blenda Stephanie Alves e; Martins-Reis, Vanessa de Oliveira; Baptista, Ana Catarina; Celeste, Letícia Correa

    2014-01-01

    The purpose of the study was to compare the speech fluency of Brazilian Portuguese speakers with that of European Portuguese speakers. The study participants were 76 individuals of any ethnicity or skin color aged 18-29 years. Of the participants, 38 lived in Brazil and 38 in Portugal. Speech samples from all participants were obtained and analyzed according to the variables of typology and frequency of speech disruptions and speech rate. Descriptive and inferential statistical analyses were performed to assess the association between the fluency profile and linguistic variant variables. We found that the speech rate of European Portuguese speakers was higher than the speech rate of Brazilian Portuguese speakers in words per minute (p=0.004). The qualitative distribution of the typology of common dysfluencies (pPortuguese speakers is not available, speech therapists in Portugal can use the same speech fluency assessment as has been used in Brazil to establish a diagnosis of stuttering, especially in regard to typical and stuttering dysfluencies, with care taken when evaluating the speech rate.

  16. THE HUMOROUS SPEAKER: THE CONSTRUCTION OF ETHOS IN COMEDY

    Directory of Open Access Journals (Sweden)

    Maria Flávia Figueiredo

    2016-07-01

    Full Text Available The rhetoric is guided by three dimensions: logos, pathos and ethos. Logos is the speech itself, pathos are the passions that the speaker, through logos, awakens in his audience, and ethos is the image that the speaker creates of himself, also through logos, in front of an audience. The rhetorical genres are three: deliberative (which drives the audience or the judge to think about future events, characterizing them as convenient or harmful, judiciary (the audience thinks about past events in order to classify them as fair or unfair and epidictic (the audience will judge any fact occurred, or even the character of a person as beautiful or not. According to Figueiredo (2014 and based on Eggs (2005, we advocate that ethos is not a mark left by the speaker only in rhetorical genres, but in any textual genre, once the result of human production, the simplest choices in textual construction, are able to reproduce something that is closely linked to speaker, thus, demarcating hir/her ethos. To verify this assumption, we selected a display of a video of the comedian Danilo Gentili, which will be examined in the light of Rhetoric and Textual Linguistics. So, our objective is to find, in the stand-up comedy genre, marks left by the speaker in the speech that characterizes his/her ethos. The analysis results show that ethos, discursive genre and communicational purpose amalgamate in an indissoluble complex in which the success of one of them interdepends on how the other was built.

  17. Direct Speaker Gaze Promotes Trust in Truth-Ambiguous Statements.

    Science.gov (United States)

    Kreysa, Helene; Kessler, Luise; Schweinberger, Stefan R

    2016-01-01

    A speaker's gaze behaviour can provide perceivers with a multitude of cues which are relevant for communication, thus constituting an important non-verbal interaction channel. The present study investigated whether direct eye gaze of a speaker affects the likelihood of listeners believing truth-ambiguous statements. Participants were presented with videos in which a speaker produced such statements with either direct or averted gaze. The statements were selected through a rating study to ensure that participants were unlikely to know a-priori whether they were true or not (e.g., "sniffer dogs cannot smell the difference between identical twins"). Participants indicated in a forced-choice task whether or not they believed each statement. We found that participants were more likely to believe statements by a speaker looking at them directly, compared to a speaker with averted gaze. Moreover, when participants disagreed with a statement, they were slower to do so when the statement was uttered with direct (compared to averted) gaze, suggesting that the process of rejecting a statement as untrue may be inhibited when that statement is accompanied by direct gaze.

  18. Direct Speaker Gaze Promotes Trust in Truth-Ambiguous Statements.

    Directory of Open Access Journals (Sweden)

    Helene Kreysa

    Full Text Available A speaker's gaze behaviour can provide perceivers with a multitude of cues which are relevant for communication, thus constituting an important non-verbal interaction channel. The present study investigated whether direct eye gaze of a speaker affects the likelihood of listeners believing truth-ambiguous statements. Participants were presented with videos in which a speaker produced such statements with either direct or averted gaze. The statements were selected through a rating study to ensure that participants were unlikely to know a-priori whether they were true or not (e.g., "sniffer dogs cannot smell the difference between identical twins". Participants indicated in a forced-choice task whether or not they believed each statement. We found that participants were more likely to believe statements by a speaker looking at them directly, compared to a speaker with averted gaze. Moreover, when participants disagreed with a statement, they were slower to do so when the statement was uttered with direct (compared to averted gaze, suggesting that the process of rejecting a statement as untrue may be inhibited when that statement is accompanied by direct gaze.

  19. Quantile Acoustic Vectors vs. MFCC Applied to Speaker Verification

    Directory of Open Access Journals (Sweden)

    Mayorga-Ortiz Pedro

    2014-02-01

    Full Text Available In this paper we describe speaker and command recognition related experiments, through quantile vectors and Gaussian Mixture Modelling (GMM. Over the past several years GMM and MFCC have become two of the dominant approaches for modelling speaker and speech recognition applications. However, memory and computational costs are important drawbacks, because autonomous systems suffer processing and power consumption constraints; thus, having a good trade-off between accuracy and computational requirements is mandatory. We decided to explore another approach (quantile vectors in several tasks and a comparison with MFCC was made. Quantile acoustic vectors are proposed for speaker verification and command recognition tasks and the results showed very good recognition efficiency. This method offered a good trade-off between computation times, characteristics vector complexity and overall achieved efficiency.

  20. Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2017-02-01

    Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.

  1. Robustness of Structures

    DEFF Research Database (Denmark)

    Faber, Michael Havbro; Vrouwenvelder, A.C.W.M.; Sørensen, John Dalsgaard

    2011-01-01

    In 2005, the Joint Committee on Structural Safety (JCSS) together with Working Commission (WC) 1 of the International Association of Bridge and Structural Engineering (IABSE) organized a workshop on robustness of structures. Two important decisions resulted from this workshop, namely...... ‘COST TU0601: Robustness of Structures’ was initiated in February 2007, aiming to provide a platform for exchanging and promoting research in the area of structural robustness and to provide a basic framework, together with methods, strategies and guidelines enhancing robustness of structures...... the development of a joint European project on structural robustness under the COST (European Cooperation in Science and Technology) programme and the decision to develop a more elaborate document on structural robustness in collaboration between experts from the JCSS and the IABSE. Accordingly, a project titled...

  2. Robust Growth Determinants

    OpenAIRE

    Doppelhofer, Gernot; Weeks, Melvyn

    2011-01-01

    This paper investigates the robustness of determinants of economic growth in the presence of model uncertainty, parameter heterogeneity and outliers. The robust model averaging approach introduced in the paper uses a flexible and parsi- monious mixture modeling that allows for fat-tailed errors compared to the normal benchmark case. Applying robust model averaging to growth determinants, the paper finds that eight out of eighteen variables found to be significantly related to economic growth ...

  3. Robust Programming by Example

    OpenAIRE

    Bishop , Matt; Elliott , Chip

    2011-01-01

    Part 2: WISE 7; International audience; Robust programming lies at the heart of the type of coding called “secure programming”. Yet it is rarely taught in academia. More commonly, the focus is on how to avoid creating well-known vulnerabilities. While important, that misses the point: a well-structured, robust program should anticipate where problems might arise and compensate for them. This paper discusses one view of robust programming and gives an example of how it may be taught.

  4. Wavelet packet transform-based robust video watermarking technique

    Indian Academy of Sciences (India)

    If any conflict happens to the copyright identification and authentication, ... the present work is concentrated on the robust digital video watermarking. .... the wavelet decomposition, resulting in a new family of orthonormal bases for function ...

  5. "Necesita una vacuna": what Spanish-speakers want in text-message immunization reminders.

    Science.gov (United States)

    Ahlers-Schmidt, Carolyn R; Chesser, Amy; Brannon, Jennifer; Lopez, Venessa; Shah-Haque, Sapna; Williams, Katherine; Hart, Traci

    2013-08-01

    Appointment reminders help parents deal with complex immunization schedules. Preferred content of text-message reminders has been identified for English-speakers. Spanish-speaking parents of children under three years old were recruited to develop Spanish text-message immunization reminders. Structured interviews included questions about demographic characteristics, use of technology, and willingness to receive text reminders. Each participant was assigned to one user-centered design (UCD) test: card sort, needs analysis or comprehension testing. Respondents (N=54) were female (70%) and averaged 27 years of age (SD=7). A card sort of 20 immunization-related statements resulted in identification of seven pieces of critical information, which were compiled into eight example texts. These texts were ranked in the needs assessment and the top two were assessed for comprehension. All participants were able to understand the content and describe intention to act. Utilizing UCD testing, Spanish-speakers identified short, specific text content that differed from preferred content of English-speaking parents.

  6. Robust procedures in chemometrics

    DEFF Research Database (Denmark)

    Kotwa, Ewelina

    properties of the analysed data. The broad theoretical background of robust procedures was given as a very useful supplement to the classical methods, and a new tool, based on robust PCA, aiming at identifying Rayleigh and Raman scatters in excitation-mission (EEM) data was developed. The results show...

  7. Online Matchmaking: It's Not Just for Dating Sites Anymore! Connecting the Climate Voices Science Speakers Network to Educators

    Science.gov (United States)

    Wegner, Kristin; Herrin, Sara; Schmidt, Cynthia

    2015-01-01

    Scientists play an integral role in the development of climate literacy skills - for both teachers and students alike. By partnering with local scientists, teachers can gain valuable insights into the science practices highlighted by the Next Generation Science Standards (NGSS), as well as a deeper understanding of cutting-edge scientific discoveries and local impacts of climate change. For students, connecting to local scientists can provide a relevant connection to climate science and STEM skills. Over the past two years, the Climate Voices Science Speakers Network (climatevoices.org) has grown to a robust network of nearly 400 climate science speakers across the United States. Formal and informal educators, K-12 students, and community groups connect with our speakers through our interactive map-based website and invite them to meet through face-to-face and virtual presentations, such as webinars and podcasts. But creating a common language between scientists and educators requires coaching on both sides. In this presentation, we will present the "nitty-gritty" of setting up scientist-educator collaborations, as well as the challenges and opportunities that arise from these partnerships. We will share the impact of these collaborations through case studies, including anecdotal feedback and metrics.

  8. The Space-Time Topography of English Speakers

    Science.gov (United States)

    Duman, Steve

    2016-01-01

    English speakers talk and think about Time in terms of physical space. The past is behind us, and the future is in front of us. In this way, we "map" space onto Time. This dissertation addresses the specificity of this physical space, or its topography. Inspired by languages like Yupno (Nunez, et al., 2012) and Bamileke-Dschang (Hyman,…

  9. Within the School and the Community--A Speaker's Bureau.

    Science.gov (United States)

    McClintock, Joy H.

    Student interest prompted the formation of a Speaker's Bureau in Seminole Senior High School, Seminole, Florida. First, students compiled a list of community contacts, including civic clubs, churches, retirement villages, newspaper offices, and the County School Administration media center. A letter of introduction was composed and speaking…

  10. Gesturing by Speakers with Aphasia: How Does It Compare?

    Science.gov (United States)

    Mol, Lisette; Krahmer, Emiel; van de Sandt-Koenderman, Mieke

    2013-01-01

    Purpose: To study the independence of gesture and verbal language production. The authors assessed whether gesture can be semantically compensatory in cases of verbal language impairment and whether speakers with aphasia and control participants use similar depiction techniques in gesture. Method: The informativeness of gesture was assessed in 3…

  11. Sentence comprehension in Swahili-English bilingual agrammatic speakers

    NARCIS (Netherlands)

    Abuom, Tom O.; Shah, Emmah; Bastiaanse, Roelien

    For this study, sentence comprehension was tested in Swahili-English bilingual agrammatic speakers. The sentences were controlled for four factors: (1) order of the arguments (base vs. derived); (2) embedding (declarative vs. relative sentences); (3) overt use of the relative pronoun "who"; (4)

  12. Schizophrenia among Sesotho speakers in South Africa | Mosotho ...

    African Journals Online (AJOL)

    Results: Core symptoms of schizophrenia among Sesotho speakers do not differ significantly from other cultures. However, the content of psychological symptoms such as delusions and hallucinations is strongly affected by cultural variables. Somatic symptoms such as headaches, palpitations, dizziness and excessive ...

  13. Native Speakers' Perception of Non-Native English Speech

    Science.gov (United States)

    Jaber, Maysa; Hussein, Riyad F.

    2011-01-01

    This study is aimed at investigating the rating and intelligibility of different non-native varieties of English, namely French English, Japanese English and Jordanian English by native English speakers and their attitudes towards these foreign accents. To achieve the goals of this study, the researchers used a web-based questionnaire which…

  14. Openings and Closings in Telephone Conversations between Native Spanish Speakers.

    Science.gov (United States)

    Coronel-Molina, Serafin M.

    1998-01-01

    A study analyzed the opening and closing sequences of 11 dyads of native Spanish-speakers in natural telephone conversations conducted in Spanish. The objective was to determine how closely Hispanic cultural patterns of conduct for telephone conversations follow the sequences outlined in previous research. It is concluded that Spanish…

  15. Why reference to the past is difficult for agrammatic speakers

    NARCIS (Netherlands)

    Bastiaanse, Roelien

    Many studies have shown that verb inflections are difficult to produce for agrammatic aphasic speakers: they are frequently omitted and substituted. The present article gives an overview of our search to understanding why this is the case. The hypothesis is that grammatical morphology referring to

  16. Speaker Recognition from Emotional Speech Using I-vector Approach

    Directory of Open Access Journals (Sweden)

    MACKOVÁ Lenka

    2014-05-01

    Full Text Available In recent years the concept of i-vectors become very popular and successful in the field of the speaker verification. The basic principle of i-vectors is that each utterance is represented by fixed-length feature vector of low-dimension. In the literature for purpose of speaker verification various recordings obtained from telephones or microphones were used. The aim of this experiment was to perform speaker verification using speaker model trained with emotional recordings on i-vector basis. The Mel Frequency Cepstral Coefficients (MFCC, log energy, their deltas and acceleration coefficients were used in process of features extraction. As the classification methods of the verification system Mahalanobis distance metric in combination with Eigen Factor Radial normalization was used and in the second approach Cosine Distance Scoring (CSS metric with Within-class Covariance Normalization as a channel compensation was employed. This verification system used emotional recordings of male subjects from freely available German emotional database (Emo-DB.

  17. Teaching the Native English Speaker How to Teach English

    Science.gov (United States)

    Odhuu, Kelli

    2014-01-01

    This article speaks to teachers who have been paired with native speakers (NSs) who have never taught before, and the feelings of frustration, discouragement, and nervousness on the teacher's behalf that can occur as a result. In order to effectively tackle this situation, teachers need to work together with the NSs. Teachers in this scenario…

  18. Google Home: smart speaker as environmental control unit.

    Science.gov (United States)

    Noda, Kenichiro

    2017-08-23

    Environmental Control Units (ECU) are devices or a system that allows a person to control appliances in their home or work environment. Such system can be utilized by clients with physical and/or functional disability to enhance their ability to control their environment, to promote independence and improve their quality of life. Over the last several years, there have been an emergence of several inexpensive, commercially-available, voice activated smart speakers into the market such as Google Home and Amazon Echo. These smart speakers are equipped with far field microphone that supports voice recognition, and allows for complete hand-free operation for various purposes, including for playing music, for information retrieval, and most importantly, for environmental control. Clients with disability could utilize these features to turn the unit into a simple ECU that is completely voice activated and wirelessly connected to appliances. Smart speakers, with their ease of setup, low cost and versatility, may be a more affordable and accessible alternative to the traditional ECU. Implications for Rehabilitation Environmental Control Units (ECU) enable independence for physically and functionally disabled clients, and reduce burden and frequency of demands on carers. Traditional ECU can be costly and may require clients to learn specialized skills to use. Smart speakers have the potential to be used as a new-age ECU by overcoming these barriers, and can be used by a wider range of clients.

  19. Evidential Uses in the Spanish of Quechua Speakers in Peru.

    Science.gov (United States)

    Escobar, Anna Maria

    1994-01-01

    Analysis of recordings of spontaneous speech of native speakers of Quechua speaking Spanish as a second language reveals that, using verbal morphological resources of Spanish, they have grammaticalized an epistemic marking system resembling that of Quechua. Sources of this process in both Quechua and Spanish are analyzed. (MSE)

  20. Bilingual and Monolingual Children Prefer Native-Accented Speakers

    Directory of Open Access Journals (Sweden)

    Andre L. eSouza

    2013-12-01

    Full Text Available Adults and young children prefer to affiliate with some individuals rather than others. Studies have shown that monolingual children show in-group biases for individuals who speak their native language without a foreign accent (Kinzler, Dupoux, & Spelke, 2007. Some studies have suggested that bilingual children are less influenced than monolinguals by language variety when attributing personality traits to different speakers (Anisfeld & Lambert, 1964, which could indicate that bilinguals have fewer in-group biases and perhaps greater social flexibility. However, no previous studies have compared monolingual and bilingual children’s reactions to speakers with unfamiliar foreign accents. In the present study, we investigated the social preferences of 5-year-old English and French monolinguals and English-French bilinguals. Contrary to our predictions, both monolingual and bilingual preschoolers preferred to be friends with native-accented speakers over speakers who spoke their dominant language with an unfamiliar foreign accent. This result suggests that both monolingual and bilingual children have strong preferences for in-group members who use a familiar language variety, and that bilingualism does not lead to generalized social flexibility.

  1. Umesh V Waghmare | Speakers | Indian Academy of Sciences

    Indian Academy of Sciences (India)

    Umesh V Waghmare. Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur P.O., Bangalore 560 064, ... These ideas apply quite well to dynamical structure of a crystal, as described by the dispersion of its phonons or vibrational waves. The speakers group has shown an interesting ...

  2. An evidence-based rehabilitation program for tracheoesophageal speakers

    NARCIS (Netherlands)

    Jongmans, P.; Rossum, M.; As-Brooks, C.; Hilgers, F.; Pols, L.; Hilgers, F.J.M.; Pols, L.C.W.; van Rossum, M.; van den Brekel, M.W.M.

    2008-01-01

    Objectives: to develop an evidence-based therapy program aimed at improving tracheoesophageal speech intelligibility. The therapy program is based on particular problems found for TE speakers in a previous study as performed by the authors. Patients/Materials and Methods: 9 male laryngectomized

  3. The Blame Game: Performance Analysis of Speaker Diarization System Components

    NARCIS (Netherlands)

    Huijbregts, M.A.H.; Wooters, Chuck

    2007-01-01

    In this paper we discuss the performance analysis of a speaker diarization system similar to the system that was submitted by ICSI at the NIST RT06s evaluation benchmark. The analysis that is based on a series of oracle experiments, provides a good understanding of the performance of each system

  4. Bilingual and monolingual children prefer native-accented speakers.

    Science.gov (United States)

    Souza, André L; Byers-Heinlein, Krista; Poulin-Dubois, Diane

    2013-01-01

    Adults and young children prefer to affiliate with some individuals rather than others. Studies have shown that monolingual children show in-group biases for individuals who speak their native language without a foreign accent (Kinzler et al., 2007). Some studies have suggested that bilingual children are less influenced than monolinguals by language variety when attributing personality traits to different speakers (Anisfeld and Lambert, 1964), which could indicate that bilinguals have fewer in-group biases and perhaps greater social flexibility. However, no previous studies have compared monolingual and bilingual children's reactions to speakers with unfamiliar foreign accents. In the present study, we investigated the social preferences of 5-year-old English and French monolinguals and English-French bilinguals. Contrary to our predictions, both monolingual and bilingual preschoolers preferred to be friends with native-accented speakers over speakers who spoke their dominant language with an unfamiliar foreign accent. This result suggests that both monolingual and bilingual children have strong preferences for in-group members who use a familiar language variety, and that bilingualism does not lead to generalized social flexibility.

  5. Evaluating acoustic speaker normalization algorithms: evidence from longitudinal child data.

    Science.gov (United States)

    Kohn, Mary Elizabeth; Farrington, Charlie

    2012-03-01

    Speaker vowel formant normalization, a technique that controls for variation introduced by physical differences between speakers, is necessary in variationist studies to compare speakers of different ages, genders, and physiological makeup in order to understand non-physiological variation patterns within populations. Many algorithms have been established to reduce variation introduced into vocalic data from physiological sources. The lack of real-time studies tracking the effectiveness of these normalization algorithms from childhood through adolescence inhibits exploration of child participation in vowel shifts. This analysis compares normalization techniques applied to data collected from ten African American children across five time points. Linear regressions compare the reduction in variation attributable to age and gender for each speaker for the vowels BEET, BAT, BOT, BUT, and BOAR. A normalization technique is successful if it maintains variation attributable to a reference sociolinguistic variable, while reducing variation attributable to age. Results indicate that normalization techniques which rely on both a measure of central tendency and range of the vowel space perform best at reducing variation attributable to age, although some variation attributable to age persists after normalization for some sections of the vowel space. © 2012 Acoustical Society of America

  6. Reading and Vocabulary Recommendations for Spanish for Native Speakers Materials.

    Science.gov (United States)

    Spencer, Laura Gutierrez

    1995-01-01

    Focuses on the need for appropriate materials to address the needs of native speakers of Spanish who study Spanish in American universities and high schools. The most important factors influencing the selection of readings should include the practical nature of themes for reading and vocabulary development, level of difficulty, and variety in…

  7. Application of Native Speaker Models for Identifying Deviations in Rhetorical Moves in Non-Native Speaker Manuscripts

    Directory of Open Access Journals (Sweden)

    Assef Khalili

    2016-06-01

    Full Text Available Introduction: Explicit teaching of generic conventions of a text genre, usually extracted from native-speaker (NS manuscripts, has long been emphasized in the teaching of Academic Writing inEnglish for Specific Purposes (henceforthESP classes, both in theory and practice. While consciousness-raising about rhetorical structure can be instrumental to non-native speakers(NNS, it has to be admitted that most works done in the field of ESP have tended to focus almost exclusively on native-speaker (NS productions, giving scant attention to non-native speaker (NNS manuscripts. That is, having outlined established norms for good writing on the basis of NS productions, few have been inclined to provide a descriptive account of NNS attempts at trying to produce a research article (RA in English. That is what we have tried to do in the present research. Methods: We randomly selected 20 RAs in dentistry and used two well-established models for results and discussion sections to try to describe the move structure of these articles and show the points of divergence from the established norms. Results: The results pointed to significant divergences that could seriously compromise the quality of an RA. Conclusion: It is believed that the insights gained on the deviations in NNS manuscripts could prove very useful in designing syllabi for ESP classes.

  8. Robustness Beamforming Algorithms

    Directory of Open Access Journals (Sweden)

    Sajad Dehghani

    2014-04-01

    Full Text Available Adaptive beamforming methods are known to degrade in the presence of steering vector and covariance matrix uncertinity. In this paper, a new approach is presented to robust adaptive minimum variance distortionless response beamforming make robust against both uncertainties in steering vector and covariance matrix. This method minimize a optimization problem that contains a quadratic objective function and a quadratic constraint. The optimization problem is nonconvex but is converted to a convex optimization problem in this paper. It is solved by the interior-point method and optimum weight vector to robust beamforming is achieved.

  9. Comparative Analysys of Speech Parameters for the Design of Speaker Verification Systems

    National Research Council Canada - National Science Library

    Souza, A

    2001-01-01

    Speaker verification systems are basically composed of three stages: feature extraction, feature processing and comparison of the modified features from speaker voice and from the voice that should be...

  10. Key-note speaker: Predictors of weight loss after preventive Health consultations

    DEFF Research Database (Denmark)

    Lous, Jørgen; Freund, Kirsten S

    2018-01-01

    Invited key-note speaker ved conferencen: Preventive Medicine and Public Health Conference 2018, July 16-17, London.......Invited key-note speaker ved conferencen: Preventive Medicine and Public Health Conference 2018, July 16-17, London....

  11. Robustness Metrics: Consolidating the multiple approaches to quantify Robustness

    DEFF Research Database (Denmark)

    Göhler, Simon Moritz; Eifler, Tobias; Howard, Thomas J.

    2016-01-01

    robustness metrics; 3) Functional expectancy and dispersion robustness metrics; and 4) Probability of conformance robustness metrics. The goal was to give a comprehensive overview of robustness metrics and guidance to scholars and practitioners to understand the different types of robustness metrics...

  12. Robustness of Structures

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2008-01-01

    This paper describes the background of the robustness requirements implemented in the Danish Code of Practice for Safety of Structures and in the Danish National Annex to the Eurocode 0, see (DS-INF 146, 2003), (DS 409, 2006), (EN 1990 DK NA, 2007) and (Sørensen and Christensen, 2006). More...... frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure combined with increased requirements to efficiency in design and execution followed by increased risk of human errors has made the need of requirements to robustness of new structures essential....... According to Danish design rules robustness shall be documented for all structures in high consequence class. The design procedure to document sufficient robustness consists of: 1) Review of loads and possible failure modes / scenarios and determination of acceptable collapse extent; 2) Review...

  13. Robustness of structures

    DEFF Research Database (Denmark)

    Vrouwenvelder, T.; Sørensen, John Dalsgaard

    2009-01-01

    After the collapse of the World Trade Centre towers in 2001 and a number of collapses of structural systems in the beginning of the century, robustness of structural systems has gained renewed interest. Despite many significant theoretical, methodical and technological advances, structural...... of robustness for structural design such requirements are not substantiated in more detail, nor have the engineering profession been able to agree on an interpretation of robustness which facilitates for its uantification. A European COST action TU 601 on ‘Robustness of structures' has started in 2007...... by a group of members of the CSS. This paper describes the ongoing work in this action, with emphasis on the development of a theoretical and risk based quantification and optimization procedure on the one side and a practical pre-normative guideline on the other....

  14. Robust Approaches to Forecasting

    OpenAIRE

    Jennifer Castle; David Hendry; Michael P. Clements

    2014-01-01

    We investigate alternative robust approaches to forecasting, using a new class of robust devices, contrasted with equilibrium correction models. Their forecasting properties are derived facing a range of likely empirical problems at the forecast origin, including measurement errors, implulses, omitted variables, unanticipated location shifts and incorrectly included variables that experience a shift. We derive the resulting forecast biases and error variances, and indicate when the methods ar...

  15. Robustness - theoretical framework

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard; Rizzuto, Enrico; Faber, Michael H.

    2010-01-01

    More frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure combined with increased requirements to efficiency in design and execution followed by increased risk of human errors has made the need of requirements to robustness of new struct...... of this fact sheet is to describe a theoretical and risk based framework to form the basis for quantification of robustness and for pre-normative guidelines....

  16. Qualitative Robustness in Estimation

    Directory of Open Access Journals (Sweden)

    Mohammed Nasser

    2012-07-01

    Full Text Available Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Times New Roman","serif";} Qualitative robustness, influence function, and breakdown point are three main concepts to judge an estimator from the viewpoint of robust estimation. It is important as well as interesting to study relation among them. This article attempts to present the concept of qualitative robustness as forwarded by first proponents and its later development. It illustrates intricacies of qualitative robustness and its relation with consistency, and also tries to remove commonly believed misunderstandings about relation between influence function and qualitative robustness citing some examples from literature and providing a new counter-example. At the end it places a useful finite and a simulated version of   qualitative robustness index (QRI. In order to assess the performance of the proposed measures, we have compared fifteen estimators of correlation coefficient using simulated as well as real data sets.

  17. Race in Conflict with Heritage: "Black" Heritage Language Speaker of Japanese

    Science.gov (United States)

    Doerr, Neriko Musha; Kumagai, Yuri

    2014-01-01

    "Heritage language speaker" is a relatively new term to denote minority language speakers who grew up in a household where the language was used or those who have a family, ancestral, or racial connection to the minority language. In research on heritage language speakers, overlap between these 2 definitions is often assumed--that is,…

  18. Defining "Native Speaker" in Multilingual Settings: English as a Native Language in Asia

    Science.gov (United States)

    Hansen Edwards, Jette G.

    2017-01-01

    The current study examines how and why speakers of English from multilingual contexts in Asia are identifying as native speakers of English. Eighteen participants from different contexts in Asia, including Singapore, Malaysia, India, Taiwan, and The Philippines, who self-identified as native speakers of English participated in hour-long interviews…

  19. Children's Understanding That Utterances Emanate from Minds: Using Speaker Belief To Aid Interpretation.

    Science.gov (United States)

    Mitchell, Peter; Robinson, Elizabeth J.; Thompson, Doreen E.

    1999-01-01

    Three experiments examined 3- to 6-year olds' ability to use a speaker's utterance based on false belief to identify which of several referents was intended. Found that many 4- to 5-year olds performed correctly only when it was unnecessary to consider the speaker's belief. When the speaker gave an ambiguous utterance, many 3- to 6-year olds…

  20. The (TNO) Speaker Diarization System for NIST Rich Transcription Evaluation 2005 for meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van

    2005-01-01

    Abstract. The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as

  1. The TNO speaker diarization system for NIST RT05s meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van

    2006-01-01

    The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well.

  2. An acoustic analysis of English vowels produced by speakers of seven different native-language backgrounds

    NARCIS (Netherlands)

    Heuven, van V.J.J.P.; Gooskens, C.

    2017-01-01

    We measured F1, F2 and duration of ten English monophthongs produced by American native speakers and by Danish, Norwegian, Swedish, Dutch, Hungarian and Chinese L2 speakers. We hypothesized that (i) L2 speakers would approximate the English vowels more closely as the phonological distance between

  3. Willing Learners yet Unwilling Speakers in ESL Classrooms

    Directory of Open Access Journals (Sweden)

    Zuraidah Ali

    2007-12-01

    Full Text Available To some of us, speech production in ESL has become so natural and integral that we seem to take it for granted. We often do not even remember how we struggled through the initial process of mastering English. Unfortunately, to students who are still learning English, they seem to face myriad problems that make them appear unwilling or reluctant ESL speakers. This study will investigate this phenomenon which is very common in the ESL classroom. Setting its background on related research findings on this matter, a qualitative study was conducted among foreign students enrolled in the Intensive English Programme (IEP at Institute of Liberal Studies (IKAL, University Tenaga Nasional (UNITEN. The results will show and discuss an extent of truth behind this perplexing phenomenon: willing learners, yet unwilling speakers of ESL, in our effort to provide supportive learning cultures in second language acquisition (SLA to this group of students.

  4. Comparing headphone and speaker effects on simulated driving.

    Science.gov (United States)

    Nelson, T M; Nilsson, T H

    1990-12-01

    Twelve persons drove for three hours in an automobile simulator while listening to music at sound level 63dB over stereo headphones during one session and from a dashboard speaker during another session. They were required to steer a mountain highway, maintain a certain indicated speed, shift gears, and respond to occasional hazards. Steering and speed control were dependent on visual cues. The need to shift and the hazards were indicated by sound and vibration effects. With the headphones, the driver's average reaction time for the most complex task presented--shifting gears--was about one-third second longer than with the speaker. The use of headphones did not delay the development of subjective fatigue.

  5. Variation among heritage speakers: Sequential vs. simultaneous bilinguals

    Directory of Open Access Journals (Sweden)

    Teresa Lee

    2013-08-01

    Full Text Available This study examines the differences in the grammatical knowledge of two types of heritage speakers of Korean. Early simultaneous bilinguals are exposed to both English and the heritage language from birth, whereas early sequential bilinguals are exposed to the heritage language first and then to English upon schooling. A listening comprehension task involving relative clauses was conducted with 51 beginning-level Korean heritage speakers. The results showed that the early sequential bilinguals exhibited much more accurate knowledge than the early simultaneous bilinguals, who lacked rudimentary knowledge of Korean relative clauses. Drawing on the findings of adult and child Korean L1 data on the acquisition of relative clauses, the performance of each group is discussed with respect to attrition and incomplete acquisition of the heritage language.

  6. Facial Expression Generation from Speaker's Emotional States in Daily Conversation

    Science.gov (United States)

    Mori, Hiroki; Ohshima, Koh

    A framework for generating facial expressions from emotional states in daily conversation is described. It provides a mapping between emotional states and facial expressions, where the former is represented by vectors with psychologically-defined abstract dimensions, and the latter is coded by the Facial Action Coding System. In order to obtain the mapping, parallel data with rated emotional states and facial expressions were collected for utterances of a female speaker, and a neural network was trained with the data. The effectiveness of proposed method is verified by a subjective evaluation test. As the result, the Mean Opinion Score with respect to the suitability of generated facial expression was 3.86 for the speaker, which was close to that of hand-made facial expressions.

  7. English exposed common mistakes made by Chinese speakers

    CERN Document Server

    Hart, Steve

    2017-01-01

    Having analysed the most common English errors made in over 600 academic papers written by Chinese undergraduates, postgraduates, and researchers, Steve Hart has written an essential, practical guide specifically for the native Chinese speaker on how to write good academic English. English Exposed: Common Mistakes Made by Chinese Speakers is divided into three main sections. The first section examines errors made with verbs, nouns, prepositions, and other grammatical classes of words. The second section focuses on problems of word choice. In addition to helping the reader find the right word, it provides instruction for selecting the right style too. The third section covers a variety of other areas essential for the academic writer, such as using punctuation, adding appropriate references, referring to tables and figures, and selecting among various English date and time phrases. Using English Exposed will allow a writer to produce material where content and ideas-not language mistakes-speak the loudest.

  8. Teaching English to speakers of other languages an introduction

    CERN Document Server

    Nunan, David

    2015-01-01

    David Nunan's dynamic learner-centered teaching style has informed and inspired countless TESOL educators around the world. In this fresh, straightforward introduction to teaching English to speakers of other languages he presents teaching techniques and procedures along with the underlying theory and principles. Complex theories and research studies are explained in a clear and comprehensible, yet non-trivial, manner without trivializing them. Practical examples of how to develop teaching materials and tasks from sound principles provide rich illustrations of theoretical constructs.

  9. Neural bases of congenital amusia in tonal language speakers.

    Science.gov (United States)

    Zhang, Caicai; Peng, Gang; Shao, Jing; Wang, William S-Y

    2017-03-01

    Congenital amusia is a lifelong neurodevelopmental disorder of fine-grained pitch processing. In this fMRI study, we examined the neural bases of congenial amusia in speakers of a tonal language - Cantonese. Previous studies on non-tonal language speakers suggest that the neural deficits of congenital amusia lie in the music-selective neural circuitry in the right inferior frontal gyrus (IFG). However, it is unclear whether this finding can generalize to congenital amusics in tonal languages. Tonal language experience has been reported to shape the neural processing of pitch, which raises the question of how tonal language experience affects the neural bases of congenital amusia. To investigate this question, we examined the neural circuitries sub-serving the processing of relative pitch interval in pitch-matched Cantonese level tone and musical stimuli in 11 Cantonese-speaking amusics and 11 musically intact controls. Cantonese-speaking amusics exhibited abnormal brain activities in a widely distributed neural network during the processing of lexical tone and musical stimuli. Whereas the controls exhibited significant activation in the right superior temporal gyrus (STG) in the lexical tone condition and in the cerebellum regardless of the lexical tone and music conditions, no activation was found in the amusics in those regions, which likely reflects a dysfunctional neural mechanism of relative pitch processing in the amusics. Furthermore, the amusics showed abnormally strong activation of the right middle frontal gyrus and precuneus when the pitch stimuli were repeated, which presumably reflect deficits of attending to repeated pitch stimuli or encoding them into working memory. No significant group difference was found in the right IFG in either the whole-brain analysis or region-of-interest analysis. These findings imply that the neural deficits in tonal language speakers might differ from those in non-tonal language speakers, and overlap partly with the

  10. MTGAN: Speaker Verification through Multitasking Triplet Generative Adversarial Networks

    OpenAIRE

    Ding, Wenhao; He, Liang

    2018-01-01

    In this paper, we propose an enhanced triplet method that improves the encoding process of embeddings by jointly utilizing generative adversarial mechanism and multitasking optimization. We extend our triplet encoder with Generative Adversarial Networks (GANs) and softmax loss function. GAN is introduced for increasing the generality and diversity of samples, while softmax is for reinforcing features about speakers. For simplification, we term our method Multitasking Triplet Generative Advers...

  11. Speech rate normalization used to improve speaker verification

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2006-11-01

    Full Text Available the normalized durations is then compared with the EER using unnormalized durations, and also with the EER when duration information is not employed. 2. Proposed phoneme duration modeling 2.1. Choosing parametric models Since the duration of a phoneme... the known transcription and the speaker-specific acoustic model described above. Only one pronunciation per word was allowed, thus resulting in 49 triphones. To decide which parametric model to use for the duration density func- tions of the triphones...

  12. Speaker Recognition for Mobile User Authentication: An Android Solution

    OpenAIRE

    Brunet , Kevin; Taam , Karim; Cherrier , Estelle; Faye , Ndiaga; Rosenberger , Christophe

    2013-01-01

    National audience; This paper deals with a biometric solution for authentication on mobile devices. Among the possible biometric modalities, speaker recognition seems the most natural choice for a mobile phone. This work lies in the continuation of our previous work \\cite{Biosig2012}, where we evaluated a candidate algorithm in terms of performance and time processing. The proposed solution is implemented here as an Android application. Its performances are evaluated both on a public database...

  13. Processing advantage for emotional words in bilingual speakers.

    Science.gov (United States)

    Ponari, Marta; Rodríguez-Cuadrado, Sara; Vinson, David; Fox, Neil; Costa, Albert; Vigliocco, Gabriella

    2015-10-01

    Effects of emotion on word processing are well established in monolingual speakers. However, studies that have assessed whether affective features of words undergo the same processing in a native and nonnative language have provided mixed results: Studies that have found differences between native language (L1) and second language (L2) processing attributed the difference to the fact that L2 learned late in life would not be processed affectively, because affective associations are established during childhood. Other studies suggest that adult learners show similar effects of emotional features in L1 and L2. Differences in affective processing of L2 words can be linked to age and context of learning, proficiency, language dominance, and degree of similarity between L2 and L1. Here, in a lexical decision task on tightly matched negative, positive, and neutral words, highly proficient English speakers from typologically different L1s showed the same facilitation in processing emotionally valenced words as native English speakers, regardless of their L1, the age of English acquisition, or the frequency and context of English use. (c) 2015 APA, all rights reserved).

  14. Robustness in econometrics

    CERN Document Server

    Sriboonchitta, Songsak; Huynh, Van-Nam

    2017-01-01

    This book presents recent research on robustness in econometrics. Robust data processing techniques – i.e., techniques that yield results minimally affected by outliers – and their applications to real-life economic and financial situations are the main focus of this book. The book also discusses applications of more traditional statistical techniques to econometric problems. Econometrics is a branch of economics that uses mathematical (especially statistical) methods to analyze economic systems, to forecast economic and financial dynamics, and to develop strategies for achieving desirable economic performance. In day-by-day data, we often encounter outliers that do not reflect the long-term economic trends, e.g., unexpected and abrupt fluctuations. As such, it is important to develop robust data processing techniques that can accommodate these fluctuations.

  15. Robust Manufacturing Control

    CERN Document Server

    2013-01-01

    This contributed volume collects research papers, presented at the CIRP Sponsored Conference Robust Manufacturing Control: Innovative and Interdisciplinary Approaches for Global Networks (RoMaC 2012, Jacobs University, Bremen, Germany, June 18th-20th 2012). These research papers present the latest developments and new ideas focusing on robust manufacturing control for global networks. Today, Global Production Networks (i.e. the nexus of interconnected material and information flows through which products and services are manufactured, assembled and distributed) are confronted with and expected to adapt to: sudden and unpredictable large-scale changes of important parameters which are occurring more and more frequently, event propagation in networks with high degree of interconnectivity which leads to unforeseen fluctuations, and non-equilibrium states which increasingly characterize daily business. These multi-scale changes deeply influence logistic target achievement and call for robust planning and control ...

  16. Iambic-Trochaic Law Effects among Native Speakers of Spanish and English

    Directory of Open Access Journals (Sweden)

    Megan Crowhurst

    2016-10-01

    Full Text Available The Iambic-Trochaic Law (Bolton, 1894; Hayes, 1995; Woodrow, 1909 asserts that listeners associate greater intensity with group beginnings (a loud-first preference and greater duration with group endings (a long-last preference. Hayes (1987; 1995 posits a natural connection between the prominences referred to in the ITL and the locations of stressed syllables in feet. However, not all lengthening in final positions originates with stressed syllables, and greater duration may also be associated with stress in nonfinal (trochaic positions. The research described here challenged the notion that presumptive long-last effects necessarily reflect stress-related duration patterns, and investigated the general hypothesis that the robustness of long-last effects should vary depending on the strength of the association between final positions and increased duration, whatever its source. Two ITL studies were conducted in which native speakers of Spanish and of English grouped streams of rhythmically alternating syllables in which vowel intensity and/or duration levels were varied. These languages were chosen because while they are prosodically similar, increased duration on constituent-final syllables is both more common and more salient in English than Spanish. Outcomes revealed robust loud-first effects in both language groups. Long-last effects were significantly weaker in the Spanish group when vowel duration was varied singly. However, long-last effects were present and comparable in both language groups when intensity and duration were covaried. Intensity was a more robust predictor of responses than duration. A primary conclusion was that whether or not humans’ rhythmic grouping preferences have an innate component, duration-based grouping preferences, at least, and the magnitude of intensity-based effects are shaped by listeners’ backgrounds.

  17. Inferring speaker attributes in adductor spasmodic dysphonia: ratings from unfamiliar listeners.

    Science.gov (United States)

    Isetti, Derek; Xuereb, Linnea; Eadie, Tanya L

    2014-05-01

    To determine whether unfamiliar listeners' perceptions of speakers with adductor spasmodic dysphonia (ADSD) differ from control speakers on the parameters of relative age, confidence, tearfulness, and vocal effort and are related to speaker-rated vocal effort or voice-specific quality of life. Twenty speakers with ADSD (including 6 speakers with ADSD plus tremor) and 20 age- and sex-matched controls provided speech recordings, completed a voice-specific quality-of-life instrument (Voice Handicap Index; Jacobson et al., 1997), and rated their own vocal effort. Twenty listeners evaluated speech samples for relative age, confidence, tearfulness, and vocal effort using rating scales. Listeners judged speakers with ADSD as sounding significantly older, less confident, more tearful, and more effortful than control speakers (p < .01). Increased vocal effort was strongly associated with decreased speaker confidence (rs = .88-.89) and sounding more tearful (rs = .83-.85). Self-rated speaker effort was moderately related (rs = .45-.52) to listener impressions. Listeners' perceptions of confidence and tearfulness were also moderately associated with higher Voice Handicap Index scores (rs = .65-.70). Unfamiliar listeners judge speakers with ADSD more negatively than control speakers, with judgments extending beyond typical clinical measures. The results have implications for counseling and understanding the psychosocial effects of ADSD.

  18. Consistency between verbal and non-verbal affective cues: a clue to speaker credibility.

    Science.gov (United States)

    Gillis, Randall L; Nilsen, Elizabeth S

    2017-06-01

    Listeners are exposed to inconsistencies in communication; for example, when speakers' words (i.e. verbal) are discrepant with their demonstrated emotions (i.e. non-verbal). Such inconsistencies introduce ambiguity, which may render a speaker to be a less credible source of information. Two experiments examined whether children make credibility discriminations based on the consistency of speakers' affect cues. In Experiment 1, school-age children (7- to 8-year-olds) preferred to solicit information from consistent speakers (e.g. those who provided a negative statement with negative affect), over novel speakers, to a greater extent than they preferred to solicit information from inconsistent speakers (e.g. those who provided a negative statement with positive affect) over novel speakers. Preschoolers (4- to 5-year-olds) did not demonstrate this preference. Experiment 2 showed that school-age children's ratings of speakers were influenced by speakers' affect consistency when the attribute being judged was related to information acquisition (speakers' believability, "weird" speech), but not general characteristics (speakers' friendliness, likeability). Together, findings suggest that school-age children are sensitive to, and use, the congruency of affect cues to determine whether individuals are credible sources of information.

  19. Robust plasmonic substrates

    DEFF Research Database (Denmark)

    Kostiučenko, Oksana; Fiutowski, Jacek; Tamulevicius, Tomas

    2014-01-01

    Robustness is a key issue for the applications of plasmonic substrates such as tip-enhanced Raman spectroscopy, surface-enhanced spectroscopies, enhanced optical biosensing, optical and optoelectronic plasmonic nanosensors and others. A novel approach for the fabrication of robust plasmonic...... substrates is presented, which relies on the coverage of gold nanostructures with diamond-like carbon (DLC) thin films of thicknesses 25, 55 and 105 nm. DLC thin films were grown by direct hydrocarbon ion beam deposition. In order to find the optimum balance between optical and mechanical properties...

  20. Robust Self Tuning Controllers

    DEFF Research Database (Denmark)

    Poulsen, Niels Kjølstad

    1985-01-01

    The present thesis concerns robustness properties of adaptive controllers. It is addressed to methods for robustifying self tuning controllers with respect to abrupt changes in the plant parameters. In the thesis an algorithm for estimating abruptly changing parameters is presented. The estimator...... has several operation modes and a detector for controlling the mode. A special self tuning controller has been developed to regulate plant with changing time delay.......The present thesis concerns robustness properties of adaptive controllers. It is addressed to methods for robustifying self tuning controllers with respect to abrupt changes in the plant parameters. In the thesis an algorithm for estimating abruptly changing parameters is presented. The estimator...

  1. An analysis of topics and vocabulary in Chinese oral narratives by normal speakers and speakers with fluent aphasia.

    Science.gov (United States)

    Law, Sam-Po; Kong, Anthony Pak-Hin; Lai, Christy

    2018-01-01

    This study analysed the topic and vocabulary of Chinese speakers based on language samples of personal recounts in a large spoken Chinese database recently made available in the public domain, i.e. Cantonese AphasiaBank ( http://www.speech.hku.hk/caphbank/search/ ). The goal of the analysis is to offer clinicians a rich source for selecting ecologically valid training materials for rehabilitating Chinese-speaking people with aphasia (PWA) in the design and planning of culturally and linguistically appropriate treatments. Discourse production of 65 Chinese-speaking PWA of fluent types (henceforth, PWFA) and their non-aphasic controls narrating an important event in their life were extracted from Cantonese AphasiaBank. Analyses of topics and vocabularies in terms of part-of-speech, word frequency, lexical semantics, and diversity were conducted. There was significant overlap in topics between the two groups. While the vocabulary was larger for controls than that of PWFA as expected, they were similar in distribution across parts-of-speech, frequency of occurrence, and the ratio of concrete to abstract items in major open word classes. Moreover, proportionately more different verbs than nouns were employed at the individual level for both speaker groups. The findings provide important implications for guiding directions of aphasia rehabilitation not only of fluent but also non-fluent Chinese aphasic speakers.

  2. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  3. The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

    Directory of Open Access Journals (Sweden)

    Fang Liu

    Full Text Available Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.

  4. The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

    Science.gov (United States)

    Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

    2012-01-01

    Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.

  5. Robust surgery loading

    NARCIS (Netherlands)

    Hans, Elias W.; Wullink, Gerhard; van Houdenhoven, Mark; Kazemier, Geert

    2008-01-01

    We consider the robust surgery loading problem for a hospital’s operating theatre department, which concerns assigning surgeries and sufficient planned slack to operating room days. The objective is to maximize capacity utilization and minimize the risk of overtime, and thus cancelled patients. This

  6. Robustness Envelopes of Networks

    NARCIS (Netherlands)

    Trajanovski, S.; Martín-Hernández, J.; Winterbach, W.; Van Mieghem, P.

    2013-01-01

    We study the robustness of networks under node removal, considering random node failure, as well as targeted node attacks based on network centrality measures. Whilst both of these have been studied in the literature, existing approaches tend to study random failure in terms of average-case

  7. Coronal View Ultrasound Imaging of Movement in Different Segments of the Tongue during Paced Recital: Findings from Four Normal Speakers and a Speaker with Partial Glossectomy

    Science.gov (United States)

    Bressmann, Tim; Flowers, Heather; Wong, Willy; Irish, Jonathan C.

    2010-01-01

    The goal of this study was to quantitatively describe aspects of coronal tongue movement in different anatomical regions of the tongue. Four normal speakers and a speaker with partial glossectomy read four repetitions of a metronome-paced poem. Their tongue movement was recorded in four coronal planes using two-dimensional B-mode ultrasound…

  8. A Study on Metadiscoursive Interaction in the MA Theses of the Native Speakers of English and the Turkish Speakers of English

    Science.gov (United States)

    Köroglu, Zehra; Tüm, Gülden

    2017-01-01

    This study has been conducted to evaluate the TM usage in the MA theses written by the native speakers (NSs) of English and the Turkish speakers (TSs) of English. The purpose is to compare the TM usage in the introduction, results and discussion, and conclusion sections by both groups' randomly selected MA theses in the field of ELT between the…

  9. Noise-robust cortical tracking of attended speech in real-world acoustic scenes

    DEFF Research Database (Denmark)

    Fuglsang, Søren; Dau, Torsten; Hjortkjær, Jens

    2017-01-01

    Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener...... is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise......-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream...

  10. "I May Be a Native Speaker but I'm Not Monolingual": Reimagining "All" Teachers' Linguistic Identities in TESOL

    Science.gov (United States)

    Ellis, Elizabeth M.

    2016-01-01

    Teacher linguistic identity has so far mainly been researched in terms of whether a teacher identifies (or is identified by others) as a native speaker (NEST) or nonnative speaker (NNEST) (Moussu & Llurda, 2008; Reis, 2011). Native speakers are presumed to be monolingual, and nonnative speakers, although by definition bilingual, tend to be…

  11. Speaker emotion recognition: from classical classifiers to deep neural networks

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2018-04-01

    Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.

  12. Speaker comfort and increase of voice level in lecture rooms

    DEFF Research Database (Denmark)

    Brunskog, Jonas; Gade, Anders Christian; Bellester, G P

    2008-01-01

    Teachers often suffer health problems or tension related to their voice. These problems may be related to there working environment, including room acoustics of the lecture rooms which forces them to stress their voices. The present paper describes a first effort in finding relationships between...... were also measured in the rooms and subjective impressions from about 20 persons who had experience talking in these rooms were collected as well. Analysis of the data revealed significant differences in the sound power produced by the speaker in the different rooms. It was also found...

  13. A robust classic.

    Science.gov (United States)

    Kutzner, Florian; Vogel, Tobias; Freytag, Peter; Fiedler, Klaus

    2011-01-01

    In the present research, we argue for the robustness of illusory correlations (ICs, Hamilton & Gifford, 1976) regarding two boundary conditions suggested in previous research. First, we argue that ICs are maintained under extended experience. Using simulations, we derive conflicting predictions. Whereas noise-based accounts predict ICs to be maintained (Fielder, 2000; Smith, 1991), a prominent account based on discrepancy-reducing feedback learning predicts ICs to disappear (Van Rooy et al., 2003). An experiment involving 320 observations with majority and minority members supports the claim that ICs are maintained. Second, we show that actively using the stereotype to make predictions that are met with reward and punishment does not eliminate the bias. In addition, participants' operant reactions afford a novel online measure of ICs. In sum, our findings highlight the robustness of ICs that can be explained as a result of unbiased but noisy learning.

  14. Do children go for the nice guys? The influence of speaker benevolence and certainty on selective word learning.

    Science.gov (United States)

    Bergstra, Myrthe; DE Mulder, Hannah N M; Coopmans, Peter

    2018-04-06

    This study investigated how speaker certainty (a rational cue) and speaker benevolence (an emotional cue) influence children's willingness to learn words in a selective learning paradigm. In two experiments four- to six-year-olds learnt novel labels from two speakers and, after a week, their memory for these labels was reassessed. Results demonstrated that children retained the label-object pairings for at least a week. Furthermore, children preferred to learn from certain over uncertain speakers, but they had no significant preference for nice over nasty speakers. When the cues were combined, children followed certain speakers, even if they were nasty. However, children did prefer to learn from nice and certain speakers over nasty and certain speakers. These results suggest that rational cues regarding a speaker's linguistic competence trump emotional cues regarding a speaker's affective status in word learning. However, emotional cues were found to have a subtle influence on this process.

  15. Robust Airline Schedules

    OpenAIRE

    Eggenberg, Niklaus; Salani, Matteo; Bierlaire, Michel

    2010-01-01

    Due to economic pressure industries, when planning, tend to focus on optimizing the expected profit or the yield. The consequence of highly optimized solutions is an increased sensitivity to uncertainty. This generates additional "operational" costs, incurred by possible modifications of the original plan to be performed when reality does not reflect what was expected in the planning phase. The modern research trend focuses on "robustness" of solutions instead of yield or profit. Although ro...

  16. The Halo surrounding native English speaker teachers in Indonesia

    Directory of Open Access Journals (Sweden)

    Angga Kramadibrata

    2016-01-01

    Full Text Available The Native Speaker Fallacy, a commonly held belief that Native English Speaker Teachers (NESTs are inherently better than Non-NESTs, has long been questioned by ELT researchers. However, this belief still stands strong in the general public. This research looks to understand how much a teacher’s nativeness affects a student’s attitude towards them, as well as the underlying reasons for their attitudes. Sixty seven respondents in two groups were asked to watch an animated teaching video, after which they completed a questionnaire that used Likert-scales to assess comprehensibility, clarity of explanation, engagement, and preference. The videos for both groups were identical apart from the narrator; one spoke in British English, while the other, Indian English. In addition, they were also visually identified as Caucasian and Asian, respectively. The video was controlled for speed of delivery. The quantitative data were then triangulated using qualitative data collected through open questions in the questionnaire as well as from a semi-structured interview conducted with 10 respondents. The data show that there is a significant implicit preference for NEST teachers in the video, as well as in respondent’s actual classes. However, when asked explicitly, respondents didn’t rank nativeness as a very important quality in English teachers. This discrepancy between implicit and explicit attitudes might be due to a subconscious cognitive bias, namely the Halo Effect, in which humans tend to make unjustified presumptions about a person based on known but irrelevant information.

  17. Does verbatim sentence recall underestimate the language competence of near-native speakers?

    Directory of Open Access Journals (Sweden)

    Judith eSchweppe

    2015-02-01

    Full Text Available Verbatim sentence recall is widely used to test the language competence of native and non-native speakers since it involves comprehension and production of connected speech. However, we assume that, to maintain surface information, sentence recall relies particularly on attentional resources, which differentially affects native and non-native speakers. Since even in near-natives language processing is less automatized than in native speakers, processing a sentence in a foreign language plus retaining its surface may result in a cognitive overload. We contrasted sentence recall performance of German native speakers with that of highly proficient non-natives. Non-natives recalled the sentences significantly poorer than the natives, but performed equally well on a cloze test. This implies that sentence recall underestimates the language competence of good non-native speakers in mixed groups with native speakers. The findings also suggest that theories of sentence recall need to consider both its linguistic and its attentional aspects.

  18. Impact of Cyrillic on Native English Speakers' Phono-lexical Acquisition of Russian.

    Science.gov (United States)

    Showalter, Catherine E

    2018-03-01

    We investigated the influence of grapheme familiarity and native language grapheme-phoneme correspondences during second language lexical learning. Native English speakers learned Russian-like words via auditory presentations containing only familiar first language phones, pictured meanings, and exposure to either Cyrillic orthographic forms (Orthography condition) or the sequence (No Orthography condition). Orthography participants saw three types of written forms: familiar-congruent (e.g., -[kom]), familiar-incongruent (e.g., -[rɑt]), and unfamiliar (e.g., -[fil]). At test, participants determined whether pictures and words matched according to what they saw during word learning. All participants performed near ceiling in all stimulus conditions, except for Orthography participants on words containing incongruent grapheme-phoneme correspondences. These results suggest that first language grapheme-phoneme correspondences can cause interference during second language phono-lexical acquisition. In addition, these results suggest that orthographic input effects are robust enough to interfere even when the input does not contain novel phones.

  19. A simple optical method for measuring the vibration amplitude of a speaker

    OpenAIRE

    UEDA, Masahiro; YAMAGUCHI, Toshihiko; KAKIUCHI, Hiroki; SUGA, Hiroshi

    1999-01-01

    A simple optical method has been proposed for measuring the vibration amplitude of a speaker vibrating with a frequency of approximately 10 kHz. The method is based on a multiple reflection between a vibrating speaker plane and a mirror parallel to that speaker plane. The multiple reflection can magnify a dispersion of the laser beam caused by the vibration, and easily make a measurement of the amplitude. The measuring sensitivity ranges between sub-microns and 1 mm. A preliminary experim...

  20. A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-08-01

    Full Text Available Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still not fully understood. In this study we compare the ability of humans and zebra finches to categorize vowels despite speaker variation in speech in order to test the hypothesis that accommodating speaker and gender differences in isolated vowels can be achieved without prior experience with speaker-related variability. Using a behavioural Go/No-go task and identical stimuli, we compared Australian English adults’ (naïve to Dutch and zebra finches’ (naïve to human speech ability to categorize /ɪ/ and /ɛ/ vowels of an novel Dutch speaker after learning to discriminate those vowels from only one other speaker. Experiment 1 and 2 presented vowels of two speakers interspersed or blocked, respectively. Results demonstrate that categorization of vowels is possible without prior exposure to speaker-related variability in speech for zebra finches, and in non-native vowel categories for humans. Therefore, this study is the first to provide evidence for what might be a species-shared auditory bias that may supersede speaker-related information during vowel categorization. It additionally provides behavioural evidence contradicting a prior hypothesis that accommodation of speaker differences is achieved via the use of formant ratios. Therefore, investigations of alternative accounts of vowel normalization that incorporate the possibility of an auditory bias for disregarding inter-speaker variability are warranted.

  1. Sensitivity to phonological context in L2 spelling: evidence from Russian ESL speakers

    DEFF Research Database (Denmark)

    Dich, Nadya

    2010-01-01

    The study attempts to investigate factors underlying the development of spellers’ sensitivity to phonological context in English. Native English speakers and Russian speakers of English as a second language (ESL) were tested on their ability to use information about the coda to predict the spelling...... on the information about the coda when spelling vowels in nonwords. In both native and non-native speakers, context sensitivity was predicted by English word spelling; in Russian ESL speakers this relationship was mediated by English proficiency. L1 spelling proficiency did not facilitate L2 context sensitivity...

  2. Effects of Language Background on Gaze Behavior: A Crosslinguistic Comparison Between Korean and German Speakers

    Science.gov (United States)

    Goller, Florian; Lee, Donghoon; Ansorge, Ulrich; Choi, Soonja

    2017-01-01

    Languages differ in how they categorize spatial relations: While German differentiates between containment (in) and support (auf) with distinct spatial words—(a) den Kuli IN die Kappe stecken (”put pen in cap”); (b) die Kappe AUF den Kuli stecken (”put cap on pen”)—Korean uses a single spatial word (kkita) collapsing (a) and (b) into one semantic category, particularly when the spatial enclosure is tight-fit. Korean uses a different word (i.e., netha) for loose-fits (e.g., apple in bowl). We tested whether these differences influence the attention of the speaker. In a crosslinguistic study, we compared native German speakers with native Korean speakers. Participants rated the similarity of two successive video clips of several scenes where two objects were joined or nested (either in a tight or loose manner). The rating data show that Korean speakers base their rating of similarity more on tight- versus loose-fit, whereas German speakers base their rating more on containment versus support (in vs. auf). Throughout the experiment, we also measured the participants’ eye movements. Korean speakers looked equally long at the moving Figure object and at the stationary Ground object, whereas German speakers were more biased to look at the Ground object. Additionally, Korean speakers also looked more at the region where the two objects touched than did German speakers. We discuss our data in the light of crosslinguistic semantics and the extent of their influence on spatial cognition and perception. PMID:29362644

  3. A Comparison of Coverbal Gesture Use in Oral Discourse Among Speakers With Fluent and Nonfluent Aphasia

    Science.gov (United States)

    Law, Sam-Po; Chak, Gigi Wan-Chi

    2017-01-01

    Purpose Coverbal gesture use, which is affected by the presence and degree of aphasia, can be culturally specific. The purpose of this study was to compare gesture use among Cantonese-speaking individuals: 23 neurologically healthy speakers, 23 speakers with fluent aphasia, and 21 speakers with nonfluent aphasia. Method Multimedia data of discourse samples from these speakers were extracted from the Cantonese AphasiaBank. Gestures were independently annotated on their forms and functions to determine how gesturing rate and distribution of gestures differed across speaker groups. A multiple regression was conducted to determine the most predictive variable(s) for gesture-to-word ratio. Results Although speakers with nonfluent aphasia gestured most frequently, the rate of gesture use in counterparts with fluent aphasia did not differ significantly from controls. Different patterns of gesture functions in the 3 speaker groups revealed that gesture plays a minor role in lexical retrieval whereas its role in enhancing communication dominates among the speakers with aphasia. The percentages of complete sentences and dysfluency strongly predicted the gesturing rate in aphasia. Conclusions The current results supported the sketch model of language–gesture association. The relationship between gesture production and linguistic abilities and clinical implications for gesture-based language intervention for speakers with aphasia are also discussed. PMID:28609510

  4. Non-English speakers attend gastroenterology clinic appointments at higher rates than English speakers in a vulnerable patient population

    Science.gov (United States)

    Sewell, Justin L.; Kushel, Margot B.; Inadomi, John M.; Yee, Hal F.

    2009-01-01

    Goals We sought to identify factors associated with gastroenterology clinic attendance in an urban safety net healthcare system. Background Missed clinic appointments reduce the efficiency and availability of healthcare, but subspecialty clinic attendance among patients with established healthcare access has not been studied. Study We performed an observational study using secondary data from administrative sources to study patients referred to, and scheduled for an appointment in, the adult gastroenterology clinic serving the safety net healthcare system of San Francisco, California. Our dependent variable was whether subjects attended or missed a scheduled appointment. Analysis included multivariable logistic regression and classification tree analysis. 1,833 patients were referred and scheduled for an appointment between 05/2005 and 08/2006. Prisoners were excluded. All patients had a primary care provider. Results 683 patients (37.3%) missed their appointment; 1,150 (62.7%) attended. Language was highly associated with attendance in the logistic regression; non-English speakers were less likely than English speakers to miss an appointment (adjusted odds ratio 0.42 [0.28,0.63] for Spanish, 0.56 [0.38,0.82] for Asian language, p gastroenterology clinic appointment, not speaking English was most strongly associated with higher attendance rates. Patient related factors associated with not speaking English likely influence subspecialty clinic attendance rates, and these factors may differ from those affecting general healthcare access. PMID:19169147

  5. Robust efficient video fingerprinting

    Science.gov (United States)

    Puri, Manika; Lubin, Jeffrey

    2009-02-01

    We have developed a video fingerprinting system with robustness and efficiency as the primary and secondary design criteria. In extensive testing, the system has shown robustness to cropping, letter-boxing, sub-titling, blur, drastic compression, frame rate changes, size changes and color changes, as well as to the geometric distortions often associated with camcorder capture in cinema settings. Efficiency is afforded by a novel two-stage detection process in which a fast matching process first computes a number of likely candidates, which are then passed to a second slower process that computes the overall best match with minimal false alarm probability. One key component of the algorithm is a maximally stable volume computation - a three-dimensional generalization of maximally stable extremal regions - that provides a content-centric coordinate system for subsequent hash function computation, independent of any affine transformation or extensive cropping. Other key features include an efficient bin-based polling strategy for initial candidate selection, and a final SIFT feature-based computation for final verification. We describe the algorithm and its performance, and then discuss additional modifications that can provide further improvement to efficiency and accuracy.

  6. Acoustic cues to perception of word stress by English, Mandarin, and Russian speakers.

    Science.gov (United States)

    Chrabaszcz, Anna; Winn, Matthew; Lin, Candise Y; Idsardi, William J

    2014-08-01

    This study investigated how listeners' native language affects their weighting of acoustic cues (such as vowel quality, pitch, duration, and intensity) in the perception of contrastive word stress. Native speakers (N = 45) of typologically diverse languages (English, Russian, and Mandarin) performed a stress identification task on nonce disyllabic words with fully crossed combinations of each of the 4 cues in both syllables. The results revealed that although the vowel quality cue was the strongest cue for all groups of listeners, pitch was the second strongest cue for the English and the Mandarin listeners but was virtually disregarded by the Russian listeners. Duration and intensity cues were used by the Russian listeners to a significantly greater extent compared with the English and Mandarin participants. Compared with when cues were noncontrastive across syllables, cues were stronger when they were in the iambic contour than when they were in the trochaic contour. Although both English and Russian are stress languages and Mandarin is a tonal language, stress perception performance of the Mandarin listeners but not of the Russian listeners is more similar to that of the native English listeners, both in terms of weighting of the acoustic cues and the cues' relative strength in different word positions. The findings suggest that tuning of second-language prosodic perceptions is not entirely predictable by prosodic similarities across languages.

  7. Pitch perception and production in congenital amusia: Evidence from Cantonese speakers.

    Science.gov (United States)

    Liu, Fang; Chan, Alice H D; Ciocca, Valter; Roquet, Catherine; Peretz, Isabelle; Wong, Patrick C M

    2016-07-01

    This study investigated pitch perception and production in speech and music in individuals with congenital amusia (a disorder of musical pitch processing) who are native speakers of Cantonese, a tone language with a highly complex tonal system. Sixteen Cantonese-speaking congenital amusics and 16 controls performed a set of lexical tone perception, production, singing, and psychophysical pitch threshold tasks. Their tone production accuracy and singing proficiency were subsequently judged by independent listeners, and subjected to acoustic analyses. Relative to controls, amusics showed impaired discrimination of lexical tones in both speech and non-speech conditions. They also received lower ratings for singing proficiency, producing larger pitch interval deviations and making more pitch interval errors compared to controls. Demonstrating higher pitch direction identification thresholds than controls for both speech syllables and piano tones, amusics nevertheless produced native lexical tones with comparable pitch trajectories and intelligibility as controls. Significant correlations were found between pitch threshold and lexical tone perception, music perception and production, but not between lexical tone perception and production for amusics. These findings provide further evidence that congenital amusia is a domain-general language-independent pitch-processing deficit that is associated with severely impaired music perception and production, mildly impaired speech perception, and largely intact speech production.

  8. Aerodynamic Characteristics of Syllable and Sentence Productions in Normal Speakers.

    Science.gov (United States)

    Thiel, Cedric; Yang, Jin; Crawley, Brianna; Krishna, Priya; Murry, Thomas

    2018-01-08

    Aerodynamic measures of subglottic air pressure (Ps) and airflow rate (AFR) are used to select behavioral voice therapy versus surgical treatment for voice disorders. However, these measures are usually taken during a series of syllables, which differs from conversational speech. Repeated syllables do not share the variation found in even simple sentences, and patients may use their best rather than typical voice unless specifically instructed otherwise. This study examined the potential differences in estimated Ps and AFR in syllable and sentence production and their effects on a measure of vocal efficiency in normal speakers. Prospective study. Measures of estimated Ps, AFR, and aerodynamic vocal efficiency (AVE) were obtained from 19 female and four male speakers ages 22-44 years with no history of voice disorders. Subjects repeated a series of /pa/ syllables and a sentence at comfortable effort level into a face mask with a pressure-sensing tube between the lips. AVE varies as a function of the speech material in normal subjects. Ps measures were significantly higher for the sentence-production samples than for the syllable-production samples. AFR was higher during sentence production than syllable production, but the difference was not statistically significant. AVE values were significantly higher for syllable versus sentence productions. The results suggest that subjects increase Ps and AFR in sentence compared with syllable production. Speaking task is a critical factor when considering measures of AVE, and this preliminary study provides a basis for further aerodynamic studies of patient populations. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  9. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

    Science.gov (United States)

    Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

    2018-05-01

    Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.

  10. Robust automated knowledge capture.

    Energy Technology Data Exchange (ETDEWEB)

    Stevens-Adams, Susan Marie; Abbott, Robert G.; Forsythe, James Chris; Trumbo, Michael Christopher Stefan; Haass, Michael Joseph; Hendrickson, Stacey M. Langfitt

    2011-10-01

    This report summarizes research conducted through the Sandia National Laboratories Robust Automated Knowledge Capture Laboratory Directed Research and Development project. The objective of this project was to advance scientific understanding of the influence of individual cognitive attributes on decision making. The project has developed a quantitative model known as RumRunner that has proven effective in predicting the propensity of an individual to shift strategies on the basis of task and experience related parameters. Three separate studies are described which have validated the basic RumRunner model. This work provides a basis for better understanding human decision making in high consequent national security applications, and in particular, the individual characteristics that underlie adaptive thinking.

  11. Passion, Robustness and Perseverance

    DEFF Research Database (Denmark)

    Lim, Miguel Antonio; Lund, Rebecca

    2016-01-01

    Evaluation and merit in the measured university are increasingly based on taken-for-granted assumptions about the “ideal academic”. We suggest that the scholar now needs to show that she is passionate about her work and that she gains pleasure from pursuing her craft. We suggest that passion...... and pleasure achieve an exalted status as something compulsory. The scholar ought to feel passionate about her work and signal that she takes pleasure also in the difficult moments. Passion has become a signal of robustness and perseverance in a job market characterised by funding shortages, increased pressure...... way to demonstrate their potential and, crucially, their passion for their work. Drawing on the literature on technologies of governance, we reflect on what is captured and what is left out by these two evaluation instruments. We suggest that bibliometric analysis at the individual level is deeply...

  12. Robust Optical Flow Estimation

    Directory of Open Access Journals (Sweden)

    Javier Sánchez Pérez

    2013-10-01

    Full Text Available n this work, we describe an implementation of the variational method proposed by Brox etal. in 2004, which yields accurate optical flows with low running times. It has several benefitswith respect to the method of Horn and Schunck: it is more robust to the presence of outliers,produces piecewise-smooth flow fields and can cope with constant brightness changes. Thismethod relies on the brightness and gradient constancy assumptions, using the information ofthe image intensities and the image gradients to find correspondences. It also generalizes theuse of continuous L1 functionals, which help mitigate the effect of outliers and create a TotalVariation (TV regularization. Additionally, it introduces a simple temporal regularizationscheme that enforces a continuous temporal coherence of the flow fields.

  13. Robust snapshot interferometric spectropolarimetry.

    Science.gov (United States)

    Kim, Daesuk; Seo, Yoonho; Yoon, Yonghee; Dembele, Vamara; Yoon, Jae Woong; Lee, Kyu Jin; Magnusson, Robert

    2016-05-15

    This Letter describes a Stokes vector measurement method based on a snapshot interferometric common-path spectropolarimeter. The proposed scheme, which employs an interferometric polarization-modulation module, can extract the spectral polarimetric parameters Ψ(k) and Δ(k) of a transmissive anisotropic object by which an accurate Stokes vector can be calculated in the spectral domain. It is inherently strongly robust to the object 3D pose variation, since it is designed distinctly so that the measured object can be placed outside of the interferometric module. Experiments are conducted to verify the feasibility of the proposed system. The proposed snapshot scheme enables us to extract the spectral Stokes vector of a transmissive anisotropic object within tens of msec with high accuracy.

  14. International Conference on Robust Statistics

    CERN Document Server

    Filzmoser, Peter; Gather, Ursula; Rousseeuw, Peter

    2003-01-01

    Aspects of Robust Statistics are important in many areas. Based on the International Conference on Robust Statistics 2001 (ICORS 2001) in Vorau, Austria, this volume discusses future directions of the discipline, bringing together leading scientists, experienced researchers and practitioners, as well as younger researchers. The papers cover a multitude of different aspects of Robust Statistics. For instance, the fundamental problem of data summary (weights of evidence) is considered and its robustness properties are studied. Further theoretical subjects include e.g.: robust methods for skewness, time series, longitudinal data, multivariate methods, and tests. Some papers deal with computational aspects and algorithms. Finally, the aspects of application and programming tools complete the volume.

  15. Robustness Analyses of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Hald, Frederik

    2013-01-01

    The robustness of structural systems has obtained a renewed interest arising from a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure. In order to minimise the likelihood of such disproportionate structural failures, many mo...... with respect to robustness of timber structures and will discuss the consequences of such robustness issues related to the future development of timber structures.......The robustness of structural systems has obtained a renewed interest arising from a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure. In order to minimise the likelihood of such disproportionate structural failures, many...... modern building codes consider the need for the robustness of structures and provide strategies and methods to obtain robustness. Therefore, a structural engineer may take necessary steps to design robust structures that are insensitive to accidental circumstances. The present paper summaries issues...

  16. Psychophysical Boundary for Categorization of Voiced-Voiceless Stop Consonants in Native Japanese Speakers

    Science.gov (United States)

    Tamura, Shunsuke; Ito, Kazuhito; Hirose, Nobuyuki; Mori, Shuji

    2018-01-01

    Purpose: The purpose of this study was to investigate the psychophysical boundary used for categorization of voiced-voiceless stop consonants in native Japanese speakers. Method: Twelve native Japanese speakers participated in the experiment. The stimuli were synthetic stop consonant-vowel stimuli varying in voice onset time (VOT) with…

  17. White Native English Speakers Needed: The Rhetorical Construction of Privilege in Online Teacher Recruitment Spaces

    Science.gov (United States)

    Ruecker, Todd; Ives, Lindsey

    2015-01-01

    Over the past few decades, scholars have paid increasing attention to the role of native speakerism in the field of TESOL. Several recent studies have exposed instances of native speakerism in TESOL recruitment discourses published through a variety of media, but none have focused specifically on professional websites advertising programs in…

  18. Within-category variance and lexical tone discrimination in native and non-native speakers

    NARCIS (Netherlands)

    Hoffmann, C.W.G.; Sadakata, M.; Chen, A.; Desain, P.W.M.; McQueen, J.M.; Gussenhove, C.; Chen, Y.; Dediu, D.

    2014-01-01

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical

  19. Content-specific coordination of listeners' to speakers' EEG during communication.

    Science.gov (United States)

    Kuhlen, Anna K; Allefeld, Carsten; Haynes, John-Dylan

    2012-01-01

    Cognitive neuroscience has recently begun to extend its focus from the isolated individual mind to two or more individuals coordinating with each other. In this study we uncover a coordination of neural activity between the ongoing electroencephalogram (EEG) of two people-a person speaking and a person listening. The EEG of one set of twelve participants ("speakers") was recorded while they were narrating short stories. The EEG of another set of twelve participants ("listeners") was recorded while watching audiovisual recordings of these stories. Specifically, listeners watched the superimposed videos of two speakers simultaneously and were instructed to attend either to one or the other speaker. This allowed us to isolate neural coordination due to processing the communicated content from the effects of sensory input. We find several neural signatures of communication: First, the EEG is more similar among listeners attending to the same speaker than among listeners attending to different speakers, indicating that listeners' EEG reflects content-specific information. Secondly, listeners' EEG activity correlates with the attended speakers' EEG, peaking at a time delay of about 12.5 s. This correlation takes place not only between homologous, but also between non-homologous brain areas in speakers and listeners. A semantic analysis of the stories suggests that listeners coordinate with speakers at the level of complex semantic representations, so-called "situation models". With this study we link a coordination of neural activity between individuals directly to verbally communicated information.

  20. Are Cantonese-Speakers Really Descriptivists? Revisiting Cross-Cultural Semantics

    Science.gov (United States)

    Lam, Barry

    2010-01-01

    In an article in "Cognition" [Machery, E., Mallon, R., Nichols, S., & Stich, S. (2004). "Semantics cross-cultural style." "Cognition, 92", B1-B12] present data which purports to show that East Asian Cantonese-speakers tend to have descriptivist intuitions about the referents of proper names, while Western English-speakers tend to have…

  1. Automaticity and stability of adaptation to a foreign-accented speaker

    NARCIS (Netherlands)

    Witteman, M.J.; Bardhan, N.P.; Weber, A.C.; McQueen, J.M.

    2015-01-01

    In three cross-modal priming experiments we asked whether adaptation to a foreign-accented speaker is automatic, and whether adaptation can be seen after a long delay between initial exposure and test. Dutch listeners were exposed to a Hebrew-accented Dutch speaker with two types of Dutch words:

  2. Congenital Amusia in Speakers of a Tone Language: Association with Lexical Tone Agnosia

    Science.gov (United States)

    Nan, Yun; Sun, Yanan; Peretz, Isabelle

    2010-01-01

    Congenital amusia is a neurogenetic disorder that affects the processing of musical pitch in speakers of non-tonal languages like English and French. We assessed whether this musical disorder exists among speakers of Mandarin Chinese who use pitch to alter the meaning of words. Using the Montreal Battery of Evaluation of Amusia, we tested 117…

  3. 7 CFR 247.13 - Provisions for non-English or limited-English speakers.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 4 2010-01-01 2010-01-01 false Provisions for non-English or limited-English speakers... § 247.13 Provisions for non-English or limited-English speakers. (a) What must State and local agencies do to ensure that non-English or limited-English speaking persons are aware of their rights and...

  4. The Sound of Voice: Voice-Based Categorization of Speakers' Sexual Orientation within and across Languages.

    Directory of Open Access Journals (Sweden)

    Simone Sulpizio

    Full Text Available Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency and to non-native speakers (language-specificity, has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity.

  5. Comprehending non-native speakers: theory and evidence for adjustment in manner of processing.

    Science.gov (United States)

    Lev-Ari, Shiri

    2014-01-01

    Non-native speakers have lower linguistic competence than native speakers, which renders their language less reliable in conveying their intentions. We suggest that expectations of lower competence lead listeners to adapt their manner of processing when they listen to non-native speakers. We propose that listeners use cognitive resources to adjust by increasing their reliance on top-down processes and extracting less information from the language of the non-native speaker. An eye-tracking study supports our proposal by showing that when following instructions by a non-native speaker, listeners make more contextually-induced interpretations. Those with relatively high working memory also increase their reliance on context to anticipate the speaker's upcoming reference, and are less likely to notice lexical errors in the non-native speech, indicating that they take less information from the speaker's language. These results contribute to our understanding of the flexibility in language processing and have implications for interactions between native and non-native speakers.

  6. Popular Public Discourse at Speakers' Corner: Negotiating Cultural Identities in Interaction

    DEFF Research Database (Denmark)

    McIlvenny, Paul

    1996-01-01

    , religious and general topical 'soap-box' oration. However, audiences are not passive receivers of rhetorical messages. They are active negotiators of interpretations and alignments that may conflict with the speaker's and other audience members' orientations to prior talk. Speakers' Corner is a space...

  7. The Acquisition of Clitic Pronouns in the Spanish Interlanguage of Peruvian Quechua Speakers.

    Science.gov (United States)

    Klee, Carol A.

    1989-01-01

    Analysis of four adult Quechua speakers' acquisition of clitic pronouns in Spanish revealed that educational attainment and amount of contact with monolingual Spanish speakers were positively related to native-like norms of competence in the use of object pronouns in Spanish. (CB)

  8. Intelligibility of Standard German and Low German to Speakers of Dutch

    NARCIS (Netherlands)

    Gooskens, C.S.; Kürschner, Sebastian; van Bezooijen, R.

    2011-01-01

    This paper reports on the intelligibility of spoken Low German and Standard German for speakers of Dutch. Two aspects are considered. First, the relative potential for intelligibility of the Low German variety of Bremen and the High German variety of Modern Standard German for speakers of Dutch is

  9. Pragmatic Instruction May Not Be Necessary among Heritage Speakers of Spanish: A Study on Requests

    Science.gov (United States)

    Barros García, María J.; Bachelor, Jeremy W.

    2018-01-01

    This paper studies the pragmatic competence of U.S. heritage speakers of Spanish in an attempt to determine (a) the degree of pragmatic transfer from English to Spanish experienced by heritage speakers when producing different types of requests in Spanish; and (b) how to best teach pragmatics to students of Spanish as a Heritage Language (SHL).…

  10. Bridging Gaps in Common Ground: Speakers Design Their Gestures for Their Listeners

    Science.gov (United States)

    Hilliard, Caitlin; Cook, Susan Wagner

    2016-01-01

    Communication is shaped both by what we are trying to say and by whom we are saying it to. We examined whether and how shared information influences the gestures speakers produce along with their speech. Unlike prior work examining effects of common ground on speech and gesture, we examined a situation in which some speakers have the same amount…

  11. Classifications of Vocalic Segments from Articulatory Kinematics: Healthy Controls and Speakers with Dysarthria

    Science.gov (United States)

    Yunusova, Yana; Weismer, Gary G.; Lindstrom, Mary J.

    2011-01-01

    Purpose: In this study, the authors classified vocalic segments produced by control speakers (C) and speakers with dysarthria due to amyotrophic lateral sclerosis (ALS) or Parkinson's disease (PD); classification was based on movement measures. The researchers asked the following questions: (a) Can vowels be classified on the basis of selected…

  12. Articulatory Movements during Vowels in Speakers with Dysarthria and Healthy Controls

    Science.gov (United States)

    Yunusova, Yana; Weismer, Gary; Westbury, John R.; Lindstrom, Mary J.

    2008-01-01

    Purpose: This study compared movement characteristics of markers attached to the jaw, lower lip, tongue blade, and dorsum during production of selected English vowels by normal speakers and speakers with dysarthria due to amyotrophic lateral sclerosis (ALS) or Parkinson disease (PD). The study asked the following questions: (a) Are movement…

  13. Speaker Input Variability Does Not Explain Why Larger Populations Have Simpler Languages.

    Science.gov (United States)

    Atkinson, Mark; Kirby, Simon; Smith, Kenny

    2015-01-01

    A learner's linguistic input is more variable if it comes from a greater number of speakers. Higher speaker input variability has been shown to facilitate the acquisition of phonemic boundaries, since data drawn from multiple speakers provides more information about the distribution of phonemes in a speech community. It has also been proposed that speaker input variability may have a systematic influence on individual-level learning of morphology, which can in turn influence the group-level characteristics of a language. Languages spoken by larger groups of people have less complex morphology than those spoken in smaller communities. While a mechanism by which the number of speakers could have such an effect is yet to be convincingly identified, differences in speaker input variability, which is thought to be larger in larger groups, may provide an explanation. By hindering the acquisition, and hence faithful cross-generational transfer, of complex morphology, higher speaker input variability may result in structural simplification. We assess this claim in two experiments which investigate the effect of such variability on language learning, considering its influence on a learner's ability to segment a continuous speech stream and acquire a morphologically complex miniature language. We ultimately find no evidence to support the proposal that speaker input variability influences language learning and so cannot support the hypothesis that it explains how population size determines the structural properties of language.

  14. A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

    NARCIS (Netherlands)

    Kriengwatana, B.; Escudero, P.; Kerkhoven, A.H.; ten Cate, C.

    2015-01-01

    Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still

  15. Speaker detection for conversational robots using synchrony between audio and video

    NARCIS (Netherlands)

    Noulas, A.; Englebienne, G.; Terwijn, B.; Kröse, B.; Hanheide, M.; Zender, H.

    2010-01-01

    This paper compares different methods for detecting the speaking person when multiple persons are interacting with a robot. We evaluate the state-of-the-art speaker detection methods on the iCat robot. These methods use the synchrony between audio and video to locate the most probable speaker. We

  16. Use of the BAT with a Cantonese-Putonghua Speaker with Aphasia

    Science.gov (United States)

    Kong, Anthony Pak-Hin; Weekes, Brendan Stuart

    2011-01-01

    The aim of this article is to illustrate the use of the Bilingual Aphasia Test (BAT) with a Cantonese-Putonghua speaker. We describe G, who is a relatively young Chinese bilingual speaker with aphasia. G's communication abilities in his L2, Putonghua, were impaired following brain damage. This impairment caused specific difficulties in…

  17. During Threaded Discussions Are Non-Native English Speakers Always at a Disadvantage?

    Science.gov (United States)

    Shafer Willner, Lynn

    2014-01-01

    When participating in threaded discussions, under what conditions might non¬native speakers of English (NNSE) be at a comparative disadvantage to their classmates who are native speakers of English (NSE)? This study compares the threaded discussion perspectives of closely-matched NNSE and NSE adult students having different levels of threaded…

  18. Promoting Communities of Practice among Non-Native Speakers of English in Online Discussions

    Science.gov (United States)

    Kim, Hoe Kyeung

    2011-01-01

    An online discussion involving text-based computer-mediated communication has great potential for promoting equal participation among non-native speakers of English. Several studies claimed that online discussions could enhance the academic participation of non-native speakers of English. However, there is little research around participation…

  19. The AMI speaker diarization system for NIST RT06s meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Huijbregts, Marijn

    2006-01-01

    We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker

  20. The AMI speaker diarization system for NIST RT06s meeting data

    NARCIS (Netherlands)

    van Leeuwen, David A.; Huijbregts, M.A.H.

    2007-01-01

    We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection

  1. Omission of definite and indefinite articles in the spontaneous speech of agrammatic speakers with Broca's aphasia

    NARCIS (Netherlands)

    Havik, E.; Bastiaanse, Y.R.M.

    2004-01-01

    Background: Cross-linguistic investigation of agrammatic speech in speakers of different languages allows us to tests theoretical accounts of the nature of agrammatism. A significant feature of the speech of many agrammatic speakers is a problem with article production. Mansson and Ahlsen (2001)

  2. Learning foreign labels from a foreign speaker: the role of (limited) exposure to a second language.

    Science.gov (United States)

    Akhtar, Nameera; Menjivar, Jennifer; Hoicka, Elena; Sabbagh, Mark A

    2012-11-01

    Three- and four-year-olds (N = 144) were introduced to novel labels by an English speaker and a foreign speaker (of Nordish, a made-up language), and were asked to endorse one of the speaker's labels. Monolingual English-speaking children were compared to bilingual children and English-speaking children who were regularly exposed to a language other than English. All children tended to endorse the English speaker's labels when asked 'What do you call this?', but when asked 'What do you call this in Nordish?', children with exposure to a second language were more likely to endorse the foreign label than monolingual and bilingual children. The findings suggest that, at this age, exposure to, but not necessarily immersion in, more than one language may promote the ability to learn foreign words from a foreign speaker.

  3. Outlier robustness for wind turbine extrapolated extreme loads

    DEFF Research Database (Denmark)

    Natarajan, Anand; Verelst, David Robert

    2012-01-01

    . Stochastic identification of numerical artifacts in simulated loads is demonstrated using the method of principal component analysis. The extrapolation methodology is made robust to outliers through a weighted loads approach, whereby the eigenvalues of the correlation matrix obtained using the loads with its...

  4. Speaker and Accent Variation Are Handled Differently: Evidence in Native and Non-Native Listeners

    Science.gov (United States)

    Kriengwatana, Buddhamas; Terry, Josephine; Chládková, Kateřina; Escudero, Paola

    2016-01-01

    Listeners are able to cope with between-speaker variability in speech that stems from anatomical sources (i.e. individual and sex differences in vocal tract size) and sociolinguistic sources (i.e. accents). We hypothesized that listeners adapt to these two types of variation differently because prior work indicates that adapting to speaker/sex variability may occur pre-lexically while adapting to accent variability may require learning from attention to explicit cues (i.e. feedback). In Experiment 1, we tested our hypothesis by training native Dutch listeners and Australian-English (AusE) listeners without any experience with Dutch or Flemish to discriminate between the Dutch vowels /I/ and /ε/ from a single speaker. We then tested their ability to classify /I/ and /ε/ vowels of a novel Dutch speaker (i.e. speaker or sex change only), or vowels of a novel Flemish speaker (i.e. speaker or sex change plus accent change). We found that both Dutch and AusE listeners could successfully categorize vowels if the change involved a speaker/sex change, but not if the change involved an accent change. When AusE listeners were given feedback on their categorization responses to the novel speaker in Experiment 2, they were able to successfully categorize vowels involving an accent change. These results suggest that adapting to accents may be a two-step process, whereby the first step involves adapting to speaker differences at a pre-lexical level, and the second step involves adapting to accent differences at a contextual level, where listeners have access to word meaning or are given feedback that allows them to appropriately adjust their perceptual category boundaries. PMID:27309889

  5. Speaker Introductions at Internal Medicine Grand Rounds: Forms of Address Reveal Gender Bias.

    Science.gov (United States)

    Files, Julia A; Mayer, Anita P; Ko, Marcia G; Friedrich, Patricia; Jenkins, Marjorie; Bryan, Michael J; Vegunta, Suneela; Wittich, Christopher M; Lyle, Melissa A; Melikian, Ryan; Duston, Trevor; Chang, Yu-Hui H; Hayes, Sharonne N

    2017-05-01

    Gender bias has been identified as one of the drivers of gender disparity in academic medicine. Bias may be reinforced by gender subordinating language or differential use of formality in forms of address. Professional titles may influence the perceived expertise and authority of the referenced individual. The objective of this study is to examine how professional titles were used in the same and mixed-gender speaker introductions at Internal Medicine Grand Rounds (IMGR). A retrospective observational study of video-archived speaker introductions at consecutive IMGR was conducted at two different locations (Arizona, Minnesota) of an academic medical center. Introducers and speakers at IMGR were physician and scientist peers holding MD, PhD, or MD/PhD degrees. The primary outcome was whether or not a speaker's professional title was used during the first form of address during speaker introductions at IMGR. As secondary outcomes, we evaluated whether or not the speakers professional title was used in any form of address during the introduction. Three hundred twenty-one forms of address were analyzed. Female introducers were more likely to use professional titles when introducing any speaker during the first form of address compared with male introducers (96.2% [102/106] vs. 65.6% [141/215]; p form of address 97.8% (45/46) compared with male dyads who utilized a formal title 72.4% (110/152) of the time (p = 0.007). In mixed-gender dyads, where the introducer was female and speaker male, formal titles were used 95.0% (57/60) of the time. Male introducers of female speakers utilized professional titles 49.2% (31/63) of the time (p addressed by professional title than were men introduced by men. Differential formality in speaker introductions may amplify isolation, marginalization, and professional discomfiture expressed by women faculty in academic medicine.

  6. Dynamics robustness of cascading systems.

    Directory of Open Access Journals (Sweden)

    Jonathan T Young

    2017-03-01

    Full Text Available A most important property of biochemical systems is robustness. Static robustness, e.g., homeostasis, is the insensitivity of a state against perturbations, whereas dynamics robustness, e.g., homeorhesis, is the insensitivity of a dynamic process. In contrast to the extensively studied static robustness, dynamics robustness, i.e., how a system creates an invariant temporal profile against perturbations, is little explored despite transient dynamics being crucial for cellular fates and are reported to be robust experimentally. For example, the duration of a stimulus elicits different phenotypic responses, and signaling networks process and encode temporal information. Hence, robustness in time courses will be necessary for functional biochemical networks. Based on dynamical systems theory, we uncovered a general mechanism to achieve dynamics robustness. Using a three-stage linear signaling cascade as an example, we found that the temporal profiles and response duration post-stimulus is robust to perturbations against certain parameters. Then analyzing the linearized model, we elucidated the criteria of when signaling cascades will display dynamics robustness. We found that changes in the upstream modules are masked in the cascade, and that the response duration is mainly controlled by the rate-limiting module and organization of the cascade's kinetics. Specifically, we found two necessary conditions for dynamics robustness in signaling cascades: 1 Constraint on the rate-limiting process: The phosphatase activity in the perturbed module is not the slowest. 2 Constraints on the initial conditions: The kinase activity needs to be fast enough such that each module is saturated even with fast phosphatase activity and upstream changes are attenuated. We discussed the relevance of such robustness to several biological examples and the validity of the above conditions therein. Given the applicability of dynamics robustness to a variety of systems, it

  7. Robust continuous clustering.

    Science.gov (United States)

    Shah, Sohil Atul; Koltun, Vladlen

    2017-09-12

    Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank.

  8. Robust Trust in Expert Testimony

    Directory of Open Access Journals (Sweden)

    Christian Dahlman

    2015-05-01

    Full Text Available The standard of proof in criminal trials should require that the evidence presented by the prosecution is robust. This requirement of robustness says that it must be unlikely that additional information would change the probability that the defendant is guilty. Robustness is difficult for a judge to estimate, as it requires the judge to assess the possible effect of information that the he or she does not have. This article is concerned with expert witnesses and proposes a method for reviewing the robustness of expert testimony. According to the proposed method, the robustness of expert testimony is estimated with regard to competence, motivation, external strength, internal strength and relevance. The danger of trusting non-robust expert testimony is illustrated with an analysis of the Thomas Quick Case, a Swedish legal scandal where a patient at a mental institution was wrongfully convicted for eight murders.

  9. Acquired dyslexia in Serbian speakers with Broca's and Wernicke's aphasia.

    Science.gov (United States)

    Vuković, Mile; Vuković, Irena; Miller, Nick

    2016-01-01

    This study examined patterns of acquired dyslexia in Serbian aphasic speakers, comparing profiles of groups with Broca's versus Wernicke's aphasia. The study also looked at the relationship of reading and auditory comprehension and between reading comprehension and reading aloud in these groups. Participants were 20 people with Broca's and 20 with Wernicke's aphasia. They were asked to read aloud and to understand written material from the Serbian adaptation of the Boston Diagnostic Aphasia Examination. A Serbian Word Reading Aloud Test was also used. The people with Broca's aphasia achieved better results in reading aloud and in reading comprehension than those with Wernicke's aphasia. Those with Wernicke's aphasia showed significantly more semantic errors than those with Broca's aphasia who had significantly more morphological and phonological errors. From the data we inferred that lesion sites accorded with previous work on networks associated with Broca's and Wernicke's aphasia and with a posterior-anterior axis for reading processes centred on (left) parietal-temporal-frontal lobes. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Speech spectrum's correlation with speakers' Eysenck personality traits.

    Directory of Open Access Journals (Sweden)

    Chao Hu

    Full Text Available The current study explored the correlation between speakers' Eysenck personality traits and speech spectrum parameters. Forty-six subjects completed the Eysenck Personality Questionnaire. They were instructed to verbally answer the questions shown on a computer screen and their responses were recorded by the computer. Spectrum parameters of /sh/ and /i/ were analyzed by Praat voice software. Formant frequencies of the consonant /sh/ in lying responses were significantly lower than that in truthful responses, whereas no difference existed on the vowel /i/ speech spectrum. The second formant bandwidth of the consonant /sh/ speech spectrum was significantly correlated with the personality traits of Psychoticism, Extraversion, and Neuroticism, and the correlation differed between truthful and lying responses, whereas the first formant frequency of the vowel /i/ speech spectrum was negatively correlated with Neuroticism in both response types. The results suggest that personality characteristics may be conveyed through the human voice, although the extent to which these effects are due to physiological differences in the organs associated with speech or to a general Pygmalion effect is yet unknown.

  11. An acoustic investigation of Arabic vowels pronounced by Malay speakers

    Directory of Open Access Journals (Sweden)

    Ali Abd Almisreb

    2016-04-01

    Full Text Available In Malaysia, Arabic language is spoken, and commonly used among the Malays. Malays use Arabic in their daily life, such as during performing worship. Hence, in this paper, some of the Arabic vowels attributes are investigated, analyzed and initial findings are presented based on tokens articulated by Malay speakers as we can consider the spoken Arabic by Malays as one of the Arabic dialects. It is known that in Arabic language there are 28 consonants and 6 main vowels. Firstly, the duration, variability, and overlapping attributes are highlighted based on syllables of Consonant–Vowel with each syllable representing every Arabic consonant with the corresponding vowels. Next, the dispersion of each vowel is examined to be compared with each other along with the variability among vowels that may cause overlapping between vowels in the vowel-space. Results showed that the vowel overlapping occurred between short vowels and their long counterpart vowels. Furthermore, an investigation of the Arabic vowel duration is addressed as well, and duration analysis for all the vowels is discussed, followed by the analysis for each vowel separately. In addition, a comparison between long and short vowels is presented as well as comparison between high and low vowel is carried out.

  12. Speakers' assumptions about the lexical flexibility of idioms.

    Science.gov (United States)

    Gibbs, R W; Nayak, N P; Bolton, J L; Keppel, M E

    1989-01-01

    In three experiments, we examined why some idioms can be lexically altered and still retain their figurative meanings (e.g., John buttoned his lips about Mary can be changed into John fastened his lips about Mary and still mean "John didn't say anything about Mary"), whereas other idioms cannot be lexically altered without losing their figurative meanings (e.g., John kicked the bucket, meaning "John died," loses its idiomatic meaning when changed into John kicked the pail). Our hypothesis was that the lexical flexibility of idioms is determined by speakers' assumptions about the ways in which parts of idioms contribute to their figurative interpretations as a whole. The results of the three experiments indicated that idioms whose individual semantic components contribute to their overall figurative meanings (e.g., go out on a limb) were judged as less disrupted by changes in their lexical items (e.g., go out on a branch) than were nondecomposable idioms (e.g., kick the bucket) when their individual words were altered (e.g., punt the pail). These findings lend support to the idea that both the syntactic productivity and the lexical makeup of idioms are matters of degree, depending on the idioms' compositional properties. This conclusion suggests that idioms do not form a unique class of linguistic items, but share many of the properties of more literal language.

  13. Listening to a non-native speaker: Adaptation and generalization

    Science.gov (United States)

    Clarke, Constance M.

    2004-05-01

    Non-native speech can cause perceptual difficulty for the native listener, but experience can moderate this difficulty. This study explored the perceptual benefit of a brief (approximately 1 min) exposure to foreign-accented speech using a cross-modal word matching paradigm. Processing speed was tracked by recording reaction times (RTs) to visual probe words following English sentences produced by a Spanish-accented speaker. In experiment 1, RTs decreased significantly over 16 accented utterances and by the end were equal to RTs to a native voice. In experiment 2, adaptation to one Spanish-accented voice improved perceptual efficiency for a new Spanish-accented voice, indicating that abstract properties of accented speech are learned during adaptation. The control group in Experiment 2 also adapted to the accented voice during the test block, suggesting adaptation can occur within two to four sentences. The results emphasize the flexibility of the human speech processing system and the need for a mechanism to explain this adaptation in models of spoken word recognition. [Research supported by an NSF Graduate Research Fellowship and the University of Arizona Cognitive Science Program.] a)Currently at SUNY at Buffalo, Dept. of Psych., Park Hall, Buffalo, NY 14260, cclarke2@buffalo.edu

  14. Phonological processing in Mandarin speakers with congenital amusia.

    Science.gov (United States)

    Wang, Xiao; Peng, Gang

    2014-12-01

    Although there is an emerging consensus that both musical and linguistic pitch processing can be problematic for individuals with a developmental disorder termed congenital amusia, the nature of such a pitch-processing deficit, especially that demonstrated in a speech setting, remains unclear. Therefore, this study tested the performance of native Mandarin speakers, both with and without amusia, on discrimination and imitation tasks for Cantonese level tones, aiming to shed light on this issue. Results suggest that the impact of the phonological deficit, coupled with that of the domain-general pitch deficit, could provide a more comprehensive interpretation of Mandarin amusics' speech impairment. Specifically, when there was a high demand for pitch sensitivity, as in fine-grained pitch discriminations, the operation of the pitch-processing deficit played the more predominant role in modulating amusics' speech performance. But when the demand was low, as in discriminating naturally produced Cantonese level tones, the impact of the phonological deficit was more pronounced compared to that of the pitch-processing deficit. However, despite their perceptual deficits, Mandarin amusics' imitation abilities were comparable to controls'. Such selective impairment in tonal perception suggests that the phonological deficit more severely implicates amusics' input pathways.

  15. A Method to Integrate GMM, SVM and DTW for Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2014-01-01

    Full Text Available This paper develops an effective and efficient scheme to integrate Gaussian mixture model (GMM, support vector machine (SVM, and dynamic time wrapping (DTW for automatic speaker recognition. GMM and SVM are two popular classifiers for speaker recognition applications. DTW is a fast and simple template matching method, and it is frequently seen in applications of speech recognition. In this work, DTW does not play a role to perform speech recognition, and it will be employed to be a verifier for verification of valid speakers. The proposed combination scheme of GMM, SVM and DTW, called SVMGMM-DTW, for speaker recognition in this study is a two-phase verification process task including GMM-SVM verification of the first phase and DTW verification of the second phase. By providing a double check to verify the identity of a speaker, it will be difficult for imposters to try to pass the security protection; therefore, the safety degree of speaker recognition systems will be largely increased. A series of experiments designed on door access control applications demonstrated that the superiority of the developed SVMGMM-DTW on speaker recognition accuracy.

  16. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization

    Directory of Open Access Journals (Sweden)

    Umit H. Yapanel

    2008-08-01

    Full Text Available A proven method for achieving effective automatic speech recognition (ASR due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed which require limited computing resources for real-time performance. The most popular speaker normalization technique is vocal-tract length normalization (VTLN, despite the fact that it is computationally expensive. In this study, we propose a novel online VTLN algorithm entitled built-in speaker normalization (BISN, where normalization is performed on-the-fly within a newly proposed PMVDR acoustic front end. The novel algorithm aspect is that in conventional frontend processing with PMVDR and VTLN, two separating warping phases are needed; while in the proposed BISN method only one single speaker dependent warp is used to achieve both the PMVDR perceptual warp and VTLN warp simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces computational requirements, thereby offering advantages for real-time ASR systems. Evaluations are performed for (i an in-car extended digit recognition task, where an on-the-fly BISN implementation reduces the relative word error rate (WER by 24%, and (ii for a diverse noisy speech task (SPINE 2, where the relative WER improvement was 9%, both relative to the baseline speaker normalization method.

  17. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization

    Directory of Open Access Journals (Sweden)

    Yapanel UmitH

    2008-01-01

    Full Text Available A proven method for achieving effective automatic speech recognition (ASR due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed which require limited computing resources for real-time performance. The most popular speaker normalization technique is vocal-tract length normalization (VTLN, despite the fact that it is computationally expensive. In this study, we propose a novel online VTLN algorithm entitled built-in speaker normalization (BISN, where normalization is performed on-the-fly within a newly proposed PMVDR acoustic front end. The novel algorithm aspect is that in conventional frontend processing with PMVDR and VTLN, two separating warping phases are needed; while in the proposed BISN method only one single speaker dependent warp is used to achieve both the PMVDR perceptual warp and VTLN warp simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces computational requirements, thereby offering advantages for real-time ASR systems. Evaluations are performed for (i an in-car extended digit recognition task, where an on-the-fly BISN implementation reduces the relative word error rate (WER by 24%, and (ii for a diverse noisy speech task (SPINE 2, where the relative WER improvement was 9%, both relative to the baseline speaker normalization method.

  18. Infants' Selectively Pay Attention to the Information They Receive from a Native Speaker of Their Language.

    Science.gov (United States)

    Marno, Hanna; Guellai, Bahia; Vidal, Yamil; Franzoi, Julia; Nespor, Marina; Mehler, Jacques

    2016-01-01

    From the first moments of their life, infants show a preference for their native language, as well as toward speakers with whom they share the same language. This preference appears to have broad consequences in various domains later on, supporting group affiliations and collaborative actions in children. Here, we propose that infants' preference for native speakers of their language also serves a further purpose, specifically allowing them to efficiently acquire culture specific knowledge via social learning. By selectively attending to informants who are native speakers of their language and who probably also share the same cultural background with the infant, young learners can maximize the possibility to acquire cultural knowledge. To test whether infants would preferably attend the information they receive from a speaker of their native language, we familiarized 12-month-old infants with a native and a foreign speaker, and then presented them with movies where each of the speakers silently gazed toward unfamiliar objects. At test, infants' looking behavior to the two objects alone was measured. Results revealed that infants preferred to look longer at the object presented by the native speaker. Strikingly, the effect was replicated also with 5-month-old infants, indicating an early development of such preference. These findings provide evidence that young infants pay more attention to the information presented by a person with whom they share the same language. This selectivity can serve as a basis for efficient social learning by influencing how infants' allocate attention between potential sources of information in their environment.

  19. Switches to English during French Service Encounters: Relationships with L2 French Speakers' Willingness to Communicate and Motivation

    Science.gov (United States)

    McNaughton, Stephanie; McDonough, Kim

    2015-01-01

    This exploratory study investigated second language (L2) French speakers' service encounters in the multilingual setting of Montreal, specifically whether switches to English during French service encounters were related to L2 speakers' willingness to communicate or motivation. Over a two-week period, 17 French L2 speakers in Montreal submitted…

  20. Cattle identification based in biometric features of the muzzle

    OpenAIRE

    Monteiro, Marta; Cadavez, Vasco; Monteiro, Fernando C.

    2015-01-01

    Cattle identification has been a serious problem for breeding association. Muzzle pattern or nose print has the same characteristic with the human fingerprint which is the most popular biometric marker. The identification accuracy and the processing time are two key challenges of any cattle identification methodology. This paper presents a robust and fast cattle identification scheme from muzzle images using Speed-up Robust Features matching. The matching refinement technique based on the mat...

  1. Design and implementation of robust controllers for a gait trainer.

    Science.gov (United States)

    Wang, F C; Yu, C H; Chou, T Y

    2009-08-01

    This paper applies robust algorithms to control an active gait trainer for children with walking disabilities. Compared with traditional rehabilitation procedures, in which two or three trainers are required to assist the patient, a motor-driven mechanism was constructed to improve the efficiency of the procedures. First, a six-bar mechanism was designed and constructed to mimic the trajectory of children's ankles in walking. Second, system identification techniques were applied to obtain system transfer functions at different operating points by experiments. Third, robust control algorithms were used to design Hinfinity robust controllers for the system. Finally, the designed controllers were implemented to verify experimentally the system performance. From the results, the proposed robust control strategies are shown to be effective.

  2. Designing, Modeling, Constructing, and Testing a Flat Panel Speaker and Sound Diffuser for a Simulator

    Science.gov (United States)

    Dillon, Christina

    2013-01-01

    The goal of this project was to design, model, build, and test a flat panel speaker and frame for a spherical dome structure being made into a simulator. The simulator will be a test bed for evaluating an immersive environment for human interfaces. This project focused on the loud speakers and a sound diffuser for the dome. The rest of the team worked on an Ambisonics 3D sound system, video projection system, and multi-direction treadmill to create the most realistic scene possible. The main programs utilized in this project, were Pro-E and COMSOL. Pro-E was used for creating detailed figures for the fabrication of a frame that held a flat panel loud speaker. The loud speaker was made from a thin sheet of Plexiglas and 4 acoustic exciters. COMSOL, a multiphysics finite analysis simulator, was used to model and evaluate all stages of the loud speaker, frame, and sound diffuser. Acoustical testing measurements were utilized to create polar plots from the working prototype which were then compared to the COMSOL simulations to select the optimal design for the dome. The final goal of the project was to install the flat panel loud speaker design in addition to a sound diffuser on to the wall of the dome. After running tests in COMSOL on various speaker configurations, including a warped Plexiglas version, the optimal speaker design included a flat piece of Plexiglas with a rounded frame to match the curvature of the dome. Eight of these loud speakers will be mounted into an inch and a half of high performance acoustic insulation, or Thinsulate, that will cover the inside of the dome. The following technical paper discusses these projects and explains the engineering processes used, knowledge gained, and the projected future goals of this project

  3. Popular Public Discourse at Speakers' Corner: Negotiating Cultural Identities in Interaction

    DEFF Research Database (Denmark)

    McIlvenny, Paul

    1996-01-01

    In this paper I examine how cultural identities are actively negotiated in popular debate at a multicultural public setting in London. Speakers at Speakers' Corner manage the local construction of group affiliation, audience response and argument in and through talk, within the context of ethnic...... in which participant 'citizens' in the public sphere can actively struggle over cultural representation and identities. Using transcribed examples of video data recorded at Speakers' Corner my paper will examine how cultural identity is invoked in the management of active participation. Audiences...... and their affiliations are regulated and made accountable through the routines of membership categorisation and the policing of cultural identities and their imaginary borders....

  4. Robust and distributed hypothesis testing

    CERN Document Server

    Gül, Gökhan

    2017-01-01

    This book generalizes and extends the available theory in robust and decentralized hypothesis testing. In particular, it presents a robust test for modeling errors which is independent from the assumptions that a sufficiently large number of samples is available, and that the distance is the KL-divergence. Here, the distance can be chosen from a much general model, which includes the KL-divergence as a very special case. This is then extended by various means. A minimax robust test that is robust against both outliers as well as modeling errors is presented. Minimax robustness properties of the given tests are also explicitly proven for fixed sample size and sequential probability ratio tests. The theory of robust detection is extended to robust estimation and the theory of robust distributed detection is extended to classes of distributions, which are not necessarily stochastically bounded. It is shown that the quantization functions for the decision rules can also be chosen as non-monotone. Finally, the boo...

  5. Robustness of IPTV business models

    NARCIS (Netherlands)

    Bouwman, H.; Zhengjia, M.; Duin, P. van der; Limonard, S.

    2008-01-01

    The final stage in the STOF method is an evaluation of the robustness of the design, for which the method provides some guidelines. For many innovative services, the future holds numerous uncertainties, which makes evaluating the robustness of a business model a difficult task. In this chapter, we

  6. Robustness Evaluation of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard

    2009-01-01

    Robustness of structural systems has obtained a renewed interest due to a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure.......Robustness of structural systems has obtained a renewed interest due to a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure....

  7. Robust statistical methods with R

    CERN Document Server

    Jureckova, Jana

    2005-01-01

    Robust statistical methods were developed to supplement the classical procedures when the data violate classical assumptions. They are ideally suited to applied research across a broad spectrum of study, yet most books on the subject are narrowly focused, overly theoretical, or simply outdated. Robust Statistical Methods with R provides a systematic treatment of robust procedures with an emphasis on practical application.The authors work from underlying mathematical tools to implementation, paying special attention to the computational aspects. They cover the whole range of robust methods, including differentiable statistical functions, distance of measures, influence functions, and asymptotic distributions, in a rigorous yet approachable manner. Highlighting hands-on problem solving, many examples and computational algorithms using the R software supplement the discussion. The book examines the characteristics of robustness, estimators of real parameter, large sample properties, and goodness-of-fit tests. It...

  8. Speaker gaze increases information coupling between infant and adult brains.

    Science.gov (United States)

    Leong, Victoria; Byrne, Elizabeth; Clackson, Kaili; Georgieva, Stanimira; Lam, Sarah; Wass, Sam

    2017-12-12

    When infants and adults communicate, they exchange social signals of availability and communicative intention such as eye gaze. Previous research indicates that when communication is successful, close temporal dependencies arise between adult speakers' and listeners' neural activity. However, it is not known whether similar neural contingencies exist within adult-infant dyads. Here, we used dual-electroencephalography to assess whether direct gaze increases neural coupling between adults and infants during screen-based and live interactions. In experiment 1 ( n = 17), infants viewed videos of an adult who was singing nursery rhymes with ( i ) direct gaze (looking forward), ( ii ) indirect gaze (head and eyes averted by 20°), or ( iii ) direct-oblique gaze (head averted but eyes orientated forward). In experiment 2 ( n = 19), infants viewed the same adult in a live context, singing with direct or indirect gaze. Gaze-related changes in adult-infant neural network connectivity were measured using partial directed coherence. Across both experiments, the adult had a significant (Granger) causal influence on infants' neural activity, which was stronger during direct and direct-oblique gaze relative to indirect gaze. During live interactions, infants also influenced the adult more during direct than indirect gaze. Further, infants vocalized more frequently during live direct gaze, and individual infants who vocalized longer also elicited stronger synchronization from the adult. These results demonstrate that direct gaze strengthens bidirectional adult-infant neural connectivity during communication. Thus, ostensive social signals could act to bring brains into mutual temporal alignment, creating a joint-networked state that is structured to facilitate information transfer during early communication and learning. Copyright © 2017 the Author(s). Published by PNAS.

  9. The native-speaker fever in English language teaching (ELT: Pitting pedagogical competence against historical origin

    Directory of Open Access Journals (Sweden)

    Anchimbe, Eric A.

    2006-01-01

    Full Text Available This paper discusses English language teaching (ELT around the world, and argues that as a profession, it should emphasise pedagogical competence rather than native-speaker requirement in the recruitment of teachers in English as a foreign language (EFL and English as a second language (ESL contexts. It establishes that being a native speaker does not make one automatically a competent speaker or, of that matter, a competent teacher of the language. It observes that on many grounds, including physical, sociocultural, technological and economic changes in the world as well as the status of English as official and national language in many post-colonial regions, the distinction between native and non-native speakers is no longer valid.

  10. Testing Template and Testing Concept of Operations for Speaker Authentication Technology

    National Research Council Canada - National Science Library

    Sipko, Marek M

    2006-01-01

    This thesis documents the findings of developing a generic testing template and supporting concept of operations for speaker verification technology as part of the Iraqi Enrollment via Voice Authentication Project (IEVAP...

  11. Is the superior verbal memory span of Mandarin speakers due to faster rehearsal?

    Science.gov (United States)

    Mattys, Sven L; Baddeley, Alan; Trenkic, Danijela

    2018-04-01

    It is well established that digit span in native Chinese speakers is atypically high. This is commonly attributed to a capacity for more rapid subvocal rehearsal for that group. We explored this hypothesis by testing a group of English-speaking native Mandarin speakers on digit span and word span in both Mandarin and English, together with a measure of speed of articulation for each. When compared to the performance of native English speakers, the Mandarin group proved to be superior on both digit and word spans while predictably having lower spans in English. This suggests that the Mandarin advantage is not limited to digits. Speed of rehearsal correlated with span performance across materials. However, this correlation was more pronounced for English speakers than for any of the Chinese measures. Further analysis suggested that speed of rehearsal did not provide an adequate account of differences between Mandarin and English spans or for the advantage of digits over words. Possible alternative explanations are discussed.

  12. Artificially intelligent recognition of Arabic speaker using voice print-based local features

    Science.gov (United States)

    Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

    2016-11-01

    Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.

  13. Speaker-Sex Discrimination for Voiced and Whispered Vowels at Short Durations

    OpenAIRE

    Smith, David R. R.

    2016-01-01

    Whispered vowels, produced with no vocal fold vibration, lack the periodic temporal fine structure which in voiced vowels underlies the perceptual attribute of pitch (a salient auditory cue to speaker sex). Voiced vowels possess no temporal fine structure at very short durations (below two glottal cycles). The prediction was that speaker-sex discrimination performance for whispered and voiced vowels would be similar for very short durations but, as stimulus duration increases, voiced vowel pe...

  14. Does training make French speakers more able to identify lexical stress?

    OpenAIRE

    Schwab, Sandra; Llisterri, Joaquim

    2013-01-01

    This research takes the stress deafness hypothesis as a starting point (e.g. Dupoux et al., 2008), and, more specifically, the fact that French speakers present difficulties in perceiving lexical stress in a free-stress language. In this framework, we aim at determining whether a prosodic training could improve the ability of French speakers to identify the stressed syllable in Spanish words. Three groups of participants took part in this experiment. The Native group was composed of 16 speake...

  15. Robust loss functions for boosting.

    Science.gov (United States)

    Kanamori, Takafumi; Takenouchi, Takashi; Eguchi, Shinto; Murata, Noboru

    2007-08-01

    Boosting is known as a gradient descent algorithm over loss functions. It is often pointed out that the typical boosting algorithm, Adaboost, is highly affected by outliers. In this letter, loss functions for robust boosting are studied. Based on the concept of robust statistics, we propose a transformation of loss functions that makes boosting algorithms robust against extreme outliers. Next, the truncation of loss functions is applied to contamination models that describe the occurrence of mislabels near decision boundaries. Numerical experiments illustrate that the proposed loss functions derived from the contamination models are useful for handling highly noisy data in comparison with other loss functions.

  16. Theoretical Framework for Robustness Evaluation

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2011-01-01

    This paper presents a theoretical framework for evaluation of robustness of structural systems, incl. bridges and buildings. Typically modern structural design codes require that ‘the consequence of damages to structures should not be disproportional to the causes of the damages’. However, although...... the importance of robustness for structural design is widely recognized the code requirements are not specified in detail, which makes the practical use difficult. This paper describes a theoretical and risk based framework to form the basis for quantification of robustness and for pre-normative guidelines...

  17. Robustness of airline route networks

    Science.gov (United States)

    Lordan, Oriol; Sallan, Jose M.; Escorihuela, Nuria; Gonzalez-Prieto, David

    2016-03-01

    Airlines shape their route network by defining their routes through supply and demand considerations, paying little attention to network performance indicators, such as network robustness. However, the collapse of an airline network can produce high financial costs for the airline and all its geographical area of influence. The aim of this study is to analyze the topology and robustness of the network route of airlines following Low Cost Carriers (LCCs) and Full Service Carriers (FSCs) business models. Results show that FSC hubs are more central than LCC bases in their route network. As a result, LCC route networks are more robust than FSC networks.

  18. Presenting and processing information in background noise: A combined speaker-listener perspective.

    Science.gov (United States)

    Bockstael, Annelies; Samyn, Laurie; Corthals, Paul; Botteldooren, Dick

    2018-01-01

    Transferring information orally in background noise is challenging, for both speaker and listener. Successful transfer depends on complex interaction between characteristics related to listener, speaker, task, background noise, and context. To fully assess the underlying real-life mechanisms, experimental design has to mimic this complex reality. In the current study, the effects of different types of background noise have been studied in an ecologically valid test design. Documentary-style information had to be presented by the speaker and simultaneously acquired by the listener in four conditions: quiet, unintelligible multitalker babble, fluctuating city street noise, and little varying highway noise. For both speaker and listener, the primary task was to focus on the content that had to be transferred. In addition, for the speakers, the occurrence of hesitation phenomena was assessed. The listener had to perform an additional secondary task to address listening effort. For the listener the condition with the most eventful background noise, i.e., fluctuating city street noise, appeared to be the most difficult with markedly longer duration of the secondary task. In the same fluctuating background noise, speech appeared to be less disfluent, suggesting a higher level of concentration from the speaker's side.

  19. Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation

    Science.gov (United States)

    Sun, Hanwu; Nwe, Tin Lay; Koh, Eugene Chin Wei; Bin, Ma; Li, Haizhou

    2007-09-01

    This paper presents a speaker diarization system developed at the Institute for Infocomm Research (I2R) for NIST Rich Transcription 2007 (RT-07) evaluation task. We describe in details our primary approaches for the speaker diarization on the Multiple Distant Microphones (MDM) conditions in conference room scenario. Our proposed system consists of six modules: 1). Least-mean squared (NLMS) adaptive filter for the speaker direction estimate via Time Difference of Arrival (TDOA), 2). An initial speaker clustering via two-stage TDOA histogram distribution quantization approach, 3). Multiple microphone speaker data alignment via GCC-PHAT Time Delay Estimate (TDE) among all the distant microphone channel signals, 4). A speaker clustering algorithm based on GMM modeling approach, 5). Non-speech removal via speech/non-speech verification mechanism and, 6). Silence removal via "Double-Layer Windowing"(DLW) method. We achieves error rate of 31.02% on the 2006 Spring (RT-06s) MDM evaluation task and a competitive overall error rate of 15.32% for the NIST Rich Transcription 2007 (RT-07) MDM evaluation task.

  20. Modulation of Lactobacillus plantarum gastrointestinal robustness by fermentation conditions enables identification of bacterial robustness markers

    NARCIS (Netherlands)

    Bokhorst-van de Veen, van H.; Lee, I.; Marco, M.L.; Bron, P.A.; Kleerebezem, M.

    2012-01-01

    Background - Lactic acid bacteria (LAB) are applied worldwide in the production of a variety of fermented food products. Additionally, specific Lactobacillus species are nowadays recognized for their health-promoting effects on the consumer. To optimally exert such beneficial effects, it is

  1. Voice patterns in adult English speakers with Autism Spectrum Disorder

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Lambrechts, Anna; Yarrow, Kielan

    Individuals with Autism Spectrum Disorder (ASD) often display atypical modulation of speech described as awkward, monotone, or sing-songy (Shriberg et al., 2001). These patterns are a robust signal of social communication deficit (Paul et al., 2005) and contribute to reaching a diagnosis of ASD...... to be re-trained. Conclusions: The current data suggest than ASD adults produce highly regular patterns of speech (as measured by pitch and pause distribution). Importantly this provides a quantifiable measurement to capture some of the clinical reports which contribute to reaching a diagnosis of autism...

  2. Robust identification of polyethylene terephthalate (PET plastics through Bayesian decision.

    Directory of Open Access Journals (Sweden)

    Mohd Asyraf Zulkifley

    Full Text Available Recycling is one of the most efficient methods for environmental friendly waste management. Among municipal wastes, plastics are the most common material that can be easily recycled and polyethylene terephthalate (PET is one of its major types. PET material is used in consumer goods packaging such as drinking bottles, toiletry containers, food packaging and many more. Usually, a recycling process is tailored to a specific material for optimal purification and decontamination to obtain high grade recyclable material. The quantity and quality of the sorting process are limited by the capacity of human workers that suffer from fatigue and boredom. Several automated sorting systems have been proposed in the literature that include using chemical, proximity and vision sensors. The main advantages of vision based sensors are its environmentally friendly approach, non-intrusive detection and capability of high throughput. However, the existing methods rely heavily on deterministic approaches that make them less accurate as the variations in PET plastic waste appearance are too high. We proposed a probabilistic approach of modeling the PET material by analyzing the reflection region and its surrounding. Three parameters are modeled by Gaussian and exponential distributions: color, size and distance of the reflection region. The final classification is made through a supervised training method of likelihood ratio test. The main novelty of the proposed method is the probabilistic approach in integrating various PET material signatures that are contaminated by stains under constant lighting changes. The system is evaluated by using four performance metrics: precision, recall, accuracy and error. Our system performed the best in all evaluation metrics compared to the benchmark methods. The system can be further improved by fusing all neighborhood information in decision making and by implementing the system in a graphics processing unit for faster processing speed.

  3. Robust identification of polyethylene terephthalate (PET) plastics through Bayesian decision.

    Science.gov (United States)

    Zulkifley, Mohd Asyraf; Mustafa, Mohd Marzuki; Hussain, Aini; Mustapha, Aouache; Ramli, Suzaimah

    2014-01-01

    Recycling is one of the most efficient methods for environmental friendly waste management. Among municipal wastes, plastics are the most common material that can be easily recycled and polyethylene terephthalate (PET) is one of its major types. PET material is used in consumer goods packaging such as drinking bottles, toiletry containers, food packaging and many more. Usually, a recycling process is tailored to a specific material for optimal purification and decontamination to obtain high grade recyclable material. The quantity and quality of the sorting process are limited by the capacity of human workers that suffer from fatigue and boredom. Several automated sorting systems have been proposed in the literature that include using chemical, proximity and vision sensors. The main advantages of vision based sensors are its environmentally friendly approach, non-intrusive detection and capability of high throughput. However, the existing methods rely heavily on deterministic approaches that make them less accurate as the variations in PET plastic waste appearance are too high. We proposed a probabilistic approach of modeling the PET material by analyzing the reflection region and its surrounding. Three parameters are modeled by Gaussian and exponential distributions: color, size and distance of the reflection region. The final classification is made through a supervised training method of likelihood ratio test. The main novelty of the proposed method is the probabilistic approach in integrating various PET material signatures that are contaminated by stains under constant lighting changes. The system is evaluated by using four performance metrics: precision, recall, accuracy and error. Our system performed the best in all evaluation metrics compared to the benchmark methods. The system can be further improved by fusing all neighborhood information in decision making and by implementing the system in a graphics processing unit for faster processing speed.

  4. Robust Identification of Polyethylene Terephthalate (PET) Plastics through Bayesian Decision

    Science.gov (United States)

    Zulkifley, Mohd Asyraf; Mustafa, Mohd Marzuki; Hussain, Aini; Mustapha, Aouache; Ramli, Suzaimah

    2014-01-01

    Recycling is one of the most efficient methods for environmental friendly waste management. Among municipal wastes, plastics are the most common material that can be easily recycled and polyethylene terephthalate (PET) is one of its major types. PET material is used in consumer goods packaging such as drinking bottles, toiletry containers, food packaging and many more. Usually, a recycling process is tailored to a specific material for optimal purification and decontamination to obtain high grade recyclable material. The quantity and quality of the sorting process are limited by the capacity of human workers that suffer from fatigue and boredom. Several automated sorting systems have been proposed in the literature that include using chemical, proximity and vision sensors. The main advantages of vision based sensors are its environmentally friendly approach, non-intrusive detection and capability of high throughput. However, the existing methods rely heavily on deterministic approaches that make them less accurate as the variations in PET plastic waste appearance are too high. We proposed a probabilistic approach of modeling the PET material by analyzing the reflection region and its surrounding. Three parameters are modeled by Gaussian and exponential distributions: color, size and distance of the reflection region. The final classification is made through a supervised training method of likelihood ratio test. The main novelty of the proposed method is the probabilistic approach in integrating various PET material signatures that are contaminated by stains under constant lighting changes. The system is evaluated by using four performance metrics: precision, recall, accuracy and error. Our system performed the best in all evaluation metrics compared to the benchmark methods. The system can be further improved by fusing all neighborhood information in decision making and by implementing the system in a graphics processing unit for faster processing speed. PMID:25485630

  5. Identification of Robust Terminal-Area Routes in Convective Weather

    Science.gov (United States)

    Pfeil, Diana Michalek; Balakrishnan, Hamsa

    2012-01-01

    Convective weather is responsible for large delays and widespread disruptions in the U.S. National Airspace System, especially during summer. Traffic flow management algorithms require reliable forecasts of route blockage to schedule and route traffic. This paper demonstrates how raw convective weather forecasts, which provide deterministic predictions of the vertically integrated liquid (the precipitation content in a column of airspace) can be translated into probabilistic forecasts of whether or not a terminal area route will be blocked. Given a flight route through the terminal area, we apply techniques from machine learning to determine the likelihood that the route will be open in actual weather. The likelihood is then used to optimize terminalarea operations by dynamically moving arrival and departure routes to maximize the expected capacity of the terminal area. Experiments using real weather scenarios on stormy days show that our algorithms recommend that a terminal-area route be modified 30% of the time, opening up 13% more available routes that were forecast to be blocked during these scenarios. The error rate is low, with only 5% of cases corresponding to a modified route being blocked in reality, whereas the original route is in fact open. In addition, for routes predicted to be open with probability 0.95 or greater by our method, 96% of these routes (on average over time horizon) are indeed open in the weather that materializes

  6. Identification of robust synthon in the molecular salts of 2 ...

    Indian Academy of Sciences (India)

    Crystal engineering has drawn the attention of the sci- entific community ... organic synthesis,2 host-guest chemistry,3 photographic film formulation,4 ... 2.1 Materials and general methods ... absorption correction was applied to the collected.

  7. Robust Portfolio Optimization Using Pseudodistances.

    Science.gov (United States)

    Toma, Aida; Leoni-Aubin, Samuela

    2015-01-01

    The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature.

  8. Robust methods for data reduction

    CERN Document Server

    Farcomeni, Alessio

    2015-01-01

    Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis.The first part of the book illustrates how dimension reduction techniques synthesize available information by reducing the dimensionality of the data. The second part focuses on cluster and discriminant analy

  9. Robust hashing for 3D models

    Science.gov (United States)

    Berchtold, Waldemar; Schäfer, Marcel; Rettig, Michael; Steinebach, Martin

    2014-02-01

    3D models and applications are of utmost interest in both science and industry. With the increment of their usage, their number and thereby the challenge to correctly identify them increases. Content identification is commonly done by cryptographic hashes. However, they fail as a solution in application scenarios such as computer aided design (CAD), scientific visualization or video games, because even the smallest alteration of the 3D model, e.g. conversion or compression operations, massively changes the cryptographic hash as well. Therefore, this work presents a robust hashing algorithm for 3D mesh data. The algorithm applies several different bit extraction methods. They are built to resist desired alterations of the model as well as malicious attacks intending to prevent correct allocation. The different bit extraction methods are tested against each other and, as far as possible, the hashing algorithm is compared to the state of the art. The parameters tested are robustness, security and runtime performance as well as False Acceptance Rate (FAR) and False Rejection Rate (FRR), also the probability calculation of hash collision is included. The introduced hashing algorithm is kept adaptive e.g. in hash length, to serve as a proper tool for all applications in practice.

  10. Proficiency in English sentence stress production by Cantonese speakers who speak English as a second language (ESL).

    Science.gov (United States)

    Ng, Manwa L; Chen, Yang

    2011-12-01

    The present study examined English sentence stress produced by native Cantonese speakers who were speaking English as a second language (ESL). Cantonese ESL speakers' proficiency in English stress production as perceived by English-speaking listeners was also studied. Acoustical parameters associated with sentence stress including fundamental frequency (F0), vowel duration, and intensity were measured from the English sentences produced by 40 Cantonese ESL speakers. Data were compared with those obtained from 40 native speakers of American English. The speech samples were also judged by eight native listeners who were native speakers of American English for placement, degree, and naturalness of stress. Results showed that Cantonese ESL speakers were able to use F0, vowel duration, and intensity to differentiate sentence stress patterns. Yet, both female and male Cantonese ESL speakers exhibited consistently higher F0 in stressed words than English speakers. Overall, Cantonese ESL speakers were found to be proficient in using duration and intensity to signal sentence stress, in a way comparable with English speakers. In addition, F0 and intensity were found to correlate closely with perceptual judgement and the degree of stress with the naturalness of stress.

  11. Complimenting Functions by Native English Speakers and Iranian EFL Learners: A Divergence or Convergence

    Directory of Open Access Journals (Sweden)

    Ali Akbar Ansarin

    2016-01-01

    Full Text Available The study of compliment speech act has been under investigation on many occasions in recent years. In this study, an attempt is made to explore appraisals performed by native English speakers and Iranian EFL learners to find out how these two groups diverge or converge from each other with regard to complimenting patterns and norms. The participants of the study were 60 advanced Iranian EFL learners who were speaking Persian as their first language and 60 native English speakers. Through a written Discourse Completion Task comprised of eight different scenarios, compliments were analyzed with regard to topics (performance, personality, possession, and skill, functions (explicit, implicit, and opt-out, gender differences and the common positive adjectives used by two groups of native and nonnative participants. The findings suggested that native English speakers praised individuals more implicitly in comparison with Iranian EFL learners and native speakers provided opt-outs more frequently than Iranian EFL learners did. The analysis of data by Chi-square showed that gender and macro functions are independent of each other among Iranian EFL learners’ compliments while for native speakers, gender played a significant role in the distribution of appraisals. Iranian EFL learners’ complimenting patterns converge more towards those of native English speakers. Moreover, both groups favored explicit compliments. However, Iranian EFL learners were more inclined to provide explicit compliments. It can be concluded that there were more similarities rather than differences between Iranian EFL learners and native English speakers regarding compliment speech act. The results of this study can benefit researchers, teachers, material developers, and EFL learners.

  12. Physiological Indices of Bilingualism: Oral–Motor Coordination and Speech Rate in Bengali–English Speakers

    Science.gov (United States)

    Chakraborty, Rahul; Goffman, Lisa; Smith, Anne

    2009-01-01

    Purpose To examine how age of immersion and proficiency in a 2nd language influence speech movement variability and speaking rate in both a 1st language and a 2nd language. Method A group of 21 Bengali–English bilingual speakers participated. Lip and jaw movements were recorded. For all 21 speakers, lip movement variability was assessed based on productions of Bengali (L1; 1st language) and English (L2; 2nd language) sentences. For analyses related to the influence of L2 proficiency on speech production processes, participants were sorted into low- (n = 7) and high-proficiency (n = 7) groups. Lip movement variability and speech rate were evaluated for both of these groups across L1 and L2 sentences. Results Surprisingly, adult bilingual speakers produced equally consistent speech movement patterns in their production of L1 and L2. When groups were sorted according to proficiency, highly proficient speakers were marginally more variable in their L1. In addition, there were some phoneme-specific effects, most markedly that segments not shared by both languages were treated differently in production. Consistent with previous studies, movement durations were longer for less proficient speakers in both L1 and L2. Interpretation In contrast to those of child learners, the speech motor systems of adult L2 speakers show a high degree of consistency. Such lack of variability presumably contributes to protracted difficulties with acquiring nativelike pronunciation in L2. The proficiency results suggest bidirectional interactions across L1 and L2, which is consistent with hypotheses regarding interference and the sharing of phonological space. A slower speech rate in less proficient speakers implies that there are increased task demands on speech production processes. PMID:18367680

  13. Robust design optimization using the price of robustness, robust least squares and regularization methods

    Science.gov (United States)

    Bukhari, Hassan J.

    2017-12-01

    In this paper a framework for robust optimization of mechanical design problems and process systems that have parametric uncertainty is presented using three different approaches. Robust optimization problems are formulated so that the optimal solution is robust which means it is minimally sensitive to any perturbations in parameters. The first method uses the price of robustness approach which assumes the uncertain parameters to be symmetric and bounded. The robustness for the design can be controlled by limiting the parameters that can perturb.The second method uses the robust least squares method to determine the optimal parameters when data itself is subjected to perturbations instead of the parameters. The last method manages uncertainty by restricting the perturbation on parameters to improve sensitivity similar to Tikhonov regularization. The methods are implemented on two sets of problems; one linear and the other non-linear. This methodology will be compared with a prior method using multiple Monte Carlo simulation runs which shows that the approach being presented in this paper results in better performance.

  14. Wavelet Filtering to Reduce Conservatism in Aeroservoelastic Robust Stability Margins

    Science.gov (United States)

    Brenner, Marty; Lind, Rick

    1998-01-01

    Wavelet analysis for filtering and system identification was used to improve the estimation of aeroservoelastic stability margins. The conservatism of the robust stability margins was reduced with parametric and nonparametric time-frequency analysis of flight data in the model validation process. Nonparametric wavelet processing of data was used to reduce the effects of external desirableness and unmodeled dynamics. Parametric estimates of modal stability were also extracted using the wavelet transform. Computation of robust stability margins for stability boundary prediction depends on uncertainty descriptions derived from the data for model validation. F-18 high Alpha Research Vehicle aeroservoelastic flight test data demonstrated improved robust stability prediction by extension of the stability boundary beyond the flight regime.

  15. Phoneme Error Pattern by Heritage Speakers of Spanish on an English Word Recognition Test.

    Science.gov (United States)

    Shi, Lu-Feng

    2017-04-01

    Heritage speakers acquire their native language from home use in their early childhood. As the native language is typically a minority language in the society, these individuals receive their formal education in the majority language and eventually develop greater competency with the majority than their native language. To date, there have not been specific research attempts to understand word recognition by heritage speakers. It is not clear if and to what degree we may infer from evidence based on bilingual listeners in general. This preliminary study investigated how heritage speakers of Spanish perform on an English word recognition test and analyzed their phoneme errors. A prospective, cross-sectional, observational design was employed. Twelve normal-hearing adult Spanish heritage speakers (four men, eight women, 20-38 yr old) participated in the study. Their language background was obtained through the Language Experience and Proficiency Questionnaire. Nine English monolingual listeners (three men, six women, 20-41 yr old) were also included for comparison purposes. Listeners were presented with 200 Northwestern University Auditory Test No. 6 words in quiet. They repeated each word orally and in writing. Their responses were scored by word, word-initial consonant, vowel, and word-final consonant. Performance was compared between groups with Student's t test or analysis of variance. Group-specific error patterns were primarily descriptive, but intergroup comparisons were made using 95% or 99% confidence intervals for proportional data. The two groups of listeners yielded comparable scores when their responses were examined by word, vowel, and final consonant. However, heritage speakers of Spanish misidentified significantly more word-initial consonants and had significantly more difficulty with initial /p, b, h/ than their monolingual peers. The two groups yielded similar patterns for vowel and word-final consonants, but heritage speakers made significantly

  16. Combining Behavioral and ERP Methodologies to Investigate the Differences Between McGurk Effects Demonstrated by Cantonese and Mandarin Speakers

    Directory of Open Access Journals (Sweden)

    Juan Zhang

    2018-05-01

    Full Text Available The present study investigated the impact of Chinese dialects on McGurk effect using behavioral and event-related potential (ERP methodologies. Specifically, intra-language comparison of McGurk effect was conducted between Mandarin and Cantonese speakers. The behavioral results showed that Cantonese speakers exhibited a stronger McGurk effect in audiovisual speech perception compared to Mandarin speakers, although both groups performed equally in the auditory and visual conditions. ERP results revealed that Cantonese speakers were more sensitive to visual cues than Mandarin speakers, though this was not the case for the auditory cues. Taken together, the current findings suggest that the McGurk effect generated by Chinese speakers is mainly influenced by segmental phonology during audiovisual speech integration.

  17. Combining Behavioral and ERP Methodologies to Investigate the Differences Between McGurk Effects Demonstrated by Cantonese and Mandarin Speakers

    Science.gov (United States)

    Zhang, Juan; Meng, Yaxuan; McBride, Catherine; Fan, Xitao; Yuan, Zhen

    2018-01-01

    The present study investigated the impact of Chinese dialects on McGurk effect using behavioral and event-related potential (ERP) methodologies. Specifically, intra-language comparison of McGurk effect was conducted between Mandarin and Cantonese speakers. The behavioral results showed that Cantonese speakers exhibited a stronger McGurk effect in audiovisual speech perception compared to Mandarin speakers, although both groups performed equally in the auditory and visual conditions. ERP results revealed that Cantonese speakers were more sensitive to visual cues than Mandarin speakers, though this was not the case for the auditory cues. Taken together, the current findings suggest that the McGurk effect generated by Chinese speakers is mainly influenced by segmental phonology during audiovisual speech integration. PMID:29780312

  18. Musical Sophistication and the Effect of Complexity on Auditory Discrimination in Finnish Speakers

    Science.gov (United States)

    Dawson, Caitlin; Aalto, Daniel; Šimko, Juraj; Vainio, Martti; Tervaniemi, Mari

    2017-01-01

    Musical experiences and native language are both known to affect auditory processing. The present work aims to disentangle the influences of native language phonology and musicality on behavioral and subcortical sound feature processing in a population of musically diverse Finnish speakers as well as to investigate the specificity of enhancement from musical training. Finnish speakers are highly sensitive to duration cues since in Finnish, vowel and consonant duration determine word meaning. Using a correlational approach with a set of behavioral sound feature discrimination tasks, brainstem recordings, and a musical sophistication questionnaire, we find no evidence for an association between musical sophistication and more precise duration processing in Finnish speakers either in the auditory brainstem response or in behavioral tasks, but they do show an enhanced pitch discrimination compared to Finnish speakers with less musical experience and show greater duration modulation in a complex task. These results are consistent with a ceiling effect set for certain sound features which corresponds to the phonology of the native language, leaving an opportunity for music experience-based enhancement of sound features not explicitly encoded in the language (such as pitch, which is not explicitly encoded in Finnish). Finally, the pattern of duration modulation in more musically sophisticated Finnish speakers suggests integrated feature processing for greater efficiency in a real world musical situation. These results have implications for research into the specificity of plasticity in the auditory system as well as to the effects of interaction of specific language features with musical experiences. PMID:28450829

  19. Musical Sophistication and the Effect of Complexity on Auditory Discrimination in Finnish Speakers.

    Science.gov (United States)

    Dawson, Caitlin; Aalto, Daniel; Šimko, Juraj; Vainio, Martti; Tervaniemi, Mari

    2017-01-01

    Musical experiences and native language are both known to affect auditory processing. The present work aims to disentangle the influences of native language phonology and musicality on behavioral and subcortical sound feature processing in a population of musically diverse Finnish speakers as well as to investigate the specificity of enhancement from musical training. Finnish speakers are highly sensitive to duration cues since in Finnish, vowel and consonant duration determine word meaning. Using a correlational approach with a set of behavioral sound feature discrimination tasks, brainstem recordings, and a musical sophistication questionnaire, we find no evidence for an association between musical sophistication and more precise duration processing in Finnish speakers either in the auditory brainstem response or in behavioral tasks, but they do show an enhanced pitch discrimination compared to Finnish speakers with less musical experience and show greater duration modulation in a complex task. These results are consistent with a ceiling effect set for certain sound features which corresponds to the phonology of the native language, leaving an opportunity for music experience-based enhancement of sound features not explicitly encoded in the language (such as pitch, which is not explicitly encoded in Finnish). Finally, the pattern of duration modulation in more musically sophisticated Finnish speakers suggests integrated feature processing for greater efficiency in a real world musical situation. These results have implications for research into the specificity of plasticity in the auditory system as well as to the effects of interaction of specific language features with musical experiences.

  20. Encoding, rehearsal, and recall in signers and speakers: shared network but differential engagement.

    Science.gov (United States)

    Bavelier, D; Newman, A J; Mukherjee, M; Hauser, P; Kemeny, S; Braun, A; Boutla, M

    2008-10-01

    Short-term memory (STM), or the ability to hold verbal information in mind for a few seconds, is known to rely on the integrity of a frontoparietal network of areas. Here, we used functional magnetic resonance imaging to ask whether a similar network is engaged when verbal information is conveyed through a visuospatial language, American Sign Language, rather than speech. Deaf native signers and hearing native English speakers performed a verbal recall task, where they had to first encode a list of letters in memory, maintain it for a few seconds, and finally recall it in the order presented. The frontoparietal network described to mediate STM in speakers was also observed in signers, with its recruitment appearing independent of the modality of the language. This finding supports the view that signed and spoken STM rely on similar mechanisms. However, deaf signers and hearing speakers differentially engaged key structures of the frontoparietal network as the stages of STM unfold. In particular, deaf signers relied to a greater extent than hearing speakers on passive memory storage areas during encoding and maintenance, but on executive process areas during recall. This work opens new avenues for understanding similarities and differences in STM performance in signers and speakers.

  1. Analysis of Acoustic Features in Speakers with Cognitive Disorders and Speech Impairments

    Science.gov (United States)

    Saz, Oscar; Simón, Javier; Rodríguez, W. Ricardo; Lleida, Eduardo; Vaquero, Carlos

    2009-12-01

    This work presents the results in the analysis of the acoustic features (formants and the three suprasegmental features: tone, intensity and duration) of the vowel production in a group of 14 young speakers suffering different kinds of speech impairments due to physical and cognitive disorders. A corpus with unimpaired children's speech is used to determine the reference values for these features in speakers without any kind of speech impairment within the same domain of the impaired speakers; this is 57 isolated words. The signal processing to extract the formant and pitch values is based on a Linear Prediction Coefficients (LPCs) analysis of the segments considered as vowels in a Hidden Markov Model (HMM) based Viterbi forced alignment. Intensity and duration are also based in the outcome of the automated segmentation. As main conclusion of the work, it is shown that intelligibility of the vowel production is lowered in impaired speakers even when the vowel is perceived as correct by human labelers. The decrease in intelligibility is due to a 30% of increase in confusability in the formants map, a reduction of 50% in the discriminative power in energy between stressed and unstressed vowels and to a 50% increase of the standard deviation in the length of the vowels. On the other hand, impaired speakers keep good control of tone in the production of stressed and unstressed vowels.

  2. Age differences in vocal emotion perception: on the role of speaker age and listener sex.

    Science.gov (United States)

    Sen, Antarika; Isaacowitz, Derek; Schirmer, Annett

    2017-10-24

    Older adults have greater difficulty than younger adults perceiving vocal emotions. To better characterise this effect, we explored its relation to age differences in sensory, cognitive and emotional functioning. Additionally, we examined the role of speaker age and listener sex. Participants (N = 163) aged 19-34 years and 60-85 years categorised neutral sentences spoken by ten younger and ten older speakers with a happy, neutral, sad, or angry voice. Acoustic analyses indicated that expressions from younger and older speakers denoted the intended emotion with similar accuracy. As expected, younger participants outperformed older participants and this effect was statistically mediated by an age-related decline in both optimism and working-memory. Additionally, age differences in emotion perception were larger for younger as compared to older speakers and a better perception of younger as compared to older speakers was greater in younger as compared to older participants. Last, a female perception benefit was less pervasive in the older than the younger group. Together, these findings suggest that the role of age for emotion perception is multi-faceted. It is linked to emotional and cognitive change, to processing biases that benefit young and own-age expressions, and to the different aptitudes of women and men.

  3. Advances in robust fractional control

    CERN Document Server

    Padula, Fabrizio

    2015-01-01

    This monograph presents design methodologies for (robust) fractional control systems. It shows the reader how to take advantage of the superior flexibility of fractional control systems compared with integer-order systems in achieving more challenging control requirements. There is a high degree of current interest in fractional systems and fractional control arising from both academia and industry and readers from both milieux are catered to in the text. Different design approaches having in common a trade-off between robustness and performance of the control system are considered explicitly. The text generalizes methodologies, techniques and theoretical results that have been successfully applied in classical (integer) control to the fractional case. The first part of Advances in Robust Fractional Control is the more industrially-oriented. It focuses on the design of fractional controllers for integer processes. In particular, it considers fractional-order proportional-integral-derivative controllers, becau...

  4. Robustness of digital artist authentication

    DEFF Research Database (Denmark)

    Jacobsen, Robert; Nielsen, Morten

    In many cases it is possible to determine the authenticity of a painting from digital reproductions of the paintings; this has been demonstrated for a variety of artists and with different approaches. Common to all these methods in digital artist authentication is that the potential of the method...... is in focus, while the robustness has not been considered, i.e. the degree to which the data collection process influences the decision of the method. However, in order for an authentication method to be successful in practice, it needs to be robust to plausible error sources from the data collection....... In this paper we investigate the robustness of the newly proposed authenticity method introduced by the authors based on second generation multiresolution analysis. This is done by modelling a number of realistic factors that can occur in the data collection....

  5. Attractive ellipsoids in robust control

    CERN Document Server

    Poznyak, Alexander; Azhmyakov, Vadim

    2014-01-01

    This monograph introduces a newly developed robust-control design technique for a wide class of continuous-time dynamical systems called the “attractive ellipsoid method.” Along with a coherent introduction to the proposed control design and related topics, the monograph studies nonlinear affine control systems in the presence of uncertainty and presents a constructive and easily implementable control strategy that guarantees certain stability properties. The authors discuss linear-style feedback control synthesis in the context of the above-mentioned systems. The development and physical implementation of high-performance robust-feedback controllers that work in the absence of complete information is addressed, with numerous examples to illustrate how to apply the attractive ellipsoid method to mechanical and electromechanical systems. While theorems are proved systematically, the emphasis is on understanding and applying the theory to real-world situations. Attractive Ellipsoids in Robust Control will a...

  6. Robustness of holonomic quantum gates

    International Nuclear Information System (INIS)

    Solinas, P.; Zanardi, P.; Zanghi, N.

    2005-01-01

    Full text: If the driving field fluctuates during the quantum evolution this produces errors in the applied operator. The holonomic (and geometrical) quantum gates are believed to be robust against some kind of noise. Because of the geometrical dependence of the holonomic operators can be robust against this kind of noise; in fact if the fluctuations are fast enough they cancel out leaving the final operator unchanged. I present the numerical studies of holonomic quantum gates subject to this parametric noise, the fidelity of the noise and ideal evolution is calculated for different noise correlation times. The holonomic quantum gates seem robust not only for fast fluctuating fields but also for slow fluctuating fields. These results can be explained as due to the geometrical feature of the holonomic operator: for fast fluctuating fields the fluctuations are canceled out, for slow fluctuating fields the fluctuations do not perturb the loop in the parameter space. (author)

  7. Robust estimation and hypothesis testing

    CERN Document Server

    Tiku, Moti L

    2004-01-01

    In statistical theory and practice, a certain distribution is usually assumed and then optimal solutions sought. Since deviations from an assumed distribution are very common, one cannot feel comfortable with assuming a particular distribution and believing it to be exactly correct. That brings the robustness issue in focus. In this book, we have given statistical procedures which are robust to plausible deviations from an assumed mode. The method of modified maximum likelihood estimation is used in formulating these procedures. The modified maximum likelihood estimators are explicit functions of sample observations and are easy to compute. They are asymptotically fully efficient and are as efficient as the maximum likelihood estimators for small sample sizes. The maximum likelihood estimators have computational problems and are, therefore, elusive. A broad range of topics are covered in this book. Solutions are given which are easy to implement and are efficient. The solutions are also robust to data anomali...

  8. Robustness in Railway Operations (RobustRailS)

    DEFF Research Database (Denmark)

    Jensen, Jens Parbo; Nielsen, Otto Anker

    This study considers the problem of enhancing railway timetable robustness without adding slack time, hence increasing the travel time. The approach integrates a transit assignment model to assess how passengers adapt their behaviour whenever operations are changed. First, the approach considers...

  9. A Robust Design Applicability Model

    DEFF Research Database (Denmark)

    Ebro, Martin; Lars, Krogstie; Howard, Thomas J.

    2015-01-01

    to be applicable in organisations assigning a high importance to one or more factors that are known to be impacted by RD, while also experiencing a high level of occurrence of this factor. The RDAM supplements existing maturity models and metrics to provide a comprehensive set of data to support management......This paper introduces a model for assessing the applicability of Robust Design (RD) in a project or organisation. The intention of the Robust Design Applicability Model (RDAM) is to provide support for decisions by engineering management considering the relevant level of RD activities...

  10. Ins-Robust Primitive Words

    OpenAIRE

    Srivastava, Amit Kumar; Kapoor, Kalpesh

    2017-01-01

    Let Q be the set of primitive words over a finite alphabet with at least two symbols. We characterize a class of primitive words, Q_I, referred to as ins-robust primitive words, which remain primitive on insertion of any letter from the alphabet and present some properties that characterizes words in the set Q_I. It is shown that the language Q_I is dense. We prove that the language of primitive words that are not ins-robust is not context-free. We also present a linear time algorithm to reco...

  11. Simultaneous Assessment of Speech Identification and Spatial Discrimination

    Directory of Open Access Journals (Sweden)

    Jennifer K. Bizley

    2015-12-01

    Full Text Available With increasing numbers of children and adults receiving bilateral cochlear implants, there is an urgent need for assessment tools that enable testing of binaural hearing abilities. Current test batteries are either limited in scope or are of an impractical duration for routine testing. Here, we report a behavioral test that enables combined testing of speech identification and spatial discrimination in noise. In this task, multitalker babble was presented from all speakers, and pairs of speech tokens were sequentially presented from two adjacent speakers. Listeners were required to identify both words from a closed set of four possibilities and to determine whether the second token was presented to the left or right of the first. In Experiment 1, normal-hearing adult listeners were tested at 15° intervals throughout the frontal hemifield. Listeners showed highest spatial discrimination performance in and around the frontal midline, with a decline at more eccentric locations. In contrast, speech identification abilities were least accurate near the midline and showed an improvement in performance at more lateral locations. In Experiment 2, normal-hearing listeners were assessed using a restricted range of speaker locations designed to match those found in clinical testing environments. Here, speakers were separated by 15° around the midline and 30° at more lateral locations. This resulted in a similar pattern of behavioral results as in Experiment 1. We conclude, this test offers the potential to assess both spatial discrimination and the ability to use spatial information for unmasking in clinical populations.

  12. Improved autonomous star identification algorithm

    International Nuclear Information System (INIS)

    Luo Li-Yan; Xu Lu-Ping; Zhang Hua; Sun Jing-Rong

    2015-01-01

    The log–polar transform (LPT) is introduced into the star identification because of its rotation invariance. An improved autonomous star identification algorithm is proposed in this paper to avoid the circular shift of the feature vector and to reduce the time consumed in the star identification algorithm using LPT. In the proposed algorithm, the star pattern of the same navigation star remains unchanged when the stellar image is rotated, which makes it able to reduce the star identification time. The logarithmic values of the plane distances between the navigation and its neighbor stars are adopted to structure the feature vector of the navigation star, which enhances the robustness of star identification. In addition, some efforts are made to make it able to find the identification result with fewer comparisons, instead of searching the whole feature database. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition rate and robustness by the proposed algorithm are better than those by the LPT algorithm and the modified grid algorithm. (paper)

  13. Speaker information affects false recognition of unstudied lexical-semantic associates.

    Science.gov (United States)

    Luthra, Sahil; Fox, Neal P; Blumstein, Sheila E

    2018-05-01

    Recognition of and memory for a spoken word can be facilitated by a prior presentation of that word spoken by the same talker. However, it is less clear whether this speaker congruency advantage generalizes to facilitate recognition of unheard related words. The present investigation employed a false memory paradigm to examine whether information about a speaker's identity in items heard by listeners could influence the recognition of novel items (critical intruders) phonologically or semantically related to the studied items. In Experiment 1, false recognition of semantically associated critical intruders was sensitive to speaker information, though only when subjects attended to talker identity during encoding. Results from Experiment 2 also provide some evidence that talker information affects the false recognition of critical intruders. Taken together, the present findings indicate that indexical information is able to contact the lexical-semantic network to affect the processing of unheard words.

  14. Perceptual and acoustic analysis of lexical stress in Greek speakers with dysarthria.

    Science.gov (United States)

    Papakyritsis, Ioannis; Müller, Nicole

    2014-01-01

    The study reported in this paper investigated the abilities of Greek speakers with dysarthria to signal lexical stress at the single word level. Three speakers with dysarthria and two unimpaired control participants were recorded completing a repetition task of a list of words consisting of minimal pairs of Greek disyllabic words contrasted by lexical stress location only. Fourteen listeners were asked to determine the attempted stress location for each word pair. Acoustic analyses of duration and intensity ratios, both within and across words, were undertaken to identify possible acoustic correlates of the listeners' judgments concerning stress location. Acoustic and perceptual data indicate that while each participant with dysarthria in this study had some difficulty in signaling stress unambiguously, the pattern of difficulty was different for each speaker. Further, it was found that the relationship between the listeners' judgments of stress location and the acoustic data was not conclusive.

  15. Speaker-Sex Discrimination for Voiced and Whispered Vowels at Short Durations.

    Science.gov (United States)

    Smith, David R R

    2016-01-01

    Whispered vowels, produced with no vocal fold vibration, lack the periodic temporal fine structure which in voiced vowels underlies the perceptual attribute of pitch (a salient auditory cue to speaker sex). Voiced vowels possess no temporal fine structure at very short durations (below two glottal cycles). The prediction was that speaker-sex discrimination performance for whispered and voiced vowels would be similar for very short durations but, as stimulus duration increases, voiced vowel performance would improve relative to whispered vowel performance as pitch information becomes available. This pattern of results was shown for women's but not for men's voices. A whispered vowel needs to have a duration three times longer than a voiced vowel before listeners can reliably tell whether it's spoken by a man or woman (∼30 ms vs. ∼10 ms). Listeners were half as sensitive to information about speaker-sex when it is carried by whispered compared with voiced vowels.

  16. Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification

    DEFF Research Database (Denmark)

    Sarkar, Achintya Kumar; Tan, Zheng-Hua

    2017-01-01

    In this paper, we present a time-contrastive learning (TCL) based bottleneck (BN) feature extraction method for speech signals with an application to text-dependent (TD) speaker verification (SV). It is well-known that speech signals exhibit quasi-stationary behavior in and only in a short interval......, and the TCL method aims to exploit this temporal structure. More specifically, it trains deep neural networks (DNNs) to discriminate temporal events obtained by uniformly segmenting speech signals, in contrast to existing DNN based BN feature extraction methods that train DNNs using labeled data...... to discriminate speakers or pass-phrases or phones or a combination of them. In the context of speaker verification, speech data of fixed pass-phrases are used for TCL-BN training, while the pass-phrases used for TCL-BN training are excluded from being used for SV, so that the learned features can be considered...

  17. Forensic Automatic Speaker Recognition Based on Likelihood Ratio Using Acoustic-phonetic Features Measured Automatically

    Directory of Open Access Journals (Sweden)

    Huapeng Wang

    2015-01-01

    Full Text Available Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence. This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence. The proposed method uses a reference database to calculate the within- and between-speaker variability. Some acoustic-phonetic features are extracted automatically using the software VoiceSauce. The effectiveness of the approach was tested using two Mandarin databases: A mobile telephone database and a landline database. The experiment's results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination. The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.

  18. Language control in different contexts: the behavioural ecology of bilingual speakers

    Directory of Open Access Journals (Sweden)

    David William Green

    2011-05-01

    Full Text Available This paper proposes that different experimental contexts (single or dual language contexts permit different neural loci at which words in the target language can be selected. However, in order to develop a fuller understanding of the neural circuit mediating language control we need to consider the community context in which bilingual speakers typically use their two languages (the behavioural ecology of bilingual speakers. The contrast between speakers from code-switching and non-code switching communities offers a way to increase our understanding of the cortical, subcortical and, in particular, cerebellar structures involved in language control. It will also help us identify the non-verbal behavioural correlates associated with these control processes.

  19. Speaker transfer in children's peer conversation: completing communication-aid-mediated contributions.

    Science.gov (United States)

    Clarke, Michael; Bloch, Steven; Wilkinson, Ray

    2013-03-01

    Managing the exchange of speakers from one person to another effectively is a key issue for participants in everyday conversational interaction. Speakers use a range of resources to indicate, in advance, when their turn will come to an end, and listeners attend to such signals in order to know when they might legitimately speak. Using the principles and findings from conversation analysis, this paper examines features of speaker transfer in a conversation between a boy with cerebral palsy who has been provided with a voice-output communication aid (VOCA), and a peer without physical or communication difficulties. Specifically, the analysis focuses on turn exchange, where a VOCA-mediated contribution approach completion, and the child without communication needs is due to speak next.

  20. Infant sensitivity to speaker and language in learning a second label.

    Science.gov (United States)

    Bhagwat, Jui; Casasola, Marianella

    2014-02-01

    Two experiments examined when monolingual, English-learning 19-month-old infants learn a second object label. Two experimenters sat together. One labeled a novel object with one novel label, whereas the other labeled the same object with a different label in either the same or a different language. Infants were tested on their comprehension of each label immediately following its presentation. Infants mapped the first label at above chance levels, but they did so with the second label only when requested by the speaker who provided it (Experiment 1) or when the second experimenter labeled the object in a different language (Experiment 2). These results show that 19-month-olds learn second object labels but do not readily generalize them across speakers of the same language. The results highlight how speaker and language spoken guide infants' acceptance of second labels, supporting sociopragmatic views of word learning. Copyright © 2013 Elsevier Inc. All rights reserved.

  1. Essays on robust asset pricing

    NARCIS (Netherlands)

    Horváth, Ferenc

    2017-01-01

    The central concept of this doctoral dissertation is robustness. I analyze how model and parameter uncertainty affect financial decisions of investors and fund managers, and what their equilibrium consequences are. Chapter 1 gives an overview of the most important concepts and methodologies used in

  2. Robust visual hashing via ICA

    International Nuclear Information System (INIS)

    Fournel, Thierry; Coltuc, Daniela

    2010-01-01

    Designed to maximize information transmission in the presence of noise, independent component analysis (ICA) could appear in certain circumstances as a statistics-based tool for robust visual hashing. Several ICA-based scenarios can attempt to reach this goal. A first one is here considered.

  3. Robustness of raw quantum tomography

    Science.gov (United States)

    Asorey, M.; Facchi, P.; Florio, G.; Man'ko, V. I.; Marmo, G.; Pascazio, S.; Sudarshan, E. C. G.

    2011-01-01

    We scrutinize the effects of non-ideal data acquisition on the tomograms of quantum states. The presence of a weight function, schematizing the effects of a finite window or equivalently noise, only affects the state reconstruction procedure by a normalization constant. The results are extended to a discrete mesh and show that quantum tomography is robust under incomplete and approximate knowledge of tomograms.

  4. Robustness of raw quantum tomography

    Energy Technology Data Exchange (ETDEWEB)

    Asorey, M. [Departamento de Fisica Teorica, Facultad de Ciencias, Universidad de Zaragoza, 50009 Zaragoza (Spain); Facchi, P. [Dipartimento di Matematica, Universita di Bari, I-70125 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Florio, G. [Dipartimento di Fisica, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Man' ko, V.I., E-mail: manko@lebedev.r [P.N. Lebedev Physical Institute, Leninskii Prospect 53, Moscow 119991 (Russian Federation); Marmo, G. [Dipartimento di Scienze Fisiche, Universita di Napoli ' Federico II' , I-80126 Napoli (Italy); INFN, Sezione di Napoli, I-80126 Napoli (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Pascazio, S. [Dipartimento di Fisica, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Sudarshan, E.C.G. [Department of Physics, University of Texas, Austin, TX 78712 (United States)

    2011-01-31

    We scrutinize the effects of non-ideal data acquisition on the tomograms of quantum states. The presence of a weight function, schematizing the effects of a finite window or equivalently noise, only affects the state reconstruction procedure by a normalization constant. The results are extended to a discrete mesh and show that quantum tomography is robust under incomplete and approximate knowledge of tomograms.

  5. Aspects of robust linear regression

    NARCIS (Netherlands)

    Davies, P.L.

    1993-01-01

    Section 1 of the paper contains a general discussion of robustness. In Section 2 the influence function of the Hampel-Rousseeuw least median of squares estimator is derived. Linearly invariant weak metrics are constructed in Section 3. It is shown in Section 4 that $S$-estimators satisfy an exact

  6. Manipulation Robustness of Collaborative Filtering

    OpenAIRE

    Benjamin Van Roy; Xiang Yan

    2010-01-01

    A collaborative filtering system recommends to users products that similar users like. Collaborative filtering systems influence purchase decisions and hence have become targets of manipulation by unscrupulous vendors. We demonstrate that nearest neighbors algorithms, which are widely used in commercial systems, are highly susceptible to manipulation and introduce new collaborative filtering algorithms that are relatively robust.

  7. Robustness Regions for Dichotomous Decisions.

    Science.gov (United States)

    Vijn, Pieter; Molenaar, Ivo W.

    1981-01-01

    In the case of dichotomous decisions, the total set of all assumptions/specifications for which the decision would have been the same is the robustness region. Inspection of this (data-dependent) region is a form of sensitivity analysis which may lead to improved decision making. (Author/BW)

  8. Theoretical Framework for Robustness Evaluation

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2011-01-01

    This paper presents a theoretical framework for evaluation of robustness of structural systems, incl. bridges and buildings. Typically modern structural design codes require that ‘the consequence of damages to structures should not be disproportional to the causes of the damages’. However, althou...

  9. Robust control design with MATLAB

    CERN Document Server

    Gu, Da-Wei; Konstantinov, Mihail M

    2013-01-01

    Robust Control Design with MATLAB® (second edition) helps the student to learn how to use well-developed advanced robust control design methods in practical cases. To this end, several realistic control design examples from teaching-laboratory experiments, such as a two-wheeled, self-balancing robot, to complex systems like a flexible-link manipulator are given detailed presentation. All of these exercises are conducted using MATLAB® Robust Control Toolbox 3, Control System Toolbox and Simulink®. By sharing their experiences in industrial cases with minimum recourse to complicated theories and formulae, the authors convey essential ideas and useful insights into robust industrial control systems design using major H-infinity optimization and related methods allowing readers quickly to move on with their own challenges. The hands-on tutorial style of this text rests on an abundance of examples and features for the second edition: ·        rewritten and simplified presentation of theoretical and meth...

  10. Robust Portfolio Optimization Using Pseudodistances

    Science.gov (United States)

    2015-01-01

    The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature. PMID:26468948

  11. Facial Symmetry in Robust Anthropometrics

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2012-01-01

    Roč. 57, č. 3 (2012), s. 691-698 ISSN 0022-1198 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : forensic science * anthropology * robust image analysis * correlation analysis * multivariate data * classification Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.244, year: 2012

  12. Sparse and Robust Factor Modelling

    NARCIS (Netherlands)

    C. Croux (Christophe); P. Exterkate (Peter)

    2011-01-01

    textabstractFactor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having

  13. Robust distributed cognitive relay beamforming

    KAUST Repository

    Pandarakkottilil, Ubaidulla; Aissa, Sonia

    2012-01-01

    design takes into account a parameter of the error in the channel state information (CSI) to render the performance of the beamformer robust in the presence of imperfect CSI. Though the original problem is non-convex, we show that the proposed design can

  14. Approximability of Robust Network Design

    NARCIS (Netherlands)

    Olver, N.K.; Shepherd, F.B.

    2014-01-01

    We consider robust (undirected) network design (RND) problems where the set of feasible demands may be given by an arbitrary convex body. This model, introduced by Ben-Ameur and Kerivin [Ben-Ameur W, Kerivin H (2003) New economical virtual private networks. Comm. ACM 46(6):69-73], generalizes the

  15. Does dynamic information about the speaker's face contribute to semantic speech processing? ERP evidence.

    Science.gov (United States)

    Hernández-Gutiérrez, David; Abdel Rahman, Rasha; Martín-Loeches, Manuel; Muñoz, Francisco; Schacht, Annekathrin; Sommer, Werner

    2018-07-01

    Face-to-face interactions characterize communication in social contexts. These situations are typically multimodal, requiring the integration of linguistic auditory input with facial information from the speaker. In particular, eye gaze and visual speech provide the listener with social and linguistic information, respectively. Despite the importance of this context for an ecological study of language, research on audiovisual integration has mainly focused on the phonological level, leaving aside effects on semantic comprehension. Here we used event-related potentials (ERPs) to investigate the influence of facial dynamic information on semantic processing of connected speech. Participants were presented with either a video or a still picture of the speaker, concomitant to auditory sentences. Along three experiments, we manipulated the presence or absence of the speaker's dynamic facial features (mouth and eyes) and compared the amplitudes of the semantic N400 elicited by unexpected words. Contrary to our predictions, the N400 was not modulated by dynamic facial information; therefore, semantic processing seems to be unaffected by the speaker's gaze and visual speech. Even though, during the processing of expected words, dynamic faces elicited a long-lasting late posterior positivity compared to the static condition. This effect was significantly reduced when the mouth of the speaker was covered. Our findings may indicate an increase of attentional processing to richer communicative contexts. The present findings also demonstrate that in natural communicative face-to-face encounters, perceiving the face of a speaker in motion provides supplementary information that is taken into account by the listener, especially when auditory comprehension is non-demanding. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Continuing Medical Education Speakers with High Evaluation Scores Use more Image-based Slides

    Directory of Open Access Journals (Sweden)

    Ferguson, Ian

    2017-01-01

    Full Text Available Although continuing medical education (CME presentations are common across health professions, it is unknown whether slide design is independently associated with audience evaluations of the speaker. Based on the conceptual framework of Mayer’s theory of multimedia learning, this study aimed to determine whether image use and text density in presentation slides are associated with overall speaker evaluations. This retrospective analysis of six sequential CME conferences (two annual emergency medicine conferences over a three-year period used a mixed linear regression model to assess whether postconference speaker evaluations were associated with image fraction (percentage of image-based slides per presentation and text density (number of words per slide. A total of 105 unique lectures were given by 49 faculty members, and 1,222 evaluations (70.1% response rate were available for analysis. On average, 47.4% (SD=25.36 of slides had at least one educationally-relevant image (image fraction. Image fraction significantly predicted overall higher evaluation scores [F(1, 100.676=6.158, p=0.015] in the mixed linear regression model. The mean (SD text density was 25.61 (8.14 words/slide but was not a significant predictor [F(1, 86.293=0.55, p=0.815]. Of note, the individual speaker [χ2 (1=2.952, p=0.003] and speaker seniority [F(3, 59.713=4.083, p=0.011] significantly predicted higher scores. This is the first published study to date assessing the linkage between slide design and CME speaker evaluations by an audience of practicing clinicians. The incorporation of images was associated with higher evaluation scores, in alignment with Mayer’s theory of multimedia learning. Contrary to this theory, however, text density showed no significant association, suggesting that these scores may be multifactorial. Professional development efforts should focus on teaching best practices in both slide design and presentation skills.

  17. Neural decoding of attentional selection in multi-speaker environments without access to clean sources

    Science.gov (United States)

    O'Sullivan, James; Chen, Zhuo; Herrero, Jose; McKhann, Guy M.; Sheth, Sameer A.; Mehta, Ashesh D.; Mesgarani, Nima

    2017-10-01

    Objective. People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. Approach. We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener’s neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker’s voice to assist the listener. Main results. Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. Significance. Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.

  18. Insight into the Attitudes of Speakers of Urban Meccan Hijazi Arabic towards their Dialect

    Directory of Open Access Journals (Sweden)

    Sameeha D. Alahmadi

    2016-04-01

    Full Text Available The current study mainly aims to examine the attitudes of speakers of Urban Meccan Hijazi Arabic (UMHA towards their dialect, which is spoken in Mecca, Saudi Arabia. It also investigates whether the participants’ age, sex and educational level have any impact on their perception of their dialect. To this end, I designed a 5-point-Likert-scale questionnaire, requiring participants to rate their attitudes towards their dialect. I asked 80 participants, whose first language is UMHA, to fill out the questionnaire. On the basis of the three independent variables, namely, age, sex and educational level, the participants were divided into three groups: old and young speakers, male and female speakers and educated and uneducated speakers. The results reveal that in general, all the groups (young and old, male and female, and educated and uneducated participants have a sense of responsibility towards their dialect, making their attitudes towards their dialect positive. However, differences exist between the three groups. For instance, old speakers tend to express their pride of their dialect more than young speakers. The same pattern is observed in male and female groups. The results show that females may feel embarrassed to provide answers that may imply that they are not proud of their own dialect, since the majority of women in the Arab world, in general, are under more pressure to conform to the overt norms of the society than males. Therefore, I argue that most Arab women may not have the same freedom to express their opinions and feelings about various issues. Based on the results, the study concludes with some recommendations for further research.  Keywords: sociolinguistics, language attitudes, dialectology, social variables, Urban Meccan Hijazi Arabic

  19. Continuing Medical Education Speakers with High Evaluation Scores Use more Image-based Slides.

    Science.gov (United States)

    Ferguson, Ian; Phillips, Andrew W; Lin, Michelle

    2017-01-01

    Although continuing medical education (CME) presentations are common across health professions, it is unknown whether slide design is independently associated with audience evaluations of the speaker. Based on the conceptual framework of Mayer's theory of multimedia learning, this study aimed to determine whether image use and text density in presentation slides are associated with overall speaker evaluations. This retrospective analysis of six sequential CME conferences (two annual emergency medicine conferences over a three-year period) used a mixed linear regression model to assess whether post-conference speaker evaluations were associated with image fraction (percentage of image-based slides per presentation) and text density (number of words per slide). A total of 105 unique lectures were given by 49 faculty members, and 1,222 evaluations (70.1% response rate) were available for analysis. On average, 47.4% (SD=25.36) of slides had at least one educationally-relevant image (image fraction). Image fraction significantly predicted overall higher evaluation scores [F(1, 100.676)=6.158, p=0.015] in the mixed linear regression model. The mean (SD) text density was 25.61 (8.14) words/slide but was not a significant predictor [F(1, 86.293)=0.55, p=0.815]. Of note, the individual speaker [χ 2 (1)=2.952, p=0.003] and speaker seniority [F(3, 59.713)=4.083, p=0.011] significantly predicted higher scores. This is the first published study to date assessing the linkage between slide design and CME speaker evaluations by an audience of practicing clinicians. The incorporation of images was associated with higher evaluation scores, in alignment with Mayer's theory of multimedia learning. Contrary to this theory, however, text density showed no significant association, suggesting that these scores may be multifactorial. Professional development efforts should focus on teaching best practices in both slide design and presentation skills.

  20. Are Cantonese-speakers really descriptivists? Revisiting cross-cultural semantics.

    Science.gov (United States)

    Lam, Barry

    2010-05-01

    In an article in Cognition [Machery, E., Mallon, R., Nichols, S., & Stich, S. (2004). Semantics cross-cultural style. Cognition, 92, B1-B12] present data which purports to show that East Asian Cantonese-speakers tend to have descriptivist intuitions about the referents of proper names, while Western English-speakers tend to have causal-historical intuitions about proper names. Machery et al. take this finding to support the view that some intuitions, the universality of which they claim is central to philosophical theories, vary according to cultural background. Machery et al. conclude from their findings that the philosophical methodology of consulting intuitions about hypothetical cases is flawed vis a vis the goal of determining truths about some philosophical domains like philosophical semantics. In the following study, three new vignettes in English were given to Western native English-speakers, and Cantonese translations were given to native Cantonese-speaking immigrants from a Cantonese community in Southern California. For all three vignettes, questions were given to elicit intuitions about the referent of a proper name and the truth-value of an uttered sentence containing a proper name. The results from this study reveal that East Asian Cantonese-speakers do not differ from Western English-speakers in ways that support Machery et al.'s conclusions. This new data concerning the intuitions of Cantonese-speakers raises questions about whether cross-cultural variation in answers to questions on certain vignettes reveal genuine differences in intuitions, or whether such differences stem from non-intuitional differences, such as differences in linguistic competence. Copyright 2009 Elsevier B.V. All rights reserved.

  1. Robust input design for nonlinear dynamic modeling of AUV.

    Science.gov (United States)

    Nouri, Nowrouz Mohammad; Valadi, Mehrdad

    2017-09-01

    Input design has a dominant role in developing the dynamic model of autonomous underwater vehicles (AUVs) through system identification. Optimal input design is the process of generating informative inputs that can be used to generate the good quality dynamic model of AUVs. In a problem with optimal input design, the desired input signal depends on the unknown system which is intended to be identified. In this paper, the input design approach which is robust to uncertainties in model parameters is used. The Bayesian robust design strategy is applied to design input signals for dynamic modeling of AUVs. The employed approach can design multiple inputs and apply constraints on an AUV system's inputs and outputs. Particle swarm optimization (PSO) is employed to solve the constraint robust optimization problem. The presented algorithm is used for designing the input signals for an AUV, and the estimate obtained by robust input design is compared with that of the optimal input design. According to the results, proposed input design can satisfy both robustness of constraints and optimality. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  2. A comparison of three speaker-intrinsic vowel formant frequency normalization algorithms for sociophonetics

    DEFF Research Database (Denmark)

    Fabricius, Anne; Watt, Dominic; Johnson, Daniel Ezra

    2009-01-01

    from RP and Aberdeen English (northeast Scotland). We conclude that, for the data examined here, the S-centroid W&F procedures performs at least as well as the two most recognized speaker-intrinsic, vowel-extrinsic, formant-intrinsic normalization methods, Lobanov's (1971) z-score procedure and Nearey......This paper evaluates a speaker-intrinsic vowel formant frequency normalization algorithm initially proposed in Watt & Fabricius (2002). We compare how well this routine, known as the S-centroid procedure, performs as a sociophonetic research tool in three ways: reducing variance in area ratios...

  3. Gender parity trends for invited speakers at four prominent virology conference series.

    Science.gov (United States)

    Kalejta, Robert F; Palmenberg, Ann C

    2017-06-07

    Scientific conferences are most beneficial to participants when they showcase significant new experimental developments, accurately summarize the current state of the field, and provide strong opportunities for collaborative networking. A top-notch slate of invited speakers, assembled by conference organizers or committees, is key to achieving these goals. The perceived underrepresentation of female speakers at prominent scientific meetings is currently a popular topic for discussion, but one that often lacks supportive data. We compiled the full rosters of invited speakers over the last 35 years for four prominent international virology conferences, the American Society for Virology Annual Meeting (ASV), the International Herpesvirus Workshop (IHW), the Positive-Strand RNA Virus Symposium (PSR), and the Gordon Research Conference on Viruses & Cells (GRC). The rosters were cross-indexed by unique names, gender, year, and repeat invitations. When plotted as gender-dependent trends over time, all four conferences showed a clear proclivity for male-dominated invited speaker lists. Encouragingly, shifts toward parity are emerging within all units, but at different rates. Not surprisingly, both selection of a larger percentage of first time participants and the presence of a woman on the speaker selection committee correlated with improved parity. Session chair information was also collected for the IHW and GRC. These visible positions also displayed a strong male dominance over time that is eroding slowly. We offer our personal interpretation of these data to aid future organizers achieve improved equity among the limited number of available positions for session moderators and invited speakers. IMPORTANCE Politicians and media members have a tendency to cite anecdotes as conclusions without any supporting data. This happens so frequently now, that a name for it has emerged: fake news. Good science proceeds otherwise. The under representation of women as invited

  4. Improving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data

    Science.gov (United States)

    2017-08-20

    M speakers. We seek a probabilistic solution to domain adap- tation, and so we encode knowledge of the out-of-domain data in prior distributions...the VB solution from (16)-(21) becomes: µ =αȳ + (1− α)µout, (24) Σa =α ( 1 NT NT∑ n=1 〈ynyTn 〉 − ȳȳT ) + (1− α) Σouta (25) + α (1− α) ( ȳ − µout...non- English languages and from unseen channels. An inadequate in-domain set was provided, which consisted of 2272 samples from 1164 speakers, and

  5. Objective eye-gaze behaviour during face-to-face communication with proficient alaryngeal speakers: a preliminary study.

    Science.gov (United States)

    Evitts, Paul; Gallop, Robert

    2011-01-01

    There is a large body of research demonstrating the impact of visual information on speaker intelligibility in both normal and disordered speaker populations. However, there is minimal information on which specific visual features listeners find salient during conversational discourse. To investigate listeners' eye-gaze behaviour during face-to-face conversation with normal, laryngeal and proficient alaryngeal speakers. Sixty participants individually participated in a 10-min conversation with one of four speakers (typical laryngeal, tracheoesophageal, oesophageal, electrolaryngeal; 15 participants randomly assigned to one mode of speech). All speakers were > 85% intelligible and were judged to be 'proficient' by two certified speech-language pathologists. Participants were fitted with a head-mounted eye-gaze tracking device (Mobile Eye, ASL) that calculated the region of interest and mean duration of eye-gaze. Self-reported gaze behaviour was also obtained following the conversation using a 10 cm visual analogue scale. While listening, participants viewed the lower facial region of the oesophageal speaker more than the normal or tracheoesophageal speaker. Results of non-hierarchical cluster analyses showed that while listening, the pattern of eye-gaze was predominantly directed at the lower face of the oesophageal and electrolaryngeal speaker and more evenly dispersed among the background, lower face, and eyes of the normal and tracheoesophageal speakers. Finally, results show a low correlation between self-reported eye-gaze behaviour and objective regions of interest data. Overall, results suggest similar eye-gaze behaviour when healthy controls converse with normal and tracheoesophageal speakers and that participants had significantly different eye-gaze patterns when conversing with an oesophageal speaker. Results are discussed in terms of existing eye-gaze data and its potential implications on auditory-visual speech perception. © 2011 Royal College of Speech

  6. Robust Reliability or reliable robustness? - Integrated consideration of robustness and reliability aspects

    DEFF Research Database (Denmark)

    Kemmler, S.; Eifler, Tobias; Bertsche, B.

    2015-01-01

    products are and vice versa. For a comprehensive understanding and to use existing synergies between both domains, this paper discusses the basic principles of Reliability- and Robust Design theory. The development of a comprehensive model will enable an integrated consideration of both domains...

  7. Robust Watermarking of Video Streams

    Directory of Open Access Journals (Sweden)

    T. Polyák

    2006-01-01

    Full Text Available In the past few years there has been an explosion in the use of digital video data. Many people have personal computers at home, and with the help of the Internet users can easily share video files on their computer. This makes possible the unauthorized use of digital media, and without adequate protection systems the authors and distributors have no means to prevent it.Digital watermarking techniques can help these systems to be more effective by embedding secret data right into the video stream. This makes minor changes in the frames of the video, but these changes are almost imperceptible to the human visual system. The embedded information can involve copyright data, access control etc. A robust watermark is resistant to various distortions of the video, so it cannot be removed without affecting the quality of the host medium. In this paper I propose a video watermarking scheme that fulfills the requirements of a robust watermark. 

  8. Robust Decentralized Formation Flight Control

    Directory of Open Access Journals (Sweden)

    Zhao Weihua

    2011-01-01

    Full Text Available Motivated by the idea of multiplexed model predictive control (MMPC, this paper introduces a new framework for unmanned aerial vehicles (UAVs formation flight and coordination. Formulated using MMPC approach, the whole centralized formation flight system is considered as a linear periodic system with control inputs of each UAV subsystem as its periodic inputs. Divided into decentralized subsystems, the whole formation flight system is guaranteed stable if proper terminal cost and terminal constraints are added to each decentralized MPC formulation of the UAV subsystem. The decentralized robust MPC formulation for each UAV subsystem with bounded input disturbances and model uncertainties is also presented. Furthermore, an obstacle avoidance control scheme for any shape and size of obstacles, including the nonapriorily known ones, is integrated under the unified MPC framework. The results from simulations demonstrate that the proposed framework can successfully achieve robust collision-free formation flights.

  9. Inefficient but robust public leadership.

    OpenAIRE

    Matsumura, Toshihiro; Ogawa, Akira

    2014-01-01

    We investigate endogenous timing in a mixed duopoly in a differentiated product market. We find that private leadership is better than public leadership from a social welfare perspective if the private firm is domestic, regardless of the degree of product differentiation. Nevertheless, the public leadership equilibrium is risk-dominant, and it is thus robust if the degree of product differentiation is high. We also find that regardless of the degree of product differentiation, the public lead...

  10. Testing Heteroscedasticity in Robust Regression

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2011-01-01

    Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf

  11. Robust power system frequency control

    CERN Document Server

    Bevrani, Hassan

    2014-01-01

    This updated edition of the industry standard reference on power system frequency control provides practical, systematic and flexible algorithms for regulating load frequency, offering new solutions to the technical challenges introduced by the escalating role of distributed generation and renewable energy sources in smart electric grids. The author emphasizes the physical constraints and practical engineering issues related to frequency in a deregulated environment, while fostering a conceptual understanding of frequency regulation and robust control techniques. The resulting control strategi

  12. Infants' Understanding of False Labeling Events: The Referential Roles of Words and the Speakers Who Use Them.

    Science.gov (United States)

    Koenig, Melissa A.; Echols, Catharine H.

    2003-01-01

    Four studies examined whether 16-month-olds' responses to true/false utterances interacted with their knowledge of human agents. Findings suggested that infants are developing a critical conception of human speakers as truthful communicators and that infants understand that human speakers may provide uniquely useful information when a word fails…

  13. Phraseology and Frequency of Occurrence on the Web: Native Speakers' Perceptions of Google-Informed Second Language Writing

    Science.gov (United States)

    Geluso, Joe

    2013-01-01

    Usage-based theories of language learning suggest that native speakers of a language are acutely aware of formulaic language due in large part to frequency effects. Corpora and data-driven learning can offer useful insights into frequent patterns of naturally occurring language to second/foreign language learners who, unlike native speakers, are…

  14. Accent, Intelligibility, and the Role of the Listener: Perceptions of English-Accented German by Native German Speakers

    Science.gov (United States)

    Hayes-Harb, Rachel; Watzinger-Tharp, Johanna

    2012-01-01

    We explore the relationship between accentedness and intelligibility, and investigate how listeners' beliefs about nonnative speech interact with their accentedness and intelligibility judgments. Native German speakers and native English learners of German produced German sentences, which were presented to 12 native German speakers in accentedness…

  15. Identifying Core Vocabulary for Urdu Language Speakers Using Augmentative Alternative Communication

    Science.gov (United States)

    Mukati, Abdul Samad

    2013-01-01

    The purpose of this research is to identify a core set of vocabulary used by native Urdu language (UL) speakers during dyadic conversation for social interaction and relationship building. This study was conducted in Karachi, Pakistan at an institution of higher education. This research seeks to distinguish between general (nonspecific…

  16. Native Speaker Norms and China English: From the Perspective of Learners and Teachers in China

    Science.gov (United States)

    He, Deyuan; Zhang, Qunying

    2010-01-01

    This article explores the question of whether the norms based on native speakers of English should be kept in English teaching in an era when English has become World Englishes. This is an issue that has been keenly debated in recent years, not least in the pages of "TESOL Quarterly." However, "China English" in such debates…

  17. Diversity in the lexical and syntactic abilities of fluent aphasic speakers

    NARCIS (Netherlands)

    Bastiaanse, Y.R.M.; Edwards, S.

    In an earlier study by the authors, it was suggested that some fluent aphasic speakers exhibit subtle grammatical deficits. In this paper, how far lexical accessing problems might account for these deficits is considered. For this study, spontaneous speech data collected from two groups of aphasic

  18. A virtual speaker in noisy classroom conditions: supporting or disrupting children's listening comprehension?

    Science.gov (United States)

    Nirme, Jens; Haake, Magnus; Lyberg Åhlander, Viveka; Brännström, Jonas; Sahlén, Birgitta

    2018-04-05

    Seeing a speaker's face facilitates speech recognition, particularly under noisy conditions. Evidence for how it might affect comprehension of the content of the speech is more sparse. We investigated how children's listening comprehension is affected by multi-talker babble noise, with or without presentation of a digitally animated virtual speaker, and whether successful comprehension is related to performance on a test of executive functioning. We performed a mixed-design experiment with 55 (34 female) participants (8- to 9-year-olds), recruited from Swedish elementary schools. The children were presented with four different narratives, each in one of four conditions: audio-only presentation in a quiet setting, audio-only presentation in noisy setting, audio-visual presentation in a quiet setting, and audio-visual presentation in a noisy setting. After each narrative, the children answered questions on the content and rated their perceived listening effort. Finally, they performed a test of executive functioning. We found significantly fewer correct answers to explicit content questions after listening in noise. This negative effect was only mitigated to a marginally significant degree by audio-visual presentation. Strong executive function only predicted more correct answers in quiet settings. Altogether, our results are inconclusive regarding how seeing a virtual speaker affects listening comprehension. We discuss how methodological adjustments, including modifications to our virtual speaker, can be used to discriminate between possible explanations to our results and contribute to understanding the listening conditions children face in a typical classroom.

  19. Multistage Data Selection-based Unsupervised Speaker Adaptation for Personalized Speech Emotion Recognition

    NARCIS (Netherlands)

    Kim, Jaebok; Park, Jeong-Sik

    This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But,

  20. Non-Native English Speakers and Nonstandard English: An In-Depth Investigation

    Science.gov (United States)

    Polat, Brittany

    2012-01-01

    Given the rising prominence of nonstandard varieties of English around the world (Jenkins 2007), learners of English as a second language are increasingly called on to communicate with speakers of both native and non-native nonstandard English varieties. In many classrooms around the world, however, learners continue to be exposed only to…

  1. An Evaluation of Native-speaker Judgements of Foreign-accented British and American English

    NARCIS (Netherlands)

    Doel, W.Z. van den

    2006-01-01

    This study is the first ever to employ a large-scale Internet survey to investigate priorities in English pronunciation training. Well over 500 native speakers from throughout the English-speaking world, including North America, the British Isles, Australia and New Zealand, were asked to detect and

  2. Impact of Industry Guest Speakers on Business Students' Perceptions of Employability Skills Development

    Science.gov (United States)

    Riebe, L.; Sibson, R.; Roepen, D.; Meakins, K.

    2013-01-01

    This study provides insights into the perceptions and expectations of Australian undergraduate business students (n=150) regarding the incorporation of guest speakers into the curriculum of a leadership unit focused on employability skills development. The authors adopted a mixed methods approach. A survey was conducted, with quantitative results…

  3. Lifting the Curtain on the Wizard of Oz: Biased Voice-Based Impressions of Speaker Size

    Science.gov (United States)

    Rendall, Drew; Vokey, John R.; Nemeth, Christie

    2007-01-01

    The consistent, but often wrong, impressions people form of the size of unseen speakers are not random but rather point to a consistent misattribution bias, one that the advertising, broadcasting, and entertainment industries also routinely exploit. The authors report 3 experiments examining the perceptual basis of this bias. The results indicate…

  4. A Comparative Study on the Use of Compliment Response Strategies by Persian and English Native Speakers

    Science.gov (United States)

    Shabani, Mansour; Zeinali, Maryam

    2015-01-01

    The significance of pragmatic knowledge and politeness strategies has recently been emphasized in language learning and teaching. Most communication failures originate in the lack of pragmatic awareness which is evident among EFL learners while communicating with English native speakers. The present study aimed at investigating compliment response…

  5. Politics of Participation in Benoît Maubrey’s Speaker Sculptures

    DEFF Research Database (Denmark)

    Keylin, Vadim

    a designated number, or using Bluetooth or WiFi technologies, and express themselves freely through the sculpture. In my paper, I investigate the strategies of audience engagement the Maubrey employs and their applicability to the acoustic design of urban spaces. Through their numerous loudspeakers, Speaker...

  6. Early-Stage Chunking of Finger Tapping Sequences by Persons Who Stutter and Fluent Speakers

    Science.gov (United States)

    Smits-Bandstra, Sarah; De Nil, Luc F.

    2013-01-01

    This research note explored the hypothesis that chunking differences underlie the slow finger-tap sequencing performance reported in the literature for persons who stutter (PWS) relative to fluent speakers (PNS). Early-stage chunking was defined as an immediate and spontaneous tendency to organize a long sequence into pauses, for motor planning,…

  7. The Role of Statistical Learning and Working Memory in L2 Speakers' Pattern Learning

    Science.gov (United States)

    McDonough, Kim; Trofimovich, Pavel

    2016-01-01

    This study investigated whether second language (L2) speakers' morphosyntactic pattern learning was predicted by their statistical learning and working memory abilities. Across three experiments, Thai English as a Foreign Language (EFL) university students (N = 140) were exposed to either the transitive construction in Esperanto (e.g., "tauro…

  8. HotTips for Speakers: 25 Surefire Ways To Engage and Captivate Any Group or Audience.

    Science.gov (United States)

    Abernathy, Rob; Reardon, Mark

    From managing stage fright to keeping the audience hanging on their every word, experienced public speakers have the techniques to make every presentation memorable. This book contains a collection of 25 strategies for public speaking that have already worked for many people. Each "HotTip" (strategy) has been tested and used with…

  9. Gender and Number Agreement in the Oral Production of Arabic Heritage Speakers

    Science.gov (United States)

    Albirini, Abdulkafi; Benmamoun, Elabbas; Chakrani, Brahim

    2013-01-01

    Heritage language acquisition has been characterized by various asymmetries, including the differential acquisition rates of various linguistic areas and the unbalanced acquisition of different categories within a single area. This paper examines Arabic heritage speakers' knowledge of subject-verb agreement versus noun-adjective agreement with the…

  10. Extending Situated Language Comprehension (Accounts) with Speaker and Comprehender Characteristics: Toward Socially Situated Interpretation.

    Science.gov (United States)

    Münster, Katja; Knoeferle, Pia

    2017-01-01

    More and more findings suggest a tight temporal coupling between (non-linguistic) socially interpreted context and language processing. Still, real-time language processing accounts remain largely elusive with respect to the influence of biological (e.g., age) and experiential (e.g., world and moral knowledge) comprehender characteristics and the influence of the 'socially interpreted' context, as for instance provided by the speaker. This context could include actions, facial expressions, a speaker's voice or gaze, and gestures among others. We review findings from social psychology, sociolinguistics and psycholinguistics to highlight the relevance of (the interplay between) the socially interpreted context and comprehender characteristics for language processing. The review informs the extension of an extant real-time processing account (already featuring a coordinated interplay between language comprehension and the non-linguistic visual context) with a variable ('ProCom') that captures characteristics of the language user and with a first approximation of the comprehender's speaker representation. Extending the CIA to the sCIA (social Coordinated Interplay Account) is the first step toward a real-time language comprehension account which might eventually accommodate the socially situated communicative interplay between comprehenders and speakers.

  11. Self-Disclosure in Initial Interactions amongst Speakers of American and Australian English

    Science.gov (United States)

    Haugh, Michael; Carbaugh, Donal

    2015-01-01

    Getting acquainted with others is one of the most basic interpersonal communication events. Yet there has only been a limited number of studies that have examined variation in the interactional practices through which unacquainted persons become acquainted and establish relationships across speakers of the same language. The current study focuses…

  12. Teaching Standard Italian to Dialect Speakers: A Pedagogical Perspective of Linguistic Systems in Contact

    Science.gov (United States)

    Danesi, Marcel

    1974-01-01

    The teaching of standard Italian to speakers of Italian dialects both in Italy and in North America is discussed, specifically through a specialized pedagogical program within the framework of a sociolinguistic and psycholinguistic perspective, and based on a structural analysis of linguistic systems in contact. Italian programs in Toronto are…

  13. Visual and auditory digit-span performance in native and nonnative speakers

    NARCIS (Netherlands)

    Olsthoorn, N.M.; Andringa, S.; Hulstijn, J.H.

    2014-01-01

    We compared 121 native and 114 non-native speakers of Dutch (with 35 different first languages) on four digit-span tasks, varying modality (visual/auditory) and direction (forward/backward). An interaction was observed between nativeness and modality, such that, while natives performed better than

  14. Understanding and Overcoming Pragmatic Failure in Intercultural Communication: From Focus on Speakers to Focus on Hearers

    Science.gov (United States)

    Padilla Cruz, Manuel

    2013-01-01

    For learners to communicate efficiently in the L2, they must avoid pragmatic failure. In many cases, teachers' praxis centres on the learner's performance in the L2 or his role as a speaker, which neglects the importance of his role as interpreter of utterances. Assuming that, as hearers, learners also have a responsibility to avoid…

  15. The "Tse Tsa Watle" Speaker Series: An Example of Ensemble Leadership and Generative Adult Learning

    Science.gov (United States)

    McKendry, Virginia

    2017-01-01

    This chapter examines an Indigenous speaker series formed to foster intercultural partnerships at a Canadian university. Using ensemble leadership and generative learning theories to make sense of the project, the author argues that ensemble leadership is key to designing the generative learning adult learners need in an era of ambiguity.

  16. Direct Measurement of the Speed of Sound Using a Microphone and a Speaker

    Science.gov (United States)

    Gómez-Tejedor, José A.; Castro-Palacio, Juan C.; Monsoriu, Juan A.

    2014-01-01

    We present a simple and accurate experiment to obtain the speed of sound in air using a conventional speaker and a microphone connected to a computer. A free open source digital audio editor and recording computer software application allows determination of the time-of-flight of the wave for different distances, from which the speed of sound is…

  17. 78 FR 65511 - Death of Thomas S. Foley Former Speaker of the House of Representatives

    Science.gov (United States)

    2013-10-31

    ... Vol. 78 Thursday, No. 211 October 31, 2013 Part IV The President Proclamation 9046--Death of Thomas S. Foley Former Speaker of the House of Representatives #0; #0; #0; Presidential Documents #0; #0...; #0; #0;Title 3-- #0;The President [[Page 65513

  18. A Nonword Repetition Task for Speakers with Misarticulations: The Syllable Repetition Task (SRT)

    Science.gov (United States)

    Shriberg, Lawrence D.; Lohmeier, Heather L.; Campbell, Thomas F.; Dollaghan, Christine A.; Green, Jordan R.; Moore, Christopher A.

    2009-01-01

    Purpose: Conceptual and methodological confounds occur when non(sense) word repetition tasks are administered to speakers who do not have the target speech sounds in their phonetic inventories or who habitually misarticulate targeted speech sounds. In this article, the authors (a) describe a nonword repetition task, the Syllable Repetition Task…

  19. Insight into the Attitudes of Speakers of Urban Meccan Hijazi Arabic towards Their Dialect

    Science.gov (United States)

    Alahmadi, Sameeha D.

    2016-01-01

    The current study mainly aims to examine the attitudes of speakers of Urban Meccan Hijazi Arabic (UMHA) towards their dialect, which is spoken in Mecca, Saudi Arabia. It also investigates whether the participants' age, sex and educational level have any impact on their perception of their dialect. To this end, I designed a 5-point-Likert-scale…

  20. Spanish Is Foreign: Heritage Speakers' Interpretations of the Introductory Spanish Language Curriculum

    Science.gov (United States)

    DeFeo, Dayna Jean

    2015-01-01

    This article presents a case study of the perceptions of Spanish heritage speakers enrolled in introductory-level Spanish foreign language courses. Despite their own identities that were linked to the United States and Spanish of the Borderlands, the participants felt that the curriculum acknowledged the Spanish of Spain and foreign countries but…

  1. Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions

    DEFF Research Database (Denmark)

    Ma, Ning; Brown, Guy J.; May, Tobias

    2015-01-01

    This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions. DNNs are used to map binaural features, consisting of the complete crosscorrelation function (CCF) and interaural...

  2. Be My Guest: A Survey of Mass Communication Students' Perception of Guest Speakers

    Science.gov (United States)

    Merle, Patrick F.; Craig, Clay

    2017-01-01

    The use of guest speakers as a pedagogical technique across disciplines at the college level is hardly novel. However, empirical assessment of journalism and mass communication students' perceptions of this practice has not previously been conducted. To fill this gap, this article presents results from an online survey specifically administered to…

  3. Participation of Second Language and Second Dialect Speakers in the Legal System.

    Science.gov (United States)

    Eades, Diana

    2003-01-01

    Overviews current theory and practice and research on second language and second dialect speakers and the language of the law. Suggests most of the studies on the topic have analyzed language in courtrooms, where access to data is much easier than in other legal settings, such as police interviews, mediation sessions, or lawyer-client interviews.…

  4. Examining the Native Speakers' Understanding of Communicative Purposes of a Written Genre in Modern Standard Chinese.

    Science.gov (United States)

    Yunxia, Zhu

    1997-01-01

    Examines the different attitudes of native speakers in understanding a written genre of Modern Standard Chinese--sales letters. The study focuses on the use of formulaic components appearing in real Chinese sales letters and compares these components with the advice given in textbooks. Findings reveal a gap between business teaching and business…

  5. Dialocalization: Acoustic speaker diarization and visual localization as joint optimization problem

    NARCIS (Netherlands)

    Friedland, G.; Yeo, C.; Hung, H.

    2010-01-01

    The following article presents a novel audio-visual approach for unsupervised speaker localization in both time and space and systematically analyzes its unique properties. Using recordings from a single, low-resolution room overview camera and a single far-field microphone, a state-of-the-art

  6. Speech overlap detection in a two-pass speaker diarization system

    NARCIS (Netherlands)

    Huijbregts, M.A.H.; Leeuwen, D.A. van; Jong, F. M. G de

    2009-01-01

    In this paper we present the two-pass speaker diarization system that we developed for the NIST RT09s evaluation. In the first pass of our system a model for speech overlap detection is gen- erated automatically. This model is used in two ways to reduce the diarization errors due to overlapping

  7. Speech overlap detection in a two-pass speaker diarization system

    NARCIS (Netherlands)

    Huijbregts, M.; Leeuwen, D.A. van; Jong, F.M.G. de

    2009-01-01

    In this paper we present the two-pass speaker diarization system that we developed for the NIST RT09s evaluation. In the first pass of our system a model for speech overlap detection is generated automatically. This model is used in two ways to reduce the diarization errors due to overlapping

  8. Compliment Responses of Thai and Punjabi Speakers of English in Thailand

    Science.gov (United States)

    Sachathep, Sukchai

    2014-01-01

    This variational pragmatics (VP) study investigates the similarities and differences of compliment responses (CR) between Thai and Punjabi speakers of English in Thailand, focusing on the strategies used in CR when the microsociolinguistic variables are integrated into the Discourse Completion Task (DCT). The participants were 20 Thai and 20…

  9. Tormenta Espacial: Engaging Spanish Speakers in the Planetarium and K-12 Classroom

    Science.gov (United States)

    Salas, F.; Duncan, D.; Traub-Metlay, S.

    2008-06-01

    Reaching out to Spanish speakers is increasingly vital to workforce development and public support of space science projects. Building on a successful partnership with NASA's TIMED mission, LASP and Space Science Institute, Fiske Planetarium has translated its original planetarium show - ``Space Storm'' - into ``Tormenta Espacial.''

  10. Muchas Caras: Engaging Spanish Speakers in the Planetarium and K--12 Classroom

    Science.gov (United States)

    Traub-Metlay, S.; Salas, F.; Duncan, D.

    2008-11-01

    Reaching out to Spanish speakers is increasingly vital to workforce development and public support of space science projects. Fiske Planetarium offers Spanish translations of our newest planetarium shows, such as ``Las Personas del Telescopio Hubble'' (``The Many Faces of Hubble'') and ``Tormenta Espacial'' (``Space Storm'').

  11. Intonation Contrast in Cantonese Speakers with Hypokinetic Dysarthria Associated with Parkinson's Disease

    Science.gov (United States)

    Ma, Joan K.-Y.; Whitehill, Tara L.; So, Susanne Y.-S.

    2010-01-01

    Purpose: Speech produced by individuals with hypokinetic dysarthria associated with Parkinson's disease (PD) is characterized by a number of features including impaired speech prosody. The purpose of this study was to investigate intonation contrasts produced by this group of speakers. Method: Speech materials with a question-statement contrast…

  12. Neural Control of Rising and Falling Tones in Mandarin Speakers Who Stutter

    Science.gov (United States)

    Howell, Peter; Jiang, Jing; Peng, Danling; Lu, Chunming

    2012-01-01

    Neural control of rising and falling tones in Mandarin people who stutter (PWS) was examined by comparing with that which occurs in fluent speakers [Howell, Jiang, Peng, and Lu (2012). Neural control of fundamental frequency rise and fall in Mandarin tones. "Brain and Language, 121"(1), 35-46]. Nine PWS and nine controls were scanned. Functional…

  13. Non-Native Speakers of the Language of Instruction: Self-Perceptions of Teaching Ability

    Science.gov (United States)

    Samuel, Carolyn

    2017-01-01

    Given the linguistically diverse instructor and student populations at Canadian universities, mutually comprehensible oral language may not be a given. Indeed, both instructors who are non-native speakers of the language of instruction (NNSLIs) and students have acknowledged oral communication challenges. Little is known, though, about how the…

  14. Topic Continuity in Informal Conversations between Native and Non-Native Speakers of English

    Science.gov (United States)

    Morris-Adams, Muna

    2013-01-01

    Topic management by non-native speakers (NNSs) during informal conversations has received comparatively little attention from researchers, and receives surprisingly little attention in second language learning and teaching. This article reports on one of the topic management strategies employed by international students during informal, social…

  15. Linguistic skills of adult native speakers, as a function of age and level of education

    NARCIS (Netherlands)

    Mulder, K.; Hulstijn, J.H.

    2011-01-01

    This study assessed, in a sample of 98 adult native speakers of Dutch, how their lexical skills and their speaking proficiency varied as a function of their age and level of education and profession (EP). Participants, categorized in terms of their age (18-35, 36-50, and 51-76 years old) and the

  16. Revealing Word Order: Using Serial Position in Binomials to Predict Properties of the Speaker

    Science.gov (United States)

    Iliev, Rumen; Smirnova, Anastasia

    2016-01-01

    Three studies test the link between word order in binomials and psychological and demographic characteristics of a speaker. While linguists have already suggested that psychological, cultural and societal factors are important in choosing word order in binomials, the vast majority of relevant research was focused on general factors and on broadly…

  17. Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

    CSIR Research Space (South Africa)

    Govender, Avashna

    2016-12-01

    Full Text Available Building synthetic child voices is considered a difficult task due to the challenges associated with data collection. As a result, speaker adaptation in conjunction with Hidden Markov Model (HMM)-based synthesis has become prevalent in this domain...

  18. Using Reversed MFCC and IT-EM for Automatic Speaker Verification

    Directory of Open Access Journals (Sweden)

    Sheeraz Memon

    2012-01-01

    Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.

  19. Communication‐related affective, behavioral, and cognitive reactions in speakers with spasmodic dysphonia

    Science.gov (United States)

    Vanryckeghem, Martine

    2017-01-01

    Objectives To investigate the self‐perceived affective, behavioral, and cognitive reactions associated with communication of speakers with spasmodic dysphonia as a function of employment status. Study Design Prospective cross‐sectional investigation Methods 148 Participants with spasmodic dysphonia (SD) completed an adapted version of the Behavior Assessment Battery (BAB‐Voice), a multidimensional assessment of self‐perceived reactions to communication. The BAB‐Voice consisted of four subtests: the Speech Situation Checklist for A) Emotional Reaction (SSC‐ER) and B) Speech Disruption (SSC‐SD), C) the Behavior Checklist (BCL), and D) the Communication Attitude Test for Adults (BigCAT). Participants were assigned to groups based on employment status (working versus retired). Results Descriptive comparison of the BAB‐Voice in speakers with SD to previously published non‐dysphonic speaker data revealed substantially higher scores associated with SD across all four subtests. Multivariate Analysis of Variance (MANOVA) revealed no significantly different BAB‐Voice subtest scores as a function of SD group status (working vs. retired). Conclusions BAB‐Voice scores revealed that speakers with SD experienced substantial impact of their voice disorder on communication attitude, coping behaviors, and affective reactions in speaking situations as reflected in their high BAB scores. These impacts do not appear to be influenced by work status, as speakers with SD who were employed or retired experienced similar levels of affective and behavioral reactions in various speaking situations and cognitive responses. These findings are consistent with previously published pilot data. The specificity of items assessed by means of the BAB‐Voice may inform the clinician of valid patient‐centered treatment goals which target the impairment extended beyond the physiological dimension. Level of Evidence 2b PMID:29299525

  20. Communication-related affective, behavioral, and cognitive reactions in speakers with spasmodic dysphonia.

    Science.gov (United States)

    Watts, Christopher R; Vanryckeghem, Martine

    2017-12-01

    To investigate the self-perceived affective, behavioral, and cognitive reactions associated with communication of speakers with spasmodic dysphonia as a function of employment status. Prospective cross-sectional investigation. 148 Participants with spasmodic dysphonia (SD) completed an adapted version of the Behavior Assessment Battery (BAB-Voice), a multidimensional assessment of self-perceived reactions to communication. The BAB-Voice consisted of four subtests: the Speech Situation Checklist for A) Emotional Reaction (SSC-ER) and B) Speech Disruption (SSC-SD), C) the Behavior Checklist (BCL), and D) the Communication Attitude Test for Adults (BigCAT). Participants were assigned to groups based on employment status (working versus retired). Descriptive comparison of the BAB-Voice in speakers with SD to previously published non-dysphonic speaker data revealed substantially higher scores associated with SD across all four subtests. Multivariate Analysis of Variance (MANOVA) revealed no significantly different BAB-Voice subtest scores as a function of SD group status (working vs. retired). BAB-Voice scores revealed that speakers with SD experienced substantial impact of their voice disorder on communication attitude, coping behaviors, and affective reactions in speaking situations as reflected in their high BAB scores. These impacts do not appear to be influenced by work status, as speakers with SD who were employed or retired experienced similar levels of affective and behavioral reactions in various speaking situations and cognitive responses. These findings are consistent with previously published pilot data. The specificity of items assessed by means of the BAB-Voice may inform the clinician of valid patient-centered treatment goals which target the impairment extended beyond the physiological dimension. 2b.