WorldWideScience

Sample records for terms text words

  1. THE RELATIONSHIP BETWEEN WORDS, TEXTS, CLOTHES AND TEXTILES

    Directory of Open Access Journals (Sweden)

    STURZA Amalia

    2017-05-01

    Full Text Available In this paper we will speculate the possible relationships between “word,” “text,” “textile,” and “clothing”. Many of the terms we use to describe our interactions with words are derived from the common linguistic root and numerous other expressions associated with reading and writing are drawn from the rich vocabulary of cloth. Textiles are one of the most ubiquitous components of material culture and they are also integral to the material history of texts. The intersection between texts and textiles locates the relationship between language and dress, as together they structure the fashion scene over the century. We compare these texts and storytelling with the process of making clothes, they go from fibers that are spun and then create the fabric or the material out of which the clothes are made. Besides the similitude of the words “text” and “textile” that have four similar letters there is also the resemblance in the way they transmit a message. While texts are meant to transmit something to the reader, to enchant and to create emotions in so various ways, just in the same way clothes are also meant to transmit emotions and feelings to the wearer or to the people watching them.

  2. Genes2WordCloud: a quick way to identify biological themes from gene lists and free text

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2011-10-01

    Full Text Available Abstract Background Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Results Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Methods Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Conclusions Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications.

  3. On variation of word frequencies in Russian literary texts

    Science.gov (United States)

    Kargin, Vladislav

    2016-03-01

    We study the variation of word frequencies in Russian literary texts. Our findings indicate that the standard deviation of a word's frequency across texts depends on its average frequency according to a power law with exponent 1/2 volatility (that is, higher "burstiness"). A latent factor model has been estimated to investigate the structure of the word frequency distribution. The findings suggest that the dependence of a word's frequency volatility on its average frequency can be explained by the asymmetry in the distribution of latent factors.

  4. The Power of Social Media Analytics: Text Analytics Based on Sentiment Analysis and Word Clouds on R

    Directory of Open Access Journals (Sweden)

    Ahmed Imran KABIR

    2018-01-01

    Full Text Available Apparently, word clouds have grown as a clear and appealing illustration or visualization strategy in terms of text. Word clouds are used as a part of various settings as a way to give a diagram by cleansing text throughout those words that come up with most frequently. Generally, this is performed constantly as an unadulterated text outline. In any case, that there is a bigger capability to this basic yet intense visualization worldview in text analytics. In this work, we investigate the adequacy of word clouds for general text analysis errands and also analyze the tweets to find out the sentiment and also discuss the legal aspects of text mining. We used R software to pull twitter data which depends altogether on word cloud as a visualization technique and also with the help of positive and negative words to determine the user sentiment. We indicate how this approach can be viably used to explain text analysis tasks and assess it in a qualitative user research.

  5. Genes2WordCloud: a quick way to identify biological themes from gene lists and free text.

    Science.gov (United States)

    Baroukh, Caroline; Jenkins, Sherry L; Dannenfelser, Ruth; Ma'ayan, Avi

    2011-10-13

    Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications.

  6. Text Comprehension and Oral Language as Predictors of Word-Problem Solving: Insights into Word-Problem Solving as a Form of Text Comprehension.

    Science.gov (United States)

    Fuchs, Lynn S; Gilbert, Jennifer K; Fuchs, Douglas; Seethaler, Pamela M; Martin, BrittanyLee N

    2018-01-01

    This study was designed to deepen insights on whether word-problem (WP) solving is a form of text comprehension (TC) and on the role of language in WPs. A sample of 325 second graders, representing high, average, and low reading and math performance, was assessed on (a) start-of-year TC, WP skill, language, nonlinguistic reasoning, working memory, and foundational skill (word identification, arithmetic) and (b) year-end WP solving, WP-language processing (understanding WP statements, without calculation demands), and calculations. Multivariate, multilevel path analysis, accounting for classroom and school effects, indicated that TC was a significant and comparably strong predictor of all outcomes. Start-of-year language was a significantly stronger predictor of both year-end WP outcomes than of calculations, whereas start-of-year arithmetic was a significantly stronger predictor of calculations than of either WP measure. Implications are discussed in terms of WP solving as a form of TC and a theoretically coordinated approach, focused on language, for addressing TC and WP-solving instruction.

  7. Automatic Text Analysis Based on Transition Phenomena of Word Occurrences

    Science.gov (United States)

    Pao, Miranda Lee

    1978-01-01

    Describes a method of selecting index terms directly from a word frequency list, an idea originally suggested by Goffman. Results of the analysis of word frequencies of two articles seem to indicate that the automated selection of index terms from a frequency list holds some promise for automatic indexing. (Author/MBR)

  8. Text Comprehension and Oral Language as Predictors of Word-Problem Solving: Insights into Word-Problem Solving as a Form of Text Comprehension

    Science.gov (United States)

    Fuchs, Lynn S.; Gilbert, Jennifer K.; Fuchs, Douglas; Seethaler, Pamela M.; Martin, BrittanyLee N.

    2018-01-01

    This study was designed to deepen insights on whether word-problem (WP) solving is a form of text comprehension (TC) and on the role of language in WPs. A sample of 325 second graders, representing high, average, and low reading and math performance, was assessed on (a) start-of-year TC, WP skill, language, nonlinguistic reasoning, working memory, and foundational skill (word identification, arithmetic) and (b) year-end WP solving, WP-language processing (understanding WP statements, without calculation demands), and calculations. Multivariate, multilevel path analysis, accounting for classroom and school effects, indicated that TC was a significant and comparably strong predictor of all outcomes. Start-of-year language was a significantly stronger predictor of both year-end WP outcomes than of calculations, whereas start-of-year arithmetic was a significantly stronger predictor of calculations than of either WP measure. Implications are discussed in terms of WP solving as a form of TC and a theoretically coordinated approach, focused on language, for addressing TC and WP-solving instruction. PMID:29643723

  9. "Daddy, Where Did the Words Go?" How Teachers Can Help Emergent Readers Develop a Concept of Word in Text

    Science.gov (United States)

    Flanigan, Kevin

    2006-01-01

    This article focuses on a concept that has rarely been studied in beginning reading research--a child's concept of word in text. Recent examinations of this phenomenon suggest that a child's ability to match spoken words to written words while reading--a concept of word in text--plays a pivotal role in early reading development. In this article,…

  10. Word Length Effects in Long-Term Memory

    Science.gov (United States)

    Tehan, Gerald; Tolan, Georgina Anne

    2007-01-01

    The word length effect has been a central feature of theorising about immediate memory. The notion that short-term memory traces rapidly decay unless refreshed by rehearsal is based primarily upon the finding that serial recall for short words is better than that for long words. The decay account of the word length effect has come under pressure…

  11. Word2vec and dictionary based approach for uyghur text filtering

    Science.gov (United States)

    Tohti, Turdi; Zhao, Yunxing; Musajan, Winira

    2017-08-01

    With emerging of deep learning, the expression of words in computer has made major breakthroughs and the effect of text processing based on word vector has also been significantly improved. This paper maps all patterns into a more abstract vector space by Uyghur-Chinese dictionary and deep learning tool Word2vec, at first. Secondly, a similar pattern is found according the characteristics of the original pattern. Finally, texts are filtered using Wu-Manber algorithm. Experiments show that this method can get obvious filtering accuracy and recall of Uyghur text information improved.

  12. Word-level recognition of multifont Arabic text using a feature vector matching approach

    Science.gov (United States)

    Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

    1996-03-01

    Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.

  13. Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis

    Directory of Open Access Journals (Sweden)

    Lev Guzmán-Vargas

    2015-11-01

    Full Text Available We study the correlation properties of word lengths in large texts from 30 ebooks in the English language from the Gutenberg Project (www.gutenberg.org using the natural visibility graph method (NVG. NVG converts a time series into a graph and then analyzes its graph properties. First, the original sequence of words is transformed into a sequence of values containing the length of each word, and then, it is integrated. Next, we apply the NVG to the integrated word-length series and construct the network. We show that the degree distribution of that network follows a power law, P ( k ∼ k - γ , with two regimes, which are characterized by the exponents γ s ≈ 1 . 7 (at short degree scales and γ l ≈ 1 . 3 (at large degree scales. This suggests that word lengths are much more strongly correlated at large distances between words than at short distances between words. That finding is also supported by the detrended fluctuation analysis (DFA and recurrence time distribution. These results provide new information about the universal characteristics of the structure of written texts beyond that given by word frequencies.

  14. Scale and time dependence of serial correlations in word-length time series of written texts

    Science.gov (United States)

    Rodriguez, E.; Aguilar-Cornejo, M.; Femat, R.; Alvarez-Ramirez, J.

    2014-11-01

    This work considered the quantitative analysis of large written texts. To this end, the text was converted into a time series by taking the sequence of word lengths. The detrended fluctuation analysis (DFA) was used for characterizing long-range serial correlations of the time series. To this end, the DFA was implemented within a rolling window framework for estimating the variations of correlations, quantified in terms of the scaling exponent, strength along the text. Also, a filtering derivative was used to compute the dependence of the scaling exponent relative to the scale. The analysis was applied to three famous English-written literary narrations; namely, Alice in Wonderland (by Lewis Carrol), Dracula (by Bram Stoker) and Sense and Sensibility (by Jane Austen). The results showed that high correlations appear for scales of about 50-200 words, suggesting that at these scales the text contains the stronger coherence. The scaling exponent was not constant along the text, showing important variations with apparent cyclical behavior. An interesting coincidence between the scaling exponent variations and changes in narrative units (e.g., chapters) was found. This suggests that the scaling exponent obtained from the DFA is able to detect changes in narration structure as expressed by the usage of words of different lengths.

  15. The impact of inverted text on visual word processing: An fMRI study.

    Science.gov (United States)

    Sussman, Bethany L; Reddigari, Samir; Newman, Sharlene D

    2018-06-01

    Visual word recognition has been studied for decades. One question that has received limited attention is how different text presentation orientations disrupt word recognition. By examining how word recognition processes may be disrupted by different text orientations it is hoped that new insights can be gained concerning the process. Here, we examined the impact of rotating and inverting text on the neural network responsible for visual word recognition focusing primarily on a region of the occipto-temporal cortex referred to as the visual word form area (VWFA). A lexical decision task was employed in which words and pseudowords were presented in one of three orientations (upright, rotated or inverted). The results demonstrate that inversion caused the greatest disruption of visual word recognition processes. Both rotated and inverted text elicited increased activation in spatial attention regions within the right parietal cortex. However, inverted text recruited phonological and articulatory processing regions within the left inferior frontal and left inferior parietal cortices. Finally, the VWFA was found to not behave similarly to the fusiform face area in that unusual text orientations resulted in increased activation and not decreased activation. It is hypothesized here that the VWFA activation is modulated by feedback from linguistic processes. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction.

    Science.gov (United States)

    Najafi, Elham; Darooneh, Amir H

    2015-01-01

    A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the importance of the words in a given text using their fractal dimensions and then ranking them according to their importance. This index measures the difference between the fractal pattern of a word in the original text relative to a shuffled version. Because the shuffled text is meaningless (i.e., words have no importance), the difference between the original and shuffled text can be used to ascertain degree of fractality. The degree of fractality may be used for automatic keyword detection. Words with the degree of fractality higher than a threshold value are assumed to be the retrieved keywords of the text. We measure the efficiency of our method for keywords extraction, making a comparison between our proposed method and two other well-known methods of automatic keyword extraction.

  17. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction

    Science.gov (United States)

    Najafi, Elham; Darooneh, Amir H.

    2015-01-01

    A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the importance of the words in a given text using their fractal dimensions and then ranking them according to their importance. This index measures the difference between the fractal pattern of a word in the original text relative to a shuffled version. Because the shuffled text is meaningless (i.e., words have no importance), the difference between the original and shuffled text can be used to ascertain degree of fractality. The degree of fractality may be used for automatic keyword detection. Words with the degree of fractality higher than a threshold value are assumed to be the retrieved keywords of the text. We measure the efficiency of our method for keywords extraction, making a comparison between our proposed method and two other well-known methods of automatic keyword extraction. PMID:26091207

  18. Words Matter: Scene Text for Image Classification and Retrieval

    NARCIS (Netherlands)

    Karaoglu, S.; Tao, R.; Gevers, T.; Smeulders, A.W.M.

    Text in natural images typically adds meaning to an object or scene. In particular, text specifies which business places serve drinks (e.g., cafe, teahouse) or food (e.g., restaurant, pizzeria), and what kind of service is provided (e.g., massage, repair). The mere presence of text, its words, and

  19. Is Word-Problem Solving a Form of Text Comprehension?

    Science.gov (United States)

    Fuchs, Lynn S.; Fuchs, Douglas; Compton, Donald L.; Hamlett, Carol L.; Wang, Amber Y.

    2015-01-01

    This study’s hypotheses were that (a) word-problem (WP) solving is a form of text comprehension that involves language comprehension processes, working memory, and reasoning, but (b) WP solving differs from other forms of text comprehension by requiring WP-specific language comprehension as well as general language comprehension. At the start of the 2nd grade, children (n = 206; on average, 7 years, 6 months) were assessed on general language comprehension, working memory, nonlinguistic reasoning, processing speed (a control variable), and foundational skill (arithmetic for WPs; word reading for text comprehension). In spring, they were assessed on WP-specific language comprehension, WPs, and text comprehension. Path analytic mediation analysis indicated that effects of general language comprehension on text comprehension were entirely direct, whereas effects of general language comprehension on WPs were partially mediated by WP-specific language. By contrast, effects of working memory and reasoning operated in parallel ways for both outcomes. PMID:25866461

  20. Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation.

    Science.gov (United States)

    Jimeno Yepes, Antonio

    2017-09-01

    Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Zur Wortbildung in wissenschaftlichen Texten (Word Formation in Scientific Texts)

    Science.gov (United States)

    Rogalla, Hanna; Rogalla, Willy

    1976-01-01

    Discusses a German frequency list of 1,500 to 2,000 scientific words, which is being developed, and the importance of learning word-building principles. Substantive and adjective suffixes are listed according to frequency, followed by remarks on copulative compounds, with examples and frequency ranking, and, finally, prefixes. (Text is in German.)…

  2. Key word placing in Web page body text to increase visibility to search engines

    Directory of Open Access Journals (Sweden)

    W. T. Kritzinger

    2007-11-01

    Full Text Available The growth of the World Wide Web has spawned a wide variety of new information sources, which has also left users with the daunting task of determining which sources are valid. Many users rely on the Web as an information source because of the low cost of information retrieval. It is also claimed that the Web has evolved into a powerful business tool. Examples include highly popular business services such as Amazon.com and Kalahari.net. It is estimated that around 80% of users utilize search engines to locate information on the Internet. This, by implication, places emphasis on the underlying importance of Web pages being listed on search engines indices. Empirical evidence that the placement of key words in certain areas of the body text will have an influence on the Web sites' visibility to search engines could not be found in the literature. The result of two experiments indicated that key words should be concentrated towards the top, and diluted towards the bottom of a Web page to increase visibility. However, care should be taken in terms of key word density, to prevent search engine algorithms from raising the spam alarm.

  3. Is Word-Problem Solving a Form of Text Comprehension?

    Science.gov (United States)

    Fuchs, Lynn S.; Fuchs, Douglas; Compton, Donald L.; Hamlett, Carol L.; Wang, Amber Y.

    2015-01-01

    This study's hypotheses were that (a) word-problem (WP) solving is a form of text comprehension that involves language comprehension processes, working memory, and reasoning, but (b) WP solving differs from other forms of text comprehension by requiring WP-specific language comprehension as well as general language comprehension. At the start of…

  4. Detecting New Words from Chinese Text Using Latent Semi-CRF Models

    Science.gov (United States)

    Sun, Xiao; Huang, Degen; Ren, Fuji

    Chinese new words and their part-of-speech (POS) are particularly problematic in Chinese natural language processing. With the fast development of internet and information technology, it is impossible to get a complete system dictionary for Chinese natural language processing, as new words out of the basic system dictionary are always being created. A latent semi-CRF model, which combines the strengths of LDCRF (Latent-Dynamic Conditional Random Field) and semi-CRF, is proposed to detect the new words together with their POS synchronously regardless of the types of the new words from the Chinese text without being pre-segmented. Unlike the original semi-CRF, the LDCRF is applied to generate the candidate entities for training and testing the latent semi-CRF, which accelerates the training speed and decreases the computation cost. The complexity of the latent semi-CRF could be further adjusted by tuning the number of hidden variables in LDCRF and the number of the candidate entities from the Nbest outputs of the LDCRF. A new-words-generating framework is proposed for model training and testing, under which the definitions and distributions of the new words conform to the ones existing in real text. Specific features called “Global Fragment Information” for new word detection and POS tagging are adopted in the model training and testing. The experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags. The proposed model is found to be performing competitively with the state-of-the-art models presented.

  5. Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network

    OpenAIRE

    Stewart, Ian; Arendt, Dustin; Bell, Eric; Volkova, Svitlana

    2017-01-01

    Language in social media is extremely dynamic: new words emerge, trend and disappear, while the meaning of existing words can fluctuate over time. Such dynamics are especially notable during a period of crisis. This work addresses several important tasks of measuring, visualizing and predicting short term text representation shift, i.e. the change in a word's contextual semantics, and contrasting such shift with surface level word dynamics, or concept drift, observed in social media streams. ...

  6. A Text Steganographic System Based on Word Length Entropy Rate

    Directory of Open Access Journals (Sweden)

    Francis Xavier Kofi Akotoye

    2017-10-01

    Full Text Available The widespread adoption of electronic distribution of material is accompanied by illicit copying and distribution. This is why individuals, businesses and governments have come to think of how to protect their work, prevent such illicit activities and trace the distribution of a document. It is in this context that a lot of attention is being focused on steganography. Implementing steganography in text document is not an easy undertaking considering the fact that text document has very few places in which to embed hidden data. Any minute change introduced to text objects can easily be noticed thus attracting attention from possible hackers. This study investigates the possibility of embedding data in text document by employing the entropy rate of the constituent characters of words not less than four characters long. The scheme was used to embed bits in text according to the alphabetic structure of the words, the respective characters were compared with their neighbouring characters and if the first character was alphabetically lower than the succeeding character according to their ASCII codes, a zero bit was embedded otherwise 1 was embedded after the characters had been transposed. Before embedding, the secret message was encrypted with a secret key to add a layer of security to the secret message to be embedded, and then a pseudorandom number was generated from the word counts of the text which was used to paint the starting point of the embedding process. The embedding capacity of the scheme was relatively high compared with the space encoding and semantic method.

  7. Evaluating a Bilingual Text-Mining System with a Taxonomy of Key Words and Hierarchical Visualization for Understanding Learner-Generated Text

    Science.gov (United States)

    Kong, Siu Cheung; Li, Ping; Song, Yanjie

    2018-01-01

    This study evaluated a bilingual text-mining system, which incorporated a bilingual taxonomy of key words and provided hierarchical visualization, for understanding learner-generated text in the learning management systems through automatic identification and counting of matching key words. A class of 27 in-service teachers studied a course…

  8. Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words.

    Science.gov (United States)

    Tanaka-Ishii, Kumiko; Bunde, Armin

    2016-01-01

    A fundamental problem in linguistics is how literary texts can be quantified mathematically. It is well known that the frequency of a (rare) word in a text is roughly inverse proportional to its rank (Zipf's law). Here we address the complementary question, if also the rhythm of the text, characterized by the arrangement of the rare words in the text, can be quantified mathematically in a similar basic way. To this end, we consider representative classic single-authored texts from England/Ireland, France, Germany, China, and Japan. In each text, we classify each word by its rank. We focus on the rare words with ranks above some threshold Q and study the lengths of the (return) intervals between them. We find that for all texts considered, the probability SQ(r) that the length of an interval exceeds r, follows a perfect Weibull-function, SQ(r) = exp(-b(β)rβ), with β around 0.7. The return intervals themselves are arranged in a long-range correlated self-similar fashion, where the autocorrelation function CQ(s) of the intervals follows a power law, CQ(s) ∼ s-γ, with an exponent γ between 0.14 and 0.48. We show that these features lead to a pronounced clustering of the rare words in the text.

  9. Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words.

    Directory of Open Access Journals (Sweden)

    Kumiko Tanaka-Ishii

    Full Text Available A fundamental problem in linguistics is how literary texts can be quantified mathematically. It is well known that the frequency of a (rare word in a text is roughly inverse proportional to its rank (Zipf's law. Here we address the complementary question, if also the rhythm of the text, characterized by the arrangement of the rare words in the text, can be quantified mathematically in a similar basic way. To this end, we consider representative classic single-authored texts from England/Ireland, France, Germany, China, and Japan. In each text, we classify each word by its rank. We focus on the rare words with ranks above some threshold Q and study the lengths of the (return intervals between them. We find that for all texts considered, the probability SQ(r that the length of an interval exceeds r, follows a perfect Weibull-function, SQ(r = exp(-b(βrβ, with β around 0.7. The return intervals themselves are arranged in a long-range correlated self-similar fashion, where the autocorrelation function CQ(s of the intervals follows a power law, CQ(s ∼ s-γ, with an exponent γ between 0.14 and 0.48. We show that these features lead to a pronounced clustering of the rare words in the text.

  10. Metric Characterizations of Superreflexivity in Terms of Word Hyperbolic Groups and Finite Graphs

    Directory of Open Access Journals (Sweden)

    Ostrovskii Mikhail

    2014-01-01

    Full Text Available We show that superreflexivity can be characterized in terms of bilipschitz embeddability of word hyperbolic groups.We compare characterizations of superrefiexivity in terms of diamond graphs and binary trees.We show that there exist sequences of series-parallel graphs of increasing topological complexitywhich admit uniformly bilipschitz embeddings into a Hilbert space, and thus do not characterize superrefiexivity.

  11. Sentiment analysis methods for understanding large-scale texts: a case for using continuum-scored words and word shift graphs

    Directory of Open Access Journals (Sweden)

    Andrew J Reagan

    2017-10-01

    Full Text Available Abstract The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, an extraordinary capacity which has profound implications for our understanding of human behavior. Given the growing assortment of sentiment-measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts. Most importantly they can aid understanding of texts with reliable and meaningful word shift graphs if (1 the dictionary covers a sufficiently large portion of a given text’s lexicon when weighted by word usage frequency; and (2 words are scored on a continuous scale.

  12. THE EFFECT OF TEACHING WITHIN-TEXT KEY WORDS ON STUDENTS’ PERFORMANCE IN READING COMPREHENSION

    Directory of Open Access Journals (Sweden)

    Mohammad Reza Khodasenas

    2013-07-01

    Full Text Available Abstract: The present study was conducted to investigate the effects of teaching within-text key word synonyms, opposites and related words on students’ performance on reading comprehension of TOEFL among Iranian EFL learners. To carry out the research, 60 Iranian EFL learners, who participated in a TOEFL preparation course, were selected as the participants of the study. Afterward they were randomly assigned into experimental and comparison groups. The experimental group was given a treatment including within-text key word synonyms, opposites and their translations, while the comparison group was given a placebo. To collect the required data, two instruments (a pre-test, and a post-test were administered to both groups during the experimentation. Subsequently, students’ scores were collected through the administration of different tests and the results were statistically analyzed. The results of these analyses revealed that the experimental group outperformed the comparison group and thus, it was concluded that teaching within-text key word synonyms, opposites and related words can improve the reading comprehension ability and general proficiency of EFL language learners.

  13. The Effect of Frequency of Input-Enhancements on Word Learning and Text Comprehension

    Science.gov (United States)

    Rott, Susanne

    2007-01-01

    Research on second language lexical development during reading has found positive effects for word frequency, the provision of glosses, and elaborative word processing. However, findings have been inconclusive regarding the effect of such intervention tasks on long-term retention. Likewise, few studies have looked at the cumulative effect of…

  14. Model of the Dynamic Construction Process of Texts and Scaling Laws of Words Organization in Language Systems.

    Directory of Open Access Journals (Sweden)

    Shan Li

    Full Text Available Scaling laws characterize diverse complex systems in a broad range of fields, including physics, biology, finance, and social science. The human language is another example of a complex system of words organization. Studies on written texts have shown that scaling laws characterize the occurrence frequency of words, words rank, and the growth of distinct words with increasing text length. However, these studies have mainly concentrated on the western linguistic systems, and the laws that govern the lexical organization, structure and dynamics of the Chinese language remain not well understood. Here we study a database of Chinese and English language books. We report that three distinct scaling laws characterize words organization in the Chinese language. We find that these scaling laws have different exponents and crossover behaviors compared to English texts, indicating different words organization and dynamics of words in the process of text growth. We propose a stochastic feedback model of words organization and text growth, which successfully accounts for the empirically observed scaling laws with their corresponding scaling exponents and characteristic crossover regimes. Further, by varying key model parameters, we reproduce differences in the organization and scaling laws of words between the Chinese and English language. We also identify functional relationships between model parameters and the empirically observed scaling exponents, thus providing new insights into the words organization and growth dynamics in the Chinese and English language.

  15. Alleged nursery words and hypocorisms among Germanic kinship terms

    DEFF Research Database (Denmark)

    Hansen, Bjarne Simmelkjær Sandgaard

    2017-01-01

    By (re-)evaluating the etymologies of the three Proto-Germanic kinship terms *aiþīn-/-ōn- ‘mother’, *aiþma- ‘daughter’s husband’ and *faþōn- ‘father’s sister’ that are all claimed by at least some etymological handbooks to be nursery words or hypocorisms, I contend that we must abandon their nurs......By (re-)evaluating the etymologies of the three Proto-Germanic kinship terms *aiþīn-/-ōn- ‘mother’, *aiþma- ‘daughter’s husband’ and *faþōn- ‘father’s sister’ that are all claimed by at least some etymological handbooks to be nursery words or hypocorisms, I contend that we must abandon...... their nursery-word interpretations and rather regard them as inherited words derived from known Indo-European lexical material in a way that reveals important information on the Old Germanic society and its family pattern....

  16. Model of the Dynamic Construction Process of Texts and Scaling Laws of Words Organization in Language Systems.

    Science.gov (United States)

    Li, Shan; Lin, Ruokuang; Bian, Chunhua; Ma, Qianli D Y; Ivanov, Plamen Ch

    2016-01-01

    Scaling laws characterize diverse complex systems in a broad range of fields, including physics, biology, finance, and social science. The human language is another example of a complex system of words organization. Studies on written texts have shown that scaling laws characterize the occurrence frequency of words, words rank, and the growth of distinct words with increasing text length. However, these studies have mainly concentrated on the western linguistic systems, and the laws that govern the lexical organization, structure and dynamics of the Chinese language remain not well understood. Here we study a database of Chinese and English language books. We report that three distinct scaling laws characterize words organization in the Chinese language. We find that these scaling laws have different exponents and crossover behaviors compared to English texts, indicating different words organization and dynamics of words in the process of text growth. We propose a stochastic feedback model of words organization and text growth, which successfully accounts for the empirically observed scaling laws with their corresponding scaling exponents and characteristic crossover regimes. Further, by varying key model parameters, we reproduce differences in the organization and scaling laws of words between the Chinese and English language. We also identify functional relationships between model parameters and the empirically observed scaling exponents, thus providing new insights into the words organization and growth dynamics in the Chinese and English language.

  17. Artful Terms: A Study on Aesthetic Word Usage for Visual Art versus Film and Music

    Directory of Open Access Journals (Sweden)

    M Dorothee Augustin

    2012-06-01

    Full Text Available Despite the importance of the arts in human life, psychologists still know relatively little about what characterises their experience for the recipient. The current research approaches this problem by studying people's word usage in aesthetics, with a focus on three important art forms: visual art, film, and music. The starting point was a list of 77 words known to be useful to describe aesthetic impressions of visual art (Augustin et al 2012, Acta Psychologica 139 187–201. Focusing on ratings of likelihood of use, we examined to what extent word usage in aesthetic descriptions of visual art can be generalised to film and music. The results support the claim of an interplay of generality and specificity in aesthetic word usage. Terms with equal likelihood of use for all art forms included beautiful, wonderful, and terms denoting originality. Importantly, emotion-related words received higher ratings for film and music than for visual art. To our knowledge this is direct evidence that aesthetic experiences of visual art may be less affectively loaded than, for example, experiences of music. The results render important information about aesthetic word usage in the realm of the arts and may serve as a starting point to develop tailored measurement instruments for different art forms.

  18. Fast words boundaries localization in text fields for low quality document images

    Science.gov (United States)

    Ilin, Dmitry; Novikov, Dmitriy; Polevoy, Dmitry; Nikolaev, Dmitry

    2018-04-01

    The paper examines the problem of word boundaries precise localization in document text zones. Document processing on a mobile device consists of document localization, perspective correction, localization of individual fields, finding words in separate zones, segmentation and recognition. While capturing an image with a mobile digital camera under uncontrolled capturing conditions, digital noise, perspective distortions or glares may occur. Further document processing gets complicated because of its specifics: layout elements, complex background, static text, document security elements, variety of text fonts. However, the problem of word boundaries localization has to be solved at runtime on mobile CPU with limited computing capabilities under specified restrictions. At the moment, there are several groups of methods optimized for different conditions. Methods for the scanned printed text are quick but limited only for images of high quality. Methods for text in the wild have an excessively high computational complexity, thus, are hardly suitable for running on mobile devices as part of the mobile document recognition system. The method presented in this paper solves a more specialized problem than the task of finding text on natural images. It uses local features, a sliding window and a lightweight neural network in order to achieve an optimal algorithm speed-precision ratio. The duration of the algorithm is 12 ms per field running on an ARM processor of a mobile device. The error rate for boundaries localization on a test sample of 8000 fields is 0.3

  19. METHOD OF RARE TERM CONTRASTIVE EXTRACTION FROM NATURAL LANGUAGE TEXTS

    Directory of Open Access Journals (Sweden)

    I. A. Bessmertny

    2017-01-01

    Full Text Available The paper considers a problem of automatic domain term extraction from documents corpus by means of a contrast collection. Existing contrastive methods successfully extract often used terms but mishandle rare terms. This could yield poorness of the resulting thesaurus. Assessment of point-wise mutual information is one of the known statistical methods of term extraction and it finds rare terms successfully. Although, it extracts many false terms at that. The proposed approach consists of point-wise mutual information application for rare terms extraction and filtering of candidates by criterion of joint occurrence with the other candidates. We build “documents-by-terms” matrix that is subjected to singular value decomposition to eliminate noise and reveal strong interconnections. Then we pass on to the resulting matrix “terms-by-terms” that reproduces strength of interconnections between words. This approach was approved on a documents collection from “Geology” domain with the use of contrast documents from such topics as “Politics”, “Culture”, “Economics” and “Accidents” on some Internet resources. The experimental results demonstrate operability of this method for rare terms extraction.

  20. Short-term verbal memory and psychophysiological response to emotion-related words in children who stutter

    Directory of Open Access Journals (Sweden)

    Stokić Miodrag

    2012-01-01

    Full Text Available Emotions play a significant role in fluency disorders. In this research we wanted to examine immediate and delayed verbal recall for auditory presented words that carry information about different emotional state (emotion-related words and emotionally neutral words in children who stutter (N=35 and their peers (N=35. Using only word semantics, we wanted to eliminate emotional verbal expression of words as a factor that can influence memory abilities. In addition, we also wanted to examine skin conductance measure as an indicator of autonomic nervous system arousal during short-term memory task for emotion-related and emotionally neutral words. Parental questionnaire (Stuttering Intensity in Children Who Stutter in Positive and Negative Emotion-Related Everyday Situations was given to parents of children who stutter in order to collect data regarding stuttering severity in emotionally arousing situations in everyday life. Differences between the experimental and the control group in global memory capacity are highest in immediate recall (p=0,01 with the tendency for lowering statistical significance with prolongation of retention interval. According to the questionnaire results, children who stutter show a higher degree of stuttering in situations with positive emotional valence (p< 0.00. Skin conductance measurements showed higher autonomic nervous system arousal during perception and free recall of positive emotion-related words in children who stutter when compared to negative and emotionally neutral words. The results indicate higher emotional arousal to positive emotions in children who stutter (p=0.02, leading to either less fluent speech or suppression of verbal short-term memory capacity.

  1. Long-term repetition priming with symmetrical polygons and words.

    Science.gov (United States)

    Kersteen-Tucker, Z

    1991-01-01

    In two different tasks, subjects were asked to make lexical decisions (word or nonword) and symmetry judgments (symmetrical or nonsymmetrical) about two-dimensional polygons. In both tasks, every stimulus was repeated at one of four lags (0, 1, 4, or 8 items interposed between the first and second stimulus presentations). This paradigm, known as repetition priming, revealed comparable short-term priming (Lag 0) and long-term priming (Lags 1, 4, and 8) both for symmetrical polygons and for words. A shorter term component (Lags 0 and 1) of priming was observed for nonwords, and only very short-term priming (Lag 0) was observed for nonsymmetrical polygons. These results indicate that response facilitation accruing from repeated exposure can be observed for stimuli that have no preexisting memory representations and suggest that perceptual factors contribute to repetition-priming effects.

  2. Short-term retention of pictures and words: evidence for dual coding systems.

    Science.gov (United States)

    Pellegrino, J W; Siegel, A W; Dhawan, M

    1975-03-01

    The recall of picture and word triads was examined in three experiments that manipulated the type of distraction in a Brown-Peterson short-term retention task. In all three experiments recall of pictures was superior to words under auditory distraction conditions. Visual distraction produced high performance levels with both types of stimuli, whereas combined auditory and visual distraction significantly reduced picture recall without further affecting word recall. The results were interpreted in terms of the dual coding hypothesis and indicated that pictures are encoded into separate visual and acoustic processing systems while words are primarily acoustically encoded.

  3. Word2Vec inversion and traditional text classifiers for phenotyping lupus.

    Science.gov (United States)

    Turner, Clayton A; Jacobs, Alexander D; Marques, Cassios K; Oates, James C; Kamen, Diane L; Anderson, Paul E; Obeid, Jihad S

    2017-08-22

    Identifying patients with certain clinical criteria based on manual chart review of doctors' notes is a daunting task given the massive amounts of text notes in the electronic health records (EHR). This task can be automated using text classifiers based on Natural Language Processing (NLP) techniques along with pattern recognition machine learning (ML) algorithms. The aim of this research is to evaluate the performance of traditional classifiers for identifying patients with Systemic Lupus Erythematosus (SLE) in comparison with a newer Bayesian word vector method. We obtained clinical notes for patients with SLE diagnosis along with controls from the Rheumatology Clinic (662 total patients). Sparse bag-of-words (BOWs) and Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) matrices were produced using NLP pipelines. These matrices were subjected to several different NLP classifiers: neural networks, random forests, naïve Bayes, support vector machines, and Word2Vec inversion, a Bayesian inversion method. Performance was measured by calculating accuracy and area under the Receiver Operating Characteristic (ROC) curve (AUC) of a cross-validated (CV) set and a separate testing set. We calculated the accuracy of the ICD-9 billing codes as a baseline to be 90.00% with an AUC of 0.900, the shallow neural network with CUIs to be 92.10% with an AUC of 0.970, the random forest with BOWs to be 95.25% with an AUC of 0.994, the random forest with CUIs to be 95.00% with an AUC of 0.979, and the Word2Vec inversion to be 90.03% with an AUC of 0.905. Our results suggest that a shallow neural network with CUIs and random forests with both CUIs and BOWs are the best classifiers for this lupus phenotyping task. The Word2Vec inversion method failed to significantly beat the ICD-9 code classification, but yielded promising results. This method does not require explicit features and is more adaptable to non-binary classification tasks. The Word2Vec inversion is

  4. On the origin and meaning of the German word Luft and some meteorological terms concerning atmospheric water, especially fog

    Directory of Open Access Journals (Sweden)

    Möller, Detlev

    2014-12-01

    Full Text Available The English and French word “air” is derived from the Latin aer, which comes from the Greek άήρ. In contrast, the German word “Luft” is a common Proto-Germanic word; in Old English “ lift” and “ lyft”. The word Luft (also Danish, Swedish and Norwegian is associated with brightness; the German Licht (light, an air (in an atmospheric sense without fog or clouds. Air and water were originally “elements” in ancient Greek and were transmutable; they represented two kinds of the “ layer of mist” (atmosphere. Dark or thick air was mist or cloud, hiding the gods (who lived in the upper air or sky; the aether. Different terms are presented that describe fog and clouds in connection with the history of the process of understanding. Finally, the word Luft (air as a term for gaseous chemical compounds (“kinds of gases” is discussed. In addition to the German, all terms are given in Greek, Latin, English and French .

  5. Knowledge based word-concept model estimation and refinement for biomedical text mining.

    Science.gov (United States)

    Jimeno Yepes, Antonio; Berlanga, Rafael

    2015-02-01

    Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Primary School Text Comprehension Predicts Mathematical Word Problem-Solving Skills in Secondary School

    Science.gov (United States)

    Björn, Piia Maria; Aunola, Kaisa; Nurmi, Jari-Erik

    2016-01-01

    This longitudinal study aimed to investigate the extent to which primary school text comprehension predicts mathematical word problem-solving skills in secondary school among Finnish students. The participants were 224 fourth graders (9-10 years old at the baseline). The children's text-reading fluency, text comprehension and basic calculation…

  7. Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network

    Energy Technology Data Exchange (ETDEWEB)

    Stewart, Ian B.; Arendt, Dustin L.; Bell, Eric B.; Volkova, Svitlana

    2017-05-17

    Language in social media is extremely dynamic: new words emerge, trend and disappear, while the meaning of existing words can fluctuate over time. This work addresses several important tasks of visualizing and predicting short term text representation shift, i.e. the change in a word’s contextual semantics. We study the relationship between short-term concept drift and representation shift on a large social media corpus – VKontakte collected during the Russia-Ukraine crisis in 2014 – 2015. We visualize short-term representation shift for example keywords and build predictive models to forecast short-term shifts in meaning from previous meaning as well as from concept drift. We show that short-term representation shift can be accurately predicted up to several weeks in advance and that visualization provides insight into meaning change. Our approach can be used to explore and characterize specific aspects of the streaming corpus during crisis events and potentially improve other downstream classification tasks including real-time event forecasting in social media.

  8. Ambiguity Analysis Of Words And Terms In Movie Script Entitled “Shooter”

    OpenAIRE

    Putra, Hendry Oktama

    2014-01-01

    There are general words and terms which is used in military language in movie “Shooter”. In its subtitles or movie transcripts, many words and terms in general context are found on in certain scene, where the scene is taking place. To put it simply, a word in general context, when used by military force, it gives different meaning. As such, those English learners who do not know such language will encounter difficulty to learn them as some of them may have certain interest. Describing the mil...

  9. Basic East German Words, Terms, Names: Why Know Them?

    Science.gov (United States)

    Mayer, Elizabeth M.

    Terms basic to any understanding of East German culture and politics are defined in this paper. The items selected are grouped in five categories: (1) the state, (2) political and philosophical terms, (3) economics, (4) education, and (5) the family, ethics, and the arts. The author emphasizes semantic differences despite similarities to words in…

  10. Using WordNet to Complement Training Information in Text Categorization

    OpenAIRE

    Rodriguez, Manuel de Buenaga; Hidalgo, Jose Maria Gomez; Agudo, Belen Diaz

    1997-01-01

    Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed through the use of a set of manually classified documents, a training collection. We suggest the utilization of additional resources like lexical databases to increase the amount of information that TC systems make use of, and thus, to improve their performance. Our approach integrates WordNet information with two training approaches through the Vector Space Model. ...

  11. 31 CFR 342.1 - Definition of words and terms used in this part.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 2 2010-07-01 2010-07-01 false Definition of words and terms used in this part. 342.1 Section 342.1 Money and Finance: Treasury Regulations Relating to Money and Finance... SAVINGS NOTES § 342.1 Definition of words and terms used in this part. (a) Payroll savings plan refers to...

  12. The Effect of Text Chat Assisted with Word Processors on Saudi English Major Students' Writing Accuracy and Productivity of Authentic Texts

    Directory of Open Access Journals (Sweden)

    Ahmad Mosa Batianeh

    2014-10-01

    Full Text Available Abstractــ-This study explored the effects of using online chat and word processors on students' writing skills that include; organizing a text, spelling, punctuation, grammar, phrasal verbs, idioms, idiomatic expressions, pragmatics, creativity, vocabulary growth, content, relational words, conjunctions, authenticity, figures of speech, imagination, coherence, style, socio-cultural aspects, language use, and the production of authentic text. The study group consisted of students in the Department of Languages and Translation at Taibah University who registered for the Writing Two course in the first semester of the 2012 - 2013 academic year. Fourty subjects were divided into two sections: section one was assigned as an experimental group (supported by Facebook and Skype and section two was assigned as a control group and was asked to write their essays with paper and pencil. Facebook and Skype accounts were created for every student in the experimental group. Data was analyzed from pre-test and post-test results to evaluate the question posed by the study: Does the use of online text chat assisted with word processors help undergraduate students develop their writing skills more than traditional methods of teaching? The results revealed that students who worked with Facebook and Skype showed a significant improvement in their writing skills when compared to the control group. In light of these findings, it is recommended that online discussions via Facebook, Skype, and other social media sites should be utilized when teaching writing and the other language skills.

  13. Menzerath-Altmann law for distinct word distribution analysis in a large text

    Science.gov (United States)

    Eroglu, Sertac

    2013-06-01

    The empirical law uncovered by Menzerath and formulated by Altmann, known as the Menzerath-Altmann law (henceforth the MA law), reveals the statistical distribution behavior of human language in various organizational levels. Building on previous studies relating organizational regularities in a language, we propose that the distribution of distinct (or different) words in a large text can effectively be described by the MA law. The validity of the proposition is demonstrated by examining two text corpora written in different languages not belonging to the same language family (English and Turkish). The results show not only that distinct word distribution behavior can accurately be predicted by the MA law, but that this result appears to be language-independent. This result is important not only for quantitative linguistic studies, but also may have significance for other naturally occurring organizations that display analogous organizational behavior. We also deliberately demonstrate that the MA law is a special case of the probability function of the generalized gamma distribution.

  14. Word position affects stimulus recognition: evidence for early ERP short-term plastic modulation.

    Science.gov (United States)

    Spironelli, Chiara; Galfano, Giovanni; Umiltà, Carlo; Angrilli, Alessandro

    2011-12-01

    The present study was aimed at investigating the short-term plastic changes that follow word learning at a neurophysiological level. The main hypothesis was that word position (left or right visual field, LVF/RH or RVF/LH) in the initial learning phase would leave a trace that affected, in the subsequent recognition phase, the Recognition Potential (i.e., the first negative component distinguishing words from other stimuli) elicited 220-240 ms after centrally presented stimuli. Forty-eight students were administered, in the learning phase, 125 words for 4s, randomly presented half in the left and half in the right visual field. In the recognition phase, participants were split into two equal groups, one was assigned to the Word task, the other to the Picture task (in which half of the 125 pictures were new, and half matched prior studied words). During the Word task, old RVF/LH words elicited significantly greater negativity in left posterior sites with respect to old LVF/RH words, which in turn showed the same pattern of activation evoked by new words. Therefore, correspondence between stimulus spatial position and hemisphere specialized in automatic word recognition created a robust prime for subsequent recognition. During the Picture task, pictures matching old RVF/LH words showed no differences compared with new pictures, but evoked significantly greater negativity than pictures matching old LVF/RH words. Thus, the priming effect vanished when the task required a switch from visual analysis to stored linguistic information, whereas the lack of correspondence between stimulus position and network specialized in automatic word recognition (i.e., when words were presented to the LVF/RH) revealed the implicit costs for recognition. Results support the view that short-term plastic changes occurring in a linguistic learning task interact with both stimulus position and modality (written word vs. picture representation). Copyright © 2011 Elsevier B.V. All rights

  15. Don’t words come easy?A psychophysical exploration of word superiority

    Directory of Open Access Journals (Sweden)

    Randi eStarrfelt

    2013-09-01

    Full Text Available Words are made of letters, and yet sometimes it is easier to identify a word than a single letter. This word superiority effect (WSE has been observed when written stimuli are presented very briefly or degraded by visual noise. We compare performance with letters and words in three experiments, to explore the extents and limits of the WSE. Using a carefully controlled list of three letter words, we show that a word superiority effect can be revealed in vocal reaction times even to undegraded stimuli. With a novel combination of psychophysics and mathematical modelling, we further show that the typical WSE is specifically reflected in perceptual processing speed: single words are simply processed faster than single letters. Intriguingly, when multiple stimuli are presented simultaneously, letters are perceived more easily than words, and this is reflected both in perceptual processing speed and visual short term memory capacity. So, even if single words come easy, there is a limit to the word superiority effect.

  16. A 38 million words Dutch text corpus and its users | Kruyt | Lexikos

    African Journals Online (AJOL)

    In August 1996, the 38 Million Words Corpus was available for consultation by the international research community. The present paper reports on the characteristics of this corpus (design, text classification, linguistic annotation) and on its use, both in dictionary projects and in linguistic research. In spite of limitations with ...

  17. Infants' long-term memory for the sound patterns of words and voices.

    Science.gov (United States)

    Houston, Derek M; Jusczyk, Peter W

    2003-12-01

    Infants' long-term memory for the phonological patterns of words versus the indexical properties of talkers' voices was examined in 3 experiments using the Headturn Preference Procedure (D. G. Kemler Nelson et al., 1995). Infants were familiarized with repetitions of 2 words and tested on the next day for their orientation times to 4 passages--2 of which included the familiarized words. At 7.5 months of age, infants oriented longer to passages containing familiarized words when these were produced by the original talker. At 7.5 and 10.5 months of age, infants did not recognize words in passages produced by a novel female talker. In contrast, 7.5-month-olds demonstrated word recognition in both talker conditions when presented with passages produced by both the original and the novel talker. The findings suggest that talker-specific information can prime infants' memory for words and facilitate word recognition across talkers. ((c) 2003 APA, all rights reserved)

  18. 31 CFR 306.2 - Definitions of words and terms as used in these regulations.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 2 2010-07-01 2010-07-01 false Definitions of words and terms as used in these regulations. 306.2 Section 306.2 Money and Finance: Treasury Regulations Relating to... GENERAL REGULATIONS GOVERNING U.S. SECURITIES General Information § 306.2 Definitions of words and terms...

  19. Word list recall in youngsters and older adults

    Directory of Open Access Journals (Sweden)

    Sogol Gerami

    2011-01-01

    Full Text Available

    text-justify: kashida; margin: 0in 0in 0pt; direction: ltr; unicode-bidi: embed; text-align: justify; text-kashida: 0%;">A word-list recall is an experiment examines the effect of age on the change in memory. The ability to understand or use language is more or less dependent on the memory capacity. Any person may know what s/he wants to say but may not be able to say it if the memory does not help. We use some form of memory in all aspects of language processing. Whatever we have in our mind is stored whether for seconds, hours, or years. By short-term memory, a person can remember different things for a period of seconds or minutes only. By rehearsal, the duration and the quantity of storage will increase. Therefore, rehearsal transforms the short-term memory into the long-term memory. This experiment, which examines the number of words recalled by different age groups after presenting a word list, reveals that the younger a person the more are the words he or she recalls. The experiment also reveals that semantically related words have greater chance to be remembered when they are compared with unrelated words.

  20. Word List Recall in Youngsters and Older Adults

    Directory of Open Access Journals (Sweden)

    Sogol Gerami

    2011-01-01

    Full Text Available

    text-justify: kashida; margin: 0in 0in 0pt; direction: ltr; unicode-bidi: embed; text-align: justify; text-kashida: 0%;">A word-list recall is an experiment examines the effect of age on the change in memory. The ability to understand or use language is more or less dependent on the memory capacity. Any person may know what s/he wants to say but may not be able to say it if the memory does not help. We use some form of memory in all aspects of language processing. Whatever we have in our mind is stored whether for seconds, hours, or years. By short-term memory, a person can remember different things for a period of seconds or minutes only. By rehearsal, the duration and the quantity of storage will increase. Therefore, rehearsal transforms the short-term memory into the long-term memory. This experiment, which examines the number of words recalled by different age groups after presenting a word list, reveals that the younger a person the more are the words he or she recalls. The experiment also reveals that semantically related words have greater chance to be remembered when they are compared with unrelated words.

  1. Examining the Effect of Interference on Short-term Memory Recall of Arabic Abstract and Concrete Words Using Free, Cued, and Serial Recall Paradigms

    Directory of Open Access Journals (Sweden)

    Ahmed Mohammed Saleh Alduais

    2015-12-01

    Full Text Available Purpose: To see if there is a correlation between interference and short-term memory recall and to examine interference as a factor affecting memory recalling of Arabic and abstract words through free, cued, and serial recall tasks. Method: Four groups of undergraduates in King Saud University, Saudi Arabia participated in this study. The first group consisted of 9 undergraduates who were trained to perform three types of recall for 20 Arabic abstract and concrete words. The second, third and fourth groups consisted of 27 undergraduates where each group was trained only to perform one recall type: free recall, cued recall and serial recall respectively. Interference (short-term memory interruption was the independent variable and a number of recalled abstract and concrete words was the dependent variable. The used materials in this study were: abstract and concrete words classification form based on four factors was distributed to the participants (concreteness, imageability, meaningfulness, and age of acquisition, three oral recall forms, three written recall forms, and observation sheets for each type of recall. Also, three methods were used: auditory, visual, and written methods. Results: Findings indicated that interference effect on short-term memory recall of Arabic abstract and concrete words was not significant especially in the case of free and serial recall paradigms. The difference between the total number of recalled Arabic abstract and concrete words was also very slight. One other the hand, we came to the conclusion that Pearson’s correlation between interference at these memory recall paradigms (M: 1.66, SD= .47 and the short-term memory recall (M: 1.75, SD= .43 supported the research hypothesis that those participants with oral interruptions tended to recall slightly less Arabic abstract and concrete words, whereas those participants with no oral interruptions would tend to recall slightly more Arabic abstract and concrete

  2. 31 CFR 339.1 - Definitions of words and terms as used in this circular.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 2 2010-07-01 2010-07-01 false Definitions of words and terms as used in this circular. 339.1 Section 339.1 Money and Finance: Treasury Regulations Relating to Money... OFFERING OF UNITED STATES SAVINGS BONDS, SERIES H § 339.1 Definitions of words and terms as used in this...

  3. Strong and long: effects of word length on phonological binding in verbal short-term memory.

    Science.gov (United States)

    Jefferies, Elizabeth; Frankish, Clive; Noble, Katie

    2011-02-01

    This study examined the effects of item length on the contribution of linguistic knowledge to immediate serial recall (ISR). Long words are typically recalled more poorly than short words, reflecting the greater demands that they place on phonological encoding, rehearsal, and production. However, reverse word length effects--that is, better recall of long than short words--can also occur in situations in which phonological maintenance is difficult, suggesting that long words derive greater support from long-term lexical knowledge. In this study, long and short words and nonwords (containing one vs. three syllables) were presented for immediate serial recall in (a) pure lists and (b) unpredictable mixed lists of words and nonwords. The mixed-list paradigm is known to disrupt the phonological stability of words, encouraging their phonemes to recombine with the elements of other list items. In this situation, standard length effects were seen for nonwords, while length effects for words were absent or reversed. A detailed error analysis revealed that long words were more robust to the mixed-list manipulation than short words: Their phonemes were less likely to be omitted and to recombine with phonemes from other list items. These findings support an interactive view of short-term memory, in which long words derive greater benefits from lexical knowledge than short words-especially when their phonological integrity is challenged by the inclusion of nonwords in mixed lists.

  4. Language Identification of Kannada, Hindi and English Text Words Through Visual Discriminating Features

    Directory of Open Access Journals (Sweden)

    M.C. Padma

    2008-06-01

    Full Text Available In a multilingual country like India, a document may contain text words in more than one language. For a multilingual environment, multi lingual Optical Character Recognition (OCR system is needed to read the multilingual documents. So, it is necessary to identify different language regions of the document before feeding the document to the OCRs of individual language. The objective of this paper is to propose visual clues based procedure to identify Kannada, Hindi and English text portions of the Indian multilingual document.

  5. Developmental, Component-Based Model of Reading Fluency: An Investigation of Predictors of Word-Reading Fluency, Text-Reading Fluency, and Reading Comprehension.

    Science.gov (United States)

    Kim, Young-Suk Grace

    2015-01-01

    The primary goal was to expand our understanding of text reading fluency (efficiency or automaticity)-how its relation to other constructs (e.g., word reading fluency and reading comprehension) changes over time and how it is different from word reading fluency and reading comprehension. We examined (1) developmentally changing relations among word reading fluency, listening comprehension, text reading fluency, and reading comprehension; (2) the relation of reading comprehension to text reading fluency; (3) unique emergent literacy predictors (i.e., phonological awareness, orthographic awareness, morphological awareness, letter name knowledge, vocabulary) of text reading fluency vs. word reading fluency; and (4) unique language and cognitive predictors (e.g., vocabulary, grammatical knowledge, theory of mind) of text reading fluency vs. reading comprehension. These questions were addressed using longitudinal data (two timepoints; Mean age = 5;24 & 6;08) from Korean-speaking children ( N = 143). Results showed that listening comprehension was related to text reading fluency at time 2, but not at time 1. At both times text reading fluency was related to reading comprehension, and reading comprehension was related to text reading fluency over and above word reading fluency and listening comprehension. Orthographic awareness was related to text reading fluency over and above other emergent literacy skills and word reading fluency. Vocabulary and grammatical knowledge were independently related to text reading fluency and reading comprehension whereas theory of mind was related to reading comprehension, but not text reading fluency. These results reveal developmental nature of relations and mechanism of text reading fluency in reading development.

  6. Developmental, Component-Based Model of Reading Fluency: An Investigation of Predictors of Word-Reading Fluency, Text-Reading Fluency, and Reading Comprehension

    OpenAIRE

    Kim, Young-Suk Grace

    2015-01-01

    The primary goal was to expand our understanding of text reading fluency (efficiency or automaticity)—how its relation to other constructs (e.g., word reading fluency and reading comprehension) changes over time and how it is different from word reading fluency and reading comprehension. We examined (1) developmentally changing relations among word reading fluency, listening comprehension, text reading fluency, and reading comprehension; (2) the relation of reading comprehension to text readi...

  7. Text Detection in Natural Scene Images by Stroke Gabor Words.

    Science.gov (United States)

    Yi, Chucai; Tian, Yingli

    2011-01-01

    In this paper, we propose a novel algorithm, based on stroke components and descriptive Gabor filters, to detect text regions in natural scene images. Text characters and strings are constructed by stroke components as basic units. Gabor filters are used to describe and analyze the stroke components in text characters or strings. We define a suitability measurement to analyze the confidence of Gabor filters in describing stroke component and the suitability of Gabor filters on an image window. From the training set, we compute a set of Gabor filters that can describe principle stroke components of text by their parameters. Then a K -means algorithm is applied to cluster the descriptive Gabor filters. The clustering centers are defined as Stroke Gabor Words (SGWs) to provide a universal description of stroke components. By suitability evaluation on positive and negative training samples respectively, each SGW generates a pair of characteristic distributions of suitability measurements. On a testing natural scene image, heuristic layout analysis is applied first to extract candidate image windows. Then we compute the principle SGWs for each image window to describe its principle stroke components. Characteristic distributions generated by principle SGWs are used to classify text or nontext windows. Experimental results on benchmark datasets demonstrate that our algorithm can handle complex backgrounds and variant text patterns (font, color, scale, etc.).

  8. Text in social networking Web sites: A word frequency analysis of Live Spaces

    OpenAIRE

    Thelwall, Mike

    2008-01-01

    Social networking sites are owned by a wide section of society and seem to dominate Web usage. Despite much research into this phenomenon, little systematic data is available. This article partially fills this gap with a pilot text analysis of one social networking site, Live Spaces. The text in 3,071 English language Live Spaces sites was monitored daily for six months and word frequency statistics calculated and compared with those from the British National Corpus. The results confirmed the...

  9. Short-Term Memory for Pictures and Words by Mentally Retarded and Nonretarded Persons.

    Science.gov (United States)

    Ellis, Norman R.; Wooldridge, Peter W.

    1985-01-01

    Twelve mentally retarded and 12 nonretarded adults were compared in a Brown-Peterson short-term memory task for the retention of words and pictures over intervals up to 30 seconds. The retarded subjects forgot more rapidly over the initial 10 seconds. They also retained pictures better than they did words. (Author/DB)

  10. Dependence of exponents on text length versus finite-size scaling for word-frequency distributions

    Science.gov (United States)

    Corral, Álvaro; Font-Clos, Francesc

    2017-08-01

    Some authors have recently argued that a finite-size scaling law for the text-length dependence of word-frequency distributions cannot be conceptually valid. Here we give solid quantitative evidence for the validity of this scaling law, using both careful statistical tests and analytical arguments based on the generalized central-limit theorem applied to the moments of the distribution (and obtaining a novel derivation of Heaps' law as a by-product). We also find that the picture of word-frequency distributions with power-law exponents that decrease with text length [X. Yan and P. Minnhagen, Physica A 444, 828 (2016), 10.1016/j.physa.2015.10.082] does not stand with rigorous statistical analysis. Instead, we show that the distributions are perfectly described by power-law tails with stable exponents, whose values are close to 2, in agreement with the classical Zipf's law. Some misconceptions about scaling are also clarified.

  11. The role of short-term memory impairment in nonword repetition, real word repetition, and nonword decoding: A case study.

    Science.gov (United States)

    Peter, Beate

    2018-01-01

    In a companion study, adults with dyslexia and adults with a probable history of childhood apraxia of speech showed evidence of difficulty with processing sequential information during nonword repetition, multisyllabic real word repetition and nonword decoding. Results suggested that some errors arose in visual encoding during nonword reading, all levels of processing but especially short-term memory storage/retrieval during nonword repetition, and motor planning and programming during complex real word repetition. To further investigate the role of short-term memory, a participant with short-term memory impairment (MI) was recruited. MI was confirmed with poor performance during a sentence repetition and three nonword repetition tasks, all of which have a high short-term memory load, whereas typical performance was observed during tests of reading, spelling, and static verbal knowledge, all with low short-term memory loads. Experimental results show error-free performance during multisyllabic real word repetition but high counts of sequence errors, especially migrations and assimilations, during nonword repetition, supporting short-term memory as a locus of sequential processing deficit during nonword repetition. Results are also consistent with the hypothesis that during complex real word repetition, short-term memory is bypassed as the word is recognized and retrieved from long-term memory prior to producing the word.

  12. Culture-Bound Words of the Danube Basin Countries: Translation into English

    Directory of Open Access Journals (Sweden)

    Chetverikova Olena

    2015-08-01

    Full Text Available Any course in linguistic country study or popular text translation is impossible without adequate understanding and presentation of culture-bound elements, which present one of the most difficult topics to deal with, especially in multicultural countries. Our investigation aims to show the problems, which appear when we deal with equivalent-lacking words related to culture. Sometimes equivalent-lacking words are associated with culture-bound words, the Ukrainian equivalent for them is “реалії” (derived from Latin realis, pl. realia. However, the term “culture-bound word” is of narrower meaning than the term “equivalent-lacking word”. A culture-bound word names an object peculiar to this or that ethnic culture. Equivalent-lacking words include, along with culture-bound words, neologisms, i.e. newly coined forms, dialect words, slang, taboo-words, foreign (third language terms, proper names, misspellings, archaisms. Comparison of languages and cultures reveals the various types of culture-bound words. Reasons for using them can be extralinguistic, lexical or stylistic. When translating culture-bound words a translator should be aware of the receptor’s potential problems, take into account his background knowledge and choose the best means of translation.

  13. A Novel Text Clustering Approach Using Deep-Learning Vocabulary Network

    Directory of Open Access Journals (Sweden)

    Junkai Yi

    2017-01-01

    Full Text Available Text clustering is an effective approach to collect and organize text documents into meaningful groups for mining valuable information on the Internet. However, there exist some issues to tackle such as feature extraction and data dimension reduction. To overcome these problems, we present a novel approach named deep-learning vocabulary network. The vocabulary network is constructed based on related-word set, which contains the “cooccurrence” relations of words or terms. We replace term frequency in feature vectors with the “importance” of words in terms of vocabulary network and PageRank, which can generate more precise feature vectors to represent the meaning of text clustering. Furthermore, sparse-group deep belief network is proposed to reduce the dimensionality of feature vectors, and we introduce coverage rate for similarity measure in Single-Pass clustering. To verify the effectiveness of our work, we compare the approach to the representative algorithms, and experimental results show that feature vectors in terms of deep-learning vocabulary network have better clustering performance.

  14. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  15. Periodic words connected with the Fibonacci words

    Directory of Open Access Journals (Sweden)

    G. M. Barabash

    2016-06-01

    Full Text Available In this paper we introduce two families of periodic words (FLP-words of type 1 and FLP-words of type 2 that are connected with the Fibonacci words and investigated their properties.

  16. Text Classification and Distributional features techniques in Datamining and Warehousing

    OpenAIRE

    Bethu, Srikanth; Babu, G Charless; Vinoda, J; Priyadarshini, E; rao, M Raghavendra

    2013-01-01

    Text Categorization is traditionally done by using the term frequency and inverse document frequency.This type of method is not very good because, some words which are not so important may appear in the document .The term frequency of unimportant words may increase and document may be classified in the wrong category.For reducing the error of classifying of documents in wrong category. The Distributional features are introduced. In the Distribuional Features, the Distribution of the words in ...

  17. Word Reading Efficiency, Text Reading Fluency, and Reading Comprehension among Chinese Learners of English

    Science.gov (United States)

    Jiang, Xiangying; Sawaki, Yasuyo; Sabatini, John

    2012-01-01

    This study examined the relationship among word reading efficiency, text reading fluency, and reading comprehension for adult English as a Foreign Language (EFL) learners. Data from 185 adult Chinese EFL learners preparing to take the Test-of-English-as-a-Foreign-Language[TM] (TOEFL[R]) were analyzed in this study. The participants completed a…

  18. Social Media Text Classification by Enhancing Well-Formed Text Trained Model

    Directory of Open Access Journals (Sweden)

    Phat Jotikabukkana

    2016-09-01

    Full Text Available Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF and Word Article Matrix (WAM are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

  19. A spatially-supported forced-choice recognition test reveals children’s long-term memory for newly learned word forms

    Directory of Open Access Journals (Sweden)

    Katherine R. Gordon

    2014-03-01

    Full Text Available Children’s memories for the link between a newly trained word and its referent have been the focus of extensive past research. However, memory for the word form itself is rarely assessed among preschool-age children. When it is, children are typically asked to verbally recall the forms, and they generally perform at floor on such tests. To better measure children’s memory for word forms, we aimed to design a more sensitive test that required recognition rather than recall, provided spatial cues to off-set the phonological memory demands of the test, and allowed pointing rather than verbal responses. We taught 12 novel word-referent pairs via ostensive naming to sixteen 4-to-6-year-olds and measured their memory for the word forms after a week-long retention interval using the new spatially-supported form recognition test. We also measured their memory for the word-referent links and the generalization of the links to untrained referents with commonly used recognition tests. Children demonstrated memory for word forms at above chance levels; however, their memory for forms was poorer than their memory for trained or generalized word-referent links. When in error, children were no more likely to select a foil that was a close neighbor to the target form than a maximally different foil. Additionally, they more often selected correct forms that were among the first six than the last six to be trained. Overall, these findings suggest that children are able to remember word forms after a limited number of ostensive exposures and a long-term delay. However, word forms remain more difficult to learn than word-referent links and there is an upper limit on the number of forms that can be learned within a given period of time.

  20. Uro-words making history: ureter and urethra.

    Science.gov (United States)

    Marx, Franz Josef; Karenberg, Axel

    2010-06-15

    We comprehensively review the history of the terms "ureter" and "urethra" from 700 BC to the present. Using a case study approach, ancient medical texts were analyzed to clarify the etymology and use of both terms. In addition, selected anatomy textbooks from the 15th to 17th centuries were searched to identify and compare descriptions, illustrations, and various expressions used by contemporary authors to designate the upper and lower parts of the urinary tract. The Ancient Greek words "ureter" and "urethra" appear early in Hippocratic and Aristotelian writings. However, both terms designated what we today call the urethra. It was only with increasing anatomical knowledge in Greek medical texts after the 1st century AD that definitions of these words evolved similar to those we employ today. Numerous synonyms were used which served as a basis for translation into Arabic and later Latin during the transfer of ancient knowledge to the cultures of the medieval period. When Greek original texts and their Arabic-Latin version were compared during the Renaissance, this led to terminological confusion which could only be gradually overcome. Around the year 1600, the use of the latinized terms "ureter" and "urethra" became generally accepted. The dissemination of these terms in modern national languages and the emergence of clinical derivatives complete this historical development. The history of the terms "ureter" and "urethra" is exemplary of the difficulties with which the development of a precise urologic terminology had to struggle. The story behind the words also clarifies why even today we still have imprecise or misleading terms.

  1. Word Domain Disambiguation via Word Sense Disambiguation

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.

    2006-06-04

    Word subject domains have been widely used to improve the perform-ance of word sense disambiguation al-gorithms. However, comparatively little effort has been devoted so far to the disambiguation of word subject do-mains. The few existing approaches have focused on the development of al-gorithms specific to word domain dis-ambiguation. In this paper we explore an alternative approach where word domain disambiguation is achieved via word sense disambiguation. Our study shows that this approach yields very strong results, suggesting that word domain disambiguation can be ad-dressed in terms of word sense disam-biguation with no need for special purpose algorithms.

  2. Influence of the Number of Predicted Words on Text Input Speed in Participants With Cervical Spinal Cord Injury.

    Science.gov (United States)

    Pouplin, Samuel; Roche, Nicolas; Vaugier, Isabelle; Jacob, Antoine; Figere, Marjorie; Pottier, Sandra; Antoine, Jean-Yves; Bensmail, Djamel

    2016-02-01

    To determine whether the number of words displayed in the word prediction software (WPS) list affects text input speed (TIS) in people with cervical spinal cord injury (SCI), and whether any influence is dependent on the level of the lesion. A cross-sectional trial. A rehabilitation center. Persons with cervical SCI (N=45). Lesion level was high (C4 and C5, American Spinal Injury Association [ASIA] grade A or B) for 15 participants (high-lesion group) and low (between C6 and C8, ASIA grade A or B) for 30 participants (low-lesion group). TIS was evaluated during four 10-minute copying tasks: (1) without WPS (Without); (2) with a display of 3 predicted words (3Words); (3) with a display of 6 predicted words (6Words); and (4) with a display of 8 predicted words (8Words). During the 4 copying tasks, TIS was measured objectively (characters per minute, number of errors) and subjectively through subject report (fatigue, perception of speed, cognitive load, satisfaction). For participants with low-cervical SCI, TIS without WPS was faster than with WPS, regardless of the number of words displayed (Pwords displayed in a word prediction list on TIS; however, perception of TIS differed according to lesion level. For persons with low-cervical SCI, a small number of words should be displayed, or WPS should not be used at all. For persons with high-cervical SCI, a larger number of words displayed increases the comfort of use of WPS. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  3. Artful terms: A study on aesthetic word usage for visual art versus film and music

    Science.gov (United States)

    Augustin, M Dorothee; Carbon, Claus-Christian; Wagemans, Johan

    2012-01-01

    Despite the importance of the arts in human life, psychologists still know relatively little about what characterises their experience for the recipient. The current research approaches this problem by studying people's word usage in aesthetics, with a focus on three important art forms: visual art, film, and music. The starting point was a list of 77 words known to be useful to describe aesthetic impressions of visual art (Augustin et al 2012, Acta Psychologica 139 187–201). Focusing on ratings of likelihood of use, we examined to what extent word usage in aesthetic descriptions of visual art can be generalised to film and music. The results support the claim of an interplay of generality and specificity in aesthetic word usage. Terms with equal likelihood of use for all art forms included beautiful, wonderful, and terms denoting originality. Importantly, emotion-related words received higher ratings for film and music than for visual art. To our knowledge this is direct evidence that aesthetic experiences of visual art may be less affectively loaded than, for example, experiences of music. The results render important information about aesthetic word usage in the realm of the arts and may serve as a starting point to develop tailored measurement instruments for different art forms. PMID:23145287

  4. Artful terms: A study on aesthetic word usage for visual art versus film and music.

    Science.gov (United States)

    Augustin, M Dorothee; Carbon, Claus-Christian; Wagemans, Johan

    2012-01-01

    Despite the importance of the arts in human life, psychologists still know relatively little about what characterises their experience for the recipient. The current research approaches this problem by studying people's word usage in aesthetics, with a focus on three important art forms: visual art, film, and music. The starting point was a list of 77 words known to be useful to describe aesthetic impressions of visual art (Augustin et al 2012, Acta Psychologica139 187-201). Focusing on ratings of likelihood of use, we examined to what extent word usage in aesthetic descriptions of visual art can be generalised to film and music. The results support the claim of an interplay of generality and specificity in aesthetic word usage. Terms with equal likelihood of use for all art forms included beautiful, wonderful, and terms denoting originality. Importantly, emotion-related words received higher ratings for film and music than for visual art. To our knowledge this is direct evidence that aesthetic experiences of visual art may be less affectively loaded than, for example, experiences of music. The results render important information about aesthetic word usage in the realm of the arts and may serve as a starting point to develop tailored measurement instruments for different art forms.

  5. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  6. On avoided words, absent words, and their application to biological sequence analysis.

    Science.gov (United States)

    Almirantis, Yannis; Charalampopoulos, Panagiotis; Gao, Jia; Iliopoulos, Costas S; Mohamed, Manal; Pissis, Solon P; Polychronopoulos, Dimitris

    2017-01-01

    The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided . This concept is particularly useful in DNA linguistic analysis. The value of the deviation of w , denoted by [Formula: see text], effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length [Formula: see text] is a [Formula: see text]-avoided word in x if [Formula: see text], for a given threshold [Formula: see text]. Notice that such a word may be completely absent from x . Hence, computing all such words naïvely can be a very time-consuming procedure, in particular for large k . In this article, we propose an [Formula: see text]-time and [Formula: see text]-space algorithm to compute all [Formula: see text]-avoided words of length k in a given sequence of length n over a fixed-sized alphabet. We also present a time-optimal [Formula: see text]-time algorithm to compute all [Formula: see text]-avoided words (of any length) in a sequence of length n over an integer alphabet of size [Formula: see text]. In addition, we provide a tight asymptotic upper bound for the number of [Formula: see text]-avoided words over an integer alphabet and the expected length of the longest one. We make available an implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency and applicability of our implementation in biological sequence analysis. The systematic search for avoided words is particularly useful for biological sequence analysis. We present a linear-time and linear-space algorithm for the computation of avoided words of length k in a given sequence x . We suggest a modification to this algorithm so that it computes all avoided words of x , irrespective of their length, within the same time complexity. We also present combinatorial results with regards to avoided words and absent words.

  7. The Comparison between Contextual Guessing Strategies vs. Memorizing a List of Isolated Words in Vocabulary Learning Regarding Long Term Memory

    Directory of Open Access Journals (Sweden)

    Leyla Vakili S AMIYAN

    2014-03-01

    Full Text Available Guessing the meaning of unknown vocabularies within a text is a way of learning new words which is named textual vocabulary acquisition. The main purpose of this study is to investigate the effectiveness of a textual guessing strategy on vocabulary learning at the intermediate le vel. Textual guessing strategy is to guess the meaning of vocabularies with the help of surrounding words or sentences in the co - text without any translation. This paper reports the findings of two quantitative studies conducted on English language learner s with the Intermediate 2 level of proficiency in Kavosh foreign language institute, Mashhad, Iran. Twenty male and female attendants were selected and assigned to ’context’ and ‘non - context’ groups. The context group received an instruction to infer the m eaning of new words while the non - context participants were treated as learning new vocabularies individually (autonomously. The result of the independent sample t - test at the post - test stage revealed that the probability value of t - test with an equality of variances assumption is lower than 0.05 (0.04700. So this result represented that there is a meaningful difference between the experimental group and the control group considering their amount of learning. The results indicated that textual guessing s trategy had more effect on their long term memory. It was also revealed that the words learned through context are used more frequently than those learned in isolation in the speaking repertoire of the participants.

  8. Phraseosemantic peculiarities of idioms with the word «silki» (snares (a case study of Russian classics and modern literature texts

    Directory of Open Access Journals (Sweden)

    Andrianova D.A.

    2017-03-01

    Full Text Available this article explores semantic and stylistic meaning changes of idioms with the word “silki” (snares during XVIII–XXI centuries on the basis of Russian classics and modern literature texts and publicistic writing. It is proved that the word “silki” (snares was used as a biblical expression in ecclesiastic and some fiction texts, this explanes its strong negative connotation, which is out of use in up-to-date contexts.

  9. Set of Frequent Word Item sets as Feature Representation for Text with Indonesian Slang

    Science.gov (United States)

    Sa'adillah Maylawati, Dian; Putri Saptawati, G. A.

    2017-01-01

    Indonesian slang are commonly used in social media. Due to their unstructured syntax, it is difficult to extract their features based on Indonesian grammar for text mining. To do so, we propose Set of Frequent Word Item sets (SFWI) as text representation which is considered match for Indonesian slang. Besides, SFWI is able to keep the meaning of Indonesian slang with regard to the order of appearance sentence. We use FP-Growth algorithm with adding separation sentence function into the algorithm to extract the feature of SFWI. The experiments is done with text data from social media such as Facebook, Twitter, and personal website. The result of experiments shows that Indonesian slang were more correctly interpreted based on SFWI.

  10. Word-length effect in verbal short-term memory in individuals with Down's syndrome.

    Science.gov (United States)

    Kanno, K; Ikeda, Y

    2002-11-01

    Many studies have indicated that individuals with Down's syndrome (DS) show a specific deficit in short-term memory for verbal information. The aim of the present study was to investigate the influence of the length of words on verbal short-term memory in individuals with DS. Twenty-eight children with DS and 10 control participants matched for memory span were tested on verbal serial recall and speech rate, which are thought to involve rehearsal and output speed. Although a significant word-length effect was observed in both groups for the recall of a larger number of items with a shorter spoken duration than for those with a longer spoken duration, the number of correct recalls in the group with DS was reduced compared to the control subjects. The results demonstrating poor short-term memory in children with DS were irrelevant to speech rate. In addition, the proportion of repetition-gained errors in serial recall was higher in children with DS than in control subjects. The present findings suggest that poor access to long-term lexical knowledge, rather than overt articulation speed, constrains verbal short-term memory functions in individuals with DS.

  11. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Segmentation of handwritten text into lines, words and characters .... We now discuss here some terms relating to water reservoirs that will be used in feature ..... is found. Next, based on the touching position, reservoir base-area points, ...

  12. Incidental Vocabulary Acquisition: The Effects of Task Type, Word Occurrence and Their Combination

    Science.gov (United States)

    Laufer, Batia; Rozovski-Roitblat, Bella

    2011-01-01

    We investigated how long-term retention of new words was affected by task type, number of word occurrences in the teaching materials and the combination of the two factors. The tasks were: reading a text with occasional Focus on Form when learners used dictionaries (T+F), or reading a text with Focus on Forms, i.e. word focused exercises (T+Fs).…

  13. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  14. The effect of word prediction settings (frequency of use) on text input speed in persons with cervical spinal cord injury: a prospective study.

    Science.gov (United States)

    Pouplin, Samuel; Roche, Nicolas; Antoine, Jean-Yves; Vaugier, Isabelle; Pottier, Sandra; Figere, Marjorie; Bensmail, Djamel

    2017-06-01

    To determine whether activation of the frequency of use and automatic learning parameters of word prediction software has an impact on text input speed. Forty-five participants with cervical spinal cord injury between C4 and C8 Asia A or B accepted to participate to this study. Participants were separated in two groups: a high lesion group for participants with lesion level is at or above C5 Asia AIS A or B and a low lesion group for participants with lesion is between C6 and C8 Asia AIS A or B. A single evaluation session was carried out for each participant. Text input speed was evaluated during three copying tasks: • without word prediction software (WITHOUT condition) • with automatic learning of words and frequency of use deactivated (NOT_ACTIV condition) • with automatic learning of words and frequency of use activated (ACTIV condition) Results: Text input speed was significantly higher in the WITHOUT than the NOT_ACTIV (pword prediction software with the activation of frequency of use and automatic learning increased text input speed in participants with high-level tetraplegia. For participants with low-level tetraplegia, the use of word prediction software with frequency of use and automatic learning activated only decreased the number of errors. Implications in rehabilitation   Access to technology can be difficult for persons with disabilities such as cervical spinal cord injury (SCI). Several methods have been developed to increase text input speed such as word prediction software.This study show that parameter of word prediction software (frequency of use) affected text input speed in persons with cervical SCI and differed according to the level of the lesion. • For persons with high-level lesion, our results suggest that this parameter must be activated so that text input speed is increased. • For persons with low lesion group, this parameter must be activated so that the numbers of errors are decreased. • In all cases, the

  15. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  16. Texts, Transmissions, Receptions. Modern Approaches to Narratives

    NARCIS (Netherlands)

    Lardinois, A.P.M.H.; Levie, S.A.; Hoeken, H.; Lüthy, C.H.

    2015-01-01

    The papers collected in this volume study the function and meaning of narrative texts from a variety of perspectives. The word 'text' is used here in the broadest sense of the term: it denotes literary books, but also oral tales, speeches, newspaper articles and comics. One of the purposes of this

  17. Words with Multiple Meanings in Authentic L2 Texts: An Analysis of "Harry Potter and the Philosopher's Stone"

    Science.gov (United States)

    Ozturk, Meral

    2017-01-01

    Dictionary studies have suggested that nearly half of the English lexicon have multiple meanings. It is not yet clear, however, if second language learners reading English texts will encounter words with multiple meanings to the same degree. This study investigates the use of words with multiple meanings in an authentic English novel. Two samples…

  18. Syllabic Length Effect in Visual Word Recognition

    Directory of Open Access Journals (Sweden)

    Roya Ranjbar Mohammadi

    2014-07-01

    Full Text Available Studies on visual word recognition have resulted in different and sometimes contradictory proposals as Multi-Trace Memory Model (MTM, Dual-Route Cascaded Model (DRC, and Parallel Distribution Processing Model (PDP. The role of the number of syllables in word recognition was examined by the use of five groups of English words and non-words. The reaction time of the participants to these words was measured using reaction time measuring software. The results indicated that there was syllabic effect on recognition of both high and low frequency words. The pattern was incremental in terms of syllable number. This pattern prevailed in high and low frequency words and non-words except in one syllable words. In general, the results are in line with the PDP model which claims that a single processing mechanism is used in both words and non-words recognition. In other words, the findings suggest that lexical items are mainly processed via a lexical route.  A pedagogical implication of the findings would be that reading in English as a foreign language involves analytical processing of the syllable of the words.

  19. Words Do Come Easy (Sometimes)

    DEFF Research Database (Denmark)

    Starrfelt, Randi; Petersen, Anders; Vangkilde, Signe Allerup

    multiple stimuli are presented simultaneously: Are words treated as units or wholes in visual short term memory? Using methods based on a Theory of Visual Attention (TVA), we measured perceptual threshold, visual processing speed and visual short term memory capacity for words and letters, in two simple...... a different pattern: Letters are perceived more easily than words, and this is reflected both in perceptual processing speed and short term memory capacity. So even if single words do come easy, they seem to enjoy no advantage in visual short term memory....

  20. Don't words come easy? A psychophysical exploration of word superiority

    DEFF Research Database (Denmark)

    Starrfelt, Randi; Petersen, Anders; Vangkilde, Signe Allerup

    2013-01-01

    Words are made of letters, and yet sometimes it is easier to identify a word than a single letter. This word superiority effect (WSE) has been observed when written stimuli are presented very briefly or degraded by visual noise. We compare performance with letters and words in three experiments, ...... and visual short term memory capacity. So, even if single words come easy, there is a limit to the word superiority effect....

  1. Combinatorics on words Christoffel words and repetitions in words

    CERN Document Server

    Berstel, Jean; Reutenauer, Christophe; Saliola, Franco V

    2008-01-01

    The two parts of this text are based on two series of lectures delivered by Jean Berstel and Christophe Reutenauer in March 2007 at the Centre de Recherches Mathématiques, Montréal, Canada. Part I represents the first modern and comprehensive exposition of the theory of Christoffel words. Part II presents numerous combinatorial and algorithmic aspects of repetition-free words stemming from the work of Axel Thue-a pioneer in the theory of combinatorics on words. A beginner to the theory of combinatorics on words will be motivated by the numerous examples, and the large variety of exercises, which make the book unique at this level of exposition. The clean and streamlined exposition and the extensive bibliography will also be appreciated. After reading this book, beginners should be ready to read modern research papers in this rapidly growing field and contribute their own research to its development. Experienced readers will be interested in the finitary approach to Sturmian words that Christoffel words offe...

  2. Effects of dynamic text in an AAC app on sight word reading for individuals with autism spectrum disorder.

    Science.gov (United States)

    Caron, Jessica; Light, Janice; Holyfield, Christine; McNaughton, David

    2018-06-01

    The purpose of this study was to investigate the effects of Transition to Literacy (T2L) software features (i.e., dynamic text and speech output upon selection of a graphic symbol) within a grid display in an augmentative and alternative communication (AAC) app, on the sight word reading skills of individuals with autism spectrum disorders (ASD) and complex communication needs. The study implemented a single-subject multiple probe research design across one set of three participants. The same design was utilized with an additional set of two participants. As part of the intervention, the participants were exposed to an AAC app with the T2L features during a highly structured matching task. With only limited exposure to the features, the five participants all demonstrated increased accuracy of identification of 12 targeted sight words. This study provides preliminary evidence that redesigning AAC apps to include the provision of dynamic text combined with speech output, can positively impact the sight-word reading of participants during a structured task. This adaptation in AAC system design could be used to complement literacy instruction and to potentially infuse components of literacy learning into daily communication.

  3. Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death

    Science.gov (United States)

    Petersen, Alexander M.; Tenenbaum, Joel; Havlin, Shlomo; Stanley, H. Eugene

    2012-03-01

    We analyze the dynamic properties of 107 words recorded in English, Spanish and Hebrew over the period 1800-2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis.

  4. Interference Effects on the Recall of Pictures, Printed Words, and Spoken Words.

    Science.gov (United States)

    Burton, John K.; Bruning, Roger H.

    1982-01-01

    Nouns were presented in triads as pictures, printed words, or spoken words and followed by various types of interference. Measures of short- and long-term memory were obtained. In short-term memory, pictorial superiority occurred with acoustic, and visual and acoustic, but not visual interference. Long-term memory showed superior recall for…

  5. Long-term memory traces for familiar spoken words in tonal languages as revealed by the Mismatch Negativity

    Directory of Open Access Journals (Sweden)

    Naiphinich Kotchabhakdi

    2004-11-01

    Full Text Available Mismatch negativity (MMN, a primary response to an acoustic change and an index of sensory memory, was used to investigate the processing of the discrimination between familiar and unfamiliar Consonant-Vowel (CV speech contrasts. The MMN was elicited by rare familiar words presented among repetitive unfamiliar words. Phonetic and phonological contrasts were identical in all conditions. MMN elicited by the familiar word deviant was larger than that elicited by the unfamiliar word deviant. The presence of syllable contrast did significantly alter the word-elicited MMN in amplitude and scalp voltage field distribution. Thus, our results indicate the existence of word-related MMN enhancement largely independent of the word status of the standard stimulus. This enhancement may reflect the presence of a longterm memory trace for familiar spoken words in tonal languages.

  6. A Typed Text Retrieval Query Language for XML Documents.

    Science.gov (United States)

    Colazzo, Dario; Sartiani, Carlo; Albano, Antonio; Manghi, Paolo; Ghelli, Giorgio; Lini, Luca; Paoli, Michele

    2002-01-01

    Discussion of XML focuses on a description of Tequyla-TX, a typed text retrieval query language for XML documents that can search on both content and structures. Highlights include motivations; numerous examples; word-based and char-based searches; tag-dependent full-text searches; text normalization; query algebra; data models and term language;…

  7. From Word Alignment to Word Senses, via Multilingual Wordnets

    Directory of Open Access Journals (Sweden)

    Dan Tufis

    2006-05-01

    Full Text Available Most of the successful commercial applications in language processing (text and/or speech dispense with any explicit concern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing useful semantic features of linguistic data is becoming cheaper and cheaper and the accuracy of this process is steadily improving. Lately, there seems to be a growing acceptance of the idea that multilingual lexical ontologisms might be the key towards aligning different views on the semantic atomic units to be used in characterizing the general meaning of various and multilingual documents. Depending on the granularity at which semantic distinctions are necessary, the accuracy of the basic semantic processing (such as word sense disambiguation can be very high with relatively low complexity computing. The paper substantiates this statement by presenting a statistical/based system for word alignment and word sense disambiguation in parallel corpora. We describe a word alignment platform which ensures text pre-processing (tokenization, POS-tagging, lemmatization, chunking, sentence and word alignment as required by an accurate word sense disambiguation.

  8. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  9. NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition

    Directory of Open Access Journals (Sweden)

    Hung Hsieh-Chuan

    2006-12-01

    Full Text Available Abstract Background Biomedical named entity recognition (Bio-NER is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized and at least one class tag (e.g., B-protein, the beginning of a protein name. However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2 cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words. We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. Results To develop our ML

  10. Novel word retention in bilingual and monolingual speakers

    Directory of Open Access Journals (Sweden)

    Pui Fong eKan

    2014-09-01

    Full Text Available The goal of this research was to examine word retention in bilinguals and monolinguals. Long-term word retention is an essential part of vocabulary learning. Previous studies have documented that bilinguals outperform monolinguals in terms of retrieving newly-exposed words. Yet, little is known about whether or to what extent bilinguals are different from monolinguals in word retention. Participants were 30 English-speaking monolingual adults and 30 bilingual adults who speak Spanish as a home language and learned English as a second language during childhood. In a previous study (Kan, Sadagopan, Janich, & Andrade, 2014, the participants were exposed to the target novel words in English, Spanish, and Cantonese. In this current study, word retention was measured a week after the fast mapping task. No exposures were given during the one-week interval. Results showed that bilinguals and monolinguals retain a similar number of words. However, participants produced more words in English than in either Spanish or Cantonese. Correlation analyses revealed that language knowledge plays a role in the relationships between fast mapping and word retention. Specifically, within- and across-language relationships between bilinguals’ fast mapping and word retention were found in Spanish and English, by contrast, within-language relationships between monolinguals’ fast mapping and word retention were found in English and across-language relationships between their fast mapping and word retention performance in English and Cantonese. Similarly, bilinguals differed from monolinguals in the relationships among the word retention scores in three languages. Significant correlations were found among bilinguals’ retention scores. However, no such correlations were found among monolinguals’ retention scores. The overall findings suggest that bilinguals’ language experience and language knowledge most likely contribute to how they learn and retain new words.

  11. VisualUrText: A Text Analytics Tool for Unstructured Textual Data

    Science.gov (United States)

    Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.

    2018-05-01

    The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.

  12. The word-length effect and disyllabic words.

    Science.gov (United States)

    Lovatt, P; Avons, S E; Masterson, J

    2000-02-01

    Three experiments compared immediate serial recall of disyllabic words that differed on spoken duration. Two sets of long- and short-duration words were selected, in each case maximizing duration differences but matching for frequency, familiarity, phonological similarity, and number of phonemes, and controlling for semantic associations. Serial recall measures were obtained using auditory and visual presentation and spoken and picture-pointing recall. In Experiments 1a and 1b, using the first set of items, long words were better recalled than short words. In Experiments 2a and 2b, using the second set of items, no difference was found between long and short disyllabic words. Experiment 3 confirmed the large advantage for short-duration words in the word set originally selected by Baddeley, Thomson, and Buchanan (1975). These findings suggest that there is no reliable advantage for short-duration disyllables in span tasks, and that previous accounts of a word-length effect in disyllables are based on accidental differences between list items. The failure to find an effect of word duration casts doubt on theories that propose that the capacity of memory span is determined by the duration of list items or the decay rate of phonological information in short-term memory.

  13. How Objective a Neutral Word Is? A Neutrosophic Approach for the Objectivity Degrees of Neutral Words

    Directory of Open Access Journals (Sweden)

    Mihaela Colhon

    2017-11-01

    Full Text Available In the latest studies concerning the sentiment polarity of words, the authors mostly consider the positive and negative constructions, without paying too much attention to the neutral words, which can have, in fact, significant sentiment degrees. More precisely, not all the neutral words have zero positivity or negativity scores, some of them having quite important nonzero scores for these polarities. At this moment, in the literature, a word is considered neutral if its positive and negative scores are equal, which implies two possibilities: (1 zero positive and negative scores; (2 nonzero, but equal positive and negative scores. It is obvious that these cases represent two different categories of neutral words that must be treated separately by a sentiment analysis task. In this paper, we present a comprehensive study about the neutral words applied to English as is developed with the aid of SentiWordNet 3.0: the publicly available lexical resource for opinion mining. We designed our study in order to provide an accurate classification of the so-called “neutral words” described in terms of sentiment scores and using measures from neutrosophy theory. The intended scope is to fill the gap concerning the neutrality aspect by giving precise measurements for the words’ objectivity.

  14. Word Translation Entropy

    DEFF Research Database (Denmark)

    Schaeffer, Moritz; Dragsted, Barbara; Hvelplund, Kristian Tangsgaard

    This study reports on an investigation into the relationship between the number of translation alternatives for a single word and eye movements on the source text. In addition, the effect of word order differences between source and target text on eye movements on the source text is studied. In p...

  15. Unpacking Direct and Indirect Relationships of Short-Term Memory to Word Reading: Evidence From Korean-Speaking Children.

    Science.gov (United States)

    Kim, Young-Suk Grace; Cho, Jeung-Ryeul; Park, Soon-Gil

    2017-08-01

    We examined the relations of short-term memory (STM), metalinguistic awareness (phonological, morphological, and orthographic awareness), and rapid automatized naming (RAN) to word reading in Korean, a language with a relatively transparent orthography. STM, metalinguistic awareness, and RAN have been shown to be important to word reading, but the nature of the relations of STM, metalinguistic awareness, and RAN to word reading has rarely been investigated. Two alternative models were fitted. In the indirect relation model, STM was hypothesized to be indirectly related to word reading via metalinguistic awareness and RAN. In the direct and indirect relations model, STM was hypothesized to be directly and indirectly related to word reading. Results from 207 beginning readers in South Korea showed that STM was directly related to word reading as well as indirectly via metalinguistic awareness and RAN. Although the direct effect of STM was relatively small (.16), the total effect incorporating the indirect effect was substantial (.42). These results suggest that STM is an important, foundational cognitive capacity that underpins metalinguistic awareness and RAN as well as word reading, and further indicate the importance of considering both direct and indirect effects of language and cognitive skills on word reading.

  16. Interference Effects on the Recall of Pictures, Printed Words and Spoken Words.

    Science.gov (United States)

    Burton, John K.; Bruning, Roger H.

    Thirty college undergraduates participated in a study of the effects of acoustic and visual interference on the recall of word and picture triads in both short-term and long-term memory. The subjects were presented 24 triads of monosyllabic nouns representing all of the possible combinations of presentation types: pictures, printed words, and…

  17. Meaningful Words and Non-Words Repetitive Articulatory Rate (Oral Diadochokinesis) in Persian Speaking Children.

    Science.gov (United States)

    Zamani, Peyman; Rezai, Hossein; Garmatani, Neda Tahmasebi

    2017-08-01

    Repetitive articulatory rate or Oral Diadochokinesis (oral-DDK) shows a guideline for appraisal and diagnosis of subjects with oral-motor disorder. Traditionally, meaningless words repetition has been utilized in this task and preschool children have challenges with them. Therefore, we aimed to determine some meaningful words in order to test oral-DDK in Persian speaking preschool children. Participants were 142 normally developing children, (age range 4-6 years), who were asked to produce /motæka, golabi/ as two meaningful Persian words and /pa-ta-ka/ as non-word in conventional oral-DDK task. We compared the time taken for 10-times fast repetitions of two meaningful Persian words and the tri-syllabic nonsense word /pa-ta-ka/. Praat software was used to calculate the average time that subjects took to produce the target items. In 4-5 year old children, [Formula: see text] of time taken for 10-times repetitions of /pa-ta-ka, motæka, golabi/ were [Formula: see text], and [Formula: see text] seconds respectively, and in 5-6 year old children were [Formula: see text], and [Formula: see text] seconds respectively. Findings showed that the main effect of type of words on oral diadochokinesis was significant ([Formula: see text]). Children repeated meaningful words /motæka, golabi/ faster than the non-word /pa-ta-ka/. Sex and age factors had no effect on time taken for repetition of oral-DDK test. It is suggested that Speech Therapists can use meaningful words to facilitate oral-DDK test for children.

  18. Multi-word Lexical Units in English and Slovak Linguistics Terminology

    Directory of Open Access Journals (Sweden)

    Магдалена Била

    2016-12-01

    Full Text Available The research issue discussed in the paper falls within pragmatics, lexicographic and translation studies. It is part of the research grant project entitled “Virtual interactive English-Slovak bilingual encyclopedic linguistics dictionary”. One of the key tasks is to deal with the linguistics term as a concept. This presupposes understanding not only the surface structure but also the deep structure of the term. In preparing the inventory of the prospective dictionary, conceptualization has to take place and defining and translating of the term has to be done accordingly. The ongoing research has shown that one of the most problematic terms is “multi-word lexical unit” (in Slovak “viacslovné pomenovanie”. The problem lies in the different conceptualization of the terms in the two languages. Straightforwardly, in Slovak, the term implies examples that in English would be mostly considered compounds (Ološtiak, Ivanová 2015; in other words word-formation is the case here. In English, the term is more heterogeneous and encompasses categories like collocation, phrasal verb, idioms, speech formulas (on the term, see Sonomura 1996, situation bound utterances (on the term, see Kecskes 2010, and paremiological expressions (Moon 2015. In these categories, pragmatics rather than word-formation and syntax is the case (Erman and Warren 2000; Gibbs Jr. 2002, Kecskes 2014. The paper offers the analysis of the deep structure of the term in question, explores the role of figurativeness, exemplifies the differences, proposes the translation equivalents, and justifies the different nature of the seemingly corresponding terms, often making an impression of being a calque.

  19. Word/sub-word lattices decomposition and combination for speech recognition

    OpenAIRE

    Le , Viet-Bac; Seng , Sopheap; Besacier , Laurent; Bigi , Brigitte

    2008-01-01

    International audience; This paper presents the benefit of using multiple lexical units in the post-processing stage of an ASR system. Since the use of sub-word units can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. By using a sub-word information table, every word in a lattice can be decomposed into ...

  20. Analysis on the Word-formation of English Netspeak Neologism

    Directory of Open Access Journals (Sweden)

    Wei Liu

    2014-12-01

    Full Text Available The emergence of computer-mediated communication provides a resourceful database for language researchers as well as learners. This study focuses on the Internet neologisms, a derivative of new media age, which in many ways affects the netizens in terms of communication. The collected data are examined empirically to figure out the characteristics of netspeak neologisms and their patterns of formation. It suggests that the most frequently occurring word-formation process of netspeak neologisms is compounding, subsequently, blending, affixation, old words with new meaning, acronyms, conversion, and clipping. Through probing into each process, the examples are illustrated and sub-categories are listed in terms of blending for further understanding. This study has proven that the diversity of word-formation processes of English netspeak neologism and may shed light on the creativity of language in the online context.

  1. Questionable Word Choice in Scientific Writing in Orthopedic Surgery

    Directory of Open Access Journals (Sweden)

    Casey M. O`Connor

    2017-07-01

    Full Text Available Background: Given the strong influence of thoughts, emotions, and behaviors on musculoskeletal symptoms andlimitations it’s important that both scientific and lay writing use the most positive, hopeful, and adaptive words andconcepts consistent with medical evidence. The use of words that might reinforce misconceptions about preferencesensitiveconditions (particularly those associated with age could increase symptoms and limitations and might alsodistract patients from the treatment preferences they would select when informed and at ease.Methods: We reviewed 100 consecutive papers published in 2014 and 2015 in 6 orthopedic surgery scientific journals.We counted the number and proportion of journal articles with questionable use of one or more of the following words:tear, aggressive, required, and fail. For each word, we counted the rate of misuse per journal and the number of specificterms misused per article per journalResults: Eighty percent of all orthopedic scientific articles reviewed had questionable use of at least one term. Tearwas most questionably used with respect to rotator cuff pathology. The words fail and require were the most commonquestionably used terms overall.Conclusion: The use of questionable words and concepts is common in scientific writing in orthopedic surgery. It’sworth considering whether traditional ways or referring to musculoskeletal illness merit rephrasing.

  2. GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain

    Science.gov (United States)

    Huang, Lan; Du, Youfu; Chen, Gongyang

    2015-03-01

    Unlike English, the Chinese language has no space between words. Segmenting texts into words, known as the Chinese word segmentation (CWS) problem, thus becomes a fundamental issue for processing Chinese documents and the first step in many text mining applications, including information retrieval, machine translation and knowledge acquisition. However, for the geoscience subject domain, the CWS problem remains unsolved. Although a generic segmenter can be applied to process geoscience documents, they lack the domain specific knowledge and consequently their segmentation accuracy drops dramatically. This motivated us to develop a segmenter specifically for the geoscience subject domain: the GeoSegmenter. We first proposed a generic two-step framework for domain specific CWS. Following this framework, we built GeoSegmenter using conditional random fields, a principled statistical framework for sequence learning. Specifically, GeoSegmenter first identifies general terms by using a generic baseline segmenter. Then it recognises geoscience terms by learning and applying a model that can transform the initial segmentation into the goal segmentation. Empirical experimental results on geoscience documents and benchmark datasets showed that GeoSegmenter could effectively recognise both geoscience terms and general terms.

  3. Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec.

    Science.gov (United States)

    Zhu, Yongjun; Yan, Erjia; Wang, Fei

    2017-07-03

    Understanding semantic relatedness and similarity between biomedical terms has a great impact on a variety of applications such as biomedical information retrieval, information extraction, and recommender systems. The objective of this study is to examine word2vec's ability in deriving semantic relatedness and similarity between biomedical terms from large publication data. Specifically, we focus on the effects of recency, size, and section of biomedical publication data on the performance of word2vec. We download abstracts of 18,777,129 articles from PubMed and 766,326 full-text articles from PubMed Central (PMC). The datasets are preprocessed and grouped into subsets by recency, size, and section. Word2vec models are trained on these subtests. Cosine similarities between biomedical terms obtained from the word2vec models are compared against reference standards. Performance of models trained on different subsets are compared to examine recency, size, and section effects. Models trained on recent datasets did not boost the performance. Models trained on larger datasets identified more pairs of biomedical terms than models trained on smaller datasets in relatedness task (from 368 at the 10% level to 494 at the 100% level) and similarity task (from 374 at the 10% level to 491 at the 100% level). The model trained on abstracts produced results that have higher correlations with the reference standards than the one trained on article bodies (i.e., 0.65 vs. 0.62 in the similarity task and 0.66 vs. 0.59 in the relatedness task). However, the latter identified more pairs of biomedical terms than the former (i.e., 344 vs. 498 in the similarity task and 339 vs. 503 in the relatedness task). Increasing the size of dataset does not always enhance the performance. Increasing the size of datasets can result in the identification of more relations of biomedical terms even though it does not guarantee better precision. As summaries of research articles, compared with article

  4. Nursery words and hypocorisms among Germanic kinship terms

    DEFF Research Database (Denmark)

    Hansen, Bjarne Simmelkjær Sandgaard

    2018-01-01

    By using Jakobson’s (1960: 127-130) criteria for determining the nursery-word sta-tus of a given lexeme, I argue in this article that, even if we should no longer re-gard PG *aiþīn-/-ōn- ‘mother’ (Goth. aiþei), *aiþma- ‘daughter’s husband’ and *faþōn- ‘father’s sister’ as nursery words...

  5. Examining the Conditions of Using an On-Line Dictionary to Learn Words and Comprehend Texts

    Science.gov (United States)

    Dilenschneider, Robert Francis

    2018-01-01

    This study investigated three look-up conditions for language learners to learn unknown target words and comprehend a reading passage when their attention is transferred away to an on-line dictionary. The research questions focused on how each look-up condition impacted the recall and recognition of word forms, word meanings, and passage…

  6. Evidence for human fronto-central gamma activity during long-term memory encoding of word sequences.

    Directory of Open Access Journals (Sweden)

    Esther Berendina Meeuwissen

    Full Text Available Although human gamma activity (30-80 Hz associated with visual processing is often reported, it is not clear to what extend gamma activity can be reliably detected non-invasively from frontal areas during complex cognitive tasks such as long term memory (LTM formation. We conducted a memory experiment composed of 35 blocks each having three parts: LTM encoding, working memory (WM maintenance and LTM retrieval. In the LTM encoding and WM maintenance parts, participants had to respectively encode or maintain the order of three sequentially presented words. During LTM retrieval subjects had to reproduce these sequences. Using magnetoencephalography (MEG we identified significant differences in the gamma and beta activity. Robust gamma activity (55-65 Hz in left BA6 (supplementary motor area (SMA/pre-SMA was stronger during LTM rehearsal than during WM maintenance. The gamma activity was sustained throughout the 3.4 s rehearsal period during which a fixation cross was presented. Importantly, the difference in gamma band activity correlated with memory performance over subjects. Further we observed a weak gamma power difference in left BA6 during the first half of the LTM rehearsal interval larger for successfully than unsuccessfully reproduced word triplets. In the beta band, we found a power decrease in left anterior regions during LTM rehearsal compared to WM maintenance. Also this suppression of beta power correlated with memory performance over subjects. Our findings show that an extended network of brain areas, characterized by oscillatory activity in different frequency bands, supports the encoding of word sequences in LTM. Gamma band activity in BA6 possibly reflects memory processes associated with language and timing, and suppression of beta activity at left frontal sensors is likely to reflect the release of inhibition directly associated with the engagement of language functions.

  7. Auditory word recognition: extrinsic and intrinsic effects of word frequency.

    Science.gov (United States)

    Connine, C M; Titone, D; Wang, J

    1993-01-01

    Two experiments investigated the influence of word frequency in a phoneme identification task. Speech voicing continua were constructed so that one endpoint was a high-frequency word and the other endpoint was a low-frequency word (e.g., best-pest). Experiment 1 demonstrated that ambiguous tokens were labeled such that a high-frequency word was formed (intrinsic frequency effect). Experiment 2 manipulated the frequency composition of the list (extrinsic frequency effect). A high-frequency list bias produced an exaggerated influence of frequency; a low-frequency list bias showed a reverse frequency effect. Reaction time effects were discussed in terms of activation and postaccess decision models of frequency coding. The results support a late use of frequency in auditory word recognition.

  8. Representation Learning of Logic Words by an RNN: From Word Sequences to Robot Actions

    Directory of Open Access Journals (Sweden)

    Tatsuro Yamada

    2017-12-01

    Full Text Available An important characteristic of human language is compositionality. We can efficiently express a wide variety of real-world situations, events, and behaviors by compositionally constructing the meaning of a complex expression from a finite number of elements. Previous studies have analyzed how machine-learning models, particularly neural networks, can learn from experience to represent compositional relationships between language and robot actions with the aim of understanding the symbol grounding structure and achieving intelligent communicative agents. Such studies have mainly dealt with the words (nouns, adjectives, and verbs that directly refer to real-world matters. In addition to these words, the current study deals with logic words, such as “not,” “and,” and “or” simultaneously. These words are not directly referring to the real world, but are logical operators that contribute to the construction of meaning in sentences. In human–robot communication, these words may be used often. The current study builds a recurrent neural network model with long short-term memory units and trains it to learn to translate sentences including logic words into robot actions. We investigate what kind of compositional representations, which mediate sentences and robot actions, emerge as the network's internal states via the learning process. Analysis after learning shows that referential words are merged with visual information and the robot's own current state, and the logical words are represented by the model in accordance with their functions as logical operators. Words such as “true,” “false,” and “not” work as non-linear transformations to encode orthogonal phrases into the same area in a memory cell state space. The word “and,” which required a robot to lift up both its hands, worked as if it was a universal quantifier. The word “or,” which required action generation that looked apparently random, was represented as an

  9. Recurrent Partial Words

    Directory of Open Access Journals (Sweden)

    Francine Blanchet-Sadri

    2011-08-01

    Full Text Available Partial words are sequences over a finite alphabet that may contain wildcard symbols, called holes, which match or are compatible with all letters; partial words without holes are said to be full words (or simply words. Given an infinite partial word w, the number of distinct full words over the alphabet that are compatible with factors of w of length n, called subwords of w, refers to a measure of complexity of infinite partial words so-called subword complexity. This measure is of particular interest because we can construct partial words with subword complexities not achievable by full words. In this paper, we consider the notion of recurrence over infinite partial words, that is, we study whether all of the finite subwords of a given infinite partial word appear infinitely often, and we establish connections between subword complexity and recurrence in this more general framework.

  10. Algorithm of Syntactic Idioms Recognition in the Text: Attempt of Construction

    Directory of Open Access Journals (Sweden)

    Sytar Hanna

    2016-12-01

    Full Text Available Background: Attention of national and foreign researchers was focused so far on structural and semantic features of syntactic idioms. Automatic analysis of these peculiar units that are on the verge of syntax and phraseology still was not carried out in the scientific literature. This issue requires a theoretical understanding and practical implementation. Purpose: To create an algorithm of recognition of syntactic idioms with one- or two-term core component in the corpus of texts. Results: Based on the results of previous theoretical studies we highlighted a number of formal and statistical criteria that enable to distinguish syntactic idioms from other language units in the corpus of Ukrainian-language texts. The author developed a block diagram of syntactic idioms recognition, incorporating two branches constructed accordingly for the sentences with one-term and sentences with two-term core component. The first branch is based on the presence of word repeats (full words concurrence or presence of other word forms of the word and the list of core components determined on previous stages of the study (є, це, то, не, так; як; з/із/зі, між, над, серед; а, але, зате, однак, проте. The second branch was created for another type of syntactic idioms – one with a two-term core component. It takes into account the following properties of the analyzed units: the presence of combinations of service parts of speech, service parts of speech with pronoun or adverb, pronoun and adverb; compliance of words combinations with the register of the syntactic idioms core components currently comprising 92 structures; association measure of mutual information ≥9, etc. Discussion: Offered algorithm enables automatic identification of syntactic idioms in the corpus of texts and removal of contexts of their use, it can be used to improve the procedure of automatic text processing and creation of automated translation

  11. Exploratory analysis of textual data from the Mother and Child Handbook using a text mining method (II): Monthly changes in the words recorded by mothers.

    Science.gov (United States)

    Tagawa, Miki; Matsuda, Yoshio; Manaka, Tomoko; Kobayashi, Makiko; Ohwada, Michitaka; Matsubara, Shigeki

    2017-01-01

    The aim of the study was to examine the possibility of converting subjective textual data written in the free column space of the Mother and Child Handbook (MCH) into objective information using text mining and to compare any monthly changes in the words written by the mothers. Pregnant women without complications (n = 60) were divided into two groups according to State-Trait Anxiety Inventory grade: low trait anxiety (group I, n = 39) and high trait anxiety (group II, n = 21). Exploratory analysis of the textual data from the MCH was conducted by text mining using the Word Miner software program. Using 1203 structural elements extracted after processing, a comparison of monthly changes in the words used in the mothers' comments was made between the two groups. The data was mainly analyzed by a correspondence analysis. The structural elements in groups I and II were divided into seven and six clusters, respectively, by cluster analysis. Correspondence analysis revealed clear monthly changes in the words used in the mothers' comments as the pregnancy progressed in group I, whereas the association was not clear in group II. The text mining method was useful for exploratory analysis of the textual data obtained from pregnant women, and the monthly change in the words used in the mothers' comments as pregnancy progressed differed according to their degree of unease. © 2016 Japan Society of Obstetrics and Gynecology.

  12. Dynamics of Semantic and Word-Formation Subsystems of the Russian Language: Historical Dynamics of the Word Family

    Directory of Open Access Journals (Sweden)

    Olga Ivanovna Dmitrieva

    2015-09-01

    Full Text Available The article provides comprehensive justification of the principles and methods of the synchronic and diachronic research of word-formation subsystems of the Russian language. The authors also study the ways of analyzing historical dynamics of word family as the main macro-unit of word-formation system. In the field of analysis there is a family of words with the stem 'ход-' (the meaning of 'motion', word-formation of which is investigated in different periods of the Russian literary language. Significance of motion-verbs in the process of forming a language picture of the world determined the character and the structure of this word family as one of the biggest in the history of the Russian language. In the article a structural and semantic dynamics of the word family 'ход-' is depicted. The results of the study show that in the ancient period the prefixes of verbal derivatives were formed, which became the apex-branched derivational paradigms existing in modern Russian. The old Russian period of language development is characterized by the appearance of words with connotative meaning (with suffixes -ishk-, -ichn-, as well as the words with possessive semantics (with suffixes –ev-, -sk-. In this period the verbs with the postfix -cz also supplement the analyzed word family. The period of formation of the National Russian language was marked by the loss of a large number of abstract nouns and the appearance of neologisms from some old Russian abstract nouns. The studied family in the modern Russian language is characterized by the following processes: the appearance of terms, the active semantic derivation, the weakening of word-formation variability, the semantic differentiation of duplicate units, the development of subsystem of words with connotative meanings, and the preservation of derivatives in all functional styles.

  13. Hemispheric asymmetries in discourse processing: evidence from false memories for lists and texts.

    Science.gov (United States)

    Ben-Artzi, Elisheva; Faust, Miriam; Moeller, Edna

    2009-01-01

    Previous research suggests that the right hemisphere (RH) may contribute uniquely to discourse and text processing by activating and maintaining a wide range of meanings, including more distantly related meanings. The present study used the word-lists false memory paradigm [Roediger, H. L., III, & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 803-814.] to examine the hypothesis that difference between the two cerebral hemispheres in discourse processing may be due, at least partly, to memory representations for implicit text-related semantic information. Specifically, we tested the susceptibility of the left hemisphere (LH) and RH to unpresented target words following the presentation of semantically related words appearing in either word lists or short texts. Findings showed that the RH produced more false alarms than the LH for unpresented target words following either word lists or texts. These findings reveal hemispheric differences in memory for semantically related information and suggest that RH advantage in long-term maintenance of a wide range of text-related word meanings may be one aspect of its unique contribution to the construction of a discourse model. The results support the RH coarse semantic coding theory [Beeman, M. (1998). Coarse semantic coding and discourse comprehension. In M. Beeman & C. Chiarello (Eds.), Right hemisphere language comprehension: Perspectives from cognitive neuroscience (pp. 255-284). Mahwah, NJ: Erlbaum.] and suggest that hemispheric differences in semantic processing during language comprehension extend also to verbal memory.

  14. Center of attention: A network text analysis of American Sniper

    Directory of Open Access Journals (Sweden)

    Starling Hunter

    2016-06-01

    Full Text Available Network Text Analysis (NTA is a term used to describe a variety of software - supported methods for modeling texts as networks of concepts. In this study we apply NTA to the screenplay of American Sniper, an Academy Award nominee for Best Adapted Screenplay in 2014. Specifically, we est ablish prior expectations as to the key themes associated with war films. We then empirically test whether words associated with the most influentially - positioned nodes in the network signify themes common to the war - film genre. As predicted, we find tha t words and concepts associated with the least constrained nodes in the text network were significantly more likely to be associated with the war genre and significantly less likely to be associated with genres to which the film did not belong.

  15. Idiomaticity of English Business Terms and Their Equivalents in Lithuanian

    Directory of Open Access Journals (Sweden)

    Pavel Skorupa

    2011-04-01

    Full Text Available The paper presents the survey of idiomatic English business terms and their Lithuanian equivalents. The study was based on the theory of idioms and idiomaticity, highlighting the idea that idiomaticity can affect single words, word combinations, and longer text passages. Idiomatic business terms were taken from different English and Lithuanian general and special dictionaries, course books, as well as business texts. The analyzed terms were classified into distinct groups according to their meaning. The key problem encountered was the lack of Lithuanian translation equivalents to certain idiomatic English business terms. Possible Lithuanian translation was provided.

  16. Fracture Mechanics Method for Word Embedding Generation of Neural Probabilistic Linguistic Model

    Directory of Open Access Journals (Sweden)

    Size Bi

    2016-01-01

    Full Text Available Word embedding, a lexical vector representation generated via the neural linguistic model (NLM, is empirically demonstrated to be appropriate for improvement of the performance of traditional language model. However, the supreme dimensionality that is inherent in NLM contributes to the problems of hyperparameters and long-time training in modeling. Here, we propose a force-directed method to improve such problems for simplifying the generation of word embedding. In this framework, each word is assumed as a point in the real world; thus it can approximately simulate the physical movement following certain mechanics. To simulate the variation of meaning in phrases, we use the fracture mechanics to do the formation and breakdown of meaning combined by a 2-gram word group. With the experiments on the natural linguistic tasks of part-of-speech tagging, named entity recognition and semantic role labeling, the result demonstrated that the 2-dimensional word embedding can rival the word embeddings generated by classic NLMs, in terms of accuracy, recall, and text visualization.

  17. Using Gazetteers to Extract Sets of Keywords from Free-Flowing Texts

    Directory of Open Access Journals (Sweden)

    Adam Crymble

    2015-12-01

    Full Text Available If you have a copy of a text in electronic format stored on your computer, it is relatively easy to keyword search for a single term. Often you can do this by using the built-in search features in your favourite text editor. However, scholars are increasingly needing to find instances of many terms within a text or texts. For example, a scholar may want to use a gazetteer to extract all mentions of English placenames within a collection of texts so that those places can later be plotted on a map. Alternatively, they may want to extract all male given names, all pronouns, stop words, or any other set of words. Using those same built-in search features to achieve this more complex goal is time consuming and clunky. This lesson will teach you how to use Python to extract a set of keywords very quickly and systematically from a set of texts. It is expected that once you have completed this lesson, you will be able to generalise the skills to extract custom sets of keywords from any set of locally saved files.

  18. Reader, Word, and Character Attributes Contributing to Chinese Children's Concept of Word

    Science.gov (United States)

    Chen, Jing; Lin, Tzu-Jung; Ku, Yu-Min; Zhang, Jie; O'Connell, Ann

    2018-01-01

    Concept of word--the awareness of how words differ from nonwords or other linguistic properties--is important to learning to read Chinese because words in Chinese texts are not separated by space, and most characters can be productively compounded with other characters to form new words. The current study examined the effects of reader, word, and…

  19. Word sets, keywords, and text contents: an investigation of text topic on the computer Iniciando a língüística do corpus do português: explorando um corpus para ensinar português como língua estrangeira

    Directory of Open Access Journals (Sweden)

    Antonio P. BERBER SARDINHA

    1999-02-01

    Full Text Available This study presents a methodology for the identification of coherent word sets. Eight sets were initially identified and further grouped into two main sets: a `company' set and a `non-company' set. These two sets shared very few collocates, and therefore they seemed to represent distinct topics. The positions of the words in the `company' and `non-company' sets across the text were computed. The results indicated that the `non-company' sets referred to `company' implicitly. Finally, the key words were compared to an automatic abridgment of the text which revealed that nearly all key words were present in the ahridgment. This was interpreted as suggesting that the key words may indeed represent the main contents of the text.Este estudo apresenta uma metodologia para a identificação de conjuntos de palavras coerentes. Oito conjuntos foram identificados inicialmente e posteriormente agrupados em dois conjuntos principais: um conjunto denominado `companhia' e outro denominado `não-companhia'. Estes dois conjuntos partilham alguns colocados, e portanto parecem representar tópicos distintos. A posição das palavras de ambos os conjuntos foi computada ao longo do texto analisado. Os resultados indicaram que os conjuntos `não-companhia' se referiam indiretamente à companhia. Por fim, as palavras-chave dos conjuntos foram comparadas a um resumo do texto automático gerado por computador o qual revelou que quase todas as palavras-chave estavam presentes no resumo. Este fato foi interpretado como indício de que as palavras-chave representam o conteúdo central do texto.

  20. Google Advertising Tools Cashing in with AdSense and AdWords

    CERN Document Server

    Davis, Harold

    2010-01-01

    With this book, you'll learn how to take full advantage of Google AdWords and AdSense, the sophisticated online advertising tools used by thousands of large and small businesses. This new edition provides a substantially updated guide to advertising on the Web, including how it works in general, and how Google's advertising programs in particular help you make money. You'll find everything you need to work with AdWords, which lets you generate text ads to accompany specific search term results, and AdSense, which automatically delivers precisely targeted text and image ads to your website.

  1. Short-Term and Long-Term Effects on Visual Word Recognition

    Science.gov (United States)

    Protopapas, Athanassios; Kapnoula, Efthymia C.

    2016-01-01

    Effects of lexical and sublexical variables on visual word recognition are often treated as homogeneous across participants and stable over time. In this study, we examine the modulation of frequency, length, syllable and bigram frequency, orthographic neighborhood, and graphophonemic consistency effects by (a) individual differences, and (b) item…

  2. Word Origins of Common Neuroscience Terms for Use in an Undergraduate Classroom.

    Science.gov (United States)

    Hallock, Robert M; Brand, Emma C; Mihalic, Taylor B

    2016-01-01

    We compiled a list of nearly 300 neuroscience terms and list their language of origin (typically Latin or Greek), their literal meaning, and their pronunciation in a table. The table was distributed to students in an undergraduate neuroscience class a few weeks before the first examination. A follow-up survey asked students how long they spent with the handout, and also assessed whether they thought it helped them better understand the terms, apply the terms, and whether they thought it helped them enough to get a higher grade on the exam. Results were positive: nearly 78% of students used the table while reviewing the material, and these students overwhelmingly reported that the table helped them better understand and apply the terms. However, students were equally split on whether the handout resulted in a better grade on the first exam. It was our premise that better understanding the derivation of the words can help students make associations between the terms and their meanings/functions. This handout can be used in any undergraduate neuroscience to help students better understand the complex terminology associated with the material.

  3. Increase in posterior alpha activity during rehearsal predicts successful long-term memory formation of word sequences.

    Science.gov (United States)

    Meeuwissen, Esther B; Takashima, Atsuko; Fernández, Guillén; Jensen, Ole

    2011-12-01

    It is becoming increasingly clear that demanding cognitive tasks rely on an extended network engaging task-relevant areas and, importantly, disengaging task-irrelevant areas. Given that alpha activity (8-12 Hz) has been shown to reflect the disengagement of task-irrelevant regions in attention and working memory tasks, we here ask if alpha activity plays a related role for long-term memory formation. Subjects were instructed to encode and maintain the order of word sequences while the ongoing brain activity was recorded using magnetoencephalography (MEG). In each trial, three words were presented followed by a 3.4 s rehearsal interval. Considering the good temporal resolution of MEG this allowed us to investigate the word presentation and rehearsal interval separately. The sequences were grouped in trials where word order either could be tested immediately (working memory trials; WM) or later (LTM trials) according to instructions. Subjects were tested on their ability to retrieve the order of the three words. The data revealed that alpha power in parieto-occipital regions was lower during word presentation compared to rehearsal. Our key finding was that parieto-occipital alpha power during the rehearsal period was markedly stronger for successfully than unsuccessfully encoded LTM sequences. This subsequent memory effect demonstrates that high posterior alpha activity creates an optimal brain state for successful LTM formation possibly by actively reducing parieto-occipital activity that might interfere with sequence encoding. Copyright © 2010 Wiley Periodicals, Inc.

  4. Convolutional Neural Networks for Text Categorization: Shallow Word-level vs. Deep Character-level

    OpenAIRE

    Johnson, Rie; Zhang, Tong

    2016-01-01

    This paper reports the performances of shallow word-level convolutional neural networks (CNN), our earlier work (2015), on the eight datasets with relatively large training data that were used for testing the very deep character-level CNN in Conneau et al. (2016). Our findings are as follows. The shallow word-level CNNs achieve better error rates than the error rates reported in Conneau et al., though the results should be interpreted with some consideration due to the unique pre-processing o...

  5. The Relationships among Verbal Short-Term Memory, Phonological Awareness, and New Word Learning: Evidence from Typical Development and Down Syndrome

    Science.gov (United States)

    Jarrold, Christopher; Thorn, Annabel S. C.; Stephens, Emma

    2009-01-01

    This study examined the correlates of new word learning in a sample of 64 typically developing children between 5 and 8 years of age and a group of 22 teenagers and young adults with Down syndrome. Verbal short-term memory and phonological awareness skills were assessed to determine whether learning new words involved accurately representing…

  6. Reading Authentic Texts

    DEFF Research Database (Denmark)

    Balling, Laura Winther

    2013-01-01

    Most research on cognates has focused on words presented in isolation that are easily defined as cognate between L1 and L2. In contrast, this study investigates what counts as cognate in authentic texts and how such cognates are read. Participants with L1 Danish read news articles in their highly...... proficient L2, English, while their eye-movements were monitored. The experiment shows a cognate advantage for morphologically simple words, but only when cognateness is defined relative to translation equivalents that are appropriate in the context. For morphologically complex words, a cognate disadvantage...... word predictability indexed by the conditional probability of each word....

  7. Learning Semantic Tags from Big Data for Clinical Text Representation.

    Science.gov (United States)

    Li, Yanpeng; Liu, Hongfang

    2015-01-01

    In clinical text mining, it is one of the biggest challenges to represent medical terminologies and n-gram terms in sparse medical reports using either supervised or unsupervised methods. Addressing this issue, we propose a novel method for word and n-gram representation at semantic level. We first represent each word by its distance with a set of reference features calculated by reference distance estimator (RDE) learned from labeled and unlabeled data, and then generate new features using simple techniques of discretization, random sampling and merging. The new features are a set of binary rules that can be interpreted as semantic tags derived from word and n-grams. We show that the new features significantly outperform classical bag-of-words and n-grams in the task of heart disease risk factor extraction in i2b2 2014 challenge. It is promising to see that semantics tags can be used to replace the original text entirely with even better prediction performance as well as derive new rules beyond lexical level.

  8. Phonological Adaptation of Borrowed Terms in Duramazwi reMimhanzi

    Directory of Open Access Journals (Sweden)

    Gift Mheta

    2011-10-01

    Full Text Available Abstract: This article analyses the phonological characteristics of Shona musical terms borrowed from English. It discusses the phonological processes that take place when words are borrowed directly or indi-rectly from English. Essentially, the article analyses the adoption and adaptation of Shona loan-words at phonological level. It draws examples from the dictionary of Shona musical terms Duramazwi reMimhanzi (2005. This exploration of loan-word adaptation enhances the understanding of the phonological changes that the musical terms undergo during the borrowing process.

  9. Flexible Word Classes

    DEFF Research Database (Denmark)

    • First major publication on the phenomenon • Offers cross-linguistic, descriptive, and diverse theoretical approaches • Includes analysis of data from different language families and from lesser studied languages This book is the first major cross-linguistic study of 'flexible words', i.e. words...... that cannot be classified in terms of the traditional lexical categories Verb, Noun, Adjective or Adverb. Flexible words can - without special morphosyntactic marking - serve in functions for which other languages must employ members of two or more of the four traditional, 'specialised' word classes. Thus......, flexible words are underspecified for communicative functions like 'predicating' (verbal function), 'referring' (nominal function) or 'modifying' (a function typically associated with adjectives and e.g. manner adverbs). Even though linguists have been aware of flexible world classes for more than...

  10. Effects of word frequency and transitional probability on word reading durations of younger and older speakers

    NARCIS (Netherlands)

    Moers, C.; Meyer, A.S.; Janse, E.

    2017-01-01

    High-frequency units are usually processed faster than low-frequency units in language comprehension and language production. Frequency effects have been shown for words as well as word combinations. Word co-occurrence effects can be operationalized in terms of transitional probability (TP). TPs

  11. Short-term retention of pictures and words as a function of type of distraction and length of delay interval.

    Science.gov (United States)

    Pellegrino, J W; Siegel, A W; Dhawan, M

    1976-01-01

    Picture and word triads were tested in a Brown-Peterson short-term retention task at varying delay intervals (3, 10, or 30 sec) and under acoustic and simultaneous acoustic and visual distraction. Pictures were superior to words at all delay intervals under single acoustic distraction. Dual distraction consistently reduced picture retention while simultaneously facilitating word retention. The results were interpreted in terms of the dual coding hypothesis with modality-specific interference effects in the visual and acoustic processing systems. The differential effects of dual distraction were related to the introduction of visual interference and differential levels of functional acoustic interference across dual and single distraction tasks. The latter was supported by a constant 2/1 ratio in the backward counting rates of the acoustic vs. dual distraction tasks. The results further suggest that retention may not depend on total processing load of the distraction task, per se, but rather that processing load operates within modalities.

  12. Word classes

    DEFF Research Database (Denmark)

    Rijkhoff, Jan

    2007-01-01

    in grammatical descriptions of some 50 languages, which together constitute a representative sample of the world’s languages (Hengeveld et al. 2004: 529). It appears that there are both quantitative and qualitative differences between word class systems of individual languages. Whereas some languages employ...... a parts-of-speech system that includes the categories Verb, Noun, Adjective and Adverb, other languages may use only a subset of these four lexical categories. Furthermore, quite a few languages have a major word class whose members cannot be classified in terms of the categories Verb – Noun – Adjective...... – Adverb, because they have properties that are strongly associated with at least two of these four traditional word classes (e.g. Adjective and Adverb). Finally, this article discusses some of the ways in which word class distinctions interact with other grammatical domains, such as syntax and morphology....

  13. Finding Rising and Falling Words

    NARCIS (Netherlands)

    Tjong Kim Sang, E.

    2016-01-01

    We examine two different methods for finding rising words (among which neologisms) and falling words (among which archaisms) in decades of magazine texts (millions of words) and in years of tweets (billions of words): one based on correlation coefficients of relative frequencies and time, and one

  14. WORD PROCESSING AND SECOND LANGUAGE WRITING: A LONGITUDINAL CASE STUDY

    Directory of Open Access Journals (Sweden)

    Alister Cumming

    2001-12-01

    Full Text Available The purpose of this study was to determine whether word processing might change a second language (L2 leamer's writing processes and improve the quality of his essays over a relatively long period of time. We worked from the assumption that research comparing word-processing to pen and paper composing tends to show positive results when studies include lengthy terms of data collection and when appropriate instruction and training are provided. We compared the processes and products of L2 composing displayed by a 29-year-old, male Mandarin leamer of English with intermediate proficiency in English while he wrote, over 8 months, 14 compositions grouped into 7 comparable pairs of topics altemating between uses of a lap-top computer and of pen and paper. Al1 keystrokes were recorded electronically in the computer environrnent; visual records of al1 text changes were made for the pen-and paper writing. Think-aloud protocols were recorded in al1 sessions. Analyses indicate advantages for the word-processing medium over the pen-and-paper medium in terms ofi a greater frequency of revisions made at the discourse level and at the syntactical level; higher scores for content on analytic ratings of the completed compositions; and more extensive evaluation ofwritten texts in think-aloud verbal reports.

  15. "What is relevant in a text document?": An interpretable machine learning approach.

    Directory of Open Access Journals (Sweden)

    Leila Arras

    Full Text Available Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP, a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

  16. Assessing neglect dyslexia with compound words.

    Science.gov (United States)

    Reinhart, Stefan; Schunck, Alexander; Schaadt, Anna Katharina; Adams, Michaela; Simon, Alexandra; Kerkhoff, Georg

    2016-10-01

    The neglect syndrome is frequently associated with neglect dyslexia (ND), which is characterized by omissions or misread initial letters of single words. ND is usually assessed with standardized reading texts in clinical settings. However, particularly in the chronic phase of ND, patients often report reading deficits in everyday situations but show (nearly) normal performances in test situations that are commonly well-structured. To date, sensitive and standardized tests to assess the severity and characteristics of ND are lacking, although reading is of high relevance for daily life and vocational settings. Several studies found modulating effects of different word features on ND. We combined those features in a novel test to enhance test sensitivity in the assessment of ND. Low-frequency words of different length that contain residual pronounceable words when the initial letter strings are neglected were selected. We compared these words in a group of 12 ND-patients suffering from right-hemispheric first-ever stroke with word stimuli containing no existing residual words. Finally, we tested whether the serially presented words are more sensitive for the diagnosis of ND than text reading. The severity of ND was modulated strongly by the ND-test words and error frequencies in single word reading of ND words were on average more than 10 times higher than in a standardized text reading test (19.8% vs. 1.8%). The novel ND-test maximizes the frequency of specific ND-errors and is therefore more sensitive for the assessment of ND than conventional text reading tasks. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  17. Pre-learning stress differentially affects long-term memory for emotional words, depending on temporal proximity to the learning experience.

    Science.gov (United States)

    Zoladz, Phillip R; Clark, Brianne; Warnecke, Ashlee; Smith, Lindsay; Tabar, Jennifer; Talbot, Jeffery N

    2011-07-06

    Stress exerts a profound, yet complex, influence on learning and memory and can enhance, impair or have no effect on these processes. Here, we have examined how the administration of stress at different times before learning affects long-term (24-hr) memory for neutral and emotional information. Participants submerged their dominant hand into a bath of ice cold water (Stress) or into a bath of warm water (No stress) for 3 min. Either immediately (Exp. 1) or 30 min (Exp. 2) after the water bath manipulation, participants were presented with a list of 30 words varying in emotional valence. The next day, participants' memory for the word list was assessed via free recall and recognition tests. In both experiments, stressed participants exhibited greater blood pressure, salivary cortisol levels, and subjective pain and stress ratings than non-stressed participants in response to the water bath manipulation. Stress applied immediately prior to learning (Exp. 1) enhanced the recognition of positive words, while stress applied 30 min prior to learning (Exp. 2) impaired free recall of negative words. Participants' recognition of positive words in Experiment 1 was positively associated with their heart rate responses to the water bath manipulation, while participants' free recall of negative words in Experiment 2 was negatively associated with their blood pressure and cortisol responses to the water bath manipulation. These findings indicate that the differential effects of pre-learning stress on long-term memory may depend on the temporal proximity of the stressor to the learning experience and the emotional nature of the to-be-learned information. Copyright © 2011. Published by Elsevier Inc.

  18. Gesture en route to words

    DEFF Research Database (Denmark)

    Jensen de López, Kristine M.

    2010-01-01

    This study explores the communicative production of gestrural and vocal modalities by 8 normally developing children in two different cultures (Danish and Zapotec: Mexican indigenous) 16 to 20 months). We analyzed spontaneous production of gestrures and words in children's transition to the two-word...... the children showed an early preference for the gestural or vocal modality. Through Analyzes of two-element combinations of words and/or gestures, we observd a relative increase in cross-modal (gesture-word and two-word) combinations. The results are discussed in terms understanding gestures as a transition...

  19. Encoding in the visual word form area: an fMRI adaptation study of words versus handwriting.

    Science.gov (United States)

    Barton, Jason J S; Fox, Christopher J; Sekunova, Alla; Iaria, Giuseppe

    2010-08-01

    Written texts are not just words but complex multidimensional stimuli, including aspects such as case, font, and handwriting style, for example. Neuropsychological reports suggest that left fusiform lesions can impair the reading of text for word (lexical) content, being associated with alexia, whereas right-sided lesions may impair handwriting recognition. We used fMRI adaptation in 13 healthy participants to determine if repetition-suppression occurred for words but not handwriting in the left visual word form area (VWFA) and the reverse in the right fusiform gyrus. Contrary to these expectations, we found adaptation for handwriting but not for words in both the left VWFA and the right VWFA homologue. A trend to adaptation for words but not handwriting was seen only in the left middle temporal gyrus. An analysis of anterior and posterior subdivisions of the left VWFA also failed to show any adaptation for words. We conclude that the right and the left fusiform gyri show similar patterns of adaptation for handwriting, consistent with a predominantly perceptual contribution to text processing.

  20. Pictures Improve Memory of SAT Vocabulary Words.

    Science.gov (United States)

    Price, Melva; Finkelstein, Arleen

    1994-01-01

    Suggests that students can improve their memory of Scholastic Aptitude Test vocabulary words by associating the words with corresponding pictures taken from magazines. Finds that long-term recall of words associated with pictures was higher than recall of words not associated with pictures. (RS)

  1. Professional Music Training and Novel Word Learning: From Faster Semantic Encoding to Longer-lasting Word Representations.

    Science.gov (United States)

    Dittinger, Eva; Barbaroux, Mylène; D'Imperio, Mariapaola; Jäncke, Lutz; Elmer, Stefan; Besson, Mireille

    2016-10-01

    On the basis of previous results showing that music training positively influences different aspects of speech perception and cognition, the aim of this series of experiments was to test the hypothesis that adult professional musicians would learn the meaning of novel words through picture-word associations more efficiently than controls without music training (i.e., fewer errors and faster RTs). We also expected musicians to show faster changes in brain electrical activity than controls, in particular regarding the N400 component that develops with word learning. In line with these hypotheses, musicians outperformed controls in the most difficult semantic task. Moreover, although a frontally distributed N400 component developed in both groups of participants after only a few minutes of novel word learning, in musicians this frontal distribution rapidly shifted to parietal scalp sites, as typically found for the N400 elicited by known words. Finally, musicians showed evidence for better long-term memory for novel words 5 months after the main experimental session. Results are discussed in terms of cascading effects from enhanced perception to memory as well as in terms of multifaceted improvements of cognitive processing due to music training. To our knowledge, this is the first report showing that music training influences semantic aspects of language processing in adults. These results open new perspectives for education in showing that early music training can facilitate later foreign language learning. Moreover, the design used in the present experiment can help to specify the stages of word learning that are impaired in children and adults with word learning difficulties.

  2. Baby's first 10 words.

    Science.gov (United States)

    Tardif, Twila; Fletcher, Paul; Liang, Weilan; Zhang, Zhixiang; Kaciroti, Niko; Marchman, Virginia A

    2008-07-01

    Although there has been much debate over the content of children's first words, few large sample studies address this question for children at the very earliest stages of word learning. The authors report data from comparable samples of 265 English-, 336 Putonghua- (Mandarin), and 369 Cantonese-speaking 8- to 16-month-old infants whose caregivers completed MacArthur-Bates Communicative Development Inventories and reported them to produce between 1 and 10 words. Analyses of individual words indicated striking commonalities in the first words that children learn. However, substantive cross-linguistic differences appeared in the relative prevalence of common nouns, people terms, and verbs as well as in the probability that children produced even one of these word types when they had a total of 1-3, 4-6, or 7-10 words in their vocabularies. These data document cross-linguistic differences in the types of words produced even at the earliest stages of vocabulary learning and underscore the importance of parental input and cross-linguistic/cross-cultural variations in children's early word-learning.

  3. A Finnic holy word and its subsequent history

    Directory of Open Access Journals (Sweden)

    Mauno Koski

    1990-01-01

    Full Text Available This article concentrates on a specific ancient holy word in Finnish and its subsequent development, hiisi. In the Finnish language region hiisi appears as an element in place names in over 230 villages established by the end of the thirteenth century, and at least a majority of these must have existed since prehistoric times. In Finland as well as in Estonia it is possible to demonstrate an earlier sacral function in places which contain hiisi as a component of their name, partly with the help of archeological discoveries, and partly with the help of oral folk tradition. It is particularly among the earliest settlement areas of Southwest Finland, Satakunta and Häme that hiisi features in the names of sacrificial sites or trees, in other words in the same areas where it features in the names of burial grounds. Names in which the hiisi element precedes a word meaning a lake, pond, or other water formation, occur particularly in the eastern Finnish dialect regions, as well as in the regions of Karelian, Olonets, Lydian, and Vepsian. In addition to its factual meaning of cult place, the Finnish word hiisi has come to denote a supernatural entity both in terms of its reference to a place and in terms of its reference to a being.

  4. EVALUATION OF THE TEXTS IN TURKISH AS A FOREI GN LANGUAGE COUR SE BOOKS IN TERMS OF FORMULAIC EXPRESSI ONS

    Directory of Open Access Journals (Sweden)

    Nil Didem ŞİMŞEK

    2015-07-01

    Full Text Available Since primitive times, the need to communicate with each other has paved the way for the use different types of languages; and the question of language has become an unsolvable, complex issue. It is not possible to limit language with definitions. Language, as a social institution, differs from other languages with the cultural and social structure it has been shaped through; and forms its own lexicon. Aksan (1996:9 ; considers the lexicon of a language as “a whole made up of not only the words, but also the idioms, communicative expressions, formulaic expressions, proverbs, terms and various sets of expressions of that language.” As there are numerous lexical items in a language, there are numerous cult ural elements as well. Each unit among the lexicon provides an important communication between the speaker of that language and the cultural values to which that language belongs; and strengthens the relationship between them. Formulaic expressions, or in other words, communicative expressions are the most significant ones among these units that constitute the lexicon. Cultural transfer has an important role especially in teaching Turkish to foreigners. The functionality of these units is noteworthy in the transfer and the deliberate use of the cultural elements of that language. The aim of this study is to evaluate the texts in beginner level (A1 Turkish as a foreign language course books in terms of formulaic expressions (communicative expressions. The d ata sources for the study are the A1 level books of Lale and İstanbul series. Transferring the culture is quite important in teaching a language. In order to present the language along with the culture, formulaic expressions (communicative expressions sho uld be included frequently, particularly in the beginner level course books.

  5. Niche as a determinant of word fate in online groups.

    Directory of Open Access Journals (Sweden)

    Eduardo G Altmann

    2011-05-01

    Full Text Available Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between their intrinsic properties and the environments in which they function. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.

  6. Corticospinal excitability during the processing of handwritten and typed words and non-words.

    Science.gov (United States)

    Gordon, Chelsea L; Spivey, Michael J; Balasubramaniam, Ramesh

    2017-06-09

    A number of studies have suggested that perception of actions is accompanied by motor simulation of those actions. To further explore this proposal, we applied Transcranial magnetic stimulation (TMS) to the left primary motor cortex during the observation of handwritten and typed language stimuli, including words and non-word consonant clusters. We recorded motor-evoked potentials (MEPs) from the right first dorsal interosseous (FDI) muscle to measure cortico-spinal excitability during written text perception. We observed a facilitation in MEPs for handwritten stimuli, regardless of whether the stimuli were words or non-words, suggesting potential motor simulation during observation. We did not observe a similar facilitation for the typed stimuli, suggesting that motor simulation was not occurring during observation of typed text. By demonstrating potential simulation of written language text during observation, these findings add to a growing literature suggesting that the motor system plays a strong role in the perception of written language. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Arabic word recognizer for mobile applications

    Science.gov (United States)

    Khanna, Nitin; Abdollahian, Golnaz; Brame, Ben; Boutin, Mireille; Delp, Edward J.

    2011-03-01

    When traveling in a region where the local language is not written using a "Roman alphabet," translating written text (e.g., documents, road signs, or placards) is a particularly difficult problem since the text cannot be easily entered into a translation device or searched using a dictionary. To address this problem, we are developing the "Rosetta Phone," a handheld device (e.g., PDA or mobile telephone) capable of acquiring an image of the text, locating the region (word) of interest within the image, and producing both an audio and a visual English interpretation of the text. This paper presents a system targeted for interpreting words written in Arabic script. The goal of this work is to develop an autonomous, segmentation-free Arabic phrase recognizer, with computational complexity low enough to deploy on a mobile device. A prototype of the proposed system has been deployed on an iPhone with a suitable user interface. The system was tested on a number of noisy images, in addition to the images acquired from the iPhone's camera. It identifies Arabic words or phrases by extracting appropriate features and assigning "codewords" to each word or phrase. On a dictionary of 5,000 words, the system uniquely mapped (word-image to codeword) 99.9% of the words. The system has a 82% recognition accuracy on images of words captured using the iPhone's built-in camera.

  8. MOJIBAKE – The Rehearsal of Word Fragments In Verbal Recall

    Directory of Open Access Journals (Sweden)

    Dr. Christiane eLange-Küttner

    2015-04-01

    Full Text Available Theories of verbal rehearsal usually assume that whole words are being rehearsed. However, words consist of letter sequences, or syllables, or word onset-vowel-coda, amongst many other conceptualizations of word structure. A more general term is the ‘grain size’ of word units (Ziegler & Goswami, 2005. In the current study, a new method measured the quantitative percentage of correctly remembered word structure. The amount of letters in the correct letter sequence as per cent of word length was calculated, disregarding missing or added letters. A forced rehearsal was tested by repeating each memory list four times. We tested low frequency (LF English words versus geographical UK town names to control for content. We also tested unfamiliar international (INT non-words and names of international (INT European towns to control for familiarity. An immediate versus distributed repetition was tested with a between-subject design. Participants responded with word fragments in their written recall especially when they had to remember unfamiliar words. While memory of whole words was sensitive to content, presentation distribution and individual sex and language differences, recall of word fragments was not. There was no trade-off between memory of word fragments with whole word recall during the repetition, instead also word fragments significantly increased. Moreover, while whole word responses correlated with each other during repetition, and word fragment responses correlated with each other during repetition, these two types of word recall responses were not correlated with each other. Thus there may be a lower layer consisting of free, sparse word fragments and an upper layer that consists of language-specific, orthographically and semantically constrained words.

  9. Effects of word width and word length on optimal character size for reading of horizontally scrolling Japanese words

    Directory of Open Access Journals (Sweden)

    Wataru eTeramoto

    2016-02-01

    Full Text Available The present study investigated whether word width and length affect the optimal character size for reading of horizontally scrolling Japanese words, using reading speed as a measure. In Experiment 1, three Japanese words, each consisting of 4 Hiragana characters, sequentially scrolled on a display screen from right to left. Participants, all Japanese native speakers, were instructed to read the words aloud as accurately as possible, irrespective of their order within the sequence. To quantitatively measure their reading performance, we used rapid serial visual presentation paradigm, where the scrolling rate was increased until the participants began to make mistakes. Thus, the highest scrolling rate at which the participants’ performance exceeded 88.9% correct rate was calculated for each character size (0.3, 0.6, 1.0, and 3.0° and scroll window size (5 or 10 character spaces. Results showed that the reading performance was highest in the range of 0.6° to 1.0°, irrespective of the scroll window size. Experiment 2 investigated whether the optimal character size observed in Experiment 1 was applicable for any word width and word length (i.e., the number of characters in a word. Results showed that reading speeds were slower for longer than shorter words and the word width of 3.6° was optimal among the word lengths tested (3, 4, and 6 character words. Considering that character size varied depending on word width and word length in the present study, this means that the optimal character size can be changed by word width and word length.

  10. Generating descriptive visual words and visual phrases for large-scale image applications.

    Science.gov (United States)

    Zhang, Shiliang; Tian, Qi; Hua, Gang; Huang, Qingming; Gao, Wen

    2011-09-01

    Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.

  11. Word Recognition during Reading: The Interaction between Lexical Repetition and Frequency

    Science.gov (United States)

    Lowder, Matthew W.; Choi, Wonil; Gordon, Peter C.

    2013-01-01

    Memory studies utilizing long-term repetition priming have generally demonstrated that priming is greater for low-frequency words than for high-frequency words and that this effect persists if words intervene between the prime and the target. In contrast, word-recognition studies utilizing masked short-term repetition priming typically show that the magnitude of repetition priming does not differ as a function of word frequency and does not persist across intervening words. We conducted an eye-tracking while reading experiment to determine which of these patterns more closely resembles the relationship between frequency and repetition during the natural reading of a text. Frequency was manipulated using proper names that were high-frequency (e.g., Stephen) or low-frequency (e.g., Dominic). The critical name was later repeated in the sentence, or a new name was introduced. First-pass reading times and skipping rates on the critical name revealed robust repetition-by-frequency interactions such that the magnitude of the repetition-priming effect was greater for low-frequency names than for high-frequency names. In contrast, measures of later processing showed effects of repetition that did not depend on lexical frequency. These results are interpreted within a framework that conceptualizes eye-movement control as being influenced in different ways by lexical- and discourse-level factors. PMID:23283808

  12. Graphic Organizer in Action: Solving Secondary Mathematics Word Problems

    Directory of Open Access Journals (Sweden)

    Khoo Jia Sian

    2016-09-01

    Full Text Available Mathematics word problems are one of the most challenging topics to learn and teach in secondary schools. This is especially the case in countries where English is not the first language for the majority of the people, such as in Brunei Darussalam. Researchers proclaimed that limited language proficiency and limited Mathematics strategies are the possible causes to this problem. However, whatever the reason is behind difficulties students face in solving Mathematical word problems, it is perhaps the teaching and learning of the Mathematics that need to be modified. For example, the use of four-square-and-a-diamond graphic organizer that infuses model drawing skill; and Polya’s problem solving principles, to solve Mathematical word problems may be some of the strategies that can help in improving students’ word problem solving skills. This study, through quantitative analysis found that the use of graphic organizer improved students’ performance in terms of Mathematical knowledge, Mathematical strategy and Mathematical explanation in solving word problems. Further qualitative analysis revealed that the use of graphic organizer boosted students’ confidence level and positive attitudes towards solving word problems.Keywords: Word Problems, Graphic Organizer, Algebra, Action Research, Secondary School Mathematics DOI: http://dx.doi.org/10.22342/jme.7.2.3546.83-90

  13. NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition.

    Science.gov (United States)

    Tsai, Richard Tzong-Han; Sung, Cheng-Lung; Dai, Hong-Jie; Hung, Hsieh-Chuan; Sung, Ting-Yi; Hsu, Wen-Lian

    2006-12-18

    Biomedical named entity recognition (Bio-NER) is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes) do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML) approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized) and at least one class tag (e.g., B-protein, the beginning of a protein name). However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2) cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words). We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. To develop our ML-based Bio-NER system, we employ conditional

  14. Linking Working Memory and Long-Term Memory: A Computational Model of the Learning of New Words

    Science.gov (United States)

    Jones, Gary; Gobet, Fernand; Pine, Julian M.

    2007-01-01

    The nonword repetition (NWR) test has been shown to be a good predictor of children's vocabulary size. NWR performance has been explained using phonological working memory, which is seen as a critical component in the learning of new words. However, no detailed specification of the link between phonological working memory and long-term memory…

  15. Tracing Knowledge Transfer from Universities to Industry: A Text Mining Approach

    DEFF Research Database (Denmark)

    Woltmann, Sabrina; Alkærsig, Lars

    2017-01-01

    This paper identifies transferred knowledge between universities and the industry by proposing the use of a computational linguistic method. Current research on university-industry knowledge exchange relies often on formal databases and indicators such as patents, collaborative publications and l...... is the first step to enable the identification of common knowledge and knowledge transfer via text mining to increase its measurability....... and license agreements, to assess the contribution to the socioeconomic surrounding of universities. We, on the other hand, use the texts from university abstracts to identify university knowledge and compare them with texts from firm webpages. We use these text data to identify common key words and thereby...... identify overlapping contents among the texts. As method we use a well-established word ranking method from the field of information retrieval term frequency–inverse document frequency (TFIDF) to identify commonalities between texts from university. In examining the outcomes of the TFIDF statistic we find...

  16. Arabic text preprocessing for the natural language processing applications

    International Nuclear Information System (INIS)

    Awajan, A.

    2007-01-01

    A new approach for processing vowelized and unvowelized Arabic texts in order to prepare them for Natural Language Processing (NLP) purposes is described. The developed approach is rule-based and made up of four phases: text tokenization, word light stemming, word's morphological analysis and text annotation. The first phase preprocesses the input text in order to isolate the words and represent them in a formal way. The second phase applies a light stemmer in order to extract the stem of each word by eliminating the prefixes and suffixes. The third phase is a rule-based morphological analyzer that determines the root and the morphological pattern for each extracted stem. The last phase produces an annotated text where each word is tagged with its morphological attributes. The preprocessor presented in this paper is capable of dealing with vowelized and unvowelized words, and provides the input words along with relevant linguistics information needed by different applications. It is designed to be used with different NLP applications such as machine translation text summarization, text correction, information retrieval and automatic vowelization of Arabic Text. (author)

  17. Word Recognition in Auditory Cortex

    Science.gov (United States)

    DeWitt, Iain D. J.

    2013-01-01

    Although spoken word recognition is more fundamental to human communication than text recognition, knowledge of word-processing in auditory cortex is comparatively impoverished. This dissertation synthesizes current models of auditory cortex, models of cortical pattern recognition, models of single-word reading, results in phonetics and results in…

  18. Semantic Change Type in Old Javanese Word and Sanskrit Loan Word to Modern Javanese

    Directory of Open Access Journals (Sweden)

    Hendy Yuniarto

    2016-12-01

    Full Text Available This research aims to describe type classifier of semantic change and to explain the factors causing semantic change. This research was conducted with a qualitative-descriptive approach. The research method is conducted by comparing the meaning of words from the Old Javanese and Sanskrit loan wordto Modern Javanese. The collection data is done by looking for words that the meaning suspected change in Old Javanese dictionary. Words meaning determined precisely by tracing to the Old Javanese text. Furthermore, words meaning are compared to present time meaning through Modern Javanese dictionary. In addition, searching Modern Javanese meaning are also using Javanese news on the internet pages. The analysis of this research is to classify Old Javanese words and Sanskrit loan words meaning that undergo change to Modern Javanese. It’s also explained why the change in the word meaning can occur. The result shows that, semantic change of Old Javanese words and Sanskrit loan words to Modern Javanese can be classified into seven types, involving widening, narrowing, shifting, metaphor, metonymy, pejoration, and euphemism. In addition, the result shows that semantic change can occur because of some factors. Psychological factor concerning emotive and taboo, and polysemy. religion spreading, the growth of science and technology, the socio-political development, and the needs of a new name.   DOI: https://doi.org/10.24071/llt.2013.160101

  19. Cultural Words in Slovenian Translations of the Works of Juan Rulfo and Carlos Fuentes

    Directory of Open Access Journals (Sweden)

    Uršula Kastelic Vukadinović

    2016-12-01

    Full Text Available The paper, based on the author´s thesis entitled Kulturno besedje v slovenskih prevodih del Juana Rulfa in Carlosa Fuentesa, discusses cultural words in the Slovenian translations of the novel Pedro Páramo, some of the stories of The Plain in Flames by Juan Rulfo published in the same volume and the novel The Death of Artemio Cruz by Carlos Fuentes. In the analyzed texts there are a number of terms (also including cultural words which denote animals, plants, dishes, clothes and geographical surroundings of Mexico that may be unknown even to the Spanish speaking readers who are not familiar with the wider Hispanic environment, and Mexican in particular. Cultural words have no exact equivalents in other languages and cultures, therefore, they most commonly indicate that we are reading a translated text. These are also the elements that contribute to the foreignization of the target text and show us the textual world as exotic and unknown. The translator of the studied texts, Alenka Bole Vrabec, tends to choose the transfer of the cultural words in order to retain some local colour, even when she could find equivalents in the Slovenian or, at least, adapt the words to the writing in accordance with the rules of the Slovenian language. These decisions accentuate the foreignizing character of her translations.

  20. The effect of word length in short-term memory: Is rehearsal necessary?

    Science.gov (United States)

    Campoy, Guillermo

    2008-05-01

    Three experiments investigated the effect of word length on a serial recognition task when rehearsal was prevented by a high presentation rate with no delay between study and test lists. Results showed that lists of short four-phoneme words were better recognized than lists of long six-phoneme words. Moreover, this effect was equivalent to that observed in conditions in which there was a delay between lists, thereby making rehearsal possible in the interval. These findings imply that rehearsal does not play a central role in the origin of the word length effect. An alternative explanation based on differences in the degree of retroactive interference generated by long and short words is proposed.

  1. Fracture Mechanics Method for Word Embedding Generation of Neural Probabilistic Linguistic Model.

    Science.gov (United States)

    Bi, Size; Liang, Xiao; Huang, Ting-Lei

    2016-01-01

    Word embedding, a lexical vector representation generated via the neural linguistic model (NLM), is empirically demonstrated to be appropriate for improvement of the performance of traditional language model. However, the supreme dimensionality that is inherent in NLM contributes to the problems of hyperparameters and long-time training in modeling. Here, we propose a force-directed method to improve such problems for simplifying the generation of word embedding. In this framework, each word is assumed as a point in the real world; thus it can approximately simulate the physical movement following certain mechanics. To simulate the variation of meaning in phrases, we use the fracture mechanics to do the formation and breakdown of meaning combined by a 2-gram word group. With the experiments on the natural linguistic tasks of part-of-speech tagging, named entity recognition and semantic role labeling, the result demonstrated that the 2-dimensional word embedding can rival the word embeddings generated by classic NLMs, in terms of accuracy, recall, and text visualization.

  2. Mining protein function from text using term-based support vector machines

    Science.gov (United States)

    Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

    2005-01-01

    Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835

  3. A scalable machine-learning approach to recognize chemical names within large text databases

    Directory of Open Access Journals (Sweden)

    Wren Jonathan D

    2006-09-01

    Full Text Available Abstract Motivation The use or study of chemical compounds permeates almost every scientific field and in each of them, the amount of textual information is growing rapidly. There is a need to accurately identify chemical names within text for a number of informatics efforts such as database curation, report summarization, tagging of named entities and keywords, or the development/curation of reference databases. Results A first-order Markov Model (MM was evaluated for its ability to distinguish chemical names from words, yielding ~93% recall in recognizing chemical terms and ~99% precision in rejecting non-chemical terms on smaller test sets. However, because total false-positive events increase with the number of words analyzed, the scalability of name recognition was measured by processing 13.1 million MEDLINE records. The method yielded precision ranges from 54.7% to 100%, depending upon the cutoff score used, averaging 82.7% for approximately 1.05 million putative chemical terms extracted. Extracted chemical terms were analyzed to estimate the number of spelling variants per term, which correlated with the total number of times the chemical name appeared in MEDLINE. This variability in term construction was found to affect both information retrieval and term mapping when using PubMed and Ovid.

  4. Combine CRF and MMSEG to Boost Chinese Word Segmentation in Social Media

    OpenAIRE

    Yushi, Yao; Zheng, Huang

    2015-01-01

    In this paper, we propose a joint algorithm for the word segmentation on Chinese social media. Previous work mainly focus on word segmentation for plain Chinese text, in order to develop a Chinese social media processing tool, we need to take the main features of social media into account, whose grammatical structure is not rigorous, and the tendency of using colloquial and Internet terms makes the existing Chinese-processing tools inefficient to obtain good performance on social media. In ou...

  5. Divineness regarding the words of the Holy Qur'an

    Directory of Open Access Journals (Sweden)

    Sheibani Muhammad

    2014-01-01

    Full Text Available One of the many questions concerning the words of the Holy Qur'an is whether their content and meaning were truly a divine revelation or they were revealed to the Prophet from God and then transferred into the form of words. In this regard, there are two perspectives. First, as all Muslims believe, the words of the Qur'an are the result of a divine revelation, where­as the second viewpoint is that the words of the Qur'an are written by a human and not God. According to this latter perspective, the words of the Qur'an are sayings of the Prophet of which only the contents are based on a divine revelation. The theory of the words of the Qur'an not being a divine revelation has been an abandoned and rejected one throughout the Islamic history. This is the reason it has not been the subject of any pertinent discussions. How can the words of the Qur'an be created by Muhammad or Gabriel even though it is believed that the Qur'an is a miracle? This article first defines the concept of revelation and then analyzes various viewpoints and opinions regarding this topic in order to conclude (with evidence that the Qur'an is the word of God and not the word of the Prophet. If he had composed the words of the Qur'an and expressed the meaning of the revelation in his own words, then the Qur'an would not be the word of God. In this case, the term 'word of God' indicates that the concept of the 'word' should be considered. Thus, it is clear that the words of the Qur'an are divine and they can be referred and attributed to God.

  6. AARP Word 2010 for dummies

    CERN Document Server

    Gookin, Dan

    2011-01-01

    It's a whole new Word - make the most of it! Here's exactly what you need to know to get going with Word 2010. From firing up Word, using the spell checker, and working with templates to formatting documents, adding images, and saving your stuff, you'll get the first and last word on Word 2010 with this fun and easy mini guide. So get ready to channel your inner writer and start creating Word files that wow! Open the book and find:Tips for navigating Word with the keyboard and mouseAdvice on using the RibbonHow to edit text and undo mistakesThings to know

  7. Pre-activation negativity (PrAN in brain potentials to unfolding words

    Directory of Open Access Journals (Sweden)

    Pelle Söderström

    2016-10-01

    Full Text Available We describe an ERP effect termed the ‘pre-activation negativity’ (PrAN, which is proposed to index the degree of pre-activation of upcoming word-internal morphemes in speech processing. Using lexical competition measures based on word-initial speech fragments (WIFs, as well as statistical analyses of ERP data from three experiments, it is shown that the PrAN is sensitive to lexical competition and that it reflects the degree of predictive certainty: the negativity is larger when there are fewer upcoming lexical competitors.

  8. The Place of the Proclamation of the Word of God

    Directory of Open Access Journals (Sweden)

    Jarosław Superson

    2016-12-01

    Full Text Available This article presents a chronological evolution of the place of the proclamation of the Word of God. On the basis of pericopes from the Old Testament, the places that God chose to speak with the man and then places chosen by Jesus Christ and the Church in its early centuries were indicated. Use of the term “ambo” (ἄμβων, which appeared in the Church probably at the end of the fourth century, became widespread, and over time it was adopted as the name of the place for the Liturgy of the Word and for the book.

  9. Creating a medical dictionary using word alignment: The influence of sources and resources

    Directory of Open Access Journals (Sweden)

    Åhlfeldt Hans

    2007-11-01

    Full Text Available Abstract Background Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. Methods We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. Results The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English

  10. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere

    Directory of Open Access Journals (Sweden)

    Znikina Ludmila

    2017-01-01

    Full Text Available The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  11. Words, Words, Words: English, Vocabulary.

    Science.gov (United States)

    Lamb, Barbara

    The Quinmester course on words gives the student the opportunity to increase his proficiency by investigating word origins, word histories, morphology, and phonology. The course includes the following: dictionary skills and familiarity with the "Oxford,""Webster's Third," and "American Heritage" dictionaries; word…

  12. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  13. Using ontology network structure in text mining.

    Science.gov (United States)

    Berndt, Donald J; McCart, James A; Luther, Stephen L

    2010-11-13

    Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

  14. Analysis Of Aspects Of Messages Hiding In Text Environments

    Directory of Open Access Journals (Sweden)

    Afanasyeva Olesya

    2015-09-01

    Full Text Available In the work are researched problems, which arise during hiding of messages in text environments, being transmitted by electronic communication channels and the Internet. The analysis of selection of places in text environment (TE, which can be replaced by word from the message is performed. Selection and replacement of words in the text environment is implemented basing on semantic analysis of text fragment, consisting of the inserted word, and its environment in TE. For implementation of such analysis is used concept of semantic parameters of words coordination and semantic value of separate word. Are used well-known methods of determination of values of these parameters. This allows moving from quality level to quantitative level analysis of text fragments semantics during their modification by word substitution. Invisibility of embedded messages is ensured by providing preset values of the semantic cooperation parameter deviations.

  15. Prediction of Learning and Comprehension when Adolescents Read Multiple Texts: The Roles of Word-Level Processing, Strategic Approach, and Reading Motivation

    Science.gov (United States)

    Braten, Ivar; Ferguson, Leila E.; Anmarkrud, Oistein; Stromso, Helge I.

    2013-01-01

    Sixty-five Norwegian 10th graders used the software Read&Answer 2.0 (Vidal-Abarca et al., 2011) to read five different texts presenting conflicting views on the controversial scientific issue of sun exposure and health. Participants were administered a multiple-choice topic-knowledge measure before and after reading, a word recognition task,…

  16. [Representation of letter position in visual word recognition process].

    Science.gov (United States)

    Makioka, S

    1994-08-01

    Two experiments investigated the representation of letter position in visual word recognition process. In Experiment 1, subjects (12 undergraduates and graduates) were asked to detect a target word in a briefly-presented probe. Probes consisted of two kanji words. The latters which formed targets (critical letters) were always contained in probes. (e.g. target: [symbol: see text] probe: [symbol: see text]) High false alarm rate was observed when critical letters occupied the same within-word relative position (left or right within the word) in the probe words as in the target word. In Experiment 2 (subject were ten undergraduates and graduates), spaces adjacent to probe words were replaced by randomly chosen hiragana letters (e.g. [symbol: see text]), because spaces are not used to separate words in regular Japanese sentences. In addition to the effect of within-word relative position as in Experiment 1, the effect of between-word relative position (left or right across the probe words) was observed. These results suggest that information about within-word relative position of a letter is used in word recognition process. The effect of within-word relative position was explained by a connectionist model of word recognition.

  17. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere)

    Science.gov (United States)

    Znikina, Ludmila; Rozhneva, Elena

    2017-11-01

    The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  18. Shakespeare and the Words of Early Modern Physic: Between Academic and Popular Medicine. A Lexicographical Approach to the Plays

    Directory of Open Access Journals (Sweden)

    Roberta Mullini

    2013-03-01

    Full Text Available The article aims at showing how Shakespeare relied on the medical vocabulary shared by his coeval society, which had, for centuries, been witnessing the continuous process of vernacularization of ancient and medieval scientific texts. After outlining the state of early modern medicine, the author presents and discusses the results of her search for relevant medical terms in nine plays by Shakespeare. In order to do this, a wide range of medical treatises has been analysed (either directly or through specific corpora such as Medieval English Medical Texts, MEMT 2005, and Early Modern English Medical Texts, EMEMT 2010, so as to verify the ancestry or the novelty of Shakespearean medical words. In addition to this, the author has also built a corpus of word types derived from seventeenth-century quack doctors’ handbills, with the purpose of creating a word list of medical terms connected to popular rather than university medicine, comparable with the list drawn out of the Shakespearean plays. The results most stressed in the article concern Shakespeare’s use of medical terminology already well known to his contemporary society (thus confuting the Oxfordian thesis about the impossibility for William Shakespeare the actor to master so many medical words and the playwright’s skill in transforming – rather than inventing – old popular terms. The article is accompanied by five tables that collect the results of the various lexicographical searches.

  19. Scientific word, Version 1.0

    Directory of Open Access Journals (Sweden)

    Semen Köksal

    1993-01-01

    Full Text Available Scientific Word is the first fully integrated mathematical word processor in the Windows 3.1 environment, which uses the TEX typesetting language for output. It runs as a Microsoft Windows application program and has two-way interface to TEX. The Scientific Word is an object-oriented WYSIWYG word processor for virtually all users who need typesetting scientific books, manuals and papers. It includes automatic equation numbering, spell checking, and LATEX and DVI previewer.

  20. Newly-acquired words are more phonologically robust in verbal short-term memory when they have associated semantic representations.

    Science.gov (United States)

    Savill, Nicola; Ellis, Andrew W; Jefferies, Elizabeth

    2017-04-01

    Verbal short-term memory (STM) is a crucial cognitive function central to language learning, comprehension and reasoning, yet the processes that underlie this capacity are not fully understood. In particular, although STM primarily draws on a phonological code, interactions between long-term phonological and semantic representations might help to stabilise the phonological trace for words ("semantic binding hypothesis"). This idea was first proposed to explain the frequent phoneme recombination errors made by patients with semantic dementia when recalling words that are no longer fully understood. However, converging evidence in support of semantic binding is scant: it is unusual for studies of healthy participants to examine serial recall at the phoneme level and also it is difficult to separate the contribution of phonological-lexical knowledge from effects of word meaning. We used a new method to disentangle these influences in healthy individuals by training new 'words' with or without associated semantic information. We examined phonological coherence in immediate serial recall (ISR), both immediately and the day after training. Trained items were more likely to be recalled than novel nonwords, confirming the importance of phonological-lexical knowledge, and items with semantic associations were also produced more accurately than those with no meaning, at both time points. For semantically-trained items, there were fewer phoneme ordering and identity errors, and consequently more complete target items were produced in both correct and incorrect list positions. These data show that lexical-semantic knowledge improves the robustness of verbal STM at the sub-item level, even when the effect of phonological familiarity is taken into account. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Lexical Information in Memory for Text.

    Science.gov (United States)

    Hayes-Roth, Barbara

    Cued-recall and two-alternative, forced-choice recognition measures were used to evaluate subjects' retention of the specific wordings of studied texts. Results obtained after 10-minute and 24 hour retention intervals suggest that the studied wordings of texts are functional components of their memory representations. Theories that assume…

  2. AUTOMATIC RETRIEVAL AND THE FORMALIZATION OF MULTI WORDS EXPRESSIONS WITH F-WORDS IN THE CORPUS OF CONTEMPORARY AMERICAN ENGLISH

    Directory of Open Access Journals (Sweden)

    Prihantoro Prihantoro

    2016-01-01

    Full Text Available The research problems in this research are 1 how lexicogrammar takes role in determining polarity of F-Word1 and 2 how to formalize it for corpus processing. The data is obtained from the Contemporary American English Corpus (COCA. In this corpus, F-word is proven to be highest in frequency as compared to its distribution across corpora. Corpus methodology is applied by sending queries to retrieve F-Words to COCA interface. Tokens combination surrounding F-words resulted in the phrase and clause unit accompanying F-words, which are significant cues to determine F-word polarity. The polarity is later proven to be not necessarily negative. I also designed a computational resource to allow the retrieval of F-words offline so that users might apply it to any digital text collections.

  3. Stemming of Slovenian library science texts

    Directory of Open Access Journals (Sweden)

    Polona Vilar

    2002-01-01

    Full Text Available The theme of the article is the preparation of a stemming algorithm for Slovenian library science texts. The procedure consisted of three phases: learning, testing and evaluation.The preparation of the optimal stemmer for Slovenian texts from the field of library science is presented, its testing and comparison with two other stemmers for the Slovenian language: the Popovič stemmer and the Generic stemmer. A corpus of 790.000 words from the field of library science was used for learning. Lists of stems, word endings and stop-words were built. In the testing phase, the component parts of the algorithm were tested on an additional corpus of 167.000 words. In the evaluation phase, a comparison of the three stemmers processing the same word corpus was made. The results of each stemmer were compared with an intellectually prepared control result of the stemming of the corpus. It consisted of groups of semantically connected words with no errors. Understemming was especially monitored – the number of stems for semantically connected words, produced by an algorithm. The results were statistically processed with the Kruskal-Wallis test. The Optimal stemmer produced the best results.It matched best with the reference results and also gave the smallest number of stems for one semantic meaning. The Popovič stemmer followed closely. The Generic stemmer proved to be the least accurate. The procedures described in the thesis can represent a platform for the development of the tools for automatic indexing and retrieval for library science texts in Slovenian language.

  4. BioWord: A sequence manipulation suite for Microsoft Word

    Directory of Open Access Journals (Sweden)

    Anzaldi Laura J

    2012-06-01

    Full Text Available Abstract Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms.

  5. Long-term temporal tracking of speech rate affects spoken-word recognition.

    Science.gov (United States)

    Baese-Berk, Melissa M; Heffner, Christopher C; Dilley, Laura C; Pitt, Mark A; Morrill, Tuuli H; McAuley, J Devin

    2014-08-01

    Humans unconsciously track a wide array of distributional characteristics in their sensory environment. Recent research in spoken-language processing has demonstrated that the speech rate surrounding a target region within an utterance influences which words, and how many words, listeners hear later in that utterance. On the basis of hypotheses that listeners track timing information in speech over long timescales, we investigated the possibility that the perception of words is sensitive to speech rate over such a timescale (e.g., an extended conversation). Results demonstrated that listeners tracked variation in the overall pace of speech over an extended duration (analogous to that of a conversation that listeners might have outside the lab) and that this global speech rate influenced which words listeners reported hearing. The effects of speech rate became stronger over time. Our findings are consistent with the hypothesis that neural entrainment by speech occurs on multiple timescales, some lasting more than an hour. © The Author(s) 2014.

  6. PEDANT: Parallel Texts in Göteborg

    Directory of Open Access Journals (Sweden)

    Daniel Ridings

    2012-09-01

    Full Text Available

    The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information.

     

    In hierdie artikel word 'n beskrywing gegee van die stand van die PEDANT-projek met parallelle korpora by die Taalbank by die Universiteit van Göteborg. Oplossings vir die verkryging van toegang tot die korpusdata word aangedui. Toegang word verskaf deur middel van die Internet en standaardtoepassings en SGML-sensitiewe programmeringshulpmiddels. Die SGML-formaat vir die enkodering van vertaalpare word gesamentlik geskets. Hierdie metodes laat toe dat gewerk kan word met enigiets vanaf suiwer teks tot tekste wat taalkundig dig geëtiketteer is.

     

  7. Preschool Children’s Memory for Word Forms Remains Stable Over Several Days, but Gradually Decreases after Six Months

    Directory of Open Access Journals (Sweden)

    Katherine Ruth Gordon

    2016-09-01

    Full Text Available Research on word learning has focused on children’s ability to identify a target object when given the word form after a minimal number of exposures to novel word-object pairings. However, relatively little research has focused on children’s ability to retrieve the word form when given the target object. The exceptions involve asking children to recall and produce forms, and children typically perform near floor on these measures. In the current study, 3- to 5-year-old children were administered a novel test of word form that allowed for recognition memory and manual responses. Specifically, when asked to label a previously trained object, children were given three forms to choose from: the target, a minimally different form, and a maximally different form. Children demonstrated memory for word forms at three post-training delays: 10 minutes (short-term, 2 to 3 days (long-term, and 6 months to 1 year (very long-term. However, children performed worse at the very long-term delay than the other time points, and the length of the very long-term delay was negatively related to performance. When in error, children were no more likely to select the minimally different form than the maximally different form at all time points. Overall, these results suggest that children remember word forms that are linked to objects over extended post-training intervals, but that their memory for the forms gradually decreases over time without further exposures. Furthermore, memory traces for word forms do not become less phonologically specific over time; rather children either identify the correct form, or they perform at chance.

  8. New Computer Terms in Bloggers’ Language

    Directory of Open Access Journals (Sweden)

    Vilija Celiešienė

    2012-06-01

    Full Text Available The article presents an analysis of new words in computer terminology that make their way to blogs and analyzes how the official neologisms and computer terms, especially the equivalents to barbarisms, are employed in everyday use. The article also discusses the ways of including the new computer terms into texts. The blogs on topics of information technology are the objects of the research. The analysis of the aforementioned blogs allowed highlighting certain trends in the use of new computer terms. An observation was made that even though the authors of the blogs could freely choose their writing style, they were not bound by the standards of literary language. Thus, their language was full of non-standard vocabulary; however, self-control regarding the language used could still be noticed. An interest in novelties of computer terminology and the tendency to accept some of the suggested new Lithuanian and loaned computer terms were noticed. When using the new words the bloggers frequently employed specific graphical elements and (or comments. The graphical elements were often chosen by bloggers to express their feelings of doubt regarding the suitability of the use of the suggested loanword. Attempting to explain the meaning of the new word to the readers the bloggers tended to post comments about the new computer terms.

  9. Word of Mouth Marketing in Mouth and Dental Health Centers towards Consumers

    Directory of Open Access Journals (Sweden)

    Aykut Ekiyor

    2014-09-01

    Full Text Available Influencing the shopping style of others by passing on the experiences of goods purchased or services received is a way of behavior that has its roots in history. The main objective of th is research is to analyze the effects of demographic factors within the scope of word of mouth marketing on the choices of mouth and dental health services. Consumers receiving service from mouth and dental health centers of the Turkish Republic Ministry o f Health constitute the environment of the research. The research conducted in order to determine the mouth and dental health center selection of consumers within the scope of word of mouth marketing. The research has been conducted in Ankara through simpl e random sampling. The sample size has been determined as 400. In terms of word of mouth marketing which has been determined as the third hypothesis of the study, as a result of the analysis of the statistical relationship between mouth and dental health c enter preference and demographic factor groups, it has been determined that there is a meaningful difference in terms of age, level of education, level of income and some dimensions of marital status and that no meaningful difference has been found in term s of gender. It has been attempted to determine the importance of word of mouth marketing in healthcare services

  10. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Based on vertical projection profiles and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, the isolated and connected (touching) characters in a word are detected. Using structural, topological and water reservoir concept-based features, characters of the word ...

  11. The effects of visual crowding, text size, and positional uncertainty on text legibility at a glance.

    Science.gov (United States)

    Dobres, Jonathan; Wolfe, Benjamin; Chahine, Nadine; Reimer, Bryan

    2018-07-01

    Reading at a glance, once a relatively infrequent mode of reading, is becoming common. Mobile interaction paradigms increasingly dominate the way in which users obtain information about the world, which often requires reading at a glance, whether from a smartphone, wearable device, or in-vehicle interface. Recent research in these areas has shown that a number of factors can affect text legibility when words are briefly presented in isolation. Here we expand upon this work by examining how legibility is affected by more crowded presentations. Word arrays were combined with a lexical decision task, in which the size of the text elements and the inter-line spacing (leading) between individual items were manipulated to gauge their relative impacts on text legibility. In addition, a single-word presentation condition that randomized the location of presentation was compared with previous work that held position constant. Results show that larger text was more legible than smaller text. Wider leading significantly enhanced legibility as well, but contrary to expectations, wider leading did not fully counteract decrements in legibility at smaller text sizes. Single-word stimuli presented with random positioning were more difficult to read than stationary counterparts from earlier studies. Finally, crowded displays required much greater processing time compared to single-word displays. These results have implications for modern interface design, which often present interactions in the form of scrollable and/or selectable lists. The present findings are of practical interest to the wide community of graphic designers and interface engineers responsible for developing our interfaces of daily use. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Monolingual accounting dictionaries for EFL text production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2006-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  13. Monolingual Accounting Dictionaries for EFL Text Production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2009-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  14. The Effect of Sleep on Children’s Word Retention and Generalization

    Directory of Open Access Journals (Sweden)

    Emma L. Axelsson

    2016-08-01

    Full Text Available In the first few years of life children spend a good proportion of time sleeping as well as acquiring the meanings of hundreds of words and their related associations. There is now ample evidence of the effects of sleep on memory in adults and the number of studies demonstrating the effects of napping and nocturnal sleep in children is also mounting. In particular, sleep appears to benefit children’s memory for recently-encountered novel words. The effect of sleep on children’s generalization of novel words across multiple items, however, is less clear. Given that sleep is polyphasic in the early years, made up of multiple episodes, and children’s word learning is gradual and strengthened slowly over time, it is highly plausible that sleep is a strong candidate in supporting children’s memory for novel words. Importantly, it appears that when children sleep shortly after exposure to novel word-object pairs retention is better than if sleep is delayed, suggesting that napping plays a vital role in long-term word retention for young children. Word learning is a complex, challenging and important part of development, thus the role that sleep plays in children’s retention of novel words is worthy of attention. As such, ensuring children get sufficient good quality sleep and regular opportunities to nap may be critical for strong language acquisition.

  15. Whole-Word Phonological Representations of Disyllabic Words in the Chinese Lexicon: Data From Acquired Dyslexia

    Directory of Open Access Journals (Sweden)

    Sam-Po Law

    2005-01-01

    Full Text Available This study addresses the issue of the existence of whole-word phonological representations of disyllabic and multisyllabic words in the Chinese mental lexicon. A Cantonese brain-injured dyslexic individual with semantic deficits, YKM, was assessed on his abilities to read aloud and to comprehend disyllabic words containing homographic heterophonous characters, the pronunciation of which can only be disambiguated in word context. Superior performance on reading to comprehension was found. YKM could produce the target phonological forms without understanding the words. The dissociation is taken as evidence for whole-word representations for these words at the phonological level. The claim is consistent with previous account for discrepancy of the frequencies of tonal errors between reading aloud and object naming in Cantonese reported of another case study of similar deficits. Theoretical arguments for whole-word form representations for all multisyllabic Chinese words are also discussed.

  16. Decorporation: officially a word.

    Science.gov (United States)

    Fisher, D R

    2000-05-01

    This note is the brief history of a word. Decorporation is a scientific term known to health physicists who have an interest in the removal of internally deposited radionuclides from the body after an accidental or inadvertent intake. Although the word decorporation appears many times in the radiation protection literature, it was only recently accepted by the editors of the Oxford English Dictionary as an entry for their latest edition.

  17. Decorporation: Officially a word

    International Nuclear Information System (INIS)

    Fisher, D.R.

    2000-01-01

    This note is the brief history of a word. Decorporation is a scientific term known to health physicists who have an interest in the removal of internally deposited radionuclides from the body after an accidental or inadvertent intake. Although the word decorporation appears many times in the radiation protection literature, it was only recently accepted by the editors of the Oxford English Dictionary as an entry for their latest edition

  18. Decorporation: Officially a word

    Energy Technology Data Exchange (ETDEWEB)

    Fisher, D.R.

    2000-05-01

    This note is the brief history of a word. Decorporation is a scientific term known to health physicists who have an interest in the removal of internally deposited radionuclides from the body after an accidental or inadvertent intake. Although the word decorporation appears many times in the radiation protection literature, it was only recently accepted by the editors of the Oxford English Dictionary as an entry for their latest edition.

  19. Decorporation: Officially a word

    International Nuclear Information System (INIS)

    Fisher, Darrell R.

    1999-01-01

    This note is the brief history of a word. Decorporation is a scientific term known to health physicists who have an interest in the removal of internally deposited radionuclides from the body after an accidental or inadvertent intake. Although the word decorporation appears many times in the radiation protection literature, it was only recently accepted by the editors of the Oxford English Dictionary as an entry for their latest edition

  20. Word length, set size, and lexical factors: Re-examining what causes the word length effect.

    Science.gov (United States)

    Guitard, Dominic; Gabel, Andrew J; Saint-Aubin, Jean; Surprenant, Aimée M; Neath, Ian

    2018-04-19

    The word length effect, better recall of lists of short (fewer syllables) than long (more syllables) words has been termed a benchmark effect of working memory. Despite this, experiments on the word length effect can yield quite different results depending on set size and stimulus properties. Seven experiments are reported that address these 2 issues. Experiment 1 replicated the finding of a preserved word length effect under concurrent articulation for large stimulus sets, which contrasts with the abolition of the word length effect by concurrent articulation for small stimulus sets. Experiment 2, however, demonstrated that when the short and long words are equated on more dimensions, concurrent articulation abolishes the word length effect for large stimulus sets. Experiment 3 shows a standard word length effect when output time is equated, but Experiments 4-6 show no word length effect when short and long words are equated on increasingly more dimensions that previous demonstrations have overlooked. Finally, Experiment 7 compared recall of a small and large neighborhood words that were equated on all the dimensions used in Experiment 6 (except for those directly related to neighborhood size) and a neighborhood size effect was still observed. We conclude that lexical factors, rather than word length per se, are better predictors of when the word length effect will occur. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  1. Powerful Vocabulary Acquisition through Texts Comparison

    Directory of Open Access Journals (Sweden)

    Mohammad Reza Hasannejad

    2015-03-01

    Full Text Available This study aimed to investigate if dual version reading comprehension had a positive effect on Intermediate EFL students’ general vocabulary acquisition, receptive and productive knowledge of vocabulary and students’ synonymous power of words. Two groups were selected - the experimental group and the control group. The study included: (1 four pretests (2 the dual version reading comprehension, and (3 four posttests. It was found that there was no significant difference between the two groups of students on the pretests. However there was a significant difference between the two groups of the students on the posttests. Overall, the dual version reading comprehension vocabulary-learning made the experimental group learners outperformed the control groups in terms of their performance on four types of vocabulary tests. This indicates that students following dual version reading comprehension were more successful in vocabulary acquisition, and developing their receptive knowledge of vocabulary, transferring their receptive knowledge in to the productive knowledge and enhancing the memorization of the synonymous words.

  2. Text-based language identification of multilingual names

    CSIR Research Space (South Africa)

    Giwa, O

    2015-11-01

    Full Text Available Text-based language identification (T-LID) of isolated words has been shown to be useful for various speech processing tasks, including pronunciation modelling and data categorisation. When the words to be categorised are proper names, the task...

  3. The Effect of Known-and-Unknown Word Combinations on Intentional Vocabulary Learning

    Science.gov (United States)

    Kasahara, Kiwamu

    2011-01-01

    The purpose of this study is to examine whether learning a known-and-unknown word combination is superior in terms of retention and retrieval of meaning to learning a single unknown word. The term "combination" in this study means a two-word collocation of a familiar word and a word that is new to the participants. Following the results of…

  4. Word form Encoding in Chinese Word Naming and Word Typing

    Science.gov (United States)

    Chen, Jenn-Yeu; Li, Cheng-Yi

    2011-01-01

    The process of word form encoding was investigated in primed word naming and word typing with Chinese monosyllabic words. The target words shared or did not share the onset consonants with the prime words. The stimulus onset asynchrony (SOA) was 100 ms or 300 ms. Typing required the participants to enter the phonetic letters of the target word,…

  5. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  6. Learning to Read Words: Theory, Findings, and Issues

    Science.gov (United States)

    Ehri, Linnea C.

    2005-01-01

    Reading words may take several forms. Readers may utilize decoding, analogizing, or predicting to read unfamiliar words. Readers read familiar words by accessing them in memory, called sight word reading. With practice, all words come to be read automatically by sight, which is the most efficient, unobtrusive way to read words in text. The process…

  7. Wine and Words: A Trilingual Wine Dictionary for South Africa

    Directory of Open Access Journals (Sweden)

    Michelle F. van der Merwe

    2011-10-01

    Full Text Available

    Abstract: The South African wine industry identified the need for a special-field on-line dictionary on viticulture and oenology in Afrikaans, English and isi-Xhosa. The dictionary provides information on wine terminology as well as linguistic information on the use of such terminology. The purpose of this article is to give a description of the project. The process of compiling the dictionary is described, from the co-operation between the wine industry and lexicographers to the intended target users and the choice of languages of the dictionary. Functions of the dictionary are discussed, with reference to specific user situations, namely text production, text reception and translation. A system of labels has been designed for the dictionary and its benefit for the user is explained. In assisting the user to make an informed choice of a term, the notion of proscriptiveness has been followed in the presentation of information in the wine dictionary.

    Keywords: TRILINGUAL WINE DICTIONARY, SPECIALISED LEXICOGRAPHY, VITICULTUREAND OENOLOGY TERMS, ON-LINE DICTIONARY, TARGET USERS, USER SITUATIONS,FUNCTIONS, TEXT RECEPTION, TEXT PRODUCTION, TRANSLATION, LABELS, ENCYCLOPEDICKNOWLEDGE, LINGUISTIC KNOWLEDGE, PROSCRIPTION

    Opsomming: Wyn en woorde: 'n Drietalige Wynwoordeboek vir Suid-Afrika. Die Suid-Afrikaanse wynbedryf het die behoefte aan 'n aanlynvakwoordeboek oor wynenwingerdkunde in Afrikaans, Engels en isiXhosa geïdentifiseer. Die woordeboek verskaf inligtingoor wynterminologie, sowel as taalkundige inligting oor die gebruik van sulke terminologie. Diedoel van hierdie artikel is om 'n beskrywing van die projek te gee. Die samestellingsproses van diewoordeboek word beskryf, vanaf die samewerking tussen die wynbedryf en die leksikograwe, totdie voorgestelde teikengebruikers en die keuse van die tale van die woordeboek. Funksies van diewoordeboek word bespreek, met verwysing na spesifieke gebruikersituasies, naamlik teksproduksie

  8. ARABIC TEXT CLASSIFICATION USING NEW STEMMER FOR FEATURE SELECTION AND DECISION TREES

    Directory of Open Access Journals (Sweden)

    SAID BAHASSINE

    2017-06-01

    Full Text Available Text classification is the process of assignment of unclassified text to appropriate classes based on their content. The most prevalent representation for text classification is the bag of words vector. In this representation, the words that appear in documents often have multiple morphological structures, grammatical forms. In most cases, this morphological variant of words belongs to the same category. In the first part of this paper, anew stemming algorithm was developed in which each term of a given document is represented by its root. In the second part, a comparative study is conducted of the impact of two stemming algorithms namely Khoja’s stemmer and our new stemmer (referred to hereafter by origin-stemmer on Arabic text classification. This investigation was carried out using chi-square as a feature of selection to reduce the dimensionality of the feature space and decision tree classifier. In order to evaluate the performance of the classifier, this study used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, Middle East, switch and world on WEKA toolkit. The recall, f-measure and precision measures are used to compare the performance of the obtained models. The experimental results show that text classification using rout stemmer outperforms classification using Khoja’s stemmer. The f-measure was 92.9% in sport category and 89.1% in business category.

  9. Multilingual text induced spelling correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a multilingual, language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from raw text corpora, without supervision, and contains word unigrams

  10. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  11. Effect of Name Change of Schizophrenia on Mass Media Between 1985 and 2013 in Japan: A Text Data Mining Analysis.

    Science.gov (United States)

    Koike, Shinsuke; Yamaguchi, Sosei; Ojio, Yasutaka; Ohta, Kazusa; Ando, Shuntaro

    2016-05-01

    Mass media such as newspapers and TV news affect mental health-related stigma. In Japan, the name of schizophrenia was changed in 2002 for the purposes of stigma reduction; however, little has been known about the effect of name change of schizophrenia on mass media. Articles including old and new names of schizophrenia, depressive disorder, and diabetes mellitus (DM) in headlines and/or text were extracted from 23169092 articles in 4 major Japanese newspapers and 1 TV news program (1985-2013). The trajectory of the number of articles including each term was determined across years. Then, all text in news headlines was segmented as per part-of-speech level using text data mining. Segmented words were classified into 6 categories and in each category of extracted words by target term and period were also tested. Total 51789 and 1106 articles including target terms in newspaper articles and TV news segments were obtained, respectively. The number of articles including the target terms increased across years. Relative increase was observed in the articles published on schizophrenia since 2003 compared with those on DM and between 2000 and 2005 compared with those on depressive disorder. Word tendency used in headlines was equivalent before and after 2002 for the articles including each target term. Articles for schizophrenia contained more negative words than depressive disorder and DM (31.5%, 16.0%, and 8.2%, respectively). Name change of schizophrenia had a limited effect on the articles published and little effect on its contents. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  12. Hearing taboo words can result in early talker effects in word recognition for female listeners.

    Science.gov (United States)

    Tuft, Samantha E; MᶜLennan, Conor T; Krestar, Maura L

    2018-02-01

    Previous spoken word recognition research using the long-term repetition-priming paradigm found performance costs for stimuli mismatching in talker identity. That is, when words were repeated across the two blocks, and the identity of the talker changed reaction times (RTs) were slower than when the repeated words were spoken by the same talker. Such performance costs, or talker effects, followed a time course, occurring only when processing was relatively slow. More recent research suggests that increased explicit and implicit attention towards the talkers can result in talker effects even during relatively fast processing. The purpose of the current study was to examine whether word meaning would influence the pattern of talker effects in an easy lexical decision task and, if so, whether results would differ depending on whether the presentation of neutral and taboo words was mixed or blocked. Regardless of presentation, participants responded to taboo words faster than neutral words. Furthermore, talker effects for the female talker emerged when participants heard both taboo and neutral words (consistent with an attention-based hypothesis), but not for participants that heard only taboo or only neutral words (consistent with the time-course hypothesis). These findings have important implications for theoretical models of spoken word recognition.

  13. Different Neural Correlates of Emotion-Label Words and Emotion-Laden Words: An ERP Study

    Directory of Open Access Journals (Sweden)

    Juan Zhang

    2017-09-01

    Full Text Available It is well-documented that both emotion-label words (e.g., sadness, happiness and emotion-laden words (e.g., death, wedding can induce emotion activation. However, the neural correlates of emotion-label words and emotion-laden words recognition have not been examined. The present study aimed to compare the underlying neural responses when processing the two kinds of words by employing event-related potential (ERP measurements. Fifteen Chinese native speakers were asked to perform a lexical decision task in which they should judge whether a two-character compound stimulus was a real word or not. Results showed that (1 emotion-label words and emotion-laden words elicited similar P100 at the posteriors sites, (2 larger N170 was found for emotion-label words than for emotion-laden words at the occipital sites on the right hemisphere, and (3 negative emotion-label words elicited larger Late Positivity Complex (LPC on the right hemisphere than on the left hemisphere while such effect was not found for emotion-laden words and positive emotion-label words. The results indicate that emotion-label words and emotion-laden words elicit different cortical responses at both early (N170 and late (LPC stages. In addition, right hemisphere advantage for emotion-label words over emotion-laden words can be observed in certain time windows (i.e., N170 and LPC while fails to be detected in some other time window (i.e., P100. The implications of the current findings for future emotion research were discussed.

  14. Semantiz Structure of the Legal Term

    Directory of Open Access Journals (Sweden)

    Екатерина Владимировна Кулевская

    2016-12-01

    Full Text Available The article examines the semantic structure of the legal term. Nowadays, with the rapid development of cross-cultural communication, people, while pursuing their professional career, learn specific languages, including the language of law, with terms being its important component. Terms can often impede the process of successful cross-cultural communication so teaching cross-cultural communication, according to many researchers, including P. Cranmer and K. Koskinen, is immensely important. The article aims to demonstrate that a legal term, a word or phrase used in legislation, is a generalized name for a legal concept that may lack a precise meaning in practice as it is polysemous. To proof this statement, the semantic structure of the legal term is studied from the cognitive point of view. The key terms (term, frame, lexico-semantic variant of a word, microframe (reference category are defined at the beginning of the article. The article also describes the classification of various semantic structures of terms developed by Prof. Belyayevskaya, based on an analysis of the cognitive foundations of the typology of semantic structures as well as on the classification of meanings. They are homogeneous semantic structures, with different lexico-semantic variants of a polysemous word representing different aspects of one microframe; these structures include monosemous terms, polysemous terms with a homogeneous semantic structure, and terms with the intermediate type of lexemes. Heterogeneous semantic structures are semantic structures, with a lexico-semantic variant of a word representing two or more reference categories rather than one category; these structures are considered to be “classical” polysemy. Two types of such structures are introduced in the article, with examples of the actualization of their lexical meaning in speech being analysed (there were used examples from the British and Russian National corpora; official legal documents and

  15. Automatic generation of stop word lists for information retrieval and analysis

    Science.gov (United States)

    Rose, Stuart J

    2013-01-08

    Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.

  16. THE SPECIAL STATUS OF EXOGENOUS WORD-FORMATION WITHIN THE GERMAN WORD-FORMATION SYSTEM

    Directory of Open Access Journals (Sweden)

    Zhilyuk Sergey Aleksandrovich

    2014-06-01

    Full Text Available The article presents the properties of exogenous word-formation system taking into account the existence of two word-formation systems in modern German. On the basis of foreign research which reveal modern trends in German word-formation connected with the internationalization and the development of new European Latin language. The author defines key features of exogenous word-formation, i.e. foreign origin of wordformation units, unmotivated units, unmotivated interchange in base and affixes as well as limited distribution rules in combination with German word-formation. The article analyzes various approaches to word-division, as well as motivated and unmotivated interchange of consonants in bases and in affixes. Unmotivated interchange showcases a special status of the exogenous word-formation within German. Another item covered by the article is the issue of confix. The article has opinions of researchers about correctness of its separation and a list of its features. The author presents his definition of confix: a confix is a bound exogenous word-formation unit with a certain lexical and semantic meaning and joining other units directly or indirectly (through linking morpheme -o-, which is able to make a base. Moreover, some confixes are able to participate at word-combination and have unlimited distribution. So far, confix showcases the integration of exogenous word-formation and traditional German word-formation. The research proves the special status of exogenous word-formation in German. Its results can be used as a base for further analysis of co-existing word-formation systems in German and determination of their characteristic features.

  17. Non-word repetition in children with specific language impairment: a deficit in phonological working memory or in long-term verbal knowledge?

    Science.gov (United States)

    Casalini, Claudia; Brizzolara, Daniela; Chilosi, Anna; Cipriani, Paola; Marcolini, Stefania; Pecini, Chiara; Roncoli, Silvia; Burani, Cristina

    2007-08-01

    In this study we investigated the effects of long-term memory (LTM) verbal knowledge on short-term memory (STM) verbal recall in a sample of Italian children affected by different subtypes of specific language impairment (SLI). The aim of the study was to evaluate if phonological working memory (PWM) abilities of SLI children can be supported by LTM linguistic representations and if PWM performances can be differently affected in the various subtypes of SLI. We tested a sample of 54 children affected by Mixed Receptive-Expressive (RE), Expressive (Ex) and Phonological (Ph) SLI (DSM-IV - American Psychiatric Association, 1994) by means of a repetition task of words (W) and non-words (NW) differing in morphemic structure [morphological non-words (MNW), consisting of combinations of roots and affixes - and simple non-words - with no morphological constituency]. We evaluated the effects of lexical and morpho-lexical LTM representations on STM recall by comparing the repetition accuracy across the three types of stimuli. Results indicated that although SLI children, as a group, showed lower repetition scores than controls, their performance was affected similarly to controls by the type of stimulus and the experimental manipulation of the non-words (better repetition of W than MNW and NW, and of MNW than NW), confirming the recourse to LTM verbal representations to support STM recall. The influence of LTM verbal knowledge on STM recall in SLI improved with age and did not differ among the three types of SLI. However, the three types of SLI differed in the accuracy of their repetition performances (PMW abilities), with the Phonological group showing the best scores. The implications for SLI theory and practice are discussed.

  18. Translation Memory and Computer Assisted Translation Tool for Medieval Texts

    Directory of Open Access Journals (Sweden)

    Törcsvári Attila

    2013-05-01

    Full Text Available Translation memories (TMs, as part of Computer Assisted Translation (CAT tools, support translators reusing portions of formerly translated text. Fencing books are good candidates for using TMs due to the high number of repeated terms. Medieval texts suffer a number of drawbacks that make hard even “simple” rewording to the modern version of the same language. The analyzed difficulties are: lack of systematic spelling, unusual word orders and typos in the original. A hypothesis is made and verified that even simple modernization increases legibility and it is feasible, also it is worthwhile to apply translation memories due to the numerous and even extremely long repeated terms. Therefore, methods and algorithms are presented 1. for automated transcription of medieval texts (when a limited training set is available, and 2. collection of repeated patterns. The efficiency of the algorithms is analyzed for recall and precision.

  19. Filaments of Meaning in Word Space

    OpenAIRE

    Karlgren, Jussi; Holst, Anders; Sahlgren, Magnus

    2008-01-01

    Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global s...

  20. Words and melody are intertwined in perception of sung words: EEG and behavioral evidence.

    Directory of Open Access Journals (Sweden)

    Reyna L Gordon

    Full Text Available Language and music, two of the most unique human cognitive abilities, are combined in song, rendering it an ecological model for comparing speech and music cognition. The present study was designed to determine whether words and melodies in song are processed interactively or independently, and to examine the influence of attention on the processing of words and melodies in song. Event-Related brain Potentials (ERPs and behavioral data were recorded while non-musicians listened to pairs of sung words (prime and target presented in four experimental conditions: same word, same melody; same word, different melody; different word, same melody; different word, different melody. Participants were asked to attend to either the words or the melody, and to perform a same/different task. In both attentional tasks, different word targets elicited an N400 component, as predicted based on previous results. Most interestingly, different melodies (sung with the same word elicited an N400 component followed by a late positive component. Finally, ERP and behavioral data converged in showing interactions between the linguistic and melodic dimensions of sung words. The finding that the N400 effect, a well-established marker of semantic processing, was modulated by musical melody in song suggests that variations in musical features affect word processing in sung language. Implications of the interactions between words and melody are discussed in light of evidence for shared neural processing resources between the phonological/semantic aspects of language and the melodic/harmonic aspects of music.

  1. Exploring the word superiority effect using TVA

    DEFF Research Database (Denmark)

    Starrfelt, Randi

    Words are made of letters, and yet sometimes it is easier to identify a word than a single letter. This word superiority effect (WSE) has been observed when written stimuli are presented very briefly or degraded by visual noise. It is unclear, however, if this is due to a lower threshold for perc...... simultaneously we find a different pattern: In a whole report experiment with six stimuli (letters or words), letters are perceived more easily than words, and this is reflected both in perceptual processing speed and short term memory capacity....... for perception of words, or a higher speed of processing for words than letters. We have investigated the WSE using methods based on a Theory of Visual Attention. In an experiment using single stimuli (words or letters) presented centrally, we show that the classical WSE is specifically reflected in perceptual...

  2. ONTOGRABBING: Extracting Information from Texts Using Generative Ontologies

    DEFF Research Database (Denmark)

    Nilsson, Jørgen Fischer; Szymczak, Bartlomiej Antoni; Jensen, P.A.

    2009-01-01

    We describe principles for extracting information from texts using a so-called generative ontology in combination with syntactic analysis. Generative ontologies are introduced as semantic domains for natural language phrases. Generative ontologies extend ordinary finite ontologies with rules...... for producing recursively shaped terms representing the ontological content (ontological semantics) of NL noun phrases and other phrases. We focus here on achieving a robust, often only partial, ontology-driven parsing of and ascription of semantics to a sentence in the text corpus. The aim of the ontological...... analysis is primarily to identify paraphrases, thereby achieving a search functionality beyond mere keyword search with synsets. We further envisage use of the generative ontology as a phrase-based rather than word-based browser into text corpora....

  3. Bread and Water as Metaphors for the Word of God in the Four Gospels

    Directory of Open Access Journals (Sweden)

    Dawid Ledwoń

    2017-06-01

    Full Text Available Bread and water are among the best-known Biblical metaphors for the Word of God. This article presents a study of their occurrence in the four Gospels against the backdrop of the Old Testament. However, an analysis of the explicit references to bread and water is not exhaustive with regards to the topic under discussion. Therefore, other terms that relate to them, such as food, spring, hunger, thirst, feeding, and drinking, are also of great interest. Studying the metaphors for the Word of God reveals both a continuation of the Biblical ideas within the four Gospels, as well as a total novelty in the expression of the Word that became flesh (Jn 1:14.

  4. Similar words analysis based on POS-CBOW language model

    Directory of Open Access Journals (Sweden)

    Dongru RUAN

    2015-10-01

    Full Text Available Similar words analysis is one of the important aspects in the field of natural language processing, and it has important research and application values in text classification, machine translation and information recommendation. Focusing on the features of Sina Weibo's short text, this paper presents a language model named as POS-CBOW, which is a kind of continuous bag-of-words language model with the filtering layer and part-of-speech tagging layer. The proposed approach can adjust the word vectors' similarity according to the cosine similarity and the word vectors' part-of-speech metrics. It can also filter those similar words set on the base of the statistical analysis model. The experimental result shows that the similar words analysis algorithm based on the proposed POS-CBOW language model is better than that based on the traditional CBOW language model.

  5. [Emotional valence of words in schizophrenia].

    Science.gov (United States)

    Jalenques, I; Enjolras, J; Izaute, M

    2013-06-01

    Emotion recognition is a domain in which deficits have been reported in schizophrenia. A number of emotion classification studies have indicated that emotion processing deficits in schizophrenia are more pronounced for negative affects. Given the difficulty of developing material suitable for the study of these emotional deficits, it would be interesting to examine whether patients suffering from schizophrenia are responsive to positively and negatively charged emotion-related words that could be used within the context of remediation strategies. The emotional perception of words was examined in a clinical experiment involving schizophrenia patients. This emotional perception was expressed by the patients in terms of the valence associated with the words. In the present study, we investigated whether schizophrenia patients would assign the same negative and positive valences to words as healthy individuals. Twenty volunteer, clinically stable, outpatients from the Psychiatric Service of the University Hospital of Clermont-Ferrand were recruited. Diagnoses were based on DSM-IV criteria. Global psychiatric symptoms were assessed using the Positive and Negative Symptoms Scale (PANSS). The patients had to evaluate the emotional valence of a set of 300 words on a 5-point scale ranging from "very unpleasant" to "very pleasant". . The collected results were compared with those obtained by Bonin et al. (2003) [13] from 97 University students. Correlational analyses of the two studies revealed that the emotional valences were highly correlated, i.e. the schizophrenia patients estimated very similar emotional valences. More precisely, it was possible to examine three separate sets of 100 words each (positive words, neutral words and negative words). The positive words that were evaluated were the more positive words from the norms collected by Bonin et al. (2003) [13], and the negative words were the more negative examples taken from these norms. The neutral words

  6. Recall of short word lists presented visually at fast rates: effects of phonological similarity and word length.

    Science.gov (United States)

    Coltheart, V; Langdon, R

    1998-03-01

    Phonological similarity of visually presented list items impairs short-term serial recall. Lists of long words are also recalled less accurately than are lists of short words. These results have been attributed to phonological recoding and rehearsal. If subjects articulate irrelevant words during list presentation, both phonological similarity and word length effects are abolished. Experiments 1 and 2 examined effects of phonological similarity and recall instructions on recall of lists shown at fast rates (from one item per 0.114-0.50 sec), which might not permit phonological encoding and rehearsal. In Experiment 3, recall instructions and word length were manipulated using fast presentation rates. Both phonological similarity and word length effects were observed, and they were not dependent on recall instructions. Experiments 4 and 5 investigated the effects of irrelevant concurrent articulation on lists shown at fast rates. Both phonological similarity and word length effects were removed by concurrent articulation, as they were with slow presentation rates.

  7. Paraphrasensuche mittels word2vec und der Word Mover’s Distance im Altgriechischen

    Directory of Open Access Journals (Sweden)

    Marcus Pöckelmann

    2017-12-01

    Full Text Available To find receptions of Plato‘s work within the ancient Greek literature, automatic methods would be a useful assistance. Unfortunately, such methods are often knowledge-based and thus restricted to extensively annotated texts, which are not available to a sufficient extent for ancient Greek. In this article, we describe an approach that is based on the distributional hypotheses instead, to overcome the problem of missing annotations. This approach uses word2vec and the related Word Mover‘s Distance to determine phrases with similar meaning. Despite its experimental state, the method produces some meaningful results as shown in three examples.

  8. Combining position weight matrices and document-term matrix for efficient extraction of associations of methylated genes and diseases from free text.

    Directory of Open Access Journals (Sweden)

    Arwa Bin Raies

    Full Text Available BACKGROUND: In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually. METHODOLOGY: We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text. CONCLUSION: The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download.

  9. KAJIAN LITERATUR: MANAKAH YANG LEBIH EFEKTIF? TRADITIONAL WORD OF MOUTH ATAU ELECTRONIC WORD OF MOUTH

    Directory of Open Access Journals (Sweden)

    Putu Adriani Prayustika

    2016-12-01

    Full Text Available Word of Mouth telah diakui sebagai salah satu strategi komunikasi yang paling efektif dalam transisi informasi perusahaan kepada konsumen. Perusahaan memanfaatkan komunikasi word of mouth untuk kepentingan pemasaran produk dan layanan. Namun, komunikasi WOM konvensional hanya efektif dalam batasan kontak sosial terbatas. Kemajuan teknologi informasi dan munculnya situs jaringan sosial online telah mengubah cara informasi ditransmisikan dan telah melampaui keterbatasan tradisional WOM. Komunikasi word of mouth dengan memanfaatkan teknologi ini sering disebut electronic word of mouth (eWOM, dimana komunikasi ini memanfaatkan media baru, seperti misalnya media sosial. Makalah ini akan membahas kajian literatur dari beberapa penelitian yang telah dilakukan sebelumnya dalam membandingkan efektivitas traditional word of mouth dan electronic word of mouth. Hasil penelitian menunjukkan bahwa secara umum dapat dikatakan dengan perkembangan teknologi seperti sekarang, eWOM jauh lebih efektif daripada traditional WOM.

  10. Novel word retention in bilingual and monolingual speakers.

    Science.gov (United States)

    Kan, Pui Fong; Sadagopan, Neeraja

    2014-01-01

    The goal of this research was to examine word retention in bilinguals and monolinguals. Long-term word retention is an essential part of vocabulary learning. Previous studies have documented that bilinguals outperform monolinguals in terms of retrieving newly-exposed words. Yet, little is known about whether or to what extent bilinguals are different from monolinguals in word retention. Participants were 30 English-speaking monolingual adults and 30 bilingual adults who speak Spanish as a home language and learned English as a second language during childhood. In a previous study (Kan et al., 2014), the participants were exposed to the target novel words in English, Spanish, and Cantonese. In this current study, word retention was measured a week after the fast mapping task. No exposures were given during the one-week interval. Results showed that bilinguals and monolinguals retain a similar number of words. However, participants produced more words in English than in either Spanish or Cantonese. Correlation analyses revealed that language knowledge plays a role in the relationships between fast mapping and word retention. Specifically, within- and across-language relationships between bilinguals' fast mapping and word retention were found in Spanish and English, by contrast, within-language relationships between monolinguals' fast mapping and word retention were found in English and across-language relationships between their fast mapping and word retention performance in English and Cantonese. Similarly, bilinguals differed from monolinguals in the relationships among the word retention scores in three languages. Significant correlations were found among bilinguals' retention scores. However, no such correlations were found among monolinguals' retention scores. The overall findings suggest that bilinguals' language experience and language knowledge most likely contribute to how they learn and retain new words.

  11. 50 CFR 600.910 - Definitions and word usage.

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 8 2010-10-01 2010-10-01 false Definitions and word usage. 600.910..., Consultation, and Recommendations § 600.910 Definitions and word usage. (a) Definitions. In addition to the... undertaken by a state agency. (b) Word usage. The terms “must”, “shall”, “should”, “may”, “may not”, “will...

  12. 50 CFR 600.810 - Definitions and word usage.

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 8 2010-10-01 2010-10-01 false Definitions and word usage. 600.810...) § 600.810 Definitions and word usage. (a) Definitions. In addition to the definitions in the Magnuson...-Stevens Act. (b) Word usage. The terms “must”, “shall”, “should”, “may”, “may not”, “will”, “could”, and...

  13. Understanding the power of word-of-mouth.

    Directory of Open Access Journals (Sweden)

    Suzana Z. Gildin

    2003-06-01

    Full Text Available Word-of-mouth has been considered one of the most powerful forms of communication in the market today. Understanding what makes word-of-mouth such a persuasive and powerful communication tool is important to organizations that intend to build strong relationships with consumers. For this reason, organizations are concerned about promoting positive word-of-mouth and retarding negative word-of-mouth, which can be harmful to the image of the company or a brand. This work focuses on the major aspects involving word-of-mouth communication. Recommendations to generate positive word-of-mouth and retard negative word-of-mouth are also highlighted.

  14. Language Model Adaptation Using Machine-Translated Text for Resource-Deficient Languages

    Directory of Open Access Journals (Sweden)

    Sadaoki Furui

    2009-01-01

    Full Text Available Text corpus size is an important issue when building a language model (LM. This is a particularly important issue for languages where little data is available. This paper introduces an LM adaptation technique to improve an LM built using a small amount of task-dependent text with the help of a machine-translated text corpus. Icelandic speech recognition experiments were performed using data, machine translated (MT from English to Icelandic on a word-by-word and sentence-by-sentence basis. LM interpolation using the baseline LM and an LM built from either word-by-word or sentence-by-sentence translated text reduced the word error rate significantly when manually obtained utterances used as a baseline were very sparse.

  15. When a text is translated does the complexity of its vocabulary change? Translations and target readerships.

    Science.gov (United States)

    Rêgo, Hênio Henrique Aragão; Braunstein, Lidia A; D'Agostino, Gregorio; Stanley, H Eugene; Miyazima, Sasuke

    2014-01-01

    In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a "temperature" concept related to the text's word-frequency distribution. We propose a "comparative thermo-linguistic" technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our "word energy" distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership.

  16. Term Familiarity to indicate Perceived and Actual Difficulty of Text in Medical Digital Libraries.

    Science.gov (United States)

    Leroy, Gondy; Endicott, James E

    2011-10-01

    With increasing text digitization, digital libraries can personalize materials for individuals with different education levels and language skills. To this end, documents need meta-information describing their difficulty level. Previous attempts at such labeling used readability formulas but the formulas have not been validated with modern texts and their outcome is seldom associated with actual difficulty. We focus on medical texts and are developing new, evidence-based meta-tags that are associated with perceived and actual text difficulty. This work describes a first tag, term familiarity , which is based on term frequency in the Google corpus. We evaluated its feasibility to serve as a tag by looking at a document corpus (N=1,073) and found that terms in blogs or journal articles displayed unexpected but significantly different scores. Term familiarity was then applied to texts and results from a previous user study (N=86) and could better explain differences for perceived and actual difficulty.

  17. Modeling statistical properties of written text.

    Directory of Open Access Journals (Sweden)

    M Angeles Serrano

    Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.

  18. Effects of Word Frequency and Transitional Probability on Word Reading Durations of Younger and Older Speakers.

    Science.gov (United States)

    Moers, Cornelia; Meyer, Antje; Janse, Esther

    2017-06-01

    High-frequency units are usually processed faster than low-frequency units in language comprehension and language production. Frequency effects have been shown for words as well as word combinations. Word co-occurrence effects can be operationalized in terms of transitional probability (TP). TPs reflect how probable a word is, conditioned by its right or left neighbouring word. This corpus study investigates whether three different age groups-younger children (8-12 years), adolescents (12-18 years) and older (62-95 years) Dutch speakers-show frequency and TP context effects on spoken word durations in reading aloud, and whether age groups differ in the size of these effects. Results show consistent effects of TP on word durations for all age groups. Thus, TP seems to influence the processing of words in context, beyond the well-established effect of word frequency, across the entire age range. However, the study also indicates that age groups differ in the size of TP effects, with older adults having smaller TP effects than adolescent readers. Our results show that probabilistic reduction effects in reading aloud may at least partly stem from contextual facilitation that leads to faster reading times in skilled readers, as well as in young language learners.

  19. The Influence of Orthographic Neighborhood Density and Word Frequency on Visual Word Recognition: Insights from RT Distributional Analyses

    Directory of Open Access Journals (Sweden)

    Stephen Wee Hun eLim

    2016-03-01

    Full Text Available The effects of orthographic neighborhood density and word frequency in visual word recognition were investigated using distributional analyses of response latencies in visual lexical decision. Main effects of density and frequency were observed in mean latencies. Distributional analyses, in addition, revealed a density x frequency interaction: for low-frequency words, density effects were mediated predominantly by distributional shifting whereas for high-frequency words, density effects were absent except at the slower RTs, implicating distributional skewing. The present findings suggest that density effects in low-frequency words reflect processes involved in early lexical access, while the effects observed in high-frequency words reflect late postlexical checking processes.

  20. Word segmentation in children’s literacy: a study about word awareness

    Directory of Open Access Journals (Sweden)

    Débora Mattos Marques

    2016-10-01

    Full Text Available The present research aimed to investigate how linguistic awareness regarding the concept of “word” may influence some mistakes on segmenting words in children’s writing in the Elementary School. The observed data comprised those of hyper and hyposegmentation which were then related to word awareness. For the analysis of linguistic awareness data, the Representational Redescription, proposed by Karmillof-Smith (1986-1992, has been used. It postulates four levels where knowledge is redescribed in the human mind, becoming accessible for awareness and verbalization along with the time. The research methodology consisted of six tests, out of which four were applied in order to verify word awareness, and, the other two tests, to obtain samples of writing data. Thus, it was noticed that a great part of the segmentation mistakes identified in the collected writings are related to the informants' ability to distinguish between different words until the moment they were observed. As a result, the uncommon segmentation mistakes found in the analyzed data evidenced that not only are they motivated by prosodic or phonological matters, but they are also influenced by linguistic awareness issues involving the informants’ understanding of word.

  1. Detecting causality from online psychiatric texts using inter-sentential language patterns

    Directory of Open Access Journals (Sweden)

    Wu Jheng-Long

    2012-07-01

    Full Text Available Abstract Background Online psychiatric texts are natural language texts expressing depressive problems, published by Internet users via community-based web services such as web forums, message boards and blogs. Understanding the cause-effect relations embedded in these psychiatric texts can provide insight into the authors’ problems, thus increasing the effectiveness of online psychiatric services. Methods Previous studies have proposed the use of word pairs extracted from a set of sentence pairs to identify cause-effect relations between sentences. A word pair is made up of two words, with one coming from the cause text span and the other from the effect text span. Analysis of the relationship between these words can be used to capture individual word associations between cause and effect sentences. For instance, (broke up, life and (boyfriend, meaningless are two word pairs extracted from the sentence pair: “I broke up with my boyfriend. Life is now meaningless to me”. The major limitation of word pairs is that individual words in sentences usually cannot reflect the exact meaning of the cause and effect events, and thus may produce semantically incomplete word pairs, as the previous examples show. Therefore, this study proposes the use of inter-sentential language patterns such as ≪broke up, boyfriend>, Results Performance was evaluated on a corpus of texts collected from PsychPark (http://www.psychpark.org, a virtual psychiatric clinic maintained by a group of volunteer professionals from the Taiwan Association of Mental Health Informatics. Experimental results show that the use of inter-sentential language patterns outperformed the use of word pairs proposed in previous studies. Conclusions This study demonstrates the acquisition of inter-sentential language patterns for causality detection from online psychiatric texts. Such semantically more complete and precise features can improve causality detection performance.

  2. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic

    Directory of Open Access Journals (Sweden)

    Fawaz S. Al-Anzi

    2017-04-01

    Full Text Available Cosine similarity is one of the most popular distance measures in text classification problems. In this paper, we used this important measure to investigate the performance of Arabic language text classification. For textual features, vector space model (VSM is generally used as a model to represent textual information as numerical vectors. However, Latent Semantic Indexing (LSI is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular value decomposition (SVD method to extract textual features based on LSI. In our experiments, we conducted comparison between some of the well-known classification methods such as Naïve Bayes, k-Nearest Neighbors, Neural Network, Random Forest, Support Vector Machine, and classification tree. We used a corpus that contains 4,000 documents of ten topics (400 document for each topic. The corpus contains 2,127,197 words with about 139,168 unique words. The testing set contains 400 documents, 40 documents for each topics. As a weighing scheme, we used Term Frequency.Inverse Document Frequency (TF.IDF. This study reveals that the classification methods that use LSI features significantly outperform the TF.IDF-based methods. It also reveals that k-Nearest Neighbors (based on cosine measure and support vector machine are the best performing classifiers.

  3. Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text

    KAUST Repository

    Bin Raies, Arwa

    2013-10-16

    Background:In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually.Methodology:We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text.Conclusion:The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download. © 2013 Bin Raies et al.

  4. When a text is translated does the complexity of its vocabulary change? Translations and target readerships.

    Directory of Open Access Journals (Sweden)

    Hênio Henrique Aragão Rêgo

    Full Text Available In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a "temperature" concept related to the text's word-frequency distribution. We propose a "comparative thermo-linguistic" technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our "word energy" distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership.

  5. Short-term retention of a single word relies on retrieval from long-term memory when both rehearsal and refreshing are disrupted.

    Science.gov (United States)

    Rose, Nathan S; Buchsbaum, Bradley R; Craik, Fergus I M

    2014-07-01

    Many working memory (WM) models propose that the focus of attention (or primary memory) has a capacity limit of one to four items, and therefore, that performance on WM tasks involves retrieving some items from long-term (or secondary) memory (LTM). In the present study, we present evidence suggesting that recall of even one item on a WM task can involve retrieving it from LTM. The WM task required participants to make a deep (living/nonliving) or shallow ("e"/no "e") level-of-processing (LOP) judgment on one word and to recall the word after a 10-s delay on each trial. During the delay, participants either rehearsed the word or performed an easy or a hard math task. When the to-be-remembered item could be rehearsed, recall was fast and accurate. When it was followed by a math task, recall was slower, error-prone, and benefited from a deeper LOP at encoding, especially for the hard math condition. The authors suggest that a covert-retrieval mechanism may have refreshed the item during easy math, and that the hard math condition shows that even a single item cannot be reliably held in WM during a sufficiently distracting task--therefore, recalling the item involved retrieving it from LTM. Additionally, performance on a final free recall (LTM) test was better for items recalled following math than following rehearsal, suggesting that initial recall following math involved elaborative retrieval from LTM, whereas rehearsal did not. The authors suggest that the extent to which performance on WM tasks involves retrieval from LTM depends on the amounts of disruption to both rehearsal and covert-retrieval/refreshing maintenance mechanisms.

  6. Examining the Effect of Interference on Short-term Memory Recall of Arabic Abstract and Concrete Words Using Free, Cued, and Serial Recall Paradigms

    OpenAIRE

    Ahmed Mohammed Saleh Alduais; Yasir Saad Almukhaizeem

    2015-01-01

    Purpose: To see if there is a correlation between interference and short-term memory recall and to examine interference as a factor affecting memory recalling of Arabic and abstract words through free, cued, and serial recall tasks. Method: Four groups of undergraduates in King Saud University, Saudi Arabia participated in this study. The first group consisted of 9 undergraduates who were trained to perform three types of recall for 20 Arabic abstract and concrete words. The second, third and...

  7. Using Date Specific Searches on Google Books to Disconfirm Prior Origination Knowledge Claims for Particular Terms, Words, and Names

    Directory of Open Access Journals (Sweden)

    Mike Sutton

    2018-04-01

    Full Text Available Back in 2004, Google Inc. (Menlo Park, CA, USA began digitizing full texts of magazines, journals, and books dating back centuries. At present, over 25 million books have been scanned and anyone can use the service (currently called Google Books to search for materials free of charge (including academics of any discipline. All the books have been scanned, converted to text using optical character recognition and stored in its digital database. The present paper describes a very precise six-stage Boolean date-specific research method on Google, referred to as Internet Date Detection (IDD for short. IDD can be used to examine countless alleged facts and myths in a systematic and verifiable way. Six examples of the IDD method in action are provided (the terms, words, and names ‘self-fulfilling prophecy’, ‘Humpty Dumpty’, ‘living fossil’, ‘moral panic’, ‘boredom’, and ‘selfish gene’ and each of these examples is shown to disconfirm widely accepted expert knowledge belief claims about their history of coinage, conception, and published origin. The paper also notes that Google’s autonomous deep learning AI program RankBrain has possibly caused the IDD method to no longer work so well, addresses how it might be recovered, and how such problems might be avoided in the future.

  8. Impact of reading purpose on incidental word learning from context

    NARCIS (Netherlands)

    Swanborn, MSL; de Glopper, Kees

    Children read texts for various reasons. We examined how reading texts for different purposes affected amounts of incidental word learning. Grade 6 students were asked to read texts for fun, to learn about the topic of the text, and for text comprehension. Proportions of words learned incidentally

  9. The blocked-random effect in pictures and words.

    Science.gov (United States)

    Toglia, M P; Hinman, P J; Dayton, B S; Catalano, J F

    1997-06-01

    Picture and word recall was examined in conjunction with list organization. 60 subjects studied a list of 30 items, either words or their pictorial equivalents. The 30 words/pictures, members of five conceptual categories, each represented by six exemplars, were presented either blocked by category or in a random order. While pictures were recalled better than words and a standard blocked-random effect was observed, the interaction indicated that the recall advantage of a blocked presentation was restricted to the word lists. A similar pattern emerged for clustering. These findings are discussed in terms of limitations upon the pictorial superiority effect.

  10. The Spanish word tiempo: its omnipresence and conceptual, logical and lexical versatility

    Directory of Open Access Journals (Sweden)

    Karlo Budor

    2012-12-01

    Full Text Available The common Spanish word tiempo corresponds to three English terms, each of them being a lexical equivalent based on a specific notion: (1 time – physical, astronomical, philosophical reference; (2 weather – geographical, climatological, meteorological reference; (3 tense: linguistic, lexical, grammatical reference. As far as universal and metalinguistic referential distinctions are concerned, all natural languages in fact present a considerable degree of variation ranging from inexistent or very vague to complete differentiation of these terms. In order to express these three types of specific references, some languages have a single word of general usage covering all its lexical acceptations. Therefore in such languages, Spanish included, different references can be distinguished only in part lexically. However, asemantic analysis of the Spanish word tiempo reveals its complexity as well as its conceptual, logical and lexical versatility. This is reflected in its capacity to combine in numerous lexical units, i.e. word compounds and/or phrases, endowed with different or specific semantic meanings. The repertory of these virtual and derived lexical forms appears to be practically unlimited, although their sphere of application and their boundaries are neither always very clear nor precise, which can be illustrated with the examples given in Spanish dictionaries.

  11. nal Sesotho texts

    African Journals Online (AJOL)

    with literary texts written in indigenous South African languages. The project ... Homi Bhabha uses the words of Salman Rushdie to underline the fact that new .... I could not conceptualise an African-language-to-African-language dictionary. An.

  12. Actual Arabic loan-words of religious content (on the material of modern foreign words

    Directory of Open Access Journals (Sweden)

    Al Shammari Majid Jamil Ashur

    2015-03-01

    Full Text Available Application of thematic classification of actual vocabulary as a whole to the formation of loan words allows to see the uniqueness of seperate groups of the vocabulary. English loan words prevail relating to the sphere of economy, science and technology, loan words from Arabic dominate from the religious vocabulary. Application of field approach to the analysis of actual religious Arabisms revealed both nuclear and peripheral components of the field. At the core of the field there are such Arabisms as Allah and Islam, which can be characterized as key words. However, in unifying the features of these words vary at a number of parameters. The word Allah has zero derivation productivity and at lexicographical description (as opposed to functioning in the language of the media is free of connotations. Arabism, Islam, by contrast, has a high derivation productivity and derived words can express evaluation. Lexicographic description of the Arabism Islam is also quite diverse stylistically and in contents. The core of the field “Muslim religion” also includes a number of words fixed in most modern dictionaries of foreign words. At the periphery of the field there are Arabisms that do not have high levels of frequency, but at the same time as an indicator of dominant Arabisms of religious content among topical Arabisms.

  13. Error-Free Text Typing Performance of an Inductive Intra-Oral Tongue Computer Interface for Severely Disabled Individuals.

    Science.gov (United States)

    Andreasen Struijk, Lotte N S; Bentsen, Bo; Gaihede, Michael; Lontis, Eugen R

    2017-11-01

    For severely paralyzed individuals, alternative computer interfaces are becoming increasingly essential for everyday life as social and vocational activities are facilitated by information technology and as the environment becomes more automatic and remotely controllable. Tongue computer interfaces have proven to be desirable by the users partly due to their high degree of aesthetic acceptability, but so far the mature systems have shown a relatively low error-free text typing efficiency. This paper evaluated the intra-oral inductive tongue computer interface (ITCI) in its intended use: Error-free text typing in a generally available text editing system, Word. Individuals with tetraplegia and able bodied individuals used the ITCI for typing using a MATLAB interface and for Word typing for 4 to 5 experimental days, and the results showed an average error-free text typing rate in Word of 11.6 correct characters/min across all participants and of 15.5 correct characters/min for participants familiar with tongue piercings. Improvements in typing rates between the sessions suggest that typing ratescan be improved further through long-term use of the ITCI.

  14. Connected text reading and differences in text reading fluency in adult readers.

    Directory of Open Access Journals (Sweden)

    Sebastian Wallot

    Full Text Available The process of connected text reading has received very little attention in contemporary cognitive psychology. This lack of attention is in parts due to a research tradition that emphasizes the role of basic lexical constituents, which can be studied in isolated words or sentences. However, this lack of attention is in parts also due to the lack of statistical analysis techniques, which accommodate interdependent time series. In this study, we investigate text reading performance with traditional and nonlinear analysis techniques and show how outcomes from multiple analyses can used to create a more detailed picture of the process of text reading. Specifically, we investigate reading performance of groups of literate adult readers that differ in reading fluency during a self-paced text reading task. Our results indicate that classical metrics of reading (such as word frequency do not capture text reading very well, and that classical measures of reading fluency (such as average reading time distinguish relatively poorly between participant groups. Nonlinear analyses of distribution tails and reading time fluctuations provide more fine-grained information about the reading process and reading fluency.

  15. Grounding word learning in space.

    Directory of Open Access Journals (Sweden)

    Larissa K Samuelson

    Full Text Available Humans and objects, and thus social interactions about objects, exist within space. Words direct listeners' attention to specific regions of space. Thus, a strong correspondence exists between where one looks, one's bodily orientation, and what one sees. This leads to further correspondence with what one remembers. Here, we present data suggesting that children use associations between space and objects and space and words to link words and objects--space binds labels to their referents. We tested this claim in four experiments, showing that the spatial consistency of where objects are presented affects children's word learning. Next, we demonstrate that a process model that grounds word learning in the known neural dynamics of spatial attention, spatial memory, and associative learning can capture the suite of results reported here. This model also predicts that space is special, a prediction supported in a fifth experiment that shows children do not use color as a cue to bind words and objects. In a final experiment, we ask whether spatial consistency affects word learning in naturalistic word learning contexts. Children of parents who spontaneously keep objects in a consistent spatial location during naming interactions learn words more effectively. Together, the model and data show that space is a powerful tool that can effectively ground word learning in social contexts.

  16. Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy

    Directory of Open Access Journals (Sweden)

    Hakenberg Jörg

    2009-01-01

    Full Text Available Abstract Background Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively. Results The 'Closest Sense' method assumes that the ontology defines multiple senses of the term. It computes the shortest path of co-occurring terms in the document to one of these senses. The 'Term Cooc' method defines a log-odds ratio for co-occurring terms including co-occurrences inferred from the ontology structure. The 'MetaData' approach trains a classifier on metadata. It does not require any ontology, but requires training data, which the other methods do not. To evaluate these approaches we defined a manually curated training corpus of 2600 documents for seven ambiguous terms from the Gene Ontology and MeSH. All approaches over all conditions achieve 80% success rate on average. The 'MetaData' approach performed best with 96%, when trained on high-quality data. Its performance deteriorates as quality of the training data decreases. The 'Term Cooc' approach performs better on Gene Ontology (92% success than on MeSH (73% success as MeSH is not a strict is-a/part-of, but rather a loose is-related-to hierarchy. The 'Closest Sense' approach achieves on average 80% success rate. Conclusion Metadata is valuable for disambiguation, but requires high quality training data. Closest Sense requires no training, but a large, consistently modelled ontology, which are two opposing conditions. Term Cooc achieves greater 90

  17. A Word Count of Modern Arabic Prose.

    Science.gov (United States)

    Landau, Jacob M.

    This book presents a word count of Arabic prose based on 60 twentieth-century Egyptian books. The text is divided into an alphabetical list and a word frequency list. This word count is intended as an aid in the: (1) writing of primers and the compilation of graded readers, (2) examination of the vocabulary selection of primers and readers…

  18. Creating a medical dictionary using word alignment: the influence of sources and resources.

    Science.gov (United States)

    Nyström, Mikael; Merkel, Magnus; Petersson, Håkan; Ahlfeldt, Hans

    2007-11-23

    Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality. We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary. The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base

  19. When a Text Is Translated Does the Complexity of Its Vocabulary Change? Translations and Target Readerships

    Science.gov (United States)

    Rêgo, Hênio Henrique Aragão; Braunstein, Lidia A.; D′Agostino, Gregorio; Stanley, H. Eugene; Miyazima, Sasuke

    2014-01-01

    In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a “temperature” concept related to the text's word-frequency distribution. We propose a “comparative thermo-linguistic” technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our “word energy” distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership. PMID:25353343

  20. Monolingual accounting dictionaries for EFL text production

    Directory of Open Access Journals (Sweden)

    Sandro Nielsen

    2006-10-01

    Full Text Available Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items that deal with these aspects are necessary for the international user group as they produce subject-field specific and register-specific texts in a foreign language, and the data items are relevant for the various stages in text production: draft writing, copyediting, stylistic editing and proofreading.

  1. The interaction of short-term and long-term memory in phonetic category formation

    Science.gov (United States)

    Harnsberger, James D.

    2002-05-01

    This study examined the role that short-term memory capacity plays in the relationship between novel stimuli (e.g., non-native speech sounds, native nonsense words) and phonetic categories in long-term memory. Thirty native speakers of American English were administered five tests: categorial AXB discrimination using nasal consonants from Malayalam; categorial identification, also using Malayalam nasals, which measured the influence of phonetic categories in long-term memory; digit span; nonword span, a short-term memory measure mediated by phonetic categories in long-term memory; and paired-associate word learning (word-word and word-nonword pairs). The results showed that almost all measures were significantly correlated with one another. The strongest predictor for the discrimination and word-nonword learning results was nonword (r=+0.62) and digit span (r=+0.51), respectively. When the identification test results were partialed out, only nonword span significantly correlated with discrimination. The results show a strong influence of short-term memory capacity on the encoding of phonetic detail within phonetic categories and suggest that long-term memory representations regulate the capacity of short-term memory to preserve information for subsequent encoding. The results of this study will also be discussed with regards to resolving the tension between episodic and abstract models of phonetic category structure.

  2. Word Sense Disambiguation with LSTM : Do We Really Need 100 Billion Words?

    NARCIS (Netherlands)

    Le, Minh; Postma, Marten; Urbani, Jacopo

    2017-01-01

    Recently, Yuan et al. (2016) have shown the effectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD). Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was

  3. Word Processors: A Look at Four Popular Programs.

    Science.gov (United States)

    Press, Larry

    1980-01-01

    Described are types of programs used for processing text (editors, print formatters, and word processors), followed by the comparison of four word-processing packages: Auto Scribe, Electric Pencil, Magic Want and Word Star. With the exception of Auto Scribe, all programs reviewed are CP/M versions. (KC)

  4. Phonological short-term memory impairment and the word length effect in children with intellectual disabilities.

    Science.gov (United States)

    Poloczek, Sebastian; Büttner, Gerhard; Hasselhorn, Marcus

    2014-02-01

    There is mounting evidence that children and adolescents with intellectual disabilities (ID) of nonspecific aetiology perform poorer on phonological short-term memory tasks than children matched for mental age indicating a structural deficit in a process contributing to short-term recall of verbal material. One explanation is that children with ID of nonspecific aetiology do not activate subvocal rehearsal to refresh degrading memory traces. However, existing research concerning this explanation is inconclusive since studies focussing on the word length effect (WLE) as indicator of rehearsal have revealed inconsistent results for samples with ID and because in several existing studies, it is unclear whether the WLE was caused by rehearsal or merely appeared during output of the responses. We assumed that in children with ID only output delays produce a small WLE while in typically developing 6- to 8-year-olds rehearsal and output contribute to the WLE. From this assumption we derived several predictions that were tested in an experiment including 34 children with mild or borderline ID and 34 typically developing children matched for mental age (MA). As predicted, results revealed a small but significant WLE for children with ID that was significantly smaller than the WLE in the control group. Additionally, for children with ID, a WLE was not found for the first word of each trial but the effect emerged only in later serial positions. The findings corroborate the notion that in children with ID subvocal rehearsal does not develop in line with their mental age and provide a potential explanation for the inconsistent results on the WLE in children with ID. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. The Relationships among Cognitive Correlates and Irregular Word, Non-Word, and Word Reading

    Science.gov (United States)

    Abu-Hamour, Bashir; University, Mu'tah; Urso, Annmarie; Mather, Nancy

    2012-01-01

    This study explored four hypotheses: (a) the relationships among rapid automatized naming (RAN) and processing speed (PS) to irregular word, non-word, and word reading; (b) the predictive power of various RAN and PS measures, (c) the cognitive correlates that best predicted irregular word, non-word, and word reading, and (d) reading performance of…

  6. Effects of lexical competition on immediate memory span for spoken words.

    Science.gov (United States)

    Goh, Winston D; Pisoni, David B

    2003-08-01

    Current theories and models of the structural organization of verbal short-term memory are primarily based on evidence obtained from manipulations of features inherent in the short-term traces of the presented stimuli, such as phonological similarity. In the present study, we investigated whether properties of the stimuli that are not inherent in the short-term traces of spoken words would affect performance in an immediate memory span task. We studied the lexical neighbourhood properties of the stimulus items, which are based on the structure and organization of words in the mental lexicon. The experiments manipulated lexical competition by varying the phonological neighbourhood structure (i.e., neighbourhood density and neighbourhood frequency) of the words on a test list while controlling for word frequency and intra-set phonological similarity (family size). Immediate memory span for spoken words was measured under repeated and nonrepeated sampling procedures. The results demonstrated that lexical competition only emerged when a nonrepeated sampling procedure was used and the participants had to access new words from their lexicons. These findings were not dependent on individual differences in short-term memory capacity. Additional results showed that the lexical competition effects did not interact with proactive interference. Analyses of error patterns indicated that item-type errors, but not positional errors, were influenced by the lexical attributes of the stimulus items. These results complement and extend previous findings that have argued for separate contributions of long-term knowledge and short-term memory rehearsal processes in immediate verbal serial recall tasks.

  7. Beyond word recognition: understanding pediatric oral health literacy.

    Science.gov (United States)

    Richman, Julia Anne; Huebner, Colleen E; Leggott, Penelope J; Mouradian, Wendy E; Mancl, Lloyd A

    2011-01-01

    Parental oral health literacy is proposed to be an indicator of children's oral health. The purpose of this study was to test if word recognition, commonly used to assess health literacy, is an adequate measure of pediatric oral health literacy. This study evaluated 3 aspects of oral health literacy and parent-reported child oral health. A 3-part pediatric oral health literacy inventory was created to assess parents' word recognition, vocabulary knowledge, and comprehension of 35 terms used in pediatric dentistry. The inventory was administered to 45 English-speaking parents of children enrolled in Head Start. Parents' ability to read dental terms was not associated with vocabulary knowledge (r=0.29, P.06) of the terms. Vocabulary knowledge was strongly associated with comprehension (r=0.80, PParent-reported child oral health status was not associated with word recognition, vocabulary knowledge, or comprehension; however parents reporting either excellent or fair/poor ratings had higher scores on all components of the inventory. Word recognition is an inadequate indicator of comprehension of pediatric oral health concepts; pediatric oral health literacy is a multifaceted construct. Parents with adequate reading ability may have difficulty understanding oral health information.

  8. 1001 most useful French words

    CERN Document Server

    McCoy, Heather

    2012-01-01

    Up-to-date entries cover technology terms, and sections on vocabulary and grammar offer helpful tips. Each word is accompanied by a brief definition, a sentence demonstrating proper usage, and a translation.

  9. JAVANESE CULTURAL WORDS IN LOCAL NEWSPAPERS IN CENTRAL JAVA AS A LANGUAGE MAINTENANCE MODEL

    Directory of Open Access Journals (Sweden)

    Deli Nirmala

    2016-04-01

    Full Text Available Javanese cultural words are the linguistic units which are very specific to Javanese culture and society. This article aims at describing what Javanese cultural words that are found in the local newspapers, what they represent, and why they are used in the local newspapers in Central Java. Non-participant observation is used to present the data for analysis, continued with page-filing and note-taking techniques. Referential, reflective-introspective, and abductive inferential methods are used to analyze the data. The result indicates that the Javanese cultural words found in the local newspapers represent festivals, rituals, Javanese ways of life, social activities, actions, feelings, thoughts, behavior, and experiences. The words become the indicators that the journalists of the local newspapers in Cental Java have positive attitudes toward Javanese words. This becomes a model for language maintenance of Javanese. This implies that the words are stored in the long-term memory, that become mental image, which are used when needed by the user for communication. The existence of the concepts residing in the mind will make the Javanese language maintenance possible, which is supported by the attitudes of the Javanese.

  10. Indonesian Text-To-Speech System Using Diphone Concatenative Synthesis

    Directory of Open Access Journals (Sweden)

    Sutarman

    2015-02-01

    Full Text Available In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems. This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can handle texts with abbreviation, there is the facility to add such words.

  11. Combinatorics of compositions and words

    CERN Document Server

    Heubach, Silvia

    2009-01-01

    A One-Stop Source of Known Results, a Bibliography of Papers on the Subject, and Novel Research Directions Focusing on a very active area of research in the last decade, Combinatorics of Compositions and Words provides an introduction to the methods used in the combinatorics of pattern avoidance and pattern enumeration in compositions and words. It also presents various tools and approaches that are applicable to other areas of enumerative combinatorics. After a historical perspective on research in the area, the text introduces techniques to solve recurrence relations, including iteration and generating functions. It then focuses on enumeration of basic statistics for compositions. The text goes on to present results on pattern avoidance for subword, subsequence, and generalized patterns in compositions and then applies these results to words. The authors also cover automata, the ECO method, generating trees, and asymptotic results via random compositions and complex analysis. Highlighting both established a...

  12. Automatic Prompt System in the Process of Mapping plWordNet on Princeton WordNet

    Directory of Open Access Journals (Sweden)

    Paweł Kędzia

    2015-06-01

    Full Text Available Automatic Prompt System in the Process of Mapping plWordNet on Princeton WordNet The paper offers a critical evaluation of the power and usefulness of an automatic prompt system based on the extended Relaxation Labelling algorithm in the process of (manual mapping plWordNet on Princeton WordNet. To this end the results of manual mapping – that is inter-lingual relations between plWN and PWN synsets – are juxtaposed with the automatic prompts that were generated for the source language synsets to be mapped. We check the number and type of inter-lingual relations introduced on the basis of automatic prompts and the distance of the respective prompt synsets from the actual target language synsets.

  13. Teach yourself visually Word 2013

    CERN Document Server

    Marmel, Elaine

    2013-01-01

    Get up to speed on the newest version of Word with visual instruction Microsoft Word is the standard for word processing programs, and the newest version offers additional functionality you'll want to use. Get up to speed quickly and easily with the step-by-step instructions and full-color screen shots in this popular guide! You'll see how to perform dozens of tasks, including how to set up and format documents and text; work with diagrams, charts, and pictures; use Mail Merge; post documents online; and much more. Easy-to-follow, two-page lessons make learning a snap.Full-

  14. Fixed versus dynamic co-occurrence windows in TextRank term weights for information retrieval

    DEFF Research Database (Denmark)

    Lu, Wei; Cheng, Qikai; Lioma, Christina

    2012-01-01

    iteratively is a score for each vertex, i.e. a term weight, that can be used for information retrieval (IR) just like conventional term frequency based term weights. So far, when computing TextRank term weights over co-occurrence graphs, the window of term co-occurrence is always fixed. This work departs from...

  15. Visual word learning in adults with dyslexia

    Directory of Open Access Journals (Sweden)

    Rosa Kit Wan Kwok

    2014-05-01

    Full Text Available We investigated word learning in university and college students with a diagnosis of dyslexia and in typically-reading controls. Participants read aloud short (4-letter and longer (7-letter nonwords as quickly as possible. The nonwords were repeated across 10 blocks, using a different random order in each block. Participants returned 7 days later and repeated the experiment. Accuracy was high in both groups. The dyslexics were substantially slower than the controls at reading the nonwords throughout the experiment. They also showed a larger length effect, indicating less effective decoding skills. Learning was demonstrated by faster reading of the nonwords across repeated presentations and by a reduction in the difference in reading speeds between shorter and longer nonwords. The dyslexics required more presentations of the nonwords before the length effect became non-significant, only showing convergence in reaction times between shorter and longer items in the second testing session where controls achieved convergence part-way through the first session. Participants also completed a psychological test battery assessing reading and spelling, vocabulary, phonological awareness, working memory, nonverbal ability and motor speed. The dyslexics performed at a similar level to the controls on nonverbal ability but significantly less well on all the other measures. Regression analyses found that decoding ability, measured as the speed of reading aloud nonwords when they were presented for the first time, was predicted by a composite of word reading and spelling scores (‘literacy’. Word learning was assessed in terms of the improvement in naming speeds over 10 blocks of training. Learning was predicted by vocabulary and working memory scores, but not by literacy, phonological awareness, nonverbal ability or motor speed. The results show that young dyslexic adults have problems both in pronouncing novel words and in learning new written words.

  16. A Network Text Analysis of David Ayer’s Fury

    Directory of Open Access Journals (Sweden)

    Starling David Hunter

    2015-12-01

    Full Text Available Network Text Analysis (NTA involves the creation of networks of words and/or concepts from linguistic data. Its key insight is that the position of words and concepts in a text network provides vital clues to the central and underlying themes of the text as a whole. Recent research has relied on inductive approaches to identify these themes. In this study we demonstrate a deductive approach that we apply to the screenplay of the 2014 World War II-era film Fury. Specifically, we first use genre expectations theory to establish prior expectations as to the key themes associated with war films. We then empirically test whether words and concepts associated with the most influentially-positioned nodes are consistent with themes common to the war-film genre. As predicted, we find that words and concepts associated with the least constrained nodes in the text network were significantly more likely to be associated with the war, action, and biography genres and significantly less likely to be associated with the mystery, science-fiction, fantasy, and film-noir genres. Keywords: content analysis, text analysis, network text analysis, semantic network analysis, film studies, screenplay, screenwriting, war movies, World War II, tanks

  17. Word add-in for ontology recognition: semantic enrichment of scientific literature

    Directory of Open Access Journals (Sweden)

    Naim Oscar

    2010-02-01

    Full Text Available Abstract Background In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. Results The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. Conclusions The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.

  18. Sub-word based Arabic handwriting analysis for writer identification

    Science.gov (United States)

    Maliki, Makki; Al-Jawad, Naseer; Jassim, Sabah

    2013-05-01

    Analysing a text or part of it is key to handwriting identification. Generally, handwriting is learnt over time and people develop habits in the style of writing. These habits are embedded in special parts of handwritten text. In Arabic each word consists of one or more sub-word(s). The end of each sub-word is considered to be a connect stroke. The main hypothesis in this paper is that sub-words are essential reflection of Arabic writer's habits that could be exploited for writer identification. Testing this hypothesis will be based on experiments that evaluate writer's identification, mainly using K nearest neighbor from group of sub-words extracted from longer text. The experimental results show that using a group of sub-words could be used to identify the writer with a successful rate between 52.94 % to 82.35% when top1 is used, and it can go up to 100% when top5 is used based on K nearest neighbor. The results show that majority of writers are identified using 7 sub-words with a reliability confident of about 90% (i.e. 90% of the rejected templates have significantly larger distances to the tested example than the distance from the correctly identified template). However previous work, using a complete word, shows successful rate of at most 90% in top 10.

  19. WORD OF MOUTH ON SOCIAL MEDIA

    Directory of Open Access Journals (Sweden)

    Ana Raluca CHIOSA

    2014-12-01

    Full Text Available Through the access to information, the Internet has transformed people lifestyle, their preference for products, how they relate to brands. Perceived as an open space, without limitation, social media has become the main channel for expression of word-of-mouth, with both positive and negative effects. Thus The Internet has allowed the development of WOM, making it contemporary in our technological world. This paper examines the motives for adopting WOM behavior, forms of WOM, the WOM model and principles, directions of WOM research. Brand engagement has made consumers more powerful in terms of requirements and evaluation of product/brand, more demanding and impatient in brand communication and market response.

  20. Word Variant Identification in Old French

    Directory of Open Access Journals (Sweden)

    Peter Willett

    1997-01-01

    Full Text Available Increasing numbers of historical texts are available in machine-readable form, which retain the original spelling, which can be very different from the modern-day equivalents due to the natural evolution of a language, and because the concept of standardisation in spelling is comparatively modern. Among medieval vernacular writers, the same word could be spelled in different ways and the same author (or scribe might even use several alternative spellings in the same passage. Thus, we do not know,a priori, how many variant forms of a particular word there are in such texts, let alone what these variants might be. Searching on the modern equivalent, or even the commonest historical variant, of a particular word may thus fail to retrieve an appreciable number of occurrences unless the searcher already has an extensive knowledge of the language of the documents. Moreover, even specialist scholars may be unaware of some idiosyncratic variants. Here, we consider the use of computer methods to retrieve variant historical spellings.

  1. LINGUISTIC DATABASE FOR AUTOMATIC GENERATION SYSTEM OF ENGLISH ADVERTISING TEXTS

    Directory of Open Access Journals (Sweden)

    N. A. Metlitskaya

    2017-01-01

    Full Text Available The article deals with the linguistic database for the system of automatic generation of English advertising texts on cosmetics and perfumery. The database for such a system includes two main blocks: automatic dictionary (that contains semantic and morphological information for each word, and semantic-syntactical formulas of the texts in a special formal language SEMSINT. The database is built on the result of the analysis of 30 English advertising texts on cosmetics and perfumery. First, each word was given a unique code. For example, N stands for nouns, A – for adjectives, V – for verbs, etc. Then all the lexicon of the analyzed texts was distributed into different semantic categories. According to this semantic classification each word was given a special semantic code. For example, the record N01 that is attributed to the word «lip» in the dictionary means that this word refers to nouns of the semantic category «part of a human’s body».The second block of the database includes the semantic-syntactical formulas of the analyzed advertising texts written in a special formal language SEMSINT. The author gives a brief description of this language, presenting its essence and structure. Also, an example of one formalized advertising text in SEMSINT is provided.

  2. Short Vowels Versus Word Familiarity in the Reading Comprehension of Arab Readers: A Revisited Issue

    Directory of Open Access Journals (Sweden)

    Abdullah M. SERAYE

    2016-03-01

    Full Text Available Arab readers, both beginning and advanced, are encouraged to read and accustomed to unvowelized and undiacriticized texts. Previous literature claimed that the presence of short vowels in the text would facilitate the reading comprehension of both beginning and advanced Arab readers. However, with a claimed strict controlling procedure, different results emerged, revealing that the only variable that affected the reading process of Arab adult skilled readers was word frequency, and its effect was limited to the time load of the reading process; this result raised the question of whether the neutral role of short vowels in the text reading process of experienced Arab readers would be maintained for less experienced readers, as represented by fourth graders, or whether word frequency would be the only variable that plays a role in their reading process. In experiment, 1,141 fourth-grade students were randomly assigned to 5 reading conditions: plain, only shaddah, short vowels plus shaddah, only short vowels, and finally the wrong short vowels plus shaddah. In experiment 2, 38 participants from the same population were assigned to a fully vowelized and diacriticized reading condition. Each participant was asked to read two texts, of high and low frequency words and then given recall and multiple-choice tests. In general, the multivariate analysis showed that the only manipulated variable that was found to affect their reading process in terms of reading time load and, to some degree, reading comprehension was word frequency, although its effect was marginal. Accordingly, pedagogical recommendations and future research were proposed.

  3. Short vowels versus word familiarity in the reading comprehension of arab readers: A revisited issue

    Directory of Open Access Journals (Sweden)

    Abdullah M. Seraye

    2016-03-01

    Full Text Available Arab readers, both beginning and advanced, are encouraged to read and accustomed to unvowelized and undiacriticized texts. Previous literature claimed that the presence of short vowels in the text would facilitate the reading comprehension of both beginning and advanced Arab readers. However, with a claimed strict controlling procedure, different results emerged, revealing that the only variable that affected the reading process of Arab adult skilled readers was word frequency, and its effect was limited to the time load of the reading process; this result raised the question of whether the neutral role of short vowels in the text reading process of experienced Arab readers would be maintained for less experienced readers, as represented by fourth graders, or whether word frequency would be the only variable that plays a role in their reading process. In experiment, 1,141 fourth-grade students were randomly assigned to 5 reading conditions: plain, only shaddah, short vowels plus shaddah, only short vowels, and finally the wrong short vowels plus shaddah. In experiment 2, 38 participants from the same population were assigned to a fully vowelized and diacriticized reading condition. Each participant was asked to read two texts, of high and low frequency words and then given recall and multiple-choice tests. In general, the multivariate analysis showed that the only manipulated variable that was found to affect their reading process in terms of reading time load and, to some degree, reading comprehension was word frequency, although its effect was marginal. Accordingly, pedagogical recommendations and future research were proposed.

  4. Word translation entropy in translation

    DEFF Research Database (Denmark)

    Schaeffer, Moritz; Dragsted, Barbara; Hvelplund, Kristian Tangsgaard

    2016-01-01

    This study reports on an investigation into the relationship between the number of translation alternatives for a single word and eye movements on the source text. In addition, the effect of word order differences between source and target text on eye movements on the source text is studied....... In particular, the current study investigates the effect of these variables on early and late eye movement measures. Early eye movement measures are indicative of processes that are more automatic while late measures are more indicative of conscious processing. Most studies that found evidence of target...... language activation during source text reading in translation, i.e. co-activation of the two linguistic systems, employed late eye movement measures or reaction times. The current study therefore aims to investigate if and to what extent earlier eye movement measures in reading for translation show...

  5. Age-Dependent Positivity-Bias in Children’s Processing of Emotion Terms

    Directory of Open Access Journals (Sweden)

    Daniela Bahn

    2017-07-01

    Full Text Available Emotions play an important role in human communication, and the daily-life interactions of young children often include situations that require the verbalization of emotional states with verbal means, e.g., with emotion terms. Through them, one can express own emotional states and those of others. Thus, the acquisition of emotion terms allows children to participate more intensively in social contexts – a basic requirement for learning new words and for elaborating socio-emotional skills. However, little is known about how children acquire and process this specific word category, which is positioned between concrete and abstract words. In particular, the influence of valence on emotion word processing during childhood has not been sufficiently investigated. Previous research points to an advantage of positive words over negative and neutral words in word processing. While previous studies found valence effects to be influenced by factors such as arousal, frequency, concreteness, and task, it is still unclear if and how valence effects are also modified by age. The present study compares the performance of children aged from 5 to 12 years and adults in two experimental tasks: lexical decision (word or pseudoword and emotional categorization (positive or negative. Stimuli consisted of 48 German emotion terms (24 positive and 24 negative matched for arousal, concreteness, age of acquisition, word class, word length, morphological complexity, frequency, and neighborhood density. Results from both tasks reveal two developmental trends: First, with increasing age children responded faster and more correctly, suggesting that emotion vocabulary gradually becomes more stable and differentiated during middle childhood. Second, the influence of valence varied with age: younger children (5- and 6-year-olds showed significantly higher performance levels for positive emotion terms compared to negative emotion terms, whereas older children and adults did not

  6. The Language Environment and Syntactic Word-Class Acquisition.

    Science.gov (United States)

    Zavrel, Jakub; Veenstra, Jorn

    A study analyzed the distribution of words in a three-million-word corpus of text from the "Wall Street Journal," in order to test a theory of the acquisition of word categories. The theory, an alternative to the semantic bootstrapping hypothesis, proposes that the child exploits multiple sources of cues (distributional, semantic, or…

  7. The problem of polysemy in the first thousand words of the General Service List: A corpus study of secondary chemistry texts

    Science.gov (United States)

    Clemmons, Karina

    Vocabulary in a second language is an indispensable building block of all comprehension (Folse, 2006; Nation, 2006). Teachers in content area classes such as science, math, and social studies frequently teach content specific vocabulary, but are not aware of the obstacles that can occur when students do not know the basic words. Word lists such as the General Service List (GSL) were created to assist students and teachers (West, 1953). The GSL does not adequately take into account the high level of polysemy of many common English words, nor has it been updated by genre to reflect specific content domains encountered by secondary science students in today's high stakes classes such as chemistry. This study examines how many words of the first 1000 words of the GSL occurred in the secondary chemistry textbooks sampled, how often the first 1000 words of the GSL were polysemous, and specifically which multiple meanings occurred. A discussion of results includes word tables that list multiple meanings present, example phrases that illustrate the context surrounding the target words, suggestions for a GSL that is genre specific to secondary chemistry textbooks and that is ranked by meaning as well as type, and implications for both vocabulary materials and classroom instruction for ELLs in secondary chemistry classes. Findings are essential to second language (L2) researchers, materials developers, publishers, and teachers.

  8. Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings

    Science.gov (United States)

    Yan, Xiaoyong; Minnhagen, Petter

    2015-01-01

    The word-frequency distribution of a text written by an author is well accounted for by a maximum entropy distribution, the RGF (random group formation)-prediction. The RGF-distribution is completely determined by the a priori values of the total number of words in the text (M), the number of distinct words (N) and the number of repetitions of the most common word (kmax). It is here shown that this maximum entropy prediction also describes a text written in Chinese characters. In particular it is shown that although the same Chinese text written in words and Chinese characters have quite differently shaped distributions, they are nevertheless both well predicted by their respective three a priori characteristic values. It is pointed out that this is analogous to the change in the shape of the distribution when translating a given text to another language. Another consequence of the RGF-prediction is that taking a part of a long text will change the input parameters (M, N, kmax) and consequently also the shape of the frequency distribution. This is explicitly confirmed for texts written in Chinese characters. Since the RGF-prediction has no system-specific information beyond the three a priori values (M, N, kmax), any specific language characteristic has to be sought in systematic deviations from the RGF-prediction and the measured frequencies. One such systematic deviation is identified and, through a statistical information theoretical argument and an extended RGF-model, it is proposed that this deviation is caused by multiple meanings of Chinese characters. The effect is stronger for Chinese characters than for Chinese words. The relation between Zipf’s law, the Simon-model for texts and the present results are discussed. PMID:25955175

  9. Forehearing words: Pre-activation of word endings at word onset.

    Science.gov (United States)

    Roll, Mikael; Söderström, Pelle; Frid, Johan; Mannfolk, Peter; Horne, Merle

    2017-09-29

    Occurring at rates up to 6-7 syllables per second, speech perception and understanding involves rapid identification of speech sounds and pre-activation of morphemes and words. Using event-related potentials (ERPs) and functional magnetic resonance imaging (fMRI), we investigated the time-course and neural sources of pre-activation of word endings as participants heard the beginning of unfolding words. ERPs showed a pre-activation negativity (PrAN) for word beginnings (first two segmental phonemes) with few possible completions. PrAN increased gradually as the number of possible completions of word onsets decreased and the lexical frequency of the completions increased. The early brain potential effect for few possible word completions was associated with a blood-oxygen-level-dependent (BOLD) contrast increase in Broca's area (pars opercularis of the left inferior frontal gyrus) and angular gyrus of the left parietal lobe. We suggest early involvement of the left prefrontal cortex in inhibiting irrelevant left parietal activation during lexical selection. The results further our understanding of the importance of Broca's area in rapid online pre-activation of words. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  10. WORD-ATTACK SKILLS FOR INDONESIAN LEARNERS

    Directory of Open Access Journals (Sweden)

    Joko Pranowo

    2006-01-01

    Full Text Available The typical drawbacks that affect most Indonesian learners studying English as the target language concern the strategies in dealing with new dictions. The rule of thumb is that the learners are usually tempted to directly look up the meaning in a dictionary when other ways such as guessing the meaning from the context or by dissecting the words into smaller units so that they are able to get a hint from the base word cannot be engineered. As a result of this activity then, they miss crucial points in the realm of word enrichment. This article will shed some light on how to deal with new words and claim that it is not the meaning of a new word that should be the first priority.

  11. Text-Mining Applications for Creation of Biofilm Literature Database

    Directory of Open Access Journals (Sweden)

    Kanika Gupta

    2017-10-01

    So in the present research published corpora of 34306 documents for biofilm was collected from PubMed database along with non-indexed resources like books, conferences, newspaper articles, etc. and these were divided into five categories i.e. classification, growth and development, physiology, drug effects and radiation effects. These five categories were further individually divided into three parts i.e. Journal Title, Abstract Title, and Abstract Text to make indexing highly specific. Text-processing was done using the software Rapid Miner_v5.3, which tokenizes the entire text into words and provides the frequency of each word within the document. The obtained words were normalized using Remove Stop and Stem Word command of Rapid Miner_v5.3 which removes the stopping and stemming words. The obtained words were stored in MS-Excel 2007 and were sorted in decreasing order of frequency using Sort & Filter command of MS-Excel 2007. The words are visualization through networks obtained by Cytoscape_v2.7.0. Now the words obtained were highly specific for biofilms, generating a controlled biofilm vocabulary and this vocabulary could be used for indexing articles for biofilm (similar to MeSH database which indexes articles for PubMed. The obtained keywords information was stored in the relational database which is locally hosted using the WAMP_v2.4 (Windows, Apache, MySQL, PHP server. The available biofilm vocabulary will be significant for researchers studying biofilm literature, making their search easy and efficient.

  12. DrugQuest - a text mining workflow for drug association discovery.

    Science.gov (United States)

    Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Vizirianakis, Ioannis S; Iliopoulos, Ioannis

    2016-06-06

    Text mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases. Herein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface. DrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest .

  13. A new universality class in corpus of texts; A statistical physics study

    Science.gov (United States)

    Najafi, Elham; Darooneh, Amir H.

    2018-05-01

    Text can be regarded as a complex system. There are some methods in statistical physics which can be used to study this system. In this work, by means of statistical physics methods, we reveal new universal behaviors of texts associating with the fractality values of words in a text. The fractality measure indicates the importance of words in a text by considering distribution pattern of words throughout the text. We observed a power law relation between fractality of text and vocabulary size for texts and corpora. We also observed this behavior in studying biological data.

  14. [Pilot study of domain-specific terminology adaptation for morphological analysis: research on unknown terms in national examination documents of radiological technologists].

    Science.gov (United States)

    Tsuji, Shintarou; Nishimoto, Naoki; Ogasawara, Katsuhiko

    2008-07-20

    Although large medical texts are stored in electronic format, they are seldom reused because of the difficulty of processing narrative texts by computer. Morphological analysis is a key technology for extracting medical terms correctly and automatically. This process parses a sentence into its smallest unit, the morpheme. Phrases consisting of two or more technical terms, however, cause morphological analysis software to fail in parsing the sentence and output unprocessed terms as "unknown words." The purpose of this study was to reduce the number of unknown words in medical narrative text processing. The results of parsing the text with additional dictionaries were compared with the analysis of the number of unknown words in the national examination for radiologists. The ratio of unknown words was reduced 1.0% to 0.36% by adding terminologies of radiological technology, MeSH, and ICD-10 labels. The terminology of radiological technology was the most effective resource, being reduced by 0.62%. This result clearly showed the necessity of additional dictionary selection and trends in unknown words. The potential for this investigation is to make available a large body of clinical information that would otherwise be inaccessible for applications other than manual health care review by personnel.

  15. Evidence for Human Fronto-Central Gamma Activity during Long-Term Memory Encoding of Word Sequences

    Science.gov (United States)

    Meeuwissen, Esther Berendina; Takashima, Atsuko; Fernández, Guillén; Jensen, Ole

    2011-01-01

    Although human gamma activity (30–80 Hz) associated with visual processing is often reported, it is not clear to what extend gamma activity can be reliably detected non-invasively from frontal areas during complex cognitive tasks such as long term memory (LTM) formation. We conducted a memory experiment composed of 35 blocks each having three parts: LTM encoding, working memory (WM) maintenance and LTM retrieval. In the LTM encoding and WM maintenance parts, participants had to respectively encode or maintain the order of three sequentially presented words. During LTM retrieval subjects had to reproduce these sequences. Using magnetoencephalography (MEG) we identified significant differences in the gamma and beta activity. Robust gamma activity (55–65 Hz) in left BA6 (supplementary motor area (SMA)/pre-SMA) was stronger during LTM rehearsal than during WM maintenance. The gamma activity was sustained throughout the 3.4 s rehearsal period during which a fixation cross was presented. Importantly, the difference in gamma band activity correlated with memory performance over subjects. Further we observed a weak gamma power difference in left BA6 during the first half of the LTM rehearsal interval larger for successfully than unsuccessfully reproduced word triplets. In the beta band, we found a power decrease in left anterior regions during LTM rehearsal compared to WM maintenance. Also this suppression of beta power correlated with memory performance over subjects. Our findings show that an extended network of brain areas, characterized by oscillatory activity in different frequency bands, supports the encoding of word sequences in LTM. Gamma band activity in BA6 possibly reflects memory processes associated with language and timing, and suppression of beta activity at left frontal sensors is likely to reflect the release of inhibition directly associated with the engagement of language functions. PMID:21738641

  16. Learning from Scientific Texts: Personalizing the Text Increases Transfer Performance and Task Involvement

    Science.gov (United States)

    Dutke, Stephan; Grefe, Anna Christina; Leopold, Claudia

    2016-01-01

    In an experiment with 65 high-school students, we tested the hypothesis that personalizing learning materials would increase students' learning performance and motivation to study the learning materials. Students studied either a 915-word standard text on the anatomy and functionality of the human eye or a personalized version of the same text in…

  17. New baseline correction algorithm for text-line recognition with bidirectional recurrent neural networks

    Science.gov (United States)

    Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle

    2013-04-01

    Many preprocessing techniques have been proposed for isolated word recognition. However, recently, recognition systems have dealt with text blocks and their compound text lines. In this paper, we propose a new preprocessing approach to efficiently correct baseline skew and fluctuations. Our approach is based on a sliding window within which the vertical position of the baseline is estimated. Segmentation of text lines into subparts is, thus, avoided. Experiments conducted on a large publicly available database (Rimes), with a BLSTM (bidirectional long short-term memory) recurrent neural network recognition system, show that our baseline correction approach highly improves performance.

  18. Voice congruency facilitates word recognition.

    Directory of Open Access Journals (Sweden)

    Sandra Campeanu

    Full Text Available Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.

  19. The Functions Of Taboo Words And Their Translation In Subtitling: A Case Study In “The Help”

    Directory of Open Access Journals (Sweden)

    Agus Darma Yoga Pratama

    2016-10-01

    Full Text Available Translating taboo words in subtitling especially translating them into Indonesian is quite difficult since most of the Indonesian people are not used to uttering taboo or offensive words publicly. In addition, watching movie is more of social activity compared to reading and that is why reading taboo expressions while watching might be embarrassing. This study tries to explore the functions of taboo words found in “The Help” movie and tries to find out how the translator translate the taboo words into the target language in order to produce the closest functions to the source language without ignoring the technical aspects of subtitling. This study also deals with the strategy used by the translator to translate the taboo words. The main theories applied here in are from Karamitroglou (1998, Ljung (2011, Toury (1995, and Gottlieb (1992. There are 70 taboo words found in the raw data and the functions of those taboo words are to express sympathy, surprise, disappointment, disbelief, fear, annoyance, metaphorical interpretation, reaction to mishap, to emphasize the associated item, function as adjectival intensifier, name-calling, anaphoric use of epithet, oath, curse, unfriendly suggestion, and four of the taboo words show non-swearing word or in dysphemism form. The strategies applied are omission (16, transfer (27, and euphemism (26. In terms of the technical aspect in subtitling, all of the subtitles in the target language are presented at the maximum of two lines at once. However, there are three lines of the subtitles which exceed the maximum numbers of characters being proposed. Since taboo word is not only used to offend someone, it is important for the translator to get the closest equivalence in the target language in order to maintain its function. The translator may choose whether he/she wants to follow the source language norms to produce adequate target text or follow the target language norms in order to produce acceptable

  20. Prophetic sensing of Yahweh’s word

    Directory of Open Access Journals (Sweden)

    Wilhelm J. Wessels

    2015-07-01

    Full Text Available This article focuses on Jeremiah 23:18, which implies that the prophet stood in the council of Yahweh (sôd to see and hear the word of Yahweh. In this verse, it seems that the senses of the prophet played a role in receiving Yahweh’s words. Verse 18 forms part of 23:16–22 in which Jeremiah warned the people of Judah not to listen to prophets who mislead them with optimistic messages. In this article, attention is given to the question whether standing in the council of Yahweh is a deciding criterion for receiving true words from Yahweh. The motif of the divine council is also investigated. An argument is presented that ‘sensing’ should be understood in the double sense of the word, namely sensory experience as well as the intellectual activity of understanding. It is argued that both meanings of the word sensing are necessary to determine the truth of Yahweh’s word.

  1. Efficacy of a Word- and Text-Based Intervention for Students With Significant Reading Difficulties.

    Science.gov (United States)

    Vaughn, Sharon; Roberts, Garrett J; Miciak, Jeremy; Taylor, Pat; Fletcher, Jack M

    2018-05-01

    We examine the efficacy of an intervention to improve word reading and reading comprehension in fourth- and fifth-grade students with significant reading problems. Using a randomized control trial design, we compare the fourth- and fifth-grade reading outcomes of students with severe reading difficulties who were provided a researcher-developed treatment with reading outcomes of students in a business-as-usual (BAU) comparison condition. A total of 280 fourth- and fifth-grade students were randomly assigned within school in a 1:1 ratio to either the BAU comparison condition ( n = 139) or the treatment condition ( n = 141). Treatment students were provided small-group tutoring for 30 to 45 minutes for an average of 68 lessons (mean hours of instruction = 44.4, SD = 11.2). Treatment students performed statistically significantly higher than BAU students on a word reading measure (effect size [ES] = 0. 58) and a measure of reading fluency (ES = 0.46). Though not statistically significant, effect sizes for students in the treatment condition were consistently higher than BAU students for decoding measures (ES = 0.06, 0.08), and mixed for comprehension (ES = -0.02, 0.14).

  2. Appraisal of space words and allocation of emotion words in bodily space.

    Directory of Open Access Journals (Sweden)

    Fernando Marmolejo-Ramos

    Full Text Available The body-specificity hypothesis (BSH predicts that right-handers and left-handers allocate positive and negative concepts differently on the horizontal plane, i.e., while left-handers allocate negative concepts on the right-hand side of their bodily space, right-handers allocate such concepts to the left-hand side. Similar research shows that people, in general, tend to allocate positive and negative concepts in upper and lower areas, respectively, in relation to the vertical plane. Further research shows a higher salience of the vertical plane over the horizontal plane in the performance of sensorimotor tasks. The aim of the paper is to examine whether there should be a dominance of the vertical plane over the horizontal plane, not only at a sensorimotor level but also at a conceptual level. In Experiment 1, various participants from diverse linguistic backgrounds were asked to rate the words "up", "down", "left", and "right". In Experiment 2, right-handed participants from two linguistic backgrounds were asked to allocate emotion words into a square grid divided into four boxes of equal areas. Results suggest that the vertical plane is more salient than the horizontal plane regarding the allocation of emotion words and positively-valenced words were placed in upper locations whereas negatively-valenced words were placed in lower locations. Together, the results lend support to the BSH while also suggesting a higher saliency of the vertical plane over the horizontal plane in the allocation of valenced words.

  3. Knowledge-based biomedical word sense disambiguation: comparison of approaches

    Directory of Open Access Journals (Sweden)

    Aronson Alan R

    2010-11-01

    Full Text Available Abstract Background Word sense disambiguation (WSD algorithms attempt to select the proper sense of ambiguous terms in text. Resources like the UMLS provide a reference thesaurus to be used to annotate the biomedical literature. Statistical learning approaches have produced good results, but the size of the UMLS makes the production of training data infeasible to cover all the domain. Methods We present research on existing WSD approaches based on knowledge bases, which complement the studies performed on statistical learning. We compare four approaches which rely on the UMLS Metathesaurus as the source of knowledge. The first approach compares the overlap of the context of the ambiguous word to the candidate senses based on a representation built out of the definitions, synonyms and related terms. The second approach collects training data for each of the candidate senses to perform WSD based on queries built using monosemous synonyms and related terms. These queries are used to retrieve MEDLINE citations. Then, a machine learning approach is trained on this corpus. The third approach is a graph-based method which exploits the structure of the Metathesaurus network of relations to perform unsupervised WSD. This approach ranks nodes in the graph according to their relative structural importance. The last approach uses the semantic types assigned to the concepts in the Metathesaurus to perform WSD. The context of the ambiguous word and semantic types of the candidate concepts are mapped to Journal Descriptors. These mappings are compared to decide among the candidate concepts. Results are provided estimating accuracy of the different methods on the WSD test collection available from the NLM. Conclusions We have found that the last approach achieves better results compared to the other methods. The graph-based approach, using the structure of the Metathesaurus network to estimate the relevance of the Metathesaurus concepts, does not perform well

  4. Blending Words Found In Social Media

    Directory of Open Access Journals (Sweden)

    Giyatmi Giyatmi

    2017-12-01

    Full Text Available There are many new words from the social media such as Netizen, Trentop, and Delcon. Those words include in blending. Blending is one of word formations combining two clipped words to form a brand new word. The researchers are interested in analyzing blend words used in the social media such as Instagram, Twitter, Facebook, and Blackberry Messenger. This research aims at (1 finding blend words used in the social media (2 describing kinds of blend words used in social media (3 describing the process of blend word formation used in the social media. This research uses some theories dealing with definition of blending and kinds of blending. This research belongs to descriptive qualitative research. Data of the research are English blend words used in social media. Data sources of this research are websites consisting of some English words used in social media and some social media users as the informant. Techniques of data collecting in this research are observation and simak catat. Observation is by observing some websites consisting of some English words used in social media. Simak catat is done by taking some notes on the data and encoding in symbols such as No/Blend words/Kinds of Blending. The researchers use source triangulation to check the data from the researchers with the informant and theory triangulation to determine kinds of blending and blend word formation in social media. There are115 data of blend words. Those data consists of 65 data of Instagram, 47 data of Twitter, 1 datum of Facebook, and 2 data of Blackberry Messenger. There are 2 types of blending used in social media;108 data of blending with clipping and 7 data of blending with overlapping. There are 10 ways of blend word formation found in this research.

  5. Word encoding during sleep is suggested by correlations between word-evoked up-states and post-sleep semantic priming

    Directory of Open Access Journals (Sweden)

    Simon eRuch

    2014-11-01

    Full Text Available To test whether humans can encode words during sleep we played everyday words to men while they were napping and assessed priming from sleep-played words following waking. Words were presented during non-rapid eye movement (NREM sleep. Priming was assessed using a semantic and a perceptual priming test. These tests measured differences in the processing of words that had been or had not been played during sleep. Synonyms to sleep-played words were the targets in the semantic priming test that tapped the meaning of sleep-played words. All men responded to sleep-played words by producing up-states in their electroencephalogram. Up-states are NREM sleep-specific phases of briefly increased neuronal excitability. The word-evoked up-states might have promoted word processing during sleep. Yet, the mean performance in the priming tests administered following sleep was at chance level, which suggests that participants as a group failed to show priming following sleep. However, performance in the two priming tests was positively correlated to each other and to the magnitude of the word-evoked up-states. Hence, the larger a participant’s word-evoked up-states, the larger his perceptual and semantic priming. Those participants who scored high on all variables must have encoded words during sleep. We conclude that some humans are able to encode words during sleep, but more research is needed to pin down the factors that modulate this ability.

  6. Infinite permutations vs. infinite words

    Directory of Open Access Journals (Sweden)

    Anna E. Frid

    2011-08-01

    Full Text Available I am going to compare well-known properties of infinite words with those of infinite permutations, a new object studied since middle 2000s. Basically, it was Sergey Avgustinovich who invented this notion, although in an early study by Davis et al. permutations appear in a very similar framework as early as in 1977. I am going to tell about periodicity of permutations, their complexity according to several definitions and their automatic properties, that is, about usual parameters of words, now extended to permutations and behaving sometimes similarly to those for words, sometimes not. Another series of results concerns permutations generated by infinite words and their properties. Although this direction of research is young, many people, including two other speakers of this meeting, have participated in it, and I believe that several more topics for further study are really promising.

  7. Terminology of the public relations field: corpus — automatic term recognition — terminology database

    Directory of Open Access Journals (Sweden)

    Nataša Logar Berginc

    2013-12-01

    Full Text Available The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a terminology database of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b multi-word term candidates with verb and noun as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The most productive patterns of multi-word terms with noun as a headword have the following structure: [adjective + noun], [adjective + and + adjective + noun] and [adjective + adjective + noun]. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.

  8. Privacy Preserving Similarity Based Text Retrieval through Blind Storage

    Directory of Open Access Journals (Sweden)

    Pinki Kumari

    2016-09-01

    Full Text Available Cloud computing is improving rapidly due to their more advantage and more data owners give interest to outsource their data into cloud storage for centralize their data. As huge files stored in the cloud storage, there is need to implement the keyword based search process to data user. At the same time to protect the privacy of data, encryption techniques are used for sensitive data, that encryption is done before outsourcing data to cloud server. But it is critical to search results in encryption data. In this system we propose similarity text retrieval from the blind storage blocks with encryption format. This system provides more security because of blind storage system. In blind storage system data is stored randomly on cloud storage.  In Existing Data Owner cannot encrypt the document data as it was done only at server end. Everyone can access the data as there was no private key concept applied to maintained privacy of the data. But In our proposed system, Data Owner can encrypt the data himself using RSA algorithm.  RSA is a public key-cryptosystem and it is widely used for sensitive data storage over Internet. In our system we use Text mining process for identifying the index files of user documents. Before encryption we also use NLP (Nature Language Processing technique to identify the keyword synonyms of data owner document. Here text mining process examines text word by word and collect literal meaning beyond the words group that composes the sentence. Those words are examined in API of word net so that only equivalent words can be identified for index file use. Our proposed system provides more secure and authorized way of recover the text in cloud storage with access control. Finally, our experimental result shows that our system is better than existing.

  9. Facilitating text reading in posterior cortical atrophy.

    Science.gov (United States)

    Yong, Keir X X; Rajdev, Kishan; Shakespeare, Timothy J; Leff, Alexander P; Crutch, Sebastian J

    2015-07-28

    We report (1) the quantitative investigation of text reading in posterior cortical atrophy (PCA), and (2) the effects of 2 novel software-based reading aids that result in dramatic improvements in the reading ability of patients with PCA. Reading performance, eye movements, and fixations were assessed in patients with PCA and typical Alzheimer disease and in healthy controls (experiment 1). Two reading aids (single- and double-word) were evaluated based on the notion that reducing the spatial and oculomotor demands of text reading might support reading in PCA (experiment 2). Mean reading accuracy in patients with PCA was significantly worse (57%) compared with both patients with typical Alzheimer disease (98%) and healthy controls (99%); spatial aspects of passages were the primary determinants of text reading ability in PCA. Both aids led to considerable gains in reading accuracy (PCA mean reading accuracy: single-word reading aid = 96%; individual patient improvement range: 6%-270%) and self-rated measures of reading. Data suggest a greater efficiency of fixations and eye movements under the single-word reading aid in patients with PCA. These findings demonstrate how neurologic characterization of a neurodegenerative syndrome (PCA) and detailed cognitive analysis of an important everyday skill (reading) can combine to yield aids capable of supporting important everyday functional abilities. This study provides Class III evidence that for patients with PCA, 2 software-based reading aids (single-word and double-word) improve reading accuracy. © 2015 American Academy of Neurology.

  10. Facilitating text reading in posterior cortical atrophy

    Science.gov (United States)

    Rajdev, Kishan; Shakespeare, Timothy J.; Leff, Alexander P.; Crutch, Sebastian J.

    2015-01-01

    Objective: We report (1) the quantitative investigation of text reading in posterior cortical atrophy (PCA), and (2) the effects of 2 novel software-based reading aids that result in dramatic improvements in the reading ability of patients with PCA. Methods: Reading performance, eye movements, and fixations were assessed in patients with PCA and typical Alzheimer disease and in healthy controls (experiment 1). Two reading aids (single- and double-word) were evaluated based on the notion that reducing the spatial and oculomotor demands of text reading might support reading in PCA (experiment 2). Results: Mean reading accuracy in patients with PCA was significantly worse (57%) compared with both patients with typical Alzheimer disease (98%) and healthy controls (99%); spatial aspects of passages were the primary determinants of text reading ability in PCA. Both aids led to considerable gains in reading accuracy (PCA mean reading accuracy: single-word reading aid = 96%; individual patient improvement range: 6%–270%) and self-rated measures of reading. Data suggest a greater efficiency of fixations and eye movements under the single-word reading aid in patients with PCA. Conclusions: These findings demonstrate how neurologic characterization of a neurodegenerative syndrome (PCA) and detailed cognitive analysis of an important everyday skill (reading) can combine to yield aids capable of supporting important everyday functional abilities. Classification of evidence: This study provides Class III evidence that for patients with PCA, 2 software-based reading aids (single-word and double-word) improve reading accuracy. PMID:26138948

  11. Analysis of Corporal Punishment of Children in the Family based on the Semantic Reading of Traditional Islamic Narratives including the Word “Dharb” [Hitting

    Directory of Open Access Journals (Sweden)

    حمیدرضا بصیری

    2016-06-01

    Full Text Available There are various traditional Islamic narratives in relation to treating children. The common understanding of these narratives allows for the corporal punishment of children in the family, although some narratives forbid parents from such behavior. With reference to traditional Islamic texts, the word “hitting” (Arabic: dharb is the main and most frequently used word implying the permissibility of corporal punishment. An investigation of the use of the term “dharb” (Arabic: ضرب in the Quran, traditional Islamic narratives and the Arabic language reveals different instances of a general sense of “occurence” which can denote “doing or carrying out”. On this basis, a wide range of usages for this word in absolute terms (without preposition is conceivable with the meanings of protecting, financially supporting, guiding and nurturing children. It seems that limiting the meaning of dharb to corporal punishment in all traditional Islamic narratives without considering the other meanings has led to incorrect interpretations of such narratives.

  12. Theology of Jesus’ words from the cross

    Directory of Open Access Journals (Sweden)

    Bogdan Zbroja

    2012-09-01

    Full Text Available The article presents a theological message of the last words that Jesus spoke from the height of the cross. Layout content is conveyed in three kinds of Christ’s relations: the words addressed to God the Father; the words addressed to the good people standing by the cross; the so-called declarations that the Master had spoken to anyone but uttered them in general. All these words speak of the Master’s love. They express His full awareness of what is being done and of His decision voluntarily taken. Above all, it is revealed in the Lord’s statements His obedience to the will of God expressed in the inspired words of the Holy Scriptures. Jesus fulfills all the prophecies of the Old Testament by pronounced words and accomplished works that will become content of the New Testament.

  13. Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

    Directory of Open Access Journals (Sweden)

    Philip A. Huebner

    2018-02-01

    Full Text Available Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing

  14. Macrostructural Treatment of Multi-word Lexical Items

    Directory of Open Access Journals (Sweden)

    Alenka Vrbinc

    2011-05-01

    Full Text Available The paper discusses the macrostructural treatment of multi-word lexical items in mono- and bilingual dictionaries. First, the classification of multi-word lexical items is presented, and special attention is paid to the discussion of compounds – a specific group of multi-word lexical items that is most commonly afforded headword status but whose inclusion in the headword list may also depend on spelling. Then the inclusion of multi-word lexical items in monolingual dictionaries is dealt with in greater detail, while the results of a short survey on the inclusion of five randomly chosen multi-word lexical items in seven English monolingual dictionaries are presented. The proposals as to how to treat these five multi-word lexical items in bilingual dictionaries are presented in the section about the inclusion of multi-word lexical items in bilingual dictionaries. The conclusion is that it is most important to take the users’ needs into consideration and to make any dictionary as user friendly as possible.

  15. Craftsmanship and Applied Arts. A few words from nowadays.

    Directory of Open Access Journals (Sweden)

    Alberto Pratelli

    2009-06-01

    Full Text Available Within today’s Italian experience, we can see how as a society, we are adapting to a new world, in the fields of work, art and technical production, mainly following a virtual road, often fake and scenographic, often very far from the real contents. This approach is realized through the use of new words, and mainly through the use of old terms, intended for new stereotypes, in order to create a new world, more attractive and beautiful, but parallel, and at end of the day, nothing but useless. The paper focusses on these words - also trough the help of the enclosed drawings, just generally related to the topic, and takes into account their use, as they are really pervasive, but lacking of a real significance. Our aim is also that of asking to artists and technicians to rediscover a better connection between brain and hand, in order to give the development of new ideas through craftsmanship, a chance.

  16. Estimation of Cross-Lingual News Similarities Using Text-Mining Methods

    Directory of Open Access Journals (Sweden)

    Zhouhao Wang

    2018-01-01

    Full Text Available In this research, two estimation algorithms for extracting cross-lingual news pairs based on machine learning from financial news articles have been proposed. Every second, innumerable text data, including all kinds news, reports, messages, reviews, comments, and tweets are generated on the Internet, and these are written not only in English but also in other languages such as Chinese, Japanese, French, etc. By taking advantage of multi-lingual text resources provided by Thomson Reuters News, we developed two estimation algorithms for extracting cross-lingual news pairs from multilingual text resources. In our first method, we propose a novel structure that uses the word information and the machine learning method effectively in this task. Simultaneously, we developed a bidirectional Long Short-Term Memory (LSTM based method to calculate cross-lingual semantic text similarity for long text and short text, respectively. Thus, when an important news article is published, users can read similar news articles that are written in their native language using our method.

  17. Word Prosody and Intonation of Sgaw Karen

    Science.gov (United States)

    West, Luke Alexander

    The prosodic, and specifically intonation, systems of Tibeto-Burman languages have received less attention in research than those of other families. This study investigates the word prosody and intonation of Sgaw Karen, a tonal Tibeto-Burman language of eastern Burma, and finds similarities to both closely related Tibeto-Burman languages and the more distant Sinitic languages like Mandarin. Sentences of varying lengths with controlled tonal environments were elicited from a total of 12 participants (5 male). In terms of word prosody, Sgaw Karen does not exhibit word stress cues, but does maintain a prosodic distinction between the more prominent major syllable and the phonologically reduced minor syllable. In terms of intonation, Sgaw Karen patterns like related Pwo Karen in its limited use of post-lexical tone, which is only present at Intonation Phrase (IP) boundaries. Unlike the intonation systems of Pwo Karen and Mandarin, however, Sgaw Karen exhibits downstep across its Accentual Phrases (AP), similarly to phenomena identified in Tibetan and Burmese.

  18. Wording effects in moral judgments

    Directory of Open Access Journals (Sweden)

    Ross E. O'Hara

    2010-12-01

    Full Text Available As the study of moral judgments grows, it becomes imperative to compare results across studies in order to create unified theories within the field. These efforts are potentially undermined, however, by variations in wording used by different researchers. The current study sought to determine whether, when, and how variations in wording influence moral judgments. Online participants responded to 15 different moral vignettes (e.g., the trolley problem using 1 of 4 adjectives: ``wrong'', ``inappropriate'', ``forbidden'', or ``blameworthy''. For half of the sample, these adjectives were preceded by the adverb ``morally''. Results indicated that people were more apt to judge an act as wrong or inappropriate than forbidden or blameworthy, and that disgusting acts were rated as more acceptable when ``morally'' was included. Although some wording differences emerged, effects sizes were small and suggest that studies of moral judgment with different wordings can legitimately be compared.

  19. Journal of Clipped Words in Reader's Digest Magazine

    OpenAIRE

    Simanjuntak, Lestari

    2012-01-01

    This study deals with Clipped Words in the “Laughter, the Best Medicine” of Reader's Digest. The objectives of the study are to find out the types of clipped words which are used in the “Laughter, the Best Medicine” of Reader's Digest, to find out sthe dominantly used in the whole story and to reason the dominant clipped word use in the text. The study use descriptive qualitative method. The data were collected from seventeen selected Reader's Digest which contains the clipped word by applie...

  20. Term Croatian considered in russian context

    Directory of Open Access Journals (Sweden)

    Željka Čelić

    2008-01-01

    Full Text Available Term Croatian is considered in Russian context, i. e. context of Russian scientific material (which is comparable to the unwritten situation in universities. Russian scientific texts connect term Croatian, almost without an exception, with the term Serbian in words such as Serbo-Croatian. This point of view is politically approved in the period untill 1990’s, but it exists in the 21st century’s scientific material. The nature of the problem lays, at the same time, in politics, language and society; thus, the question is: what is the reason of such a context in which Croatian language is placed now? There are no arguments for it, especially if it is for Slovak language politically based and language approved to be an entity – in comparison to the Czech language; for Ukrainian (once Littlerussian – at least in principle, in comparison to the Russian, or, more convincable, Belorussian to Russian (the standard Belorussian language exists from 1905. The term Croatian is independently, even in new books, connected with terms of soil, state, nation, but not language. And though today, because of political reasons, exists an awareness of Croatian language without its Serbian mirror reflexion, the term Serbo-Croatian stays. Thus, this paper looks through the history concerning Croatian language in 19, 20 and 21st century’s Russian philology, including Juraj Križanić and Vatroslav Jagić – innovators of the Croatian word in Russia.

  1. Using machine learning to disentangle homonyms in large text corpora.

    Science.gov (United States)

    Roll, Uri; Correia, Ricardo A; Berger-Tal, Oded

    2018-06-01

    Systematic reviews are an increasingly popular decision-making tool that provides an unbiased summary of evidence to support conservation action. These reviews bridge the gap between researchers and managers by presenting a comprehensive overview of all studies relating to a particular topic and identify specifically where and under which conditions an effect is present. However, several technical challenges can severely hinder the feasibility and applicability of systematic reviews, for example, homonyms (terms that share spelling but differ in meaning). Homonyms add noise to search results and cannot be easily identified or removed. We developed a semiautomated approach that can aid in the classification of homonyms among narratives. We used a combination of automated content analysis and artificial neural networks to quickly and accurately sift through large corpora of academic texts and classify them to distinct topics. As an example, we explored the use of the word reintroduction in academic texts. Reintroduction is used within the conservation context to indicate the release of organisms to their former native habitat; however, a Web of Science search for this word returned thousands of publications in which the term has other meanings and contexts. Using our method, we automatically classified a sample of 3000 of these publications with over 99% accuracy, relative to a manual classification. Our approach can be used easily with other homonyms and can greatly facilitate systematic reviews or similar work in which homonyms hinder the harnessing of large text corpora. Beyond homonyms we see great promise in combining automated content analysis and machine-learning methods to handle and screen big data for relevant information in conservation science. © 2017 Society for Conservation Biology.

  2. DINAMIC ASPECTS OF INTERNATIONALIZATION IN MODERN MEDIA WORD CREATION

    Directory of Open Access Journals (Sweden)

    Ratsiburskaya Larisa Viktorovna

    2014-12-01

    Full Text Available In the context of globalization at the turn of XX–XXI centuries there are various manifestations of internationalization in the Russian language notified by linguists. Internationalization in modern Russian word-formation reveals itself in derivational activity of loan morphemes and models, as well as in the borrowing of new derivational morphemes. In modern mass media word creation well-known derivational affixes (the prefixes anti-, contr-, pseudo-, quasi-, super-, hyper-, ultra-, ex-, as well as some new derivational elements (the prefix mega-, the suffix -ing, -land, -wood, suffixed -gate are noted, and the author puts forward arguments to substantiate their morphemic status. "Ameroglobalization", and in particular the influence of the English language on the semantics of prefixes, contributes to the growth of word-building activity and productivity of the prefixes which are of Greek-Latin origin. The appearance of new word-building affixes in the modern Russian language is affected by active usage of the corresponding foreign morphemes in the journalists' word creation, in media texts and on the internet forums. Activization of foreign derivation elements in the media word creation is conditioned by sociocultural factors. Advertisement and mass media encourage the trend of using angloamericanisms and facilitate growth of word-building productivity of new affixes and new word-building models.

  3. Unsupervised Learning of Word-Sequence Representations from Scratch via Convolutional Tensor Decomposition

    OpenAIRE

    Huang, Furong; Anandkumar, Animashree

    2016-01-01

    Unsupervised text embeddings extraction is crucial for text understanding in machine learning. Word2Vec and its variants have received substantial success in mapping words with similar syntactic or semantic meaning to vectors close to each other. However, extracting context-aware word-sequence embedding remains a challenging task. Training over large corpus is difficult as labels are difficult to get. More importantly, it is challenging for pre-trained models to obtain word-...

  4. Pronunciation modelling of foreign words for Sepedi ASR

    CSIR Research Space (South Africa)

    Modipa, T

    2010-11-01

    Full Text Available , specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data...

  5. Romanian Words of Arabic Origin: Scientific and Technical Vocabulary

    Directory of Open Access Journals (Sweden)

    Georgeta Rata

    2016-10-01

    Full Text Available There are 141 Romanian words of Arabic origin acquired either directly from Arabic or else indirectly by passing from Arabic into other languages and then into Romanian. Most entered one or more of the Romance languages before entering Romanian. To qualify for this list, a word must be reported in etymology dictionaries as having descended from Arabic. Words associated with the Islamic religion are omitted. Archaic and rare words are also omitted. Given the nature of the journal in which the paper is to be published, the author selected for analysis only about 126 terms belonging to the scientific and technical vocabulary: Adobe, alambic, albatros, alcalin, alchimie, alcool, alfalfa, algebră, algoritm, alidadă, alizarină, amalgam, ambră, anil, antimoniu, azimuth, azur, benjoin, bezoar, bor, cafea, calibre, camfor, carat, carciofoi, caric, cârmâz, carob, chimie, cifru, coton, curcuma, cuşcuş, erg, falafel, fanfară, felucă, fenec, gazelă, gerbil, girafă, halva, hamada, humus, iasomie, jar, julep, kaliu, lac, lămâie, lazurit, liliac, lime, marcasit, masicot, mizenă, muson, nadir, natriu, papagal, rachetă, realgar, sabkha, safari, şah, sandarac, şaorma, şerbet, sirop, sodium, şofran, sorbet, spanac, sumac, tabac, tahân, taifun, talc, tamarin(d, tangerină, tar, tară, tarhon, tarif, tasă, ţechin, ton, varan, zahăr, zenith, zero, zircon, etc. Some of them are obsolescent, but a large number are in everyday use and have been so well assimilated into Romanian that they have produced other words through derivation and composition, or they have acquired new meanings.

  6. Learning word order at birth: A NIRS study

    Directory of Open Access Journals (Sweden)

    Silvia Benavides-Varela

    2017-06-01

    Full Text Available In language, the relative order of words in sentences carries important grammatical functions. However, the developmental origins and the neural correlates of the ability to track word order are to date poorly understood. The current study therefore investigates the origins of infants’ ability to learn about the sequential order of words, using near-infrared spectroscopy (NIRS with newborn infants. We have conducted two experiments: one in which a word order change was implemented in 4-word sequences recorded with a list intonation (as if each word was a separate item in a list; list prosody condition, Experiment 1 and one in which the same 4-word sequences were recorded with a well-formed utterance-level prosodic contour (utterance prosody condition, Experiment 2. We found that newborns could detect the violation of the word order in the list prosody condition, but not in the utterance prosody condition. These results suggest that while newborns are already sensitive to word order in linguistic sequences, prosody appears to be a stronger cue than word order for the identification of linguistic units at birth.

  7. Scaling laws and fluctuations in the statistics of word frequencies

    Science.gov (United States)

    Gerlach, Martin; Altmann, Eduardo G.

    2014-11-01

    In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths.

  8. Do handwritten words magnify lexical effects in visual word recognition?

    Science.gov (United States)

    Perea, Manuel; Gil-López, Cristina; Beléndez, Victoria; Carreiras, Manuel

    2016-01-01

    An examination of how the word recognition system is able to process handwritten words is fundamental to formulate a comprehensive model of visual word recognition. Previous research has revealed that the magnitude of lexical effects (e.g., the word-frequency effect) is greater with handwritten words than with printed words. In the present lexical decision experiments, we examined whether the quality of handwritten words moderates the recruitment of top-down feedback, as reflected in word-frequency effects. Results showed a reading cost for difficult-to-read and easy-to-read handwritten words relative to printed words. But the critical finding was that difficult-to-read handwritten words, but not easy-to-read handwritten words, showed a greater word-frequency effect than printed words. Therefore, the inherent physical variability of handwritten words does not necessarily boost the magnitude of lexical effects.

  9. Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources

    Directory of Open Access Journals (Sweden)

    Paweł Kędzia

    2015-12-01

    Full Text Available Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources Lexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance

  10. Connected text reading and differences in text reading fluency in adult readers

    NARCIS (Netherlands)

    Wallot, S.; Hollis, G.; Rooij, M. de

    2013-01-01

    The process of connected text reading has received very little attention in contemporary cognitive psychology. This lack of attention is in parts due to a research tradition that emphasizes the role of basic lexical constituents, which can be studied in isolated words or sentences. However, this

  11. Flooding Vocabulary Gaps to Accelerate Word Learning

    Science.gov (United States)

    Brabham, Edna; Buskist, Connie; Henderson, Shannon Coman; Paleologos, Timon; Baugh, Nikki

    2012-01-01

    Students entering school with limited vocabularies are at a disadvantage compared to classmates with robust knowledge of words and meanings. Teaching a few unrelated words at a time is insufficient for catching these students up with peers and preparing them to comprehend texts they will encounter across the grades. This article presents…

  12. Word of mouth komunikacija

    Directory of Open Access Journals (Sweden)

    Žnideršić-Kovač Ružica

    2009-01-01

    Full Text Available Consumers' buying decision is very complex multistep process in which a lot of factors have significant impact. Traditional approach to the problem of communication between a company and its consumers, implies usage of marketing mix instruments, mostly promotion mix, in order to achieve positive purchase decision. Formal communication between company and consumers is dominant comparing to informal communication, and even in marketing literature there is not enough attention paid to this type of communication such as Word of Mouth. Numerous of research shows that consumers emphasize crucial impact of Word of Mouth on their buying decision. .

  13. THE STUDENTS’ PERCEPTIONS OF AUTHENTIC TEXTS-BASED TRANSLATION

    Directory of Open Access Journals (Sweden)

    Rusiana .

    2017-12-01

    Full Text Available Translation requires lots of practice. As it is generally known, authentic texts provide fruitful experience for students to translate either Indonesian-English or vice versa. Authentic texts give many real uses of language in varied meaningful contexts The texts used were advertisement, abstract, local stories, tourist attraction, community service and project for money. This research is aimed at investigating whether the use of authentic texts benefits the students and describing the students’ perceptions toward the use of authentic texts in Translation class. It is a qualitative research. Questionnaires were used to obtain the students’ perceptions on the use of authentic texts in translation. The findings show that authentic texts-based translation benefits students in experiencing better translation. Advertisement was considered to be the most relevant text. On the contrary, they find it difficult to cope with authentic texts particularly dealing with words/terms/vocabulary, meanings, culture, and grammar. The recommendations are that the students have to be exposed to many authentic texts of varied topics in both English and Indonesian in order that they understand both the SL and TL well. For further researchers, it would be possible to research on the influence of authentic texts based translation on the students’ translation skill.

  14. How Many Words Is a Picture Worth? Integrating Visual Literacy in Language Learning with Photographs

    Science.gov (United States)

    Baker, Lottie

    2015-01-01

    Cognitive research has shown that the human brain processes images quicker than it processes words, and images are more likely than text to remain in long-term memory. With the expansion of technology that allows people from all walks of life to create and share photographs with a few clicks, the world seems to value visual media more than ever…

  15. Computer-Aided Qualitative Data Analysis with Word

    Directory of Open Access Journals (Sweden)

    Bruno Nideröst

    2002-05-01

    Full Text Available Despite some fragmentary references in the literature about qualitative methods, it is fairly unknown that Word can be successfully used for computer-aided Qualitative Data Analyses (QDA. Based on several Word standard operations, elementary QDA functions such as sorting data, code-and-retrieve and frequency counts can be realized. Word is particularly interesting for those users who wish to have first experiences with computer-aided analysis before investing time and money in a specialized QDA Program. The well-known standard software could also be an option for those qualitative researchers who usually work with word processing but have certain reservations towards computer-aided analysis. The following article deals with the most important requirements and options of Word for computer-aided QDA. URN: urn:nbn:de:0114-fqs0202225

  16. Chinese Unknown Word Recognition for PCFG-LA Parsing

    Directory of Open Access Journals (Sweden)

    Qiuping Huang

    2014-01-01

    Full Text Available This paper investigates the recognition of unknown words in Chinese parsing. Two methods are proposed to handle this problem. One is the modification of a character-based model. We model the emission probability of an unknown word using the first and last characters in the word. It aims to reduce the POS tag ambiguities of unknown words to improve the parsing performance. In addition, a novel method, using graph-based semisupervised learning (SSL, is proposed to improve the syntax parsing of unknown words. Its goal is to discover additional lexical knowledge from a large amount of unlabeled data to help the syntax parsing. The method is mainly to propagate lexical emission probabilities to unknown words by building the similarity graphs over the words of labeled and unlabeled data. The derived distributions are incorporated into the parsing process. The proposed methods are effective in dealing with the unknown words to improve the parsing. Empirical results for Penn Chinese Treebank and TCT Treebank revealed its effectiveness.

  17. The Activation of Embedded Words in Spoken Word Recognition.

    Science.gov (United States)

    Zhang, Xujin; Samuel, Arthur G

    2015-01-01

    The current study investigated how listeners understand English words that have shorter words embedded in them. A series of auditory-auditory priming experiments assessed the activation of six types of embedded words (2 embedded positions × 3 embedded proportions) under different listening conditions. Facilitation of lexical decision responses to targets (e.g., pig) associated with words embedded in primes (e.g., hamster ) indexed activation of the embedded words (e.g., ham ). When the listening conditions were optimal, isolated embedded words (e.g., ham ) primed their targets in all six conditions (Experiment 1a). Within carrier words (e.g., hamster ), the same set of embedded words produced priming only when they were at the beginning or comprised a large proportion of the carrier word (Experiment 1b). When the listening conditions were made suboptimal by expanding or compressing the primes, significant priming was found for isolated embedded words (Experiment 2a), but no priming was produced when the carrier words were compressed/expanded (Experiment 2b). Similarly, priming was eliminated when the carrier words were presented with one segment replaced by noise (Experiment 3). When cognitive load was imposed, priming for embedded words was again found when they were presented in isolation (Experiment 4a), but not when they were embedded in the carrier words (Experiment 4b). The results suggest that both embedded position and proportion play important roles in the activation of embedded words, but that such activation only occurs under unusually good listening conditions.

  18. WORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION

    NARCIS (Netherlands)

    Chen, W.; Gader, P.

    2004-01-01

    Word level training refers to the process of learning the parameters of a word recognition system based on word level criteria functions. Previously, researchers trained lexicon­driven handwritten word recognition systems at the character level individually. These systems generally use statistical

  19. A text analysis of the poems of Sylvia Plath.

    Science.gov (United States)

    Lester, David; McSwain, Stephanie

    2011-08-01

    Changes in the words used in the poems of Sylvia Plath were examined using the Linguistic Inquiry and Word Count, a computer program for analyzing the content of texts. Major changes in the content of her poems were observed over the course of Plath's career, as well as in the final year of her life. As the time of her suicide came closer, words expressing positive emotions became more frequent, while words concerned with causation and insight became less frequent.

  20. Effects of music on memory for text.

    Science.gov (United States)

    Purnell-Webb, Patricia; Speelman, Craig P

    2008-06-01

    Previous research has suggested that the use of song can facilitate recall of text. This study examined the effect of repetition of a melody across verses, familiarity with the melody, rhythm, and other structural processing hypotheses to explain this phenomenon. Two experiments were conducted, each with 100 participants recruited from undergraduate Psychology programs (44 men, 156 women, M age = 28.5 yr., SD = 9.4). In Exp. 1, participants learned a four-verse ballad in one of five encoding conditions (familiar melody, unfamiliar melody, unknown rhythm, known rhythm, and spoken). Exp. 2 assessed the effect of familiarity in rhythm-only conditions and of pre-exposure with a previously unfamiliar melody. Measures taken were number of verbatim words recalled and number of lines produced with correct syllabic structure. Analysis indicated that rhythm, with or without musical accompaniment, can facilitate recall of text, suggesting that rhythm may provide a schematic frame to which text can be attached. Similarly, familiarity with the rhythm or melody facilitated recall. Findings are discussed in terms of integration and dual-processing theories.

  1. An Algorithm for Morphological Segmentation of Esperanto Words

    Directory of Open Access Journals (Sweden)

    Guinard Theresa

    2016-04-01

    Full Text Available Morphological analysis (finding the component morphemes of a word and tagging morphemes with part-of-speech information is a useful preprocessing step in many natural language processing applications, especially for synthetic languages. Compound words from the constructed language Esperanto are formed by straightforward agglutination, but for many words, there is more than one possible sequence of component morphemes. However, one segmentation is usually more semantically probable than the others. This paper presents a modified n-gram Markov model that finds the most probable segmentation of any Esperanto word, where the model’s states represent morpheme part-of-speech and semantic classes. The overall segmentation accuracy was over 98% for a set of presegmented dictionary words.

  2. Interactions between Digital Geometry and Combinatorics on Words

    Directory of Open Access Journals (Sweden)

    Srečko Brlek

    2011-08-01

    Full Text Available We review some recent results in digital geometry obtained by using a combinatorics on words approach to discrete geometry. Motivated on the one hand by the well-known theory of Sturmian words which model conveniently discrete lines in the plane, and on the other hand by the development of digital geometry, this study reveals strong links between the two fields. Discrete figures are identified with polyominoes encoded by words. The combinatorial tools lead to elegant descriptions of geometrical features and efficient algorithms. Among these, radix-trees are useful for efficiently detecting path intersection, Lyndon and Christoffel words appear as the main tools for describing digital convexity; equations on words allow to better understand tilings by translations.

  3. The x-word and its usage : Taboo words and swearwords in general, and x-words in newspapers

    OpenAIRE

    Lindahl, Katarina

    2008-01-01

    All languages have words that are considered taboo – words that are not supposed to be said or used. Taboo words, or swearwords, can be used in many different ways and they can have different meanings depending on what context they appear in. Another aspect of taboo words is the euphemisms that are used in order to avoid obscene speech. This paper will focus on x-words, words like the f-word or the c-word, which replace the words fuck or cunt, but as the study will show they also have other m...

  4. Mojibake - The rehearsal of word fragments in verbal recall.

    Science.gov (United States)

    Lange-Küttner, Christiane; Sykorova, Eva

    2015-01-01

    Theories of verbal rehearsal usually assume that whole words are being rehearsed. However, words consist of letter sequences, or syllables, or word onset-vowel-coda, amongst many other conceptualizations of word structure. A more general term is the 'grain size' of word units (Ziegler and Goswami, 2005). In the current study, a new method measured the quantitative percentage of correctly remembered word structure. The amount of letters in the correct letter sequence as per cent of word length was calculated, disregarding missing or added letters. A forced rehearsal was tested by repeating each memory list four times. We tested low frequency (LF) English words versus geographical (UK) town names to control for content. We also tested unfamiliar international (INT) non-words and names of international (INT) European towns to control for familiarity. An immediate versus distributed repetition was tested with a between-subject design. Participants responded with word fragments in their written recall especially when they had to remember unfamiliar words. While memory of whole words was sensitive to content, presentation distribution and individual sex and language differences, recall of word fragments was not. There was no trade-off between memory of word fragments with whole word recall during the repetition, instead also word fragments significantly increased. Moreover, while whole word responses correlated with each other during repetition, and word fragment responses correlated with each other during repetition, these two types of word recall responses were not correlated with each other. Thus there may be a lower layer consisting of free, sparse word fragments and an upper layer that consists of language-specific, orthographically and semantically constrained words.

  5. Contextual diversity facilitates learning new words in the classroom.

    Directory of Open Access Journals (Sweden)

    Eva Rosa

    Full Text Available In the field of word recognition and reading, it is commonly assumed that frequently repeated words create more accessible memory traces than infrequently repeated words, thus capturing the word-frequency effect. Nevertheless, recent research has shown that a seemingly related factor, contextual diversity (defined as the number of different contexts [e.g., films] in which a word appears, is a better predictor than word-frequency in word recognition and sentence reading experiments. Recent research has shown that contextual diversity plays an important role when learning new words in a laboratory setting with adult readers. In the current experiment, we directly manipulated contextual diversity in a very ecological scenario: at school, when Grade 3 children were learning words in the classroom. The new words appeared in different contexts/topics (high-contextual diversity or only in one of them (low-contextual diversity. Results showed that words encountered in different contexts were learned and remembered more effectively than those presented in redundant contexts. We discuss the practical (educational [e.g., curriculum design] and theoretical (models of word recognition implications of these findings.

  6. The Activation of Embedded Words in Spoken Word Recognition

    Science.gov (United States)

    Zhang, Xujin; Samuel, Arthur G.

    2015-01-01

    The current study investigated how listeners understand English words that have shorter words embedded in them. A series of auditory-auditory priming experiments assessed the activation of six types of embedded words (2 embedded positions × 3 embedded proportions) under different listening conditions. Facilitation of lexical decision responses to targets (e.g., pig) associated with words embedded in primes (e.g., hamster) indexed activation of the embedded words (e.g., ham). When the listening conditions were optimal, isolated embedded words (e.g., ham) primed their targets in all six conditions (Experiment 1a). Within carrier words (e.g., hamster), the same set of embedded words produced priming only when they were at the beginning or comprised a large proportion of the carrier word (Experiment 1b). When the listening conditions were made suboptimal by expanding or compressing the primes, significant priming was found for isolated embedded words (Experiment 2a), but no priming was produced when the carrier words were compressed/expanded (Experiment 2b). Similarly, priming was eliminated when the carrier words were presented with one segment replaced by noise (Experiment 3). When cognitive load was imposed, priming for embedded words was again found when they were presented in isolation (Experiment 4a), but not when they were embedded in the carrier words (Experiment 4b). The results suggest that both embedded position and proportion play important roles in the activation of embedded words, but that such activation only occurs under unusually good listening conditions. PMID:25593407

  7. Word selection affects perceptions of synthetic biology

    Directory of Open Access Journals (Sweden)

    Tonidandel Scott

    2011-07-01

    Full Text Available Abstract Members of the synthetic biology community have discussed the significance of word selection when describing synthetic biology to the general public. In particular, many leaders proposed the word "create" was laden with negative connotations. We found that word choice and framing does affect public perception of synthetic biology. In a controlled experiment, participants perceived synthetic biology more negatively when "create" was used to describe the field compared to "construct" (p = 0.008. Contrary to popular opinion among synthetic biologists, however, low religiosity individuals were more influenced negatively by the framing manipulation than high religiosity people. Our results suggest that synthetic biologists directly influence public perception of their field through avoidance of the word "create".

  8. Word posets, with applications to Coxeter groups

    Directory of Open Access Journals (Sweden)

    Matthew J. Samuel

    2011-08-01

    Full Text Available We discuss the theory of certain partially ordered sets that capture the structure of commutation classes of words in monoids. As a first application, it follows readily that counting words in commutation classes is #P-complete. We then apply the partially ordered sets to Coxeter groups. Some results are a proof that enumerating the reduced words of elements of Coxeter groups is #P-complete, a recursive formula for computing the number of commutation classes of reduced words, as well as stronger bounds on the maximum number of commutation classes than were previously known. This also allows us to improve the known bounds on the number of primitive sorting networks.

  9. Some words on Word

    NARCIS (Netherlands)

    Janssen, Maarten; Visser, A.

    In many disciplines, the notion of a word is of central importance. For instance, morphology studies le mot comme tel, pris isol´ement (Mel’ˇcuk, 1993 [74]). In the philosophy of language the word was often considered to be the primary bearer of meaning. Lexicography has as its fundamental role

  10. Word Similarity from Dictionaries: Inferring Fuzzy Measures from Fuzzy Graphs

    Directory of Open Access Journals (Sweden)

    Vicenc Torra

    2008-01-01

    Full Text Available WORD SIMILARITY FROM DICTIONARIES: INFERRING FUZZY MEASURES FROM FUZZY GRAPHS The computation of similarities between words is a basic element of information retrieval systems, when retrieval is not solely based on word matching. In this work we consider a measure between words based on dictionaries. This is achieved assuming that a dictionary is formalized as a fuzzy graph. We show that the approach permits to compute measures not only for pairs of words but for sets of them.

  11. SUBTLEX- AL: Albanian word frequencies based on film subtitles

    Directory of Open Access Journals (Sweden)

    Dr.Sc. Rrezarta Avdyli

    2013-06-01

    Full Text Available Recently several studies have shown that word frequency estimation based on subtitle files explains better the variance in word recognition performance than traditional words frequency estimates did. The present study aims to show this frequency estimate in Albanian from more than 2M words coming from film subtitles. Our results show high correlation between the RT from a LD study (120 stimuli and the SUBTLEX- AL, as well as, high correlation between this and the unique existing frequency list of a hundred more frequent Albanian words. These findings suggest that SUBTLEX-AL it is good frequency estimation, furthermore, this is the first database of frequency estimation in Albanian larger than 100 words.

  12. WORD FORMATION ON DRAGON NEST CHAT LANGUAGE

    Directory of Open Access Journals (Sweden)

    Shavitri Cecillia Harsono

    2016-11-01

    Full Text Available Word formation is creation of new words, which sometimes changes a word’s meaning. Words can be formed from multi word phrases as well. In many cases vocabularies in language are formed from combination of words (Haspelmath 2010: 102. Word formation does not only involve changing physical form of the word itself, but also changing the meaning of said word. There are also instances where the physical form retain its original form while the meaning changes. The phenomenon is called semantic change (Stockwell-Minkova 2001:149. In this thesis the research proposed that the said phenomenon occur in virtual environment, such as in MMORPG. Multiplayer online games that feature fantasy setting virtual environment. For the purpose of this research, Dragon Nest South East Asia server was chosen as data source. The samples are taken from players perusing [World] communication channel. The result of the data analysis has shown that the phenomenon of word formation could occur in a virtual environment of MMORPG, specifcally in Dragon Nest SEA. There are two word formation processes found: processes that involve physical changes and processes that do not involve physical changes but rather innate meaning. It is done by both processing daily language vocabulary both physically and changing its innate meaning to create new words that suits the said virtual environment context. This fnding may influence future research on a fresh perspective and untilled feld.

  13. Scaling laws and fluctuations in the statistics of word frequencies

    International Nuclear Information System (INIS)

    Gerlach, Martin; Altmann, Eduardo G

    2014-01-01

    In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths. (paper)

  14. Contextual Richness and Word Learning: Context Enhances Comprehension but Retrieval Enhances Retention

    Science.gov (United States)

    van den Broek, Gesa S. E.; Takashima, Atsuko; Segers, Eliane; Verhoeven, Ludo

    2018-01-01

    Learning new vocabulary from context typically requires multiple encounters during which word meaning can be retrieved from memory or inferred from context. We compared the effect of memory retrieval and context inferences on short- and long-term retention in three experiments. Participants studied novel words and then practiced the words either…

  15. Verbal Reports of Proficient Readers In Coping With Unfamiliar Words

    Directory of Open Access Journals (Sweden)

    Kusumarasdyati Kusumarasdyati

    2016-02-01

    Full Text Available The present study reports the actual strategy use of good readers when they face hindrance in the form of unfamiliar words. Eight undergraduates majoring in English at Surabaya State University performed think-aloud while reading two texts to find out how they coped with such difficulties. The verbal protocol indicated that half of the participants mainly relied on a bilingual (English-Indonesian dictionary to attack unfamiliar words, and only one of them preferred to use a monolingual (English-English one. Two of them employed context cues to infer the meaning of the words, while one participant combined the use of context cues and a monolingual dictionary as the major strategy. All but one of the participants skipped some of lexical items whose meaning was unknown to them, especially when these words did not have a key contribution to the meaning of the whole text.

  16. Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach.

    Science.gov (United States)

    Yan, Erjia; Williams, Jake; Chen, Zheng

    2017-01-01

    Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies from a large multidisciplinary corpus of PLoS ONE publications. It finds a power-law pattern in the frequency distributions of terms present in each discipline, indicating a semantic richness potentially sufficient for further study and advanced analysis. The salient relationships amongst these vocabularies become apparent in application of a principal component analysis. For example, Mathematics and Computer and Information Sciences were found to have similar vocabulary use patterns along with Engineering and Physics; while Chemistry and the Social Sciences were found to exhibit contrasting vocabulary use patterns along with the Earth Sciences and Chemistry. These results have implications to studies of scholarly communication as scholars attempt to identify the epistemological cultures of disciplines, and as a full text-based methodology could lead to machine learning applications in the automated classification of scholarly work according to disciplinary vocabularies.

  17. Word Similarity From Dictionaries: Inferring Fuzzy Measures From Fuzzy Graphs

    Directory of Open Access Journals (Sweden)

    Torra

    2008-01-01

    Full Text Available The computation of similarities between words is a basic element of information retrieval systems, when retrieval is not solely based on word matching. In this work we consider a measure between words based on dictionaries. This is achieved assuming that a dictionary is formalized as a fuzzy graph. We show that the approach permits to compute measures not only for pairs of words but for sets of them.

  18. Core Vocabulary: Its Morphological Content and Presence in Exemplar Texts

    Science.gov (United States)

    Hiebert, Elfrieda H.; Goodwin, Amanda P.; Cervetti, Gina N.

    2018-01-01

    This study addresses the distribution of words in texts at different points of schooling. The first aim was to identify a core vocabulary that accounts for the majority of the words in texts through the lens of morphological families. Results showed that 2,451 morphological families, averaging 4.61 members, make up the core vocabulary of school…

  19. Finding words in a language that allows words without vowels.

    Science.gov (United States)

    El Aissati, Abder; McQueen, James M; Cutler, Anne

    2012-07-01

    Across many languages from unrelated families, spoken-word recognition is subject to a constraint whereby potential word candidates must contain a vowel. This constraint minimizes competition from embedded words (e.g., in English, disfavoring win in twin because t cannot be a word). However, the constraint would be counter-productive in certain languages that allow stand-alone vowelless open-class words. One such language is Berber (where t is indeed a word). Berber listeners here detected words affixed to nonsense contexts with or without vowels. Length effects seen in other languages replicated in Berber, but in contrast to prior findings, word detection was not hindered by vowelless contexts. When words can be vowelless, otherwise universal constraints disfavoring vowelless words do not feature in spoken-word recognition. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. WORD ORIGIN HELPS EXPAND LEARNERS’ VOCABULARY A VOCABULARY TEACHING APPROACH

    Directory of Open Access Journals (Sweden)

    Li Jing

    2012-12-01

    Full Text Available Word origin (motivation deals with the connection between name and sense, explaining how a word originated. With the knowledge of how words are originated, learners can grasp a word easier and thus expand their vocabulary more quickly. The introduction to word origin (motivation by teachers can also help the learners gain interest in the process of learning and learn more about the cultural and historical background of the English-speaking countries. This paper tries to clarify this method of teaching from four aspects: onomatopoeia, word formation, cultural and historical background and cognitive linguistics.

  1. Entity recognition from clinical texts via recurrent neural network.

    Science.gov (United States)

    Liu, Zengjian; Yang, Ming; Wang, Xiaolong; Chen, Qingcai; Tang, Buzhou; Wang, Zhe; Xu, Hua

    2017-07-05

    Entity recognition is one of the most primary steps for text analysis and has long attracted considerable attention from researchers. In the clinical domain, various types of entities, such as clinical entities and protected health information (PHI), widely exist in clinical texts. Recognizing these entities has become a hot topic in clinical natural language processing (NLP), and a large number of traditional machine learning methods, such as support vector machine and conditional random field, have been deployed to recognize entities from clinical texts in the past few years. In recent years, recurrent neural network (RNN), one of deep learning methods that has shown great potential on many problems including named entity recognition, also has been gradually used for entity recognition from clinical texts. In this paper, we comprehensively investigate the performance of LSTM (long-short term memory), a representative variant of RNN, on clinical entity recognition and protected health information recognition. The LSTM model consists of three layers: input layer - generates representation of each word of a sentence; LSTM layer - outputs another word representation sequence that captures the context information of each word in this sentence; Inference layer - makes tagging decisions according to the output of LSTM layer, that is, outputting a label sequence. Experiments conducted on corpora of the 2010, 2012 and 2014 i2b2 NLP challenges show that LSTM achieves highest micro-average F1-scores of 85.81% on the 2010 i2b2 medical concept extraction, 92.29% on the 2012 i2b2 clinical event detection, and 94.37% on the 2014 i2b2 de-identification, which is considerably competitive with other state-of-the-art systems. LSTM that requires no hand-crafted feature has great potential on entity recognition from clinical texts. It outperforms traditional machine learning methods that suffer from fussy feature engineering. A possible future direction is how to integrate knowledge

  2. Normative data for 148 Spanish emotional words in terms of attributions of humanity

    Directory of Open Access Journals (Sweden)

    Armando Rodríguez-Pérez

    2014-10-01

    Full Text Available Research on outgroup infrahumanization is based on the subtle and not deliberate distinction of secondary emotions, an exclusively human emotion, and primary emotions, which are shared by animals and human beings. According to prior studies, people attribute more secondary emotions to the ingroup than to the outgroup which they deny or restrict the ability to experience them. This study presents normative measures for 148 emotional words viewed by Spanish people in seven dimensions related to humanity assessments. Two factors were revealed by the principal components analysis (PCA. The first component was loaded on dimensions that differentiate the emotions depending on the cognitive demands (cognition, moral quality and duration whereas the second one was loaded on their expressive profile (visibility, age at which they are acquired, universality and causal locus. These dimensions were analyzed in relation to desirability, familiarity and explicit humanity.

  3. A multiresolutional approach to fuzzy text meaning: A first attempt

    Energy Technology Data Exchange (ETDEWEB)

    Mehler, A.

    1996-12-31

    The present paper focuses on the connotative meaning aspect of language signs especially above the level of words. In this context the view is taken that texts can be defined as a kind of supersign, to which-in the same way as to other signs-a meaning can be assigned. A text can therefore be described as the result of a sign articulation which connects the material text sign with a corresponding meaning. For the constitution of the structural text meaning a kind of a semiotic composition principle is responsible, which leads to the emergence of interlocked levels of language units, demonstrating different grades of resolution. Starting on the level of words, and going through the level of sentences this principle reaches finally the level of texts by aggregating step by step the meaning of a unit on a higher level out of the meanings of all components one level below, which occur within this unit. Besides, this article will elaborate the hypothesis that the meaning constitution as a two-stage process, corresponding to the syntagmatic and paradigmatic restrictions of language elements among each other, obtains equally on the level of texts. On text level this two-levelledness leads to the constitution of the connotative text meaning, whose constituents are determined on word level by the syntagmatic and paradigmatic relations of the words. The formalization of the text meaning representation occurs with the help of fuzzy set theory.

  4. Familiar units prevail over statistical cues in word segmentation.

    Science.gov (United States)

    Poulin-Charronnat, Bénédicte; Perruchet, Pierre; Tillmann, Barbara; Peereman, Ronald

    2017-09-01

    In language acquisition research, the prevailing position is that listeners exploit statistical cues, in particular transitional probabilities between syllables, to discover words of a language. However, other cues are also involved in word discovery. Assessing the weight learners give to these different cues leads to a better understanding of the processes underlying speech segmentation. The present study evaluated whether adult learners preferentially used known units or statistical cues for segmenting continuous speech. Before the exposure phase, participants were familiarized with part-words of a three-word artificial language. This design allowed the dissociation of the influence of statistical cues and familiar units, with statistical cues favoring word segmentation and familiar units favoring (nonoptimal) part-word segmentation. In Experiment 1, performance in a two-alternative forced choice (2AFC) task between words and part-words revealed part-word segmentation (even though part-words were less cohesive in terms of transitional probabilities and less frequent than words). By contrast, an unfamiliarized group exhibited word segmentation, as usually observed in standard conditions. Experiment 2 used a syllable-detection task to remove the likely contamination of performance by memory and strategy effects in the 2AFC task. Overall, the results suggest that familiar units overrode statistical cues, ultimately questioning the need for computation mechanisms of transitional probabilities (TPs) in natural language speech segmentation.

  5. Large-corpus phoneme and word recognition and the generality of lexical context in CVC word perception.

    Science.gov (United States)

    Gelfand, Jessica T; Christie, Robert E; Gelfand, Stanley A

    2014-02-01

    Speech recognition may be analyzed in terms of recognition probabilities for perceptual wholes (e.g., words) and parts (e.g., phonemes), where j or the j-factor reveals the number of independent perceptual units required for recognition of the whole (Boothroyd, 1968b; Boothroyd & Nittrouer, 1988; Nittrouer & Boothroyd, 1990). For consonant-vowel-consonant (CVC) nonsense syllables, j ∼ 3 because all 3 phonemes are needed to identify the syllable, but j ∼ 2.5 for real-word CVCs (revealing ∼2.5 independent perceptual units) because higher level contributions such as lexical knowledge enable word recognition even if less than 3 phonemes are accurately received. These findings were almost exclusively determined with the 120-word corpus of the isophonemic word lists (Boothroyd, 1968a; Boothroyd & Nittrouer, 1988), presented one word at a time. It is therefore possible that its generality or applicability may be limited. This study thus determined j by using a much larger and less restricted corpus of real-word CVCs presented in 3-word groups as well as whether j is influenced by test size. The j-factor for real-word CVCs was derived from the recognition performance of 223 individuals with a broad range of hearing sensitivity by using the Tri-Word Test (Gelfand, 1998), which involves 50 three-word presentations and a corpus of 450 words. The influence of test size was determined from a subsample of 96 participants with separate scores for the first 10, 20, and 25 (and all 50) presentation sets of the full test. The mean value of j was 2.48 with a 95% confidence interval of 2.44-2.53, which is in good agreement with values obtained with isophonemic word lists, although its value varies among individuals. A significant correlation was found between percent-correct scores and j, but it was small and accounted for only 12.4% of the variance in j for phoneme scores ≥60%. Mean j-factors for the 10-, 20-, 25-, and 50-set test sizes were between 2.49 and 2.53 and were not

  6. Reading in Developmental Prosopagnosia: Evidence for a Dissociation Between Word and Face Recognition

    DEFF Research Database (Denmark)

    Starrfelt, Randi; Klargaard, Solja; Petersen, Anders

    2018-01-01

    exposure durations (targeting the word superiority effect), and d) text reading. Results: Participants with developmental prosopagnosia performed strikingly similar to controls across the four reading tasks. Formal analysis revealed a significant dissociation between word and face recognition......, that is, impaired reading in developmental prosopagnosia. Method: We tested 10 adults with developmental prosopagnosia and 20 matched controls. All participants completed the Cambridge Face Memory Test, the Cambridge Face Perception test and a Face recognition questionnaire used to quantify everyday face...... recognition experience. Reading was measured in four experimental tasks, testing different levels of letter, word, and text reading: a) single word reading with words of varying length, b) vocal response times in single letter and short word naming, c) recognition of single letters and short words at brief...

  7. Social interaction facilitates word learning in preverbal infants: Word-object mapping and word segmentation.

    Science.gov (United States)

    Hakuno, Yoko; Omori, Takahide; Yamamoto, Jun-Ichi; Minagawa, Yasuyo

    2017-08-01

    In natural settings, infants learn spoken language with the aid of a caregiver who explicitly provides social signals. Although previous studies have demonstrated that young infants are sensitive to these signals that facilitate language development, the impact of real-life interactions on early word segmentation and word-object mapping remains elusive. We tested whether infants aged 5-6 months and 9-10 months could segment a word from continuous speech and acquire a word-object relation in an ecologically valid setting. In Experiment 1, infants were exposed to a live tutor, while in Experiment 2, another group of infants were exposed to a televised tutor. Results indicate that both younger and older infants were capable of segmenting a word and learning a word-object association only when the stimuli were derived from a live tutor in a natural manner, suggesting that real-life interaction enhances the learning of spoken words in preverbal infants. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Acoustic and semantic interference effects in words and pictures.

    Science.gov (United States)

    Dhawan, M; Pellegrino, J W

    1977-05-01

    Interference effects for pictures and words were investigated using a probe-recall task. Word stimuli showed acoustic interference effects for items at the end of the list and semantic interference effects for items at the beginning of the list, similar to results of Kintsch and Buschke (1969). Picture stimuli showed large semantic interference effects at all list positions with smaller acoustic interference effects. The results were related to latency data on picture-word processing and interpreted in terms of the differential order, probability, and/or speed of access to acoustic and semantic levels of processing. A levels of processing explanation of picture-word retention differences was related to dual coding theory. Both theoretical positions converge on an explanation of picture-word retention differences as a function of the relative capacity for semantic or associative processing.

  9. Image Captioning with Word Gate and Adaptive Self-Critical Learning

    Directory of Open Access Journals (Sweden)

    Xinxin Zhu

    2018-06-01

    Full Text Available Although the policy-gradient methods for reinforcement learning have shown significant improvement in image captioning, how to achieve high performance during the reinforcement optimizing process is still not a simple task. There are at least two difficulties: (1 The large size of vocabulary leads to a large action space, which makes it difficult for the model to accurately predict the current word. (2 The large variance of gradient estimation in reinforcement learning usually causes severe instabilities in the training process. In this paper, we propose two innovations to boost the performance of self-critical sequence training (SCST. First, we modify the standard long short-term memory (LSTMbased decoder by introducing a gate function to reduce the search scope of the vocabulary for any given image, which is termed the word gate decoder. Second, instead of only considering current maximum actions greedily, we propose a stabilized gradient estimation method whose gradient variance is controlled by the difference between the sampling reward from the current model and the expectation of the historical reward. We conducted extensive experiments, and results showed that our method could accelerate the training process and increase the prediction accuracy. Our method was validated on MS COCO datasets and yielded state-of-the-art performance.

  10. Does neighborhood size really cause the word length effect?

    Science.gov (United States)

    Guitard, Dominic; Saint-Aubin, Jean; Tehan, Gerald; Tolan, Anne

    2018-02-01

    In short-term serial recall, it is well-known that short words are remembered better than long words. This word length effect has been the cornerstone of the working memory model and a benchmark effect that all models of immediate memory should account for. Currently, there is no consensus as to what determines the word length effect. Jalbert and colleagues (Jalbert, Neath, Bireta, & Surprenant, 2011a; Jalbert, Neath, & Surprenant, 2011b) suggested that neighborhood size is one causal factor. In six experiments we systematically examined their suggestion. In Experiment 1, with an immediate serial recall task, multiple word lengths, and a large pool of words controlled for neighborhood size, the typical word length effect was present. In Experiments 2 and 3, with an order reconstruction task and words with either many or few neighbors, we observed the typical word length effect. In Experiment 4 we tested the hypothesis that the previous abolition of the word length effect when neighborhood size was controlled was due to a confounded factor: frequency of orthographic structure. As predicted, we reversed the word length effect when using short words with less frequent orthographic structures than the long words, as was done in both of Jalbert et al.'s studies. In Experiments 5 and 6, we again observed the typical word length effect, even if we controlled for neighborhood size and frequency of orthographic structure. Overall, the results were not consistent with the predictions of Jalbert et al. and clearly showed a large and reliable word length effect after controlling for neighborhood size.

  11. Deep Belief Networks Based Toponym Recognition for Chinese Text

    Directory of Open Access Journals (Sweden)

    Shu Wang

    2018-06-01

    Full Text Available In Geographical Information Systems, geo-coding is used for the task of mapping from implicitly geo-referenced data to explicitly geo-referenced coordinates. At present, an enormous amount of implicitly geo-referenced information is hidden in unstructured text, e.g., Wikipedia, social data and news. Toponym recognition is the foundation of mining this useful geo-referenced information by identifying words as toponyms in text. In this paper, we propose an adapted toponym recognition approach based on deep belief network (DBN by exploring two key issues: word representation and model interpretation. A Skip-Gram model is used in the word representation process to represent words with contextual information that are ignored by current word representation models. We then determine the core hyper-parameters of the DBN model by illustrating the relationship between the performance and the hyper-parameters, e.g., vector dimensionality, DBN structures and probability thresholds. The experiments evaluate the performance of the Skip-Gram model implemented by the Word2Vec open-source tool, determine stable hyper-parameters and compare our approach with a conditional random field (CRF based approach. The experimental results show that the DBN model outperforms the CRF model with smaller corpus. When the corpus size is large enough, their statistical metrics become approaching. However, their recognition results express differences and complementarity on different kinds of toponyms. More importantly, combining their results can directly improve the performance of toponym recognition relative to their individual performances. It seems that the scale of the corpus has an obvious effect on the performance of toponym recognition. Generally, there is no adequate tagged corpus on specific toponym recognition tasks, especially in the era of Big Data. In conclusion, we believe that the DBN-based approach is a promising and powerful method to extract geo

  12. Modeling Word Burstiness Using the Dirichlet Distribution

    DEFF Research Database (Denmark)

    Madsen, Rasmus Elsborg; Kauchak, David; Elkan, Charles

    2005-01-01

    Multinomial distributions are often used to model text documents. However, they do not capture well the phenomenon that words in a document tend to appear in bursts: if a word appears once, it is more likely to appear again. In this paper, we propose the Dirichlet compound multinomial model (DCM......) as an alternative to the multinomial. The DCM model has one additional degree of freedom, which allows it to capture burstiness. We show experimentally that the DCM is substantially better than the multinomial at modeling text data, measured by perplexity. We also show using three standard document collections...

  13. Morpheme matching based text tokenization for a scarce resourced language.

    Science.gov (United States)

    Rehman, Zobia; Anwar, Waqas; Bajwa, Usama Ijaz; Xuan, Wang; Chaoying, Zhou

    2013-01-01

    Text tokenization is a fundamental pre-processing step for almost all the information processing applications. This task is nontrivial for the scarce resourced languages such as Urdu, as there is inconsistent use of space between words. In this paper a morpheme matching based approach has been proposed for Urdu text tokenization, along with some other algorithms to solve the additional issues of boundary detection of compound words, affixation, reduplication, names and abbreviations. This study resulted into 97.28% precision, 93.71% recall, and 95.46% F1-measure; while tokenizing a corpus of 57000 words by using a morpheme list with 6400 entries.

  14. A Few Words about Words | Poster

    Science.gov (United States)

    By Ken Michaels, Guest Writer In Shakepeare’s play “Hamlet,” Polonius inquires of the prince, “What do you read, my lord?” Not at all pleased with what he’s reading, Hamlet replies, “Words, words, words.”1 I have previously described the communication model in which a sender encodes a message and then sends it via some channel (or medium) to a receiver, who decodes the message

  15. Information content versus word length in random typing

    International Nuclear Information System (INIS)

    Ferrer-i-Cancho, Ramon; Moscoso del Prado Martín, Fermín

    2011-01-01

    Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al 2011 Proc. Nat. Acad. Sci. 108 3825). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths according to information content, it exhibits a linear relationship between information content and word length. The exact slope and intercept are presented for three major variants of the random typing process. A strong correlation between information content and word length can simply arise from the units making a word (e.g., letters) and not necessarily from the interplay between a word and its context as proposed by Piantadosi and co-workers. In itself, the linear relation does not entail the results of any optimization process. (letter)

  16. WORD OF MOUTH SEBAGAI KONSEKUENSI KEPUASAN PELANGGAN

    Directory of Open Access Journals (Sweden)

    Eny Purbandari

    2018-03-01

    Full Text Available The objective of this study is to investigate the impact of price and service quality on customer satisfaction to increase words of mouth. Data were collected by distributes questionnaires to 110 patient of Bhayangkara Polda DIY Hospital. Then, data was analyzed using structural equation modeling. The result showed that service quality, price and image have positive effect on patient satisfaction and patient satisfaction has a positive effect on words of mouth. The results also shows that image have the highest effect in creating the satisfaction. Therefore, the models of words of mouth have acceptable.

  17. Examining the Effect of Interference on Short-Term Memory Recall of Arabic Abstract and Concrete Words Using Free, Cued, and Serial Recall Paradigms

    Science.gov (United States)

    Alduais, Ahmed Mohammed Saleh; Almukhaizeem, Yasir Saad

    2015-01-01

    Purpose: To see if there is a correlation between interference and short-term memory recall and to examine interference as a factor affecting memory recalling of Arabic and abstract words through free, cued, and serial recall tasks. Method: Four groups of undergraduates in King Saud University, Saudi Arabia participated in this study. The first…

  18. Word final schwa is driven by intonation-The case of Bari Italian.

    Science.gov (United States)

    Grice, Martine; Savino, Michelina; Roettger, Timo B

    2018-04-01

    In order to convey pragmatic functions, a speaker has to select an intonation contour (the tune) in addition to the words that are to be spoken (the text). The tune and text are assumed to be independent of each other, such that any one intonation contour can be produced on different phrases, regardless of the number and nature of the segments they are made up of. However, if the segmental string is too short, certain tunes-especially those with a rising component-call for adjustments to the text. In Italian, for instance, loan words such as "chat" can be produced with a word final schwa when this word occurs at the end of a question. This paper investigates this word final schwa in the Bari variety in a number of different intonation contours. Although its presence and duration is to some extent dependent on idiosyncratic properties of speakers and words, schwa is largely conditioned by intonation. Schwa cannot thus be considered a mere phonetic artefact, since it is relevant for phonology, in that it facilitates the production of communicatively relevant intonation contours.

  19. Words and possible words in early language acquisition.

    Science.gov (United States)

    Marchetto, Erika; Bonatti, Luca L

    2013-11-01

    In order to acquire language, infants must extract its building blocks-words-and master the rules governing their legal combinations from speech. These two problems are not independent, however: words also have internal structure. Thus, infants must extract two kinds of information from the same speech input. They must find the actual words of their language. Furthermore, they must identify its possible words, that is, the sequences of sounds that, being morphologically well formed, could be words. Here, we show that infants' sensitivity to possible words appears to be more primitive and fundamental than their ability to find actual words. We expose 12- and 18-month-old infants to an artificial language containing a conflict between statistically coherent and structurally coherent items. We show that 18-month-olds can extract possible words when the familiarization stream contains marks of segmentation, but cannot do so when the stream is continuous. Yet, they can find actual words from a continuous stream by computing statistical relationships among syllables. By contrast, 12-month-olds can find possible words when familiarized with a segmented stream, but seem unable to extract statistically coherent items from a continuous stream that contains minimal conflicts between statistical and structural information. These results suggest that sensitivity to word structure is in place earlier than the ability to analyze distributional information. The ability to compute nontrivial statistical relationships becomes fully effective relatively late in development, when infants have already acquired a considerable amount of linguistic knowledge. Thus, mechanisms for structure extraction that do not rely on extensive sampling of the input are likely to have a much larger role in language acquisition than general-purpose statistical abilities. Copyright © 2013. Published by Elsevier Inc.

  20. Defining a Conceptual Topography of Word Concreteness: Clustering Properties of Emotion, Sensation, and Magnitude among 750 English Words

    Directory of Open Access Journals (Sweden)

    Joshua Troche

    2017-10-01

    Full Text Available Cognitive science has a longstanding interest in the ways that people acquire and use abstract vs. concrete words (e.g., truth vs. piano. One dominant theory holds that abstract and concrete words are subserved by two parallel semantic systems. We recently proposed an alternative account of abstract-concrete word representation premised upon a unitary, high dimensional semantic space wherein word meaning is nested. We hypothesize that a range of cognitive and perceptual dimensions (e.g., emotion, time, space, color, size, visual form bound this space, forming a conceptual topography. Here we report a normative study where we examined the clustering properties of a sample of English words (N = 750 spanning a spectrum of concreteness in a continuous manner from highly abstract to highly concrete. Participants (N = 328 rated each target word on a range of 14 cognitive dimensions (e.g., color, emotion, valence, polarity, motion, space. The dimensions reduced to three factors: Endogenous factor, Exogenous factor, and Magnitude factor. Concepts were plotted in a unified, multimodal space with concrete and abstract concepts along a continuous continuum. We discuss theoretical implications and practical applications of this dataset. These word norms are freely available for download and use at http://www.reilly-coglab.com/data/.

  1. Rational kernels for Arabic Root Extraction and Text Classification

    Directory of Open Access Journals (Sweden)

    Attia Nehar

    2016-04-01

    Full Text Available In this paper, we address the problems of Arabic Text Classification and root extraction using transducers and rational kernels. We introduce a new root extraction approach on the basis of the use of Arabic patterns (Pattern Based Stemmer. Transducers are used to model these patterns and root extraction is done without relying on any dictionary. Using transducers for extracting roots, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Root extraction experiments are conducted on three word collections and yield 75.6% of accuracy. Classification experiments are done on the Saudi Press Agency dataset and N-gram kernels are tested with different values of N. Accuracy and F1 report 90.79% and 62.93% respectively. These results show that our approach, when compared with other approaches, is promising specially in terms of accuracy and F1.

  2. ADAPTING HYBRID MACHINE TRANSLATION TECHNIQUES FOR CROSS-LANGUAGE TEXT RETRIEVAL SYSTEM

    Directory of Open Access Journals (Sweden)

    P. ISWARYA

    2017-03-01

    Full Text Available This research work aims in developing Tamil to English Cross - language text retrieval system using hybrid machine translation approach. The hybrid machine translation system is a combination of rule based and statistical based approaches. In an existing word by word translation system there are lot of issues and some of them are ambiguity, Out-of-Vocabulary words, word inflections, and improper sentence structure. To handle these issues, proposed architecture is designed in such a way that, it contains Improved Part-of-Speech tagger, machine learning based morphological analyser, collocation based word sense disambiguation procedure, semantic dictionary, and tense markers with gerund ending rules, and two pass transliteration algorithm. From the experimental results it is clear that the proposed Tamil Query based translation system achieves significantly better translation quality over existing system, and reaches 95.88% of monolingual performance.

  3. The effect of post-learning presentation of music on long-term word-list retention.

    Science.gov (United States)

    Judde, Sarah; Rickard, Nikki

    2010-07-01

    Memory consolidation processes occur slowly over time, allowing recently formed memories to be altered soon after acquisition. Although post-learning arousal treatments have been found to modulate memory consolidation, examination of the temporal parameters of these effects in humans has been limited. In the current study, 127 participants learned a neutral word list and were exposed to either a positively or negatively arousing musical piece following delays of 0, 20 or 45min. One-week later, participants completed a long-term memory recognition test, followed by Carver and White's (1994) approach/avoidance personality scales. Retention was significantly enhanced, regardless of valence, when the emotion manipulation occurred at 20min, but not immediately or 45min, post-learning. Further, the 20min interval effect was found to be moderated by high 'drive' approach sensitivity. The selective facilitatory conditions of music identified in the current study (timing and personality) offer valuable insights for future development of more specified memory intervention strategies.

  4. Text Manipulation Techniques and Foreign Language Composition.

    Science.gov (United States)

    Walker, Ronald W.

    1982-01-01

    Discusses an approach to teaching second language composition which emphasizes (1) careful analysis of model texts from a limited, but well-defined perspective and (2) the application of text manipulation techniques developed by the word processing industry to student compositions. (EKN)

  5. Spectrotemporal processing drives fast access to memory traces for spoken words.

    Science.gov (United States)

    Tavano, A; Grimm, S; Costa-Faidella, J; Slabu, L; Schröger, E; Escera, C

    2012-05-01

    The Mismatch Negativity (MMN) component of the event-related potentials is generated when a detectable spectrotemporal feature of the incoming sound does not match the sensory model set up by preceding repeated stimuli. MMN is enhanced at frontocentral scalp sites for deviant words when compared to acoustically similar deviant pseudowords, suggesting that automatic access to long-term memory traces for spoken words contributes to MMN generation. Does spectrotemporal feature matching also drive automatic lexical access? To test this, we recorded human auditory event-related potentials (ERPs) to disyllabic spoken words and pseudowords within a passive oddball paradigm. We first aimed at replicating the word-related MMN enhancement effect for Spanish, thereby adding to the available cross-linguistic evidence (e.g., Finnish, English). We then probed its resilience to spectrotemporal perturbation by inserting short (20 ms) and long (120 ms) silent gaps between first and second syllables of deviant and standard stimuli. A significantly enhanced, frontocentrally distributed MMN to deviant words was found for stimuli with no gap. The long gap yielded no deviant word MMN, showing that prior expectations of word form limits in a given language influence deviance detection processes. Crucially, the insertion of a short gap suppressed deviant word MMN enhancement at frontocentral sites. We propose that spectrotemporal point-wise matching constitutes a core mechanism for fast serial computations in audition and language, bridging sensory and long-term memory systems. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. A Bidirectional Relationship between Conceptual Organization and Word Learning

    Directory of Open Access Journals (Sweden)

    Tanya Kaefer

    2013-01-01

    Full Text Available This study explores the relationship between word learning and conceptual organization for preschool-aged children. We proposed a bidirectional model in which increases in word learning lead to increases in taxonomic organization, which, in turn, leads to further increases in word learning. In order to examine this model, we recruited 104 4-year olds from Head Start classrooms; 52 children participated in a two-week training program, and 52 children were in a control group. Results indicated that children in the training program learned more words and were more likely to sort taxonomically than children in the control condition. Furthermore, the number of words learned over the training period predicted the extent to which children categorized taxonomically. Additionally, this ability to categorize taxonomically predicted the number of words learned outside the training program, over and above the number of words learned in the program. These results suggest a bi-directional relationship between conceptual organization and word learning.

  7. Understanding text as social practice: An exploration of the potential of systemic functional grammar to facilitate students' interpretation of media texts.

    Directory of Open Access Journals (Sweden)

    Jenny Clarence-Fincham

    2008-08-01

    Full Text Available It has frequently been claimed that Halliday's Systemic Functional Grammar (SFG is apowerful linguistic tool which facilitates analytical and interpretative skills and provides aflexible, yet structured set of analytical tools with which to interpret texts. With this claim asa backdrop, this article asks whether SFG is, in fact an appropriate analytical approach forunder-graduate students and whether it can facilitate their ability to analyse texts. Its contextis a second level course, Analysing Media Texts, offered at Natal University. Broadly framedby critical discourse analysis, it traces the development of a thirteen week module and,using student analyses for illustrative purposes, identifies pedagogical challenges anddifficulties that need to be confronted before any strong claims can be made. It is concludedthat, on the evidence of students' responses to texts analysed during this course, it is not yetpossible to make strong claims about the benefits of SFG. There is enough positiveevidence, however, to pursue the possibility that with innovative curriculum development andthe careful scaffolding and integration of concepts, SFG will be clearly shown to have anextremely important role to play.Daar is dikwels beweer dat Halliday se Sistemies-Funksionele Grammatika (SFG 'n kragtige linguistiese middel is wat analitiese en interpreterende vaardighede bevorder en 'n plooibare, dog gestruktureere stel analitiese gereedskap verskaf waarmee tekste gei"nterpreteer kan word. Met die bewering as agtergrond vra hierdie artikel of SFG inderdaad 'n toepas like analitiese benadering vir voorgraadse studente is en of dit hulle vermoe om tekste te ontleed, bevorder. Die konteks is 'n tweedejaarskursus, Analysing Media Texts, wat aan die Universiteit van Natal aangebied word. Breedweg omraam deur kritiese diskoersanalise, speur die artikel die ontwikkeling van 'n module van dertien weke na, met gebruik van studenteontledings ter illustrasie en identifiseer

  8. Journalism, controversy and convincing practices: the words and the training

    Directory of Open Access Journals (Sweden)

    Francisco José Castilhos Karam

    2012-09-01

    Full Text Available Journalism is raised by Greco-RomanRhetoric and Dialectics and comes through thepresent days without ever giving up on its centralpillar: words. Words are to be found everywhere:in writt en texts, static or dynamic images; in infographicsand great, typical journalistic tales. Theyare to be found in chronics, comics, informal talkand social networks. To become a journalist meansto not give up on words that are central when oneacknowledges the importance of the surroundings,of detection methods and narrative models. Wordsare at the core of controversy and convincing practices.They are at the core of becoming a journalist.

  9. Ancient medical texts, modern reading problems

    Directory of Open Access Journals (Sweden)

    Maria Carlota Rosa

    2006-12-01

    Full Text Available The word tradition has a very specific meaning in linguistics: the passing down of a text, which may have been completed or corrected by different copyists at different times, when the concept of authorship was not the same as it is today. When reading an ancient text the word tradition must be in the reader's mind. To discuss one of the problems an ancient text poses to its modern readers, this work deals with one of the first printed medical texts in Portuguese, the Regimento proueytoso contra ha pestenença, and draws a parallel between it and two related texts, A moche profitable treatise against the pestilence, and the Recopilaçam das cousas que conuem guardar se no modo de preseruar à Cidade de Lixboa E os sãos, & curar os que esteuerem enfermos de Peste. The problems which arise out of the textual structure of those books show how difficult is to establish a tradition of another type, the medical tradition. The linguistic study of the innumerable medieval plague treatises may throw light on the continuities and on the disruptions of the so-called hippocratic-galenical medical tradition.

  10. COURSES OF MISRECALL OVER LONG-TERM RETENTION INTERVALS AS RELATED TO STRENGTH OF PRE-EXPERIMENTAL HABITS OF WORD ASSOCIATION.

    Science.gov (United States)

    BILODEAU, EDWARD A.; BLICK, KENNETH A.

    THIS STUDY WAS MADE TO COMPARE THE EFFECTS OF STIMULATION AND NONSTIMULATION ON RECALL OF WORDS FOLLOWING TIME-DELAY PERIODS. THE SUBJECTS (670 AIRMEN) WERE TRAINED WITH AN EXAMPLE WORD LIST AND TWO WORD LISTS CONTAINING FIVE OF THE SECONDARY WORDS ASSOCIATED WITH RUSSELL-JENKINS STIMULUS WORDS. AFTER TIME DELAYS OF 2 MINUTES, 20 MINUTES, 2 DAYS,…

  11. The Emar Lexical Texts

    NARCIS (Netherlands)

    Gantzert, Merijn

    2011-01-01

    This four-part work provides a philological analysis and a theoretical interpretation of the cuneiform lexical texts found in the Late Bronze Age city of Emar, in present-day Syria. These word and sign lists, commonly dated to around 1100 BC, were almost all found in the archive of a single school.

  12. Affective Norms for 362 Persian Words

    Directory of Open Access Journals (Sweden)

    Mahdi Bagheri

    2016-10-01

    Full Text Available Background: During the past two decades, a great deal of research has been conducted on developing affective norms for words in various languages, showing that there is an urgent need to create such norms in Persian language, too. The present study intended to develop a set of 362 Persian words rated according to their emotional valence, arousal, imageability, and familiarity so as to prepare the ground for further research on emotional word processing. This was the first attempt to set affective norms for Persian words in the realm of emotion.  Methods: Prior to the study, a multitude of words were selected from Persian dictionary and academic books in Persian literature. Secondly, three independent proficient experts in the Persian literature were asked to extract the suitable words from the list and to choose the best (defined as grammatically correct and most often used. The database normalization process was based on the ratings by a total of 88 participants using a 9-point Likert scale. Each participant evaluated about 120 words on four different scales.  Results: There were significant relationships between affective dimensions and some psycholinguistic variables. Also, further analyses were carried out to investigate the possible relationship between different features of valences (positive, negative, and neutral and other variables included in the dataset.  Conclusion: These affective norms for Persian words create a useful and valid dataset which will provide researchers with applying standard verbal materials as well as materials applied in other languages, e.g. English, German, French, Spanish, Portuguese, Dutch, etc.

  13. Medical Named Entity Recognition for Indonesian Language Using Word Representations

    Science.gov (United States)

    Rahman, Arief

    2018-03-01

    Nowadays, Named Entity Recognition (NER) system is used in medical texts to obtain important medical information, like diseases, symptoms, and drugs. While most NER systems are applied to formal medical texts, informal ones like those from social media (also called semi-formal texts) are starting to get recognition as a gold mine for medical information. We propose a theoretical Named Entity Recognition (NER) model for semi-formal medical texts in our medical knowledge management system by comparing two kinds of word representations: cluster-based word representation and distributed representation.

  14. Mojibake – The rehearsal of word fragments in verbal recall

    Science.gov (United States)

    Lange-Küttner, Christiane; Sykorova, Eva

    2015-01-01

    Theories of verbal rehearsal usually assume that whole words are being rehearsed. However, words consist of letter sequences, or syllables, or word onset-vowel-coda, amongst many other conceptualizations of word structure. A more general term is the ‘grain size’ of word units (Ziegler and Goswami, 2005). In the current study, a new method measured the quantitative percentage of correctly remembered word structure. The amount of letters in the correct letter sequence as per cent of word length was calculated, disregarding missing or added letters. A forced rehearsal was tested by repeating each memory list four times. We tested low frequency (LF) English words versus geographical (UK) town names to control for content. We also tested unfamiliar international (INT) non-words and names of international (INT) European towns to control for familiarity. An immediate versus distributed repetition was tested with a between-subject design. Participants responded with word fragments in their written recall especially when they had to remember unfamiliar words. While memory of whole words was sensitive to content, presentation distribution and individual sex and language differences, recall of word fragments was not. There was no trade-off between memory of word fragments with whole word recall during the repetition, instead also word fragments significantly increased. Moreover, while whole word responses correlated with each other during repetition, and word fragment responses correlated with each other during repetition, these two types of word recall responses were not correlated with each other. Thus there may be a lower layer consisting of free, sparse word fragments and an upper layer that consists of language-specific, orthographically and semantically constrained words. PMID:25941500

  15. Text mining, a race against time? An attempt to quantify possible variations in text corpora of medical publications throughout the years.

    Science.gov (United States)

    Wagner, Mathias; Vicinus, Benjamin; Muthra, Sherieda T; Richards, Tereza A; Linder, Roland; Frick, Vilma Oliveira; Groh, Andreas; Rubie, Claudia; Weichert, Frank

    2016-06-01

    The continuous growth of medical sciences literature indicates the need for automated text analysis. Scientific writing which is neither unitary, transcending social situation nor defined by a timeless idea is subject to constant change as it develops in response to evolving knowledge, aims at different goals, and embodies different assumptions about nature and communication. The objective of this study was to evaluate whether publication dates should be considered when performing text mining. A search of PUBMED for combined references to chemokine identifiers and particular cancer related terms was conducted to detect changes over the past 36 years. Text analyses were performed using freeware available from the World Wide Web. TOEFL Scores of territories hosting institutional affiliations as well as various readability indices were investigated. Further assessment was conducted using Principal Component Analysis. Laboratory examination was performed to evaluate the quality of attempts to extract content from the examined linguistic features. The PUBMED search yielded a total of 14,420 abstracts (3,190,219 words). The range of findings in laboratory experimentation were coherent with the variability of the results described in the analyzed body of literature. Increased concurrence of chemokine identifiers together with cancer related terms was found at the abstract and sentence level, whereas complexity of sentences remained fairly stable. The findings of the present study indicate that concurrent references to chemokines and cancer increased over time whereas text complexity remained stable. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Punctuation effects in english and esperanto texts

    Science.gov (United States)

    Ausloos, M.

    2010-07-01

    A statistical physics study of punctuation effects on sentence lengths is presented for written texts: Alice in wonderland and Through a looking glass. The translation of the first text into esperanto is also considered as a test for the role of punctuation in defining a style, and for contrasting natural and artificial, but written, languages. Several log-log plots of the sentence-length-rank relationship are presented for the major punctuation marks. Different power laws are observed with characteristic exponents. The exponent can take a value much less than unity ( ca. 0.50 or 0.30) depending on how a sentence is defined. The texts are also mapped into time series based on the word frequencies. The quantitative differences between the original and translated texts are very minutes, at the exponent level. It is argued that sentences seem to be more reliable than word distributions in discussing an author style.

  17. Buzz words in the upstream

    International Nuclear Information System (INIS)

    Knoll, B.

    1998-01-01

    Examples of misleading or misunderstood 'buzz' words that are prevalent in modern upstream technology are illustrated. The terms underbalanced drilling, horizontal wells, and geo-steering, which were unheard of in the early 1980s, have become key 'buzz' words in modern exploitation terminology. The terms are not only misused, but the technologies themselves are frequently mis-applied as shown by the frequency of economic failures, or less than optimal technical successes which have occurred when these technologies have been employed. Two examples, 'horizontal drilling' and 'geosteering', are used to illustrate the point. With regard to horizontal drilling, many oil field professionals consider it as merely a more advanced method of directional drilling. This represents a serious, yet common, misconception. In truth, horizontal wells are not just an altered drilling process, but a fundamental change in exploitation technology. A more appropriate definition would be that a horizontal well is an enhanced oil recovery process, clearly implying a relationship to the exploitation benefit potential of horizontal wells. The other term, 'geo-steering' refers to defining, generating and monitoring a wellpath on geology rather than geometry. It, too, is frequently misused in the technical media. The term is also misrepresented by implying that it is applicable only to the horizontal section of a well, which in fact is far from the truth. To counter these misconceptions, the paper provides appropriate definitions for each of these terms, and defines the conditions under which the techniques themselves are most appropriately used. 7 figs

  18. Word problems and make-believe: Using frame analysis and ethnomethodology to explore aspects of the culture of schooling

    Directory of Open Access Journals (Sweden)

    Benincasa Luciana

    2017-12-01

    Full Text Available The paper applies Goffman’s frame analysis and ethnomethodology to student performance on mathematical word problems. In educational research, frame analysis has usually been limited to primary frames. Instead, in this paper I focus on the kind of secondary frame that Goffman calls ‘utilitarian make-believe’. The data consist of a fragment of verbal interaction between a teacher and a 12-year-old pupil during an oral mathematics exam. By evoking the idea of ‘as-ifness’, word problems introduce pupils to a make-believe world. The text consists only of ‘filler words’ because what really matters are the figures. Word problems and possibly other aspects of schooling can be interpreted in terms of a utilitarian make-believe key. Readiness to adopt this make-believe frame when required may be the difference between school success and failure. I argue that maths achievement takes more than just ‘being good with numbers’. It is a joint enterprise of people interacting within a culturally-shaped setting, organized so as to make some phenomena stand out rather than others. Finally, I argue that ‘word problems and possibly other ‘school genres’ could be added to the list of utilitarian make-believe frames provided by Goffman.

  19. Effects of Word Width and Word Length on Optimal Character Size for Reading of Horizontally Scrolling Japanese Words.

    Science.gov (United States)

    Teramoto, Wataru; Nakazaki, Takuyuki; Sekiyama, Kaoru; Mori, Shuji

    2016-01-01

    The present study investigated, whether word width and length affect the optimal character size for reading of horizontally scrolling Japanese words, using reading speed as a measure. In Experiment 1, three Japanese words, each consisting of four Hiragana characters, sequentially scrolled on a display screen from right to left. Participants, all Japanese native speakers, were instructed to read the words aloud as accurately as possible, irrespective of their order within the sequence. To quantitatively measure their reading performance, we used rapid serial visual presentation paradigm, where the scrolling rate was increased until the participants began to make mistakes. Thus, the highest scrolling rate at which the participants' performance exceeded 88.9% correct rate was calculated for each character size (0.3°, 0.6°, 1.0°, and 3.0°) and scroll window size (5 or 10 character spaces). Results showed that the reading performance was highest in the range of 0.6° to 1.0°, irrespective of the scroll window size. Experiment 2 investigated whether the optimal character size observed in Experiment 1 was applicable for any word width and word length (i.e., the number of characters in a word). Results showed that reading speeds were slower for longer than shorter words and the word width of 3.6° was optimal among the word lengths tested (three, four, and six character words). Considering that character size varied depending on word width and word length in the present study, this means that the optimal character size can be changed by word width and word length in scrolling Japanese words.

  20. Processing and Representation of Ambiguous Words in Chinese Reading: Evidence from Eye Movements.

    Science.gov (United States)

    Shen, Wei; Li, Xingshan

    2016-01-01

    In the current study, we used eye tracking to investigate whether senses of polysemous words and meanings of homonymous words are represented and processed similarly or differently in Chinese reading. Readers read sentences containing target words which was either homonymous words or polysemous words. The contexts of text preceding the target words were manipulated to bias the participants toward reading the ambiguous words according to their dominant, subordinate, or neutral meanings. Similarly, disambiguating regions following the target words were also manipulated to favor either the dominant or subordinate meanings of ambiguous words. The results showed that there were similar eye movement patterns when Chinese participants read sentences containing homonymous and polysemous words. The study also found that participants took longer to read the target word and the disambiguating text following it when the prior context and disambiguating regions favored divergent meanings rather than the same meaning. These results suggested that homonymy and polysemy are represented similarly in the mental lexicon when a particular meaning (sense) is fully specified by disambiguating information. Furthermore, multiple meanings (senses) are represented as separate entries in the mental lexicon.

  1. Abelian primitive words

    OpenAIRE

    Domaratzki, Michael; Rampersad, Narad

    2011-01-01

    We investigate Abelian primitive words, which are words that are not Abelian powers. We show that unlike classical primitive words, the set of Abelian primitive words is not context-free. We can determine whether a word is Abelian primitive in linear time. Also different from classical primitive words, we find that a word may have more than one Abelian root. We also consider enumeration problems and the relation to the theory of codes. Peer reviewed

  2. The BioLexicon: a large-scale terminological resource for biomedical text mining

    Directory of Open Access Journals (Sweden)

    Thompson Paul

    2011-10-01

    Full Text Available Abstract Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is

  3. Building a protein name dictionary from full text: a machine learning term extraction approach

    Directory of Open Access Journals (Sweden)

    Campagne Fabien

    2005-04-01

    Full Text Available Abstract Background The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. Results We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. Conclusion This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.

  4. Adapting the Freiburg monosyllabic word test for Slovenian

    Directory of Open Access Journals (Sweden)

    Tatjana Marvin

    2017-12-01

    Full Text Available Speech audiometry is one of the standard methods used to diagnose the type of hearing loss and to assess the communication function of the patient by determining the level of the patient’s ability to understand and repeat words presented to him or her in a hearing test. For this purpose, the Slovenian adaptation of the German tests developed by Hahlbrock (1953, 1960 – the Freiburg Monosyllabic Word Test and the Freiburg Number Test – are used in Slovenia (adapted in 1968 by Pompe. In this paper we focus on the Freiburg Monosyllabic Word Test for Slovenian, which has been criticized by patients as well as in the literature for the unequal difficulty and frequency of the words, with many of these being extremely rare or even obsolete. As part of the patient’s communication function is retrieving the meaning of individual words by guessing, the less frequent and consequently less familiar words do not contribute to reliable testing results. We therefore adapt the test by identifying and removing such words and supplement them with phonetically similar words to preserve the phonetic balance of the list. The words used for replacement are extracted from the written corpus of Slovenian Gigafida and the spoken corpus of Slovenian GOS, while the optimal combinations of words are established by using computational algorithms.

  5. iWordNet: A New Approach to Cognitive Science and Artificial Intelligence

    OpenAIRE

    Chang, Mark; Chang, Monica

    2017-01-01

    One of the main challenges in artificial intelligence or computational linguistics is understanding the meaning of a word or concept. We argue that the connotation of the term “understanding,” or the meaning of the word “meaning,” is merely a word mapping game due to unavoidable circular definitions. These circular definitions arise when an individual defines a concept, the concepts in its definition, and so on, eventually forming a personalized network of concepts, which we call an iWordNet....

  6. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Information extraction (IE); text mining; text repositories; knowledge discovery from .... general purpose English words. However ... of precision and recall, as extensive experimentation is required due to lack of public tagged corpora. 4. Mining ...

  7. Stopping Antidepressants and Anxiolytics as Major Concerns Reported in Online Health Communities: A Text Mining Approach.

    Science.gov (United States)

    Abbe, Adeline; Falissard, Bruno

    2017-10-23

    Internet is a particularly dynamic way to quickly capture the perceptions of a population in real time. Complementary to traditional face-to-face communication, online social networks help patients to improve self-esteem and self-help. The aim of this study was to use text mining on material from an online forum exploring patients' concerns about treatment (antidepressants and anxiolytics). Concerns about treatment were collected from discussion titles in patients' online community related to antidepressants and anxiolytics. To examine the content of these titles automatically, we used text mining methods, such as word frequency in a document-term matrix and co-occurrence of words using a network analysis. It was thus possible to identify topics discussed on the forum. The forum included 2415 discussions on antidepressants and anxiolytics over a period of 3 years. After a preprocessing step, the text mining algorithm identified the 99 most frequently occurring words in titles, among which were escitalopram, withdrawal, antidepressant, venlafaxine, paroxetine, and effect. Patients' concerns were related to antidepressant withdrawal, the need to share experience about symptoms, effects, and questions on weight gain with some drugs. Patients' expression on the Internet is a potential additional resource in addressing patients' concerns about treatment. Patient profiles are close to that of patients treated in psychiatry. ©Adeline Abbe, Bruno Falissard. Originally published in JMIR Mental Health (http://mental.jmir.org), 23.10.2017.

  8. Cherokee self-reliance and word-use in stories of stress.

    Science.gov (United States)

    Lowe, John; Riggs, Cheryl; Henson, Jim; Elder, Tribal; Liehr, Patricia

    2009-01-01

    This study examined the relationship between Cherokee self-reliance and related values expressed through word-use in stories of stress written by Cherokee adolescents. The overall aim of this pilot study was to test the feasibility of using cultural appropriate measurements for a larger intervention study of substance abuse prevention in Cherokee adolescents. A sample of 50 Cherokee adolescent senior high school students completed the Cherokee Self-Reliance Questionnaire and wrote their story of stress. The Linguistic Inquiry and Word Count (LIWC) program, a word-based computerized text analysis software, was used to report the percentage of words used in the selected word categories in relation to all the words used by a participant. Word-use from the stories of stress were found to correlate with Cherokee self-reliance.

  9. Language abstraction in word of mouth

    NARCIS (Netherlands)

    Schellekens, G.A.C.; Verlegh, P.W.J.; Smidts, A.

    2010-01-01

    This research examines the language that consumers use in word of mouth. For both positive and negative product experiences, we demonstrate that consumers use more abstract terms when they describe experiences that are in line with the valence of their product attitude. This effect cannot be

  10. Scale-invariant transition probabilities in free word association trajectories

    Directory of Open Access Journals (Sweden)

    Martin Elias Costa

    2009-09-01

    Full Text Available Free-word association has been used as a vehicle to understand the organization of human thoughts. The original studies relied mainly on qualitative assertions, yielding the widely intuitive notion that trajectories of word associations are structured, yet considerably more random than organized linguistic text. Here we set to determine a precise characterization of this space, generating a large number of word association trajectories in a web implemented game. We embedded the trajectories in the graph of word co-occurrences from a linguistic corpus. To constrain possible transport models we measured the memory loss and the cycling probability. These two measures could not be reconciled by a bounded diffusive model since the cycling probability was very high (16 % of order-2 cycles implying a majority of short-range associations whereas the memory loss was very rapid (converging to the asymptotic value in ∼ 7 steps which, in turn, forced a high fraction of long-range associations. We show that memory loss and cycling probabilities of free word association trajectories can be simultaneously accounted by a model in which transitions are determined by a scale invariant probability distribution.

  11. Comparison between BIDE, PrefixSpan, and TRuleGrowth for Mining of Indonesian Text

    Science.gov (United States)

    Sa'adillah Maylawati, Dian; Irfan, Mohamad; Budiawan Zulfikar, Wildan

    2017-01-01

    Mining proscess for Indonesian language still be an interesting research. Multiple of words representation was claimed can keep the meaning of text better than bag of words. In this paper, we compare several sequential pattern algortihm, among others BIDE (BIDirectional Extention), PrefixSpan, and TRuleGrowth. All of those algorithm produce frequent word sequence to keep the meaning of text. However, the experiment result, with 14.006 of Indonesian tweet from Twitter, shows that BIDE can produce more efficient frequent word sequence than PrefixSpan and TRuleGrowth without missing the meaning of text. Then, the average of time process of PrefixSpan is faster than BIDE and TRuleGrowth. In the other hand, PrefixSpan and TRuleGrowth is more efficient in using memory than BIDE.

  12. Learning Words from Context and Dictionaries: An Experimental Comparison.

    Science.gov (United States)

    Fischer, Ute

    1994-01-01

    Investigated the independent and interactive effects of contextual and definitional information on vocabulary learning. German students of English received either a text with unfamiliar English words or their monolingual English dictionary entries. A third group received both. Information about word context is crucial to understanding meaning. (44…

  13. The Association between Mathematical Word Problems and Reading Comprehension

    Science.gov (United States)

    Vilenius-Tuohimaa, Piia Maria; Aunola, Kaisa; Nurmi, Jari-Erik

    2008-01-01

    This study aimed to investigate the interplay between mathematical word problem skills and reading comprehension. The participants were 225 children aged 9-10 (Grade 4). The children's text comprehension and mathematical word problem-solving performance was tested. Technical reading skills were investigated in order to categorise participants as…

  14. Activation of words with phonological overlap

    Directory of Open Access Journals (Sweden)

    Claudia K. Friedrich

    2013-08-01

    Full Text Available Multiple lexical representations overlapping with the input (cohort neighbors are temporarily activated in the listener’s mental lexicon when speech unfolds in time. Activation for cohort neighbors appears to rapidly decline as soon as there is mismatch with the input. However, it is a matter of debate whether or not they are completely excluded from further processing. We recorded behavioral data and event-related brain potentials (ERPs in auditory-visual word onset priming during a lexical decision task. As primes we used the first two syllables of spoken German words. In a carrier word condition, the primes were extracted from spoken versions of the target words (ano-ANORAK 'anorak'. In a cohort neighbor condition, the primes were taken from words that overlap with the target word up to the second nucleus (ana- taken from ANANAS 'pineapple'. Relative to a control condition, where primes and targets were unrelated, lexical decision responses for cohort neighbors were delayed. This reveals that cohort neighbors are disfavored by the decision processes at the behavioral front end. In contrast, left-anterior ERPs reflected long-lasting facilitated processing of cohort neighbors. We interpret these results as evidence for extended parallel processing of cohort neighbors. That is, in parallel to the preparation and elicitation of delayed lexical decision responses to cohort neighbors, aspects of the processing system appear to keep track of those less efficient candidates.

  15. Universal Lyndon Words

    OpenAIRE

    Carpi, Arturo; Fici, Gabriele; Holub, Stepan; Oprsal, Jakub; Sciortino, Marinella

    2014-01-01

    A word $w$ over an alphabet $\\Sigma$ is a Lyndon word if there exists an order defined on $\\Sigma$ for which $w$ is lexicographically smaller than all of its conjugates (other than itself). We introduce and study \\emph{universal Lyndon words}, which are words over an $n$-letter alphabet that have length $n!$ and such that all the conjugates are Lyndon words. We show that universal Lyndon words exist for every $n$ and exhibit combinatorial and structural properties of these words. We then defi...

  16. Word skipping: effects of word length, predictability, spelling and reading skill.

    Science.gov (United States)

    Slattery, Timothy J; Yates, Mark

    2017-08-31

    Readers eyes often skip over words as they read. Skipping rates are largely determined by word length; short words are skipped more than long words. However, the predictability of a word in context also impacts skipping rates. Rayner, Slattery, Drieghe and Liversedge (2011) reported an effect of predictability on word skipping for even long words (10-13 characters) that extend beyond the word identification span. Recent research suggests that better readers and spellers have an enhanced perceptual span (Veldre & Andrews, 2014). We explored whether reading and spelling skill interact with word length and predictability to impact word skipping rates in a large sample (N=92) of average and poor adult readers. Participants read the items from Rayner et al. (2011) while their eye movements were recorded. Spelling skill (zSpell) was assessed using the dictation and recognition tasks developed by Sally Andrews and colleagues. Reading skill (zRead) was assessed from reading speed (words per minute) and accuracy of three 120 word passages each with 10 comprehension questions. We fit linear mixed models to the target gaze duration data and generalized linear mixed models to the target word skipping data. Target word gaze durations were significantly predicted by zRead while, the skipping likelihoods were significantly predicted by zSpell. Additionally, for gaze durations, zRead significantly interacted with word predictability as better readers relied less on context to support word processing. These effects are discussed in relation to the lexical quality hypothesis and eye movement models of reading.

  17. Position list word aligned hybrid

    DEFF Research Database (Denmark)

    Deliege, Francois; Pedersen, Torben Bach

    2010-01-01

    Compressed bitmap indexes are increasingly used for efficiently querying very large and complex databases. The Word Aligned Hybrid (WAH) bitmap compression scheme is commonly recognized as the most efficient compression scheme in terms of CPU efficiency. However, WAH compressed bitmaps use a lot...... of storage space. This paper presents the Position List Word Aligned Hybrid (PLWAH) compression scheme that improves significantly over WAH compression by better utilizing the available bits and new CPU instructions. For typical bit distributions, PLWAH compressed bitmaps are often half the size of WAH...... bitmaps and, at the same time, offer an even better CPU efficiency. The results are verified by theoretical estimates and extensive experiments on large amounts of both synthetic and real-world data....

  18. Comparing different kinds of words and word-word relations to test an habituation model of priming.

    Science.gov (United States)

    Rieth, Cory A; Huber, David E

    2017-06-01

    Huber and O'Reilly (2003) proposed that neural habituation exists to solve a temporal parsing problem, minimizing blending between one word and the next when words are visually presented in rapid succession. They developed a neural dynamics habituation model, explaining the finding that short duration primes produce positive priming whereas long duration primes produce negative repetition priming. The model contains three layers of processing, including a visual input layer, an orthographic layer, and a lexical-semantic layer. The predicted effect of prime duration depends both on this assumed representational hierarchy and the assumption that synaptic depression underlies habituation. The current study tested these assumptions by comparing different kinds of words (e.g., words versus non-words) and different kinds of word-word relations (e.g., associative versus repetition). For each experiment, the predictions of the original model were compared to an alternative model with different representational assumptions. Experiment 1 confirmed the prediction that non-words and inverted words require longer prime durations to eliminate positive repetition priming (i.e., a slower transition from positive to negative priming). Experiment 2 confirmed the prediction that associative priming increases and then decreases with increasing prime duration, but remains positive even with long duration primes. Experiment 3 replicated the effects of repetition and associative priming using a within-subjects design and combined these effects by examining target words that were expected to repeat (e.g., viewing the target word 'BACK' after the prime phrase 'back to'). These results support the originally assumed representational hierarchy and more generally the role of habituation in temporal parsing and priming. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Preschoolers Have Better Long-Term Memory for Rhyming Text than Adults

    Science.gov (United States)

    Király, Ildikó; Takács, Szilvia; Kaldy, Zsuzsa; Blaser, Erik

    2017-01-01

    The dominant view of children's memory is that it is slow to develop and is inferior to adults'. Here we pitted 4-year-old children against adults in a test of verbatim recall of verbal material. Parents read a novel rhyming verse (and an integrated word list) as their child's bedtime story on ten consecutive days. A group of young adults listened…

  20. Assessing semantic similarity of texts - Methods and algorithms

    Science.gov (United States)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  1. Signal Words

    Science.gov (United States)

    SIGNAL WORDS TOPIC FACT SHEET NPIC fact sheets are designed to answer questions that are commonly asked by the ... making decisions about pesticide use. What are Signal Words? Signal words are found on pesticide product labels, ...

  2. Office automation: a look beyond word processing

    OpenAIRE

    DuBois, Milan Ephriam, Jr.

    1983-01-01

    Approved for public release; distribution is unlimited Word processing was the first of various forms of office automation technologies to gain widespread acceptance and usability in the business world. For many, it remains the only form of office automation technology. Office automation, however, is not just word processing, although it does include the function of facilitating and manipulating text. In reality, office automation is not one innovation, or one office system, or one tech...

  3. When does word frequency influence written production?

    Directory of Open Access Journals (Sweden)

    Cristina eBaus

    2013-12-01

    Full Text Available The aim of the present study was to explore the central (e.g., lexical processing and peripheral processes (motor preparation and execution underlying word production during typewriting. To do so, we tested non-professional typers in a picture typing task while continuously recording EEG. Participants were instructed to write (by means of a standard keyboard the corresponding name for a given picture. The lexical frequency of the words was manipulated: half of the picture names were of high-frequency while the remaining were of low-frequency. Different measures were obtained: 1 first keystroke latency and 2 keystroke latency of the subsequent letters and duration of the word. Moreover, ERPs locked to the onset of the picture presentation were analysed to explore the temporal course of word frequency in typewriting. The results showed an effect of word frequency for the first keystroke latency but not for the duration of the word or the speed to which letter were typed (interstroke intervals. The electrophysiological results showed the expected ERP frequency effect at posterior sites: amplitudes for low-frequency words were more positive than those for high-frequency words. However, relative to previous evidence in the spoken modality, the frequency effect appeared in a later time-window. These results demonstrate two marked differences in the processing dynamics underpinning typing compared to speaking: First, central processing dynamics between speaking and typing differ already in the manner that words are accessed; second, central processing differences in typing, unlike speaking, do not cascade to peripheral processes involved in response execution.

  4. SSC 254 Screen-Based Word Processors: Production Tests. The Lanier Word Processor.

    Science.gov (United States)

    Moyer, Ruth A.

    Designed for use in Trident Technical College's Secretarial Lab, this series of 12 production tests focuses on the use of the Lanier Word Processor for a variety of tasks. In tests 1 and 2, students are required to type and print out letters. Tests 3 through 8 require students to reformat a text; make corrections on a letter; divide and combine…

  5. Word-by-word entrainment of speech rhythm during joint story building

    Directory of Open Access Journals (Sweden)

    Tommi eHimberg

    2015-06-01

    Full Text Available Movements and behaviour synchronise during social interaction at many levels, often unintentionally. During smooth conversation, for example, participants adapt to each others' speech rates. Here we aimed to find out to which extent speakers adapt their turn-taking rhythms during a story-building game.Nine sex-matched dyads of adults (12 males, 6 females created two 5-min stories by contributing to them alternatingly one word at a time. The participants were located in different rooms, with audio connection during one story and audiovisual during the other. They were free to select the topic of the story.Although the participants received no instructions regarding the timing of the story building, their word rhythms were highly entrained (R ̅ = 0.70, p < 0.001 even though the rhythms as such were unstable (R ̅ = 0.14 for pooled data. Such high entrainment in the absence of steady word rhythm occurred in every individual story, independently of whether the subjects were connected via audio-only or audiovisual link.The observed entrainment was of similar strength as typical entrainment in finger-tapping tasks where participants are specifically instructed to synchronize their behaviour. Thus speech seems to spontaneously induce strong entrainment between the conversation partners, likely reflecting automatic alignment of their semantic and syntactic processes.

  6. Processing negative valence of word pairs that include a positive word.

    Science.gov (United States)

    Itkes, Oksana; Mashal, Nira

    2016-09-01

    Previous research has suggested that cognitive performance is interrupted by negative relative to neutral or positive stimuli. We examined whether negative valence affects performance at the word or phrase level. Participants performed a semantic decision task on word pairs that included either a negative or a positive target word. In Experiment 1, the valence of the target word was congruent with the overall valence conveyed by the word pair (e.g., fat kid). As expected, response times were slower in the negative condition relative to the positive condition. Experiment 2 included target words that were incongruent with the overall valence of the word pair (e.g., fat salary). Response times were longer for word pairs whose overall valence was negative relative to positive, even though these word pairs included a positive word. Our findings support the Cognitive Primacy Hypothesis, according to which emotional valence is extracted after conceptual processing is complete.

  7. Words of foreign origin in political discourse

    Directory of Open Access Journals (Sweden)

    Sabina Zorčič

    2012-12-01

    Full Text Available The paper discusses the use of words of foreign origin in Slovenian political discourse. At the outset, this usage is broken down into four groups: the first contains specific phrases and terminology inherent to the political domain; the second contains words of foreign origin generally present in the Slovene language (because of their high frequency of nonexclusivistic use, these words are not of interest to the scope of this investigation; the third contains various words of foreign origin used as affectional packaging for messages with the aim of stimulating the desired interpretation (framing reality; the fourth group, which is the most interesting for our research, is made up of words of foreign origin which could have a marker: + marked, + not necessary, + unwanted, but only if we accept the logic of purism. All the words in this group could be replaced - without any loss of meaning - with their Slovene equivalents. The speakerʼs motivation for using the foreign word is crucial to our discussion. In the framework of Pierre Bourdieuʼs poststructural theory as well as Austinʼs and Searleʼs speech act theory, statistical data is analysed to observe how usage frequency varies in correlation with selected factors which manifest the speakerʼs habitus. We argue that words of foreign origin represent symbolic cultural capital, a kind of added value which functions as credit and as such is an important form of the accumulation of capital.

  8. Persian Words Used in Kazi Nazrul Islam's Poetry

    Directory of Open Access Journals (Sweden)

    Md. Mumit Al Rashid

    2017-11-01

    Full Text Available Kazi Nazrul Islam, the national poet of Bangladesh, popularly known as Nazrul-the rebel poet, is undoubtedly one who may rightly be called as one of the greatest “poets of people” of the world. He was the first poet in Bengali literature that used extensive Arabic and Persian words to express his views and to create a Muslim renaissance within the whole Bengali nation. He was a multi-lingual poet. That’s why we see huge Arabic, Persian, Hindi, Sanskrit and Urdu words, even sentences in almost everywhere of his literature. This article is about the Persian words that Nazrul had used in his poetry. Though the majority of his poems consists of more or less Persian words, in this article, we discussed five of his poems named Shat-il-Arab, Moharram, Kamal Pasha, Qorbani and the 12th Fateha that has most Persian words comparatively.

  9. Caffeine improves left hemisphere processing of positive words.

    Directory of Open Access Journals (Sweden)

    Lars Kuchinke

    Full Text Available A positivity advantage is known in emotional word recognition in that positive words are consistently processed faster and with fewer errors compared to emotionally neutral words. A similar advantage is not evident for negative words. Results of divided visual field studies, where stimuli are presented in either the left or right visual field and are initially processed by the contra-lateral brain hemisphere, point to a specificity of the language-dominant left hemisphere. The present study examined this effect by showing that the intake of caffeine further enhanced the recognition performance of positive, but not negative or neutral stimuli compared to a placebo control group. Because this effect was only present in the right visual field/left hemisphere condition, and based on the close link between caffeine intake and dopaminergic transmission, this result points to a dopaminergic explanation of the positivity advantage in emotional word recognition.

  10. Examining the central and peripheral processes of written word production through meta-analysis

    Directory of Open Access Journals (Sweden)

    Jeremy ePurcell

    2011-10-01

    Full Text Available Producing written words requires central cognitive processes (such as orthographic long-term and working memory as well as more peripheral processes responsible for generating the motor actions needed for producing written words in a variety of formats (handwriting, typing, etc.. In recent years, various functional neuroimaging studies have examined the neural substrates underlying the central and peripheral processes of written word production. This study provides the first quantitative meta-analysis of these studies by applying Activation Likelihood Estimation methods (Turkeltaub et al., 2002. For alphabet languages, we identified 11 studies (with a total of 17 experimental contrasts that had been designed to isolate central and/or peripheral processes of word spelling (total number of participants = 146. Three ALE meta-analyses were carried out. One involved the complete set of 17 contrasts; two others were applied to subsets of contrasts to distinguish the neural substrates of central from peripheral processes. These analyses identified a network of brain regions reliably associated with the central and peripheral processes of word spelling. Among the many significant results, is the finding that the regions with the greatest correspondence across studies were in the left inferior temporal/fusiform gyri and left inferior frontal gyrus. Furthermore, although the angular gyrus has traditionally been identified as a key site within the written word production network, none of the meta-analyses found it to be a consistent site of activation, identifying instead a region just superior/medial to the left angular gyrus in the left posterior intraparietal sulcus. In general these meta-analyses and the discussion of results provide a valuable foundation upon which future studies that examine the neural basis of written word production can build.

  11. ASM Based Synthesis of Handwritten Arabic Text Pages

    Directory of Open Access Journals (Sweden)

    Laslo Dinges

    2015-01-01

    Full Text Available Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.

  12. Towards A New Approach For Arabic Root Extraction:Exploit Relations Between The Word Letters And Their Placement In The Word For Arabic Root Extraction

    Directory of Open Access Journals (Sweden)

    Fatma Abu Hawas

    2013-01-01

    Full Text Available In this paper we present a new root-extraction approach for Arabic words. The approach tries to assign for Arabic word a unique root without having a database of word roots, a list of words patterns or even a list of all the prefixes and the suffixes of the Arabic words. Unlike most of Arabic rule-based stemmers, it tries to predict the letters positions that may form the word root one by one using some rules based on the relations among the Arabic word letters and their placement in the word. This paper will focus on two parts of the approach. The first one deals with the rules that distinguish between the Arabic definite article “ال -AL” and the permanent component “ال -AL” that may found in any Arabic word. The second part of the approach adopts the segmentation of the word into three parts and classifies Arabic letters in to groups according to their positions in each segment. The proposed approach is a system composed of several modules that corporate together to extract the word root. The approach has been tested and evaluated using the Holy Quran words. The results of the evaluation show a promising root extraction algorithm.

  13. Examining word association networks: A cross-country comparison of women's perceptions of HPV testing and vaccination.

    Directory of Open Access Journals (Sweden)

    Bernd C Schmid

    Full Text Available In this study, we examined the perceptual associations women hold with regard to cervical cancer testing and vaccination across two countries, the U.S. and Australia. In a large-scale online survey, we presented participants with 'trigger' words, and asked them to state sequentially other words that came to mind. We used this data to construct detailed term co-occurrence network graphs, which we analyzed using basic topological ranking techniques. The results showed that women hold divergent perceptual associations regarding trigger words relating to cervical cancer screening tools, i.e. human papillomavirus (HPV testing and vaccination, which indicate health knowledge deficiencies with non-HPV related associations emerging from the data. This result was found to be consistent across the country groups studied. Our findings are critical in optimizing consumer education and public service announcements to minimize misperceptions relating to HPV testing and vaccination in order to maximize adoption of cervical cancer prevention tools.

  14. Spatial attention in written word perception

    Directory of Open Access Journals (Sweden)

    Veronica eMontani

    2014-02-01

    Full Text Available The role of attention in visual word recognition and reading aloud is a long debated issue. Studies of both developmental and acquired reading disorders provide growing evidence that spatial attention is critically involved in word reading, in particular for the phonological decoding of unfamiliar letter strings. However, studies on healthy participants have produced contrasting results. The aim of this study was to investigate how the allocation of spatial attention may influence the perception of letter strings in skilled readers. High frequency words, low frequency words and pseudowords were briefly and parafoveally presented either in the left or the right visual field. Attentional allocation was modulated by the presentation of a spatial cue before the target string. Accuracy in reporting the target string was modulated by the spatial cue but this effect varied with the type of string. For unfamiliar strings, processing was facilitated when attention was focused on the string location and hindered when it was diverted from the target. This finding is consistent the assumptions of the CDP+ model of reading aloud, as well as with familiarity sensitivity models that argue for a flexible use of attention according with the specific requirements of the string. Moreover, we found that processing of high-frequency words was facilitated by an extra-large focus of attention. The latter result is consistent with the hypothesis that a broad distribution of attention is the default mode during reading of familiar words because it might optimally engage the broad receptive fields of the highest detectors in the hierarchical system for visual word recognition.

  15. Finding words in a language that allows words without vowels

    NARCIS (Netherlands)

    El Aissati, A.; McQueen, J.M.; Cutler, A.

    2012-01-01

    Across many languages from unrelated families, spoken-word recognition is subject to a constraint whereby potential word candidates must contain a vowel. This constraint minimizes competition from embedded words (e.g., in English, disfavoring win in twin because t cannot be a word). However, the

  16. Language cultural brokerage and informed consent will technological terms impede telemedicine use

    Directory of Open Access Journals (Sweden)

    Caron Jack

    2014-04-01

    Full Text Available Introduction. Telemedicine provides a solution to treatment of economically and geographically compromised patients and enhances the level of care. However, a problem has arisen in safeguarding patients’ rights to informed consent.Objective. To determine the impact of language, translation and interpretation barriers on gaining legally valid informed consent in telemedicine.Design. Forty-one key words relevant to computer terminology and concepts required to gain informed consent for a telemedicine encounter were selected and sent for translation into isiZulu, the local indigenous language of KwaZulu-Natal, South Africa. A questionnaire with the list of words was developed with three domains covering information communication technology (ICT use, ICT terms and ethics terms. This was administered to patients at four outpatient departments in rural KwaZulu-Natal hospitals.Results. Of the 54 participants, 50 (92.6% did not know or understand the term ‘telemedicine’, 49 (90.7% the term ‘video conference’ and 49 (90.7% the term ‘electronic records’. Words such as ‘consent’ and ‘autonomy’ were understood by less than a third of the participants. Only 19 individuals (35.2% understood the word ‘consent’, and only 4 (7.4% understood both the words ‘consent’ and ‘telemedicine’.Conclusions. The results of this study show that obtaining informed consent for a telemedicine consultation is problematic. Alternative ways of gaining informed consent need to be investigated.

  17. A cross-linguistic study of real-word and non-word repetition as predictors of grammatical competence in children with typical language development

    Science.gov (United States)

    Dispaldro, Marco; Deevy, Patricia; Altoe, Gianmarco; Benelli, Beatrice; Leonard Purdue, Laurence B.

    2013-01-01

    Background Although relationships among non-word repetition, real-word repetition and grammatical ability have been documented, it is important to study whether the specific nature of these relationships is tied to the characteristics of a given language. Aims The aim of this study is to explore the potential cross-linguistic differences (Italian and English) in the relationship among non-word repetition, real-word repetition, and grammatical ability in three- and four-year-old children with typical language development. Methods & Procedures To reach this goal, two repetition tasks (one real-word list and one non-word list for each language) were used. In Italian the grammatical categories were the third person plural inflection and the direct-object clitic pronouns, while in English they were the third person singular present tense inflection and the past tense in regular and irregular forms. Outcomes & Results A cross-linguistic comparison showed that in both Italian and English, non-word repetition was a significant predictor of grammatical ability. However, performance on real-word repetition explained children’s grammatical ability in Italian but not in English. Conclusions & Implications Abilities underlying non-word repetition performance (e.g., the processing and/or storage of phonological material) play an important role in the development of children’s grammatical abilities in both languages. Lexical ability (indexed by real-word repetition) showed a close relationship to grammatical ability in Italian but not in English. Implications of the findings are discussed in terms of cross-linguistic differences, genetic research, clinical intervention and methodological issues. PMID:21899673

  18. A cross-linguistic study of real-word and non-word repetition as predictors of grammatical competence in children with typical language development.

    Science.gov (United States)

    Dispaldro, Marco; Deevy, Patricia; Altoé, Gianmarco; Benelli, Beatrice; Leonard, Laurence B

    2011-01-01

    Although relationships among non-word repetition, real-word repetition and grammatical ability have been documented, it is important to study whether the specific nature of these relationships is tied to the characteristics of a given language. The aim of this study is to explore the potential cross-linguistic differences (Italian and English) in the relationship among non-word repetition, real-word repetition, and grammatical ability in three-and four-year-old children with typical language development. To reach this goal, two repetition tasks (one real-word list and one non-word list for each language) were used. In Italian the grammatical categories were the third person plural inflection and the direct-object clitic pronouns, while in English they were the third person singular present tense inflection and the past tense in regular and irregular forms. A cross-linguistic comparison showed that in both Italian and English, non-word repetition was a significant predictor of grammatical ability. However, performance on real-word repetition explained children's grammatical ability in Italian but not in English. Abilities underlying non-word repetition performance (e.g., the processing and/or storage of phonological material) play an important role in the development of children's grammatical abilities in both languages. Lexical ability (indexed by real-word repetition) showed a close relationship to grammatical ability in Italian but not in English. Implications of the findings are discussed in terms of cross-linguistic differences, genetic research, clinical intervention and methodological issues. © 2011 Royal College of Speech & Language Therapists.

  19. Ukufundisa izicuku zeziqhakancu emagameni (Teaching click clusters in words

    Directory of Open Access Journals (Sweden)

    Gxowa-Dlayedwa, Ntombizodwa Cynthia

    2015-12-01

    Full Text Available Some teachers find it uninteresting and difficult to teach isiXhosa phonemes and syllables to grade one to three learners. This has a negative impact as the literacy results are low because learners’ reading and writing skills are poor. The linguistics terms featuring in the title, namely; consonants, vowels and syllables as found in words facilitate reading, and thus improve literacy standards in every language. IsiXhosa is one of the eleven official languages in South Africa. Phonemes include clicks and/or click cluster and vowels. On the other hand, there are people who are interested in learning to speak isiXhosa, but the difficulties encountered during the pronunciation of clicks discourage many of them. This study believes that the knowledge of phonemes and syllables will boost the literacy standard in isiXhosa. Therefore, the purposes of this study are to show that clicks and click clusters are found in major word categories which are in life circles. Secondly, if words are divided into segments, it becomes easy to produce them in print and reading skills. Thirdly, reading is possible in every language, and most importantly, skills are transferable. The current study therefore, argues that the knowledge of phonemes and syllables facilitates reading and creative writing skills. The data used in this study were taken from a novel written by Sidlayi (2009. Few examples have been given by the researchers themselves with an objective to clarify some ideas.

  20. Does length or neighborhood size cause the word length effect?

    Science.gov (United States)

    Jalbert, Annie; Neath, Ian; Surprenant, Aimée M

    2011-10-01

    Jalbert, Neath, Bireta, and Surprenant (2011) suggested that past demonstrations of the word length effect, the finding that words with fewer syllables are recalled better than words with more syllables, included a confound: The short words had more orthographic neighbors than the long words. The experiments reported here test two predictions that would follow if neighborhood size is a more important factor than word length. In Experiment 1, we found that concurrent articulation removed the effect of neighborhood size, just as it removes the effect of word length. Experiment 2 demonstrated that this pattern is also found with nonwords. For Experiment 3, we factorially manipulated length and neighborhood size, and found only effects of the latter. These results are problematic for any theory of memory that includes decay offset by rehearsal, but they are consistent with accounts that include a redintegrative stage that is susceptible to disruption by noise. The results also confirm the importance of lexical and linguistic factors on memory tasks thought to tap short-term memory.

  1. Words in Sheep’s Clothing

    Directory of Open Access Journals (Sweden)

    Dušan Gabrovšek

    2006-06-01

    Full Text Available The paper focuses on various types of dictionary words, i.e. infrequent and rather uncommon words often listed in comprehensive monolingual English dictionaries but virtually nonexistent in actual usage. These are typically learned derivatives of Greek or Latin origin that are given as unlabeled synonyms of everyday vocabulary items. Their inclusion seems to stem from the application of two different bits of lexicographic philosophy: great respect for matters classical and the principle of comprehensiveness. Seen from this perspective, descriptive corpus-based lexicography is still too weak. While in large native-speaker-oriented dictionaries of English such entries do not seem to cause any harm, they can be positively dangerous in EFL/ESL environments, because using them can easily lead to strange or downright incomprehensible lexical items. Learners are advised to be careful and check the status of such “dubious” items also in English monolingual learners’ dictionaries, in which dictionary words are virtually nonexistent.

  2. Infant word recognition: Insights from TRACE simulations.

    Science.gov (United States)

    Mayor, Julien; Plunkett, Kim

    2014-02-01

    The TRACE model of speech perception (McClelland & Elman, 1986) is used to simulate results from the infant word recognition literature, to provide a unified, theoretical framework for interpreting these findings. In a first set of simulations, we demonstrate how TRACE can reconcile apparently conflicting findings suggesting, on the one hand, that consonants play a pre-eminent role in lexical acquisition (Nespor, Peña & Mehler, 2003; Nazzi, 2005), and on the other, that there is a symmetry in infant sensitivity to vowel and consonant mispronunciations of familiar words (Mani & Plunkett, 2007). In a second series of simulations, we use TRACE to simulate infants' graded sensitivity to mispronunciations of familiar words as reported by White and Morgan (2008). An unexpected outcome is that TRACE fails to demonstrate graded sensitivity for White and Morgan's stimuli unless the inhibitory parameters in TRACE are substantially reduced. We explore the ramifications of this finding for theories of lexical development. Finally, TRACE mimics the impact of phonological neighbourhoods on early word learning reported by Swingley and Aslin (2007). TRACE offers an alternative explanation of these findings in terms of mispronunciations of lexical items rather than imputing word learning to infants. Together these simulations provide an evaluation of Developmental (Jusczyk, 1993) and Familiarity (Metsala, 1999) accounts of word recognition by infants and young children. The findings point to a role for both theoretical approaches whereby vocabulary structure and content constrain infant word recognition in an experience-dependent fashion, and highlight the continuity in the processes and representations involved in lexical development during the second year of life.

  3. Cumulative Repetition Effects across Multiple Readings of a Word: Evidence from Eye Movements

    Science.gov (United States)

    Kamienkowski, Juan E.; Carbajal, M. Julia; Bianchi, Bruno; Sigman, Mariano; Shalom, Diego E.

    2018-01-01

    When a word is read more than once, reading time generally decreases in the successive occurrences. This Repetition Effect has been used to study word encoding and memory processes in a variety of experimental measures. We studied naturally occurring repetitions of words within normal texts (stories of around 3,000 words). Using linear mixed…

  4. Using complex networks to quantify consistency in the use of words

    International Nuclear Information System (INIS)

    Amancio, D R; Oliveira Jr, O N; Costa, L da F

    2012-01-01

    In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition

  5. Seven words you can never say at HHS

    Directory of Open Access Journals (Sweden)

    Robbins RA

    2017-12-01

    Full Text Available No abstract available. Article truncated at 150 words. The recent announcement of the seven words you can never say at Health & Human Services (HHS reminded me of the late George Carlin’s routine, “Seven Words You Can Never Say on Television” (1. Policy analysts at the Centers for Disease Control (CDC in Atlanta were told of the list of forbidden words at a meeting last Thursday, December 14, with senior CDC officials who oversee the budget, according to an analyst who took part in the 90-minute briefing (2. The forbidden words are "vulnerable," "entitlement," "diversity," "transgender," "fetus," "evidence-based" and "science-based." In some instances, the analysts were given alternative phrases. Instead of “science-based” or “evidence-based,” the suggested phrase is “CDC bases its recommendations on science in consideration with community standards and wishes,” the person said. In other cases, no replacement words were immediately offered. This is the latest attempt by government departments to distort fact. As an example, the …

  6. The socially-weighted encoding of spoken words: A dual-route approach to speech perception

    Directory of Open Access Journals (Sweden)

    Meghan eSumner

    2014-01-01

    Full Text Available Spoken words are highly variable. A single word may never be uttered the same way twice. As listeners, we regularly encounter speakers of different ages, genders, and accents, increasing the amount of variation we face. How listeners understand spoken words as quickly and adeptly as they do despite this variation remains an issue central to linguistic theory. We propose that learned acoustic patterns are mapped simultaneously to linguistic representations and to social representations. In doing so, we illuminate a paradox that results in the literature from, we argue, the focus on representations and the peripheral treatment of word-level phonetic variation. We consider phonetic variation more fully and highlight a growing body of work that is problematic for current theory: Words with different pronunciation variants are recognized equally well in immediate processing tasks, while an atypical, infrequent, but socially-idealized form is remembered better in the long-term. We suggest that the perception of spoken words is socially-weighted, resulting in sparse, but high-resolution clusters of socially-idealized episodes that are robust in immediate processing and are more strongly encoded, predicting memory inequality. Our proposal includes a dual-route approach to speech perception in which listeners map acoustic patterns in speech to linguistic and social representations in tandem. This approach makes novel predictions about the extraction of information from the speech signal, and provides a framework with which we can ask new questions. We propose that language comprehension, broadly, results from the integration of both linguistic and social information.

  7. Novel-word learning deficits in Mandarin-speaking preschool children with specific language impairments.

    Science.gov (United States)

    Chen, Yuchun; Liu, Huei-Mei

    2014-01-01

    Children with SLI exhibit overall deficits in novel word learning compared to their age-matched peers. However, the manifestation of the word learning difficulty in SLI was not consistent across tasks and the factors affecting the learning performance were not yet determined. Our aim is to examine the extent of word learning difficulties in Mandarin-speaking preschool children with SLI, and to explore the potent influence of existing lexical knowledge on to the word learning process. Preschool children with SLI (n=37) and typical language development (n=33) were exposed to novel words for unfamiliar objects embedded in stories. Word learning tasks including the initial mapping and short-term repetitive learning were designed. Results revealed that Mandarin-speaking preschool children with SLI performed as well as their age-peers in the initial form-meaning mapping task. Their word learning difficulty was only evidently shown in the short-term repetitive learning task under a production demand, and their learning speed was slower than the control group. Children with SLI learned the novel words with a semantic head better in both the initial mapping and repetitive learning tasks. Moderate correlations between stand word learning performances and scores on standardized vocabulary were found after controlling for children's age and nonverbal IQ. The results suggested that the word learning difficulty in children with SLI occurred in the process of establishing a robust phonological representation at the beginning stage of word learning. Also, implicit compound knowledge is applied to aid word learning process for children with and without SLI. We also provide the empirical data to validate the relationship between preschool children's word learning performance and their existing receptive vocabulary ability. Copyright © 2013 Elsevier Ltd. All rights reserved.

  8. Typing speed, spelling accuracy, and the use of word-prediction

    Directory of Open Access Journals (Sweden)

    Marina Herold

    2008-02-01

    Full Text Available Children with spelling difficulties are limited in their participation in all written school activities. We aimed to investigate the influence of word-prediction as a tool on spelling accuracy and typing speed. To this end, we selected 80 Grade 4 - 6 children with spelling difficulties in a school for special needs to participate in a research project involving a cross-over within-subject design. The research task took the form of entering 30 words through an on-screen keyboard, with and without the use of word-prediction software. The Graded Word Spelling Test served to investigate whether there was a relationship between the children's current spelling knowledge and word-prediction efficacy. The results indicated an increase in spelling accuracy with the use of word-prediction, but at the cost of time and the tendency to use word approximations, and no significant relationship between spelling knowledge and word-prediction efficacy.

  9. Accounting for L2 learners’ errors in word stress placement

    Directory of Open Access Journals (Sweden)

    Clara Herlina Karjo

    2016-01-01

    Full Text Available Stress placement in English words is governed by highly complicated rules. Thus, assigning stress correctly in English words has been a challenging task for L2 learners, especially Indonesian learners since their L1 does not recognize such stress system. This study explores the production of English word stress by 30 university students. The method used for this study is immediate repetition task. Participants are instructed to identify the stress placement of 80 English words which are auditorily presented as stimuli and immediately repeat the words with correct stress placement. The objectives of this study are to find out whether English word stress placement is problematic for L2 learners and to investigate the phonological factors which account for these problems. Research reveals that L2 learners have different ability in producing the stress, but three-syllable words are more problematic than two-syllable words. Moreover, misplacement of stress is caused by, among others, the influence of vowel lenght and vowel height.

  10. On advantage of seeing text and hearing speech

    Directory of Open Access Journals (Sweden)

    Živanović Jelena

    2011-01-01

    Full Text Available The aim of this study was to examine the effect of congruence between the sensory modality through which a concept can be experienced and the modality through which the word denoting that concept is perceived during word recognition. Words denoting concepts that can be experienced visually (e.g. “color” and words denoting concepts that can be experienced auditorily (e.g. “noise” were presented both visually and auditorily. We observed shorter processing latencies when there was a match between the modality through which a concept could be experienced and the modality through which a word denoting that concept was presented. In visual lexical decision task, “color” was recognized faster than “noise”, whereas in auditory lexical decision task, “noise” was recognized faster than “color”. The obtained pattern of results can not be accounted for by exclusive amodal theories, whereas it can be easily integrated in theories based on perceptual representations.

  11. The art of words of Branko Miljković

    Directory of Open Access Journals (Sweden)

    Aleksić Slađana V.

    2015-01-01

    Full Text Available The youngest poet of Serbian neosymbolism, Branko Miljković, established his version of national poetic tradition according to the well known model of Eliot's principle and was also always ready to accept something new, modern, the most avant-garde. He argued for the perfection of form as one of important elements of the symbolist poetry within his own poetics as well as within texts related to poetry, while as a poet and translator he was oriented toward modern European poetry of the poet of symbolism with a special talent for patriotic poetry. Branko Miljković presents the poem itself as the subject of his poetry where the power of poetic word can be perceived. That structure considers the improvement of lines, the improvement of the expressive power of words as well as the improvement of language. The words are a powerful framework of world for Miljković. He strived for returning their depth and intensive meaning. Furthermore, he found creative energy of words, its resurrection in the revival of semantic, evocative and expressive effect. He looked for the value of poetic words primarily within their superiority. In Miljković's world of poetry the words invent, shape, they become the only confirmation of our world. He regarded his poetic opinion and the very act of poet's creation as a necessary condition of the real poetic creativity: the author identified himself with the very act of writing being completely left at the mercy of words. The art of dying for this world is the art of spiritual living. At the same time it is the effort that differentiates the place of death and the place of life: the poison and the cure. These tendencies provided the poet with intellectual dimension, confirmed not only the power of poetry but the power of poet as well.

  12. Changing word usage predicts changing word durations in New Zealand English.

    Science.gov (United States)

    Sóskuthy, Márton; Hay, Jennifer

    2017-09-01

    This paper investigates the emergence of lexicalized effects of word usage on word duration by looking at parallel changes in usage and duration over 130years in New Zealand English. Previous research has found that frequent words are shorter, informative words are longer, and words in utterance-final position are also longer. It has also been argued that some of these patterns are not simply online adjustments, but are incorporated into lexical representations. While these studies tend to focus on the synchronic aspects of such patterns, our corpus shows that word-usage patterns and word durations are not static over time. Many words change in duration and also change with respect to frequency, informativity and likelihood of occurring utterance-finally. Analysis of changing word durations over this time period shows substantial patterns of co-adaptation between word usage and word durations. Words that are increasing in frequency are becoming shorter. Words that are increasing/decreasing in informativity show a change in the same direction in duration (e.g. increasing informativity is associated with increasing duration). And words that are increasingly appearing utterance-finally are lengthening. These effects exist independently of the local effects of the predictors. For example, words that are increasing utterance-finally lengthen in all positions, including utterance-medially. We show that these results are compatible with a number of different views about lexical representations, but they cannot be explained without reference to a production-perception loop that allows speakers to update their representations dynamically on the basis of their experience. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  13. Semi-automatic Term Extraction for the African Languages, with ...

    African Journals Online (AJOL)

    rbr

    for the treatment of single-word terms versus multi-word terms; and the various findings are sum- marised in a ... these days in many different types of dictionary to use the systematic evidence .... not form the focus of the current investigation. ..... When one studies the first 25 unique terms on the KeyWord list, one sees that.

  14. Using a High-Dimensional Graph of Semantic Space to Model Relationships among Words

    Directory of Open Access Journals (Sweden)

    Alice F Jackson

    2014-05-01

    Full Text Available The GOLD model (Graph Of Language Distribution is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA. The superior performance of the GOLD models (big and small suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition.

  15. A WORD-OF-MOUSE APPROACH FOR WORD-OF-MOUTH MEASUREMENT

    OpenAIRE

    Andreia Gabriela ANDREI

    2012-01-01

    Despite of the fact that word-of-mouth phenomenon gained unseen dimensions, only few studies have focused on its measurement and only three of them developed a word-of-mouth construct. Our study develops a bi-dimensional scale which assigns usual word-of-mouth mechanisms available in online networking sites (eg: Recommend, Share, Like, Comment) into the WOM (+) - positive word-of-mouth valence dimension - respectively into the WOM (-) - negative word-of-mouth valence dimension. We adapted e-W...

  16. A Literature Review of Word of Mouth and Electronic Word of Mouth: Implications for Consumer Behavior

    Directory of Open Access Journals (Sweden)

    Nuria Huete-Alcocer

    2017-07-01

    Full Text Available The rise and spread of the Internet has led to the emergence of a new form of word of mouth (WOM: electronic word of mouth (eWOM, considered one of the most influential informal media among consumers, businesses, and the population at large. Drawing on these ideas, this paper reviews the relevant literature, analyzing the impact of traditional WOM and eWOM in the field of consumer behavior and highlighting the main differences between the two types of recommendations, with a view to contributing to a better understanding of the potential of both.

  17. Badomics words and the power and peril of the ome-meme

    Directory of Open Access Journals (Sweden)

    Eisen Jonathan A

    2012-07-01

    Full Text Available Abstract Languages and cultures, like organisms, are constantly evolving. Words, like genes, can come and go–spreading around or going extinct. Here I discuss the spread of one small subset of words that are meant to convey “comprehensiveness” in some way: the “omes” and other words derived from “genome” or “genomics.” I focus on a bad aspect of this spread the use of what I refer to as “badomics” words. I discuss why these should be considered bad and how to distinguish badomics words from good ones.

  18. The Training of Morphological Decomposition in Word Processing and Its Effects on Literacy Skills

    Directory of Open Access Journals (Sweden)

    Irit Bar-Kochva

    2017-10-01

    Full Text Available This study set out to examine the effects of a morpheme-based training on reading and spelling in fifth and sixth graders (N = 47, who present poor literacy skills and speak German as a second language. A computerized training, consisting of a visual lexical decision task (comprising 2,880 items, presented in 12 sessions, was designed to encourage fast morphological analysis in word processing. The children were divided between two groups: the one underwent a morpheme-based training, in which word-stems of inflections and derivations were presented for a limited duration, while their pre- and suffixes remained on screen until response. Another group received a control training consisting of the same task, except that the duration of presentation of a non-morphological unit was restricted. In a Word Disruption Task, participants read words under three conditions: morphological separation (with symbols separating between the words’ morphemes, non-morphological separation (with symbols separating between non-morphological units of words, and no-separation (with symbols presented at the beginning and end of each word. The group receiving the morpheme-based program improved more than the control group in terms of word reading fluency in the morphological condition. The former group also presented similar word reading fluency after training in the morphological condition and in the no-separation condition, thereby suggesting that the morpheme-based training contributed to the integration of morphological decomposition into the process of word recognition. At the same time, both groups similarly improved in other measures of word reading fluency. With regard to spelling, the morpheme-based training group showed a larger improvement than the control group in spelling of trained items, and a unique improvement in spelling of untrained items (untrained word-stems integrated into trained pre- and suffixes. The results further suggest some contribution of

  19. Right word making sense of the words that confuse

    CERN Document Server

    Morrison, Elizabeth

    2012-01-01

    'Affect' or 'effect'? 'Right', 'write' or 'rite'? English can certainly be a confusing language, whether you're a native speaker or learning it as a second language. 'The Right Word' is the essential reference to help people master its subtleties and avoid making mistakes. Divided into three sections, it first examines homophones - those tricky words that sound the same but are spelled differently - then looks at words that often confuse before providing a list of commonly misspelled words.

  20. Image-Word Pairing-Congruity Effect on Affective Responses

    Science.gov (United States)

    Sanabria Z., Jorge C.; Cho, Youngil; Sambai, Ami; Yamanaka, Toshimasa

    The present study explores the effects of familiarity on affective responses (pleasure and arousal) to Japanese ad elements, based on the schema incongruity theory. Print ads showing natural scenes (landscapes) were used to create the stimuli (images and words). An empirical study was conducted to measure subjects' affective responses to image-word combinations that varied in terms of incongruity. The level of incongruity was based on familiarity levels, and was statistically determined by a variable called ‘pairing-congruity status’. The tested hypothesis proposed that even highly familiar image-word combinations, when combined incongruously, would elicit strong affective responses. Subjects assessed the stimuli using bipolar scales. The study was effective in tracing interactions between familiarity, pleasure and arousal, although the incongruous image-word combinations did not elicit the predicted strong effects on pleasure and arousal. The results suggest a need for further research incorporating kansei (i.e., creativity) into the process of stimuli selection.

  1. It's a Mad, Mad Wordle: For a New Take on Text, Try This Fun Word Cloud Generator

    Science.gov (United States)

    Foote, Carolyn

    2009-01-01

    Nation. New. Common. Generation. These are among the most frequently used words spoken by President Barack Obama in his January 2009 inauguration speech as seen in a fascinating visual display called a Wordle. Educators, too, can harness the power of Wordle to enhance learning. Imagine providing students with a whole new perspective on…

  2. Positive emotion word use and longevity in famous deceased psychologists.

    Science.gov (United States)

    Pressman, Sarah D; Cohen, Sheldon

    2012-05-01

    This study examined whether specific types of positive and negative emotional words used in the autobiographies of well-known deceased psychologists were associated with longevity. For each of the 88 psychologists, the percent of emotional words used in writing was calculated and categorized by valence (positive or negative) and arousal (activated [e.g., lively, anxious] or not activated [e.g., calm, drowsy]) based on existing emotion scales and models of emotion categorization. After controlling for sex, year of publication, health (based on disclosed illness in autobiography), native language, and year of birth, the use of more activated positive emotional words (e.g., lively, vigorous, attentive, humorous) was associated with increased longevity. Negative terms (e.g., angry, afraid, drowsy, sluggish) and unactivated positive terms (e.g., peaceful, calm) were not related to longevity. The association of activated positive emotions with longevity was also independent of words indicative of social integration, optimism, and the other affect/activation categories. Results indicate that in writing, not every type of emotion correlates with longevity and that there may be value to considering different categories beyond emotional valence in health relevant outcomes.

  3. The BioLexicon: a large-scale terminological resource for biomedical text mining

    Science.gov (United States)

    2011-01-01

    Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical

  4. The embodied mind extended: Using words as social tools

    Directory of Open Access Journals (Sweden)

    Anna M Borghi

    2013-05-01

    Full Text Available The extended mind view and the embodied-grounded view of cognition and language are typically considered as rather independent perspectives. In this paper we propose a possible integration of the two views and support it proposing the idea of ''Words As social Tools' (WAT'. In this respect, we will propose that words, also due to their social and public character, can be conceived as quasi-external devices that extend our cognition. Moreover, words function like tools in that they enlarge the bodily space of action thus modifying our sense of body. To support our proposal, we review the relevant literature on tool use and on words as tools and report recent evidence indicating that word use leads to an extension of space close to the body. In addition, we outline a model of the neural processes that may underpin bodily space extension via word use and may reflect possible effects on cognition of the use of words as external means. We also discuss how reconciling the two perspectives can help to overcome the limitations they encounter if considered independently.

  5. WHO ARE FANS OF FACEBOOK FAN PAGES? AN ELECTRONIC WORD-OF-MOUTH COMMUNICATION PERSPECTIVE

    Directory of Open Access Journals (Sweden)

    Xiao Hu

    2014-12-01

    Full Text Available Given its great business value and popularity, Facebook fan pages have attracted more and more attention in both industry and academia. Fans of Facebook fan pages play an important role in electronic word-of-mouth (eWOM communication. This study focused on the population of fans on Facebook fan pages and examined the differences between fans and non-fans in terms of demographics, social network sites (SNS use, Internet use, and online shopping behaviors. The results indicated that fans used SNS more frequently than non-fans. Additionally, from the eWOM perspective, the researchers moderated product types in the model of people’s word-of-mouth (WOM preferences and found that people had different preferences for eWOM and traditional WOM for different products. Traditional WOM is still the most important source of information for people when shopping online.

  6. Smashing WordPress Themes Making WordPress Beautiful

    CERN Document Server

    Hedengren, Thord Daniel

    2011-01-01

    The ultimate guide to WordPress Themes - one of the hottest topics on the web today WordPress is so much more than a blogging platform, and Smashing WordPress Themes teaches readers how to make it look any way they like - from a corporate site, to a photography gallery and moreWordPress is one of the hottest tools on the web today and is used by sites including The New York Times, Rolling Stone, flickr, CNN, NASA and of course Smashing MagazineBeautiful full colour throughout - web designers expect nothing lessSmashing Magazine will fully support this book by by promoting it through their webs

  7. Goodnight Book: Sleep Consolidation Improves Word Learning via Storybooks

    Directory of Open Access Journals (Sweden)

    Sophie E. Williams

    2014-03-01

    Full Text Available Reading the same storybooks repeatedly helps preschool children learn words. In addition, sleeping shortly after learning also facilitates memory consolidation and aids learning in older children and adults. The current study explored how sleep promotes word learning in preschool children using a shared storybook reading task. Children were either read the same story repeatedly or different stories and either napped after the stories or remained awake. Children’s word retention were tested 2.5 hours later, 24 hours later and 7 days later. Results demonstrate strong, persistent effects for both repeated readings and sleep consolidation on young children’s word learning. A key finding is that children who read different stories before napping learned words as well as children who had the advantage of hearing the same story. In contrast, children who read different stories and remained awake never caught up to their peers on later word learning tests. Implications for educational practices are discussed.

  8. The Motivation of Secondary School Students in Mathematical Word Problem Solving

    Science.gov (United States)

    Gasco, Javier; Villarroel, Jose-Domingo

    2014-01-01

    Introduction: Motivation is an important factor in the learning of mathematics. Within this area of education, word problem solving is central in most mathematics curricula of Secondary School. The objective of this research is to detect the differences in motivation in terms of the strategies used to solve word problems. Method: It analyzed the…

  9. The Emotions of Abstract Words: A Distributional Semantic Analysis.

    Science.gov (United States)

    Lenci, Alessandro; Lebani, Gianluca E; Passaro, Lucia C

    2018-04-06

    Recent psycholinguistic and neuroscientific research has emphasized the crucial role of emotions for abstract words, which would be grounded by affective experience, instead of a sensorimotor one. The hypothesis of affective embodiment has been proposed as an alternative to the idea that abstract words are linguistically coded and that linguistic processing plays a key role in their acquisition and processing. In this paper, we use distributional semantic models to explore the complex interplay between linguistic and affective information in the representation of abstract words. Distributional analyses on Italian norming data show that abstract words have more affective content and tend to co-occur with contexts with higher emotive values, according to affective statistical indices estimated in terms of distributional similarity with a restricted number of seed words strongly associated with a set of basic emotions. Therefore, the strong affective content of abstract words might just be an indirect byproduct of co-occurrence statistics. This is consistent with a version of representational pluralism in which concepts that are fully embodied either at the sensorimotor or at the affective level live side-by-side with concepts only indirectly embodied via their linguistic associations with other embodied words. Copyright © 2018 Cognitive Science Society, Inc.

  10. Words of foreign origin in political discourse

    Directory of Open Access Journals (Sweden)

    Sabina Zorčič

    2012-12-01

    Full Text Available The paper discusses the use of words of foreign origin in Slovenian political discourse. At the outset, this usage is broken down into four groups: the first contains specific phrases and terminology inherent to the political domain; the second contains words of foreign origin generally present in the Slovene language (because of their high frequency of nonexclusivistic use, these words are not of interest to the scope of this investigation; the third contains various words of foreign origin used as affectional packaging for messages with the aim of stimulating the desired interpretation (framing reality; the fourth group, which is the most interesting for our research, is made up of words of foreign origin which could have a marker: + marked, + not necessary, + unwanted, but only if we accept the logic of purism. All the words in this group could be replaced - without any loss of meaning - with their Slovene equivalents. The speakerʼs motivation for using the foreign word is crucial to our discussion. In the framework of Pierre Bourdieuʼs poststructural theory as well as Austinʼs and Searleʼs speech act theory, statistical data is analysed to observe how usage frequency varies in correlation with selected factors which manifest the speakerʼs habitus. We argue that words of foreign origin represent symbolic cultural capital, a kind of added value which functions as credit and as such is an important form of the accumulation of capital.       

  11. Question Word in the Mandarin Language

    Directory of Open Access Journals (Sweden)

    Xu Yunyu

    2016-12-01

    Full Text Available In an interrogative sentence in Mandarin language, a question word can be placed in the beginning, middle or end of a sentence. Because of the different nation and culture, when a foreign student learns Mandarin, they find it difficult to understand the question words and the position of the question words in that language. Because of that, the writer proposes to explain such problems. This research aims to find out what are the types of question words in Mandarin, and also to explain the function and usage of question words in the Mandarin interrogative sentence. An interrogative sentence is a very important sentence. In Mandarin, the following question words: 谁(shuí “Who”,在哪里(zài nǎli “where”, 在哪儿(zài nǎ’er “where”,为什么(wèi shénme “why”, 怎么(zěnme “why”,多少(duō shǎo) “how many”,多久(duō jiǔ “how long”,什么时候 (shénme shíhòu “when”,什么(shénme “what”,做什么(zuò shénme “why”,干 什么(gàn shénme “why”,干嘛(gànma “why” and so on are used to ask “who”, “where”, “what”, “how much”, “when”, “what time”, and “why”. Those words have different functions and usage. Each sentence has a certain structure and word order. A question word can be placed in the beginning, middle, or end of a sentence. When the place is changed, there is a possibility of miscommunication.   DOI: https://doi.org/10.24071/llt.2013.160106

  12. Quantifying the Beauty of Words: A Neurocognitive Poetics Perspective

    Directory of Open Access Journals (Sweden)

    Arthur M. Jacobs

    2017-12-01

    Full Text Available In this paper I would like to pave the ground for future studies in Computational Stylistics and (Neuro-Cognitive Poetics by describing procedures for predicting the subjective beauty of words. A set of eight tentative word features is computed via Quantitative Narrative Analysis (QNA and a novel metric for quantifying word beauty, the aesthetic potential is proposed. Application of machine learning algorithms fed with this QNA data shows that a classifier of the decision tree family excellently learns to split words into beautiful vs. ugly ones. The results shed light on surface and semantic features theoretically relevant for affective-aesthetic processes in literary reading and generate quantitative predictions for neuroaesthetic studies of verbal materials.

  13. Neural Correlates of Task-Irrelevant First and Second Language Emotion Words — Evidence from the Face-Word Stroop Task

    Directory of Open Access Journals (Sweden)

    Lin Fan

    2016-11-01

    Full Text Available Emotionally valenced words have thus far not been empirically examined in a bilingual population with the emotional face-word Stroop paradigm. Chinese-English bilinguals were asked to identify the facial expressions of emotion with their first (L1 or second (L2 language task-irrelevant emotion words superimposed on the face pictures. We attempted to examine how the emotional content of words modulates behavioral performance and cerebral functioning in the bilinguals’ two languages. The results indicated that there were significant congruency effects for both L1 and L2 emotion words, and that identifiable differences in the magnitude of Stroop effect between the two languages were also observed, suggesting L1 is more capable of activating the emotional response to word stimuli. For event-related potentials (ERPs data, an N350-550 effect was observed only in L1 task with greater negativity for incongruent than congruent trials. The size of N350-550 effect differed across languages, whereas no identifiable language distinction was observed in the effect of conflict slow potential (conflict SP. Finally, more pronounced negative amplitude at 230-330 ms was observed in L1 than in L2, but only for incongruent trials. This negativity, likened to an orthographic decoding N250, may reflect the extent of attention to emotion word processing at word-form level, while N350-550 reflects a complicated set of processes in the conflict processing. Overall, the face-word congruency effect has reflected identifiable language distinction at 230-330 and 350-550 ms, which provides supporting evidence for the theoretical proposals assuming attenuated emotionality of L2 processing.

  14. Language Skills in Classical Chinese Text Comprehension

    Science.gov (United States)

    Lau, Kit-ling

    2018-01-01

    This study used both quantitative and qualitative methods to explore the role of lower- and higher-level language skills in classical Chinese (CC) text comprehension. A CC word and sentence translation test, text comprehension test, and questionnaire were administered to 393 Secondary Four students; and 12 of these were randomly selected to…

  15. "I Can Comprehend...I Just Can't Read Big Words."

    Science.gov (United States)

    Allen, Janet

    2003-01-01

    Explains how to help students understand the relationship between figuring out words and the ability to comprehend a text. Describes how to solidify the word-learning/comprehension connection via explaining, modeling, and supporting transfer to independence. Concludes that explaining, modeling, and supporting transfer to independence form the…

  16. The Presentation of Word Formation in General Monolingual ...

    African Journals Online (AJOL)

    This paper gives suggestions regarding the theoretical approaches that could lead to a better user-directed lexicographic practice. Keywords: Afrikaans dictionaries, cognitive function, complex form, compound, derivative, dictionary function, electronic dictionaries, text production, text reception, user needs, word formation ...

  17. Short-Term Free Recall and Sequential Memory for Pictures and Words: A Simultaneous-Successive Processing Interpretation.

    Science.gov (United States)

    Randhawa, Bikkar S.; And Others

    1982-01-01

    Replications of two basic experiments in support of the dual-coding processing model with grade 10 and college subjects used pictures, concrete words, and abstract words as stimuli presented at fast and slow rates for immediate and sequential recall. Results seem to be consistent with predictions of simultaneous-successive cognitive theory. (MBR)

  18. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges.

    Science.gov (United States)

    Stansfield, Claire; O'Mara-Eves, Alison; Thomas, James

    2017-09-01

    Using text mining to aid the development of database search strings for topics described by diverse terminology has potential benefits for systematic reviews; however, methods and tools for accomplishing this are poorly covered in the research methods literature. We briefly review the literature on applications of text mining for search term development for systematic reviewing. We found that the tools can be used in 5 overarching ways: improving the precision of searches; identifying search terms to improve search sensitivity; aiding the translation of search strategies across databases; searching and screening within an integrated system; and developing objectively derived search strategies. Using a case study and selected examples, we then reflect on the utility of certain technologies (term frequency-inverse document frequency and Termine, term frequency, and clustering) in improving the precision and sensitivity of searches. Challenges in using these tools are discussed. The utility of these tools is influenced by the different capabilities of the tools, the way the tools are used, and the text that is analysed. Increased awareness of how the tools perform facilitates the further development of methods for their use in systematic reviews. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Auditory word recognition is not more sensitive to word-initial than to word-final stimulus information

    NARCIS (Netherlands)

    Vlugt, van der M.J.; Nooteboom, S.G.

    1986-01-01

    Several accounts of human recognition of spoken words a.!!llign special importance to stimulus-word onsets. The experiment described here was d~igned to find out whether such a word-beginning superiority effect, which ill supported by experimental evidence of various kinds, is due to a special

  20. Reinforcement and inference in cross-situational word learning.

    Science.gov (United States)

    Tilles, Paulo F C; Fontanari, José F

    2013-01-01

    Cross-situational word learning is based on the notion that a learner can determine the referent of a word by finding something in common across many observed uses of that word. Here we propose an adaptive learning algorithm that contains a parameter that controls the strength of the reinforcement applied to associations between concurrent words and referents, and a parameter that regulates inference, which includes built-in biases, such as mutual exclusivity, and information of past learning events. By adjusting these parameters so that the model predictions agree with data from representative experiments on cross-situational word learning, we were able to explain the learning strategies adopted by the participants of those experiments in terms of a trade-off between reinforcement and inference. These strategies can vary wildly depending on the conditions of the experiments. For instance, for fast mapping experiments (i.e., the correct referent could, in principle, be inferred in a single observation) inference is prevalent, whereas for segregated contextual diversity experiments (i.e., the referents are separated in groups and are exhibited with members of their groups only) reinforcement is predominant. Other experiments are explained with more balanced doses of reinforcement and inference.