WorldWideScience

Sample records for folklore text corpora

  1. Two approaches to gathering text corpora from the WorldWideWeb

    CSIR Research Space (South Africa)

    Botha, G

    2005-11-01

    Full Text Available Many applications of pattern recognition to natural language processing require large text corpora in a specified language. For many of the languages of the world, such corpora are not readily available, but significant quantities of text...

  2. Segmenting corpora of texts Segmentação de corpora de textos

    Directory of Open Access Journals (Sweden)

    Tony Berber Sardinha

    2002-01-01

    Full Text Available The aim of the research presented here is to report on a corpus-based method for discourse analysis that is based on the notion of segmentation, or the division of texts into cohesive portions. For the purposes of this investigation, a segment is defined as a contiguous portion of written text consisting of at least two sentences. The segmentation procedure developed for the study is called LSM (link set median, which is based on the identification of lexical repetition in text. The data analysed in this investigation were three corpora of 100 texts each. Each corpus was composed of texts of one particular genre: research articles, annual business reports, and encyclopaedia entries. The total number of words in the three corpora was 1,262,710 words. The segments inserted in the texts by the LSM procedure were compared to the internal section divisions in the texts. Afterwards, the results obtained through the LSM procedure were then compared to segmentation carried out at random. The results indicated that the LSM procedure worked better than random, suggesting that lexical repetition accounts in part for the way texts are segmented into sections.O objetivo da pesquisa apresentada é relatar um método baseado em corpus para análise de discurso que se baseia na noção de segmentação, isto é, a divisão de textos em porções coesas. Para os propósitos desse estudo, um segmento é definido como uma porção contígua de texto que consiste em pelo menos sentenças. O procedimento de segmentação desenvolvido para a pesquisa chama-se LSM ('link set median' e se baseia na identificação da repetição lexical nos textos. Os dados analisados foram três corpora de 100 textos cada. Cada corpus representava um gênero específico: artigos de pesquisa, relatórios anuais de negócio e artigos de enciclopédia. O tamanho total do corpus é 1.262.710 palavras. A segmentação por LSM foi comparada à divisão interna em seções de cada texto. A

  3. Using machine learning to disentangle homonyms in large text corpora.

    Science.gov (United States)

    Roll, Uri; Correia, Ricardo A; Berger-Tal, Oded

    2018-06-01

    Systematic reviews are an increasingly popular decision-making tool that provides an unbiased summary of evidence to support conservation action. These reviews bridge the gap between researchers and managers by presenting a comprehensive overview of all studies relating to a particular topic and identify specifically where and under which conditions an effect is present. However, several technical challenges can severely hinder the feasibility and applicability of systematic reviews, for example, homonyms (terms that share spelling but differ in meaning). Homonyms add noise to search results and cannot be easily identified or removed. We developed a semiautomated approach that can aid in the classification of homonyms among narratives. We used a combination of automated content analysis and artificial neural networks to quickly and accurately sift through large corpora of academic texts and classify them to distinct topics. As an example, we explored the use of the word reintroduction in academic texts. Reintroduction is used within the conservation context to indicate the release of organisms to their former native habitat; however, a Web of Science search for this word returned thousands of publications in which the term has other meanings and contexts. Using our method, we automatically classified a sample of 3000 of these publications with over 99% accuracy, relative to a manual classification. Our approach can be used easily with other homonyms and can greatly facilitate systematic reviews or similar work in which homonyms hinder the harnessing of large text corpora. Beyond homonyms we see great promise in combining automated content analysis and machine-learning methods to handle and screen big data for relevant information in conservation science. © 2017 Society for Conservation Biology.

  4. Ontology-based retrieval of bio-medical information based on microarray text corpora

    DEFF Research Database (Denmark)

    Hansen, Kim Allan; Zambach, Sine; Have, Christian Theil

    are exponentially growing, the text corpora are sparse and inconsistent in spite of attempts to standardize the format. Ordinary keyword search may in some cases be insucient to nd rele- vant information and the potential benet of using a semantic approach in this context has only been investigated to a limited...

  5. Representativeness in corpora of literary texts: introducing the C18P project

    Directory of Open Access Journals (Sweden)

    Gemeinböck, Iris

    2016-07-01

    Full Text Available Currently there are very few specialised corpora of literary texts that are tailored to the needs of literary critics who are interested in corpus stylistic analyses of prose fiction. Many existing corpora including literary texts were compiled for linguistic research interests and are often unsuitable for corpus stylistic purposes. The paper addresses three of the main problems: the absence of labelling of the texts for literary genre, the use of extracts, and the prevalence of linguistic periodisation schemes. C18P is a corpus of prose fiction designed specifically to address these issues. It traces the early development of the novel from 1700 up until the Victorian era. It can, for instance, be used for an analysis of the characteristic linguistic features of individual literary genres and forms. The following paper introduces the design of the corpus as well as some of its potential uses.

  6. Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.

    Science.gov (United States)

    Cohen, Raphael; Elhadad, Michael; Elhadad, Noémie

    2013-01-16

    The increasing availability of Electronic Health Record (EHR) data and specifically free-text patient notes presents opportunities for phenotype extraction. Text-mining methods in particular can help disease modeling by mapping named-entities mentions to terminologies and clustering semantically related terms. EHR corpora, however, exhibit specific statistical and linguistic characteristics when compared with corpora in the biomedical literature domain. We focus on copy-and-paste redundancy: clinicians typically copy and paste information from previous notes when documenting a current patient encounter. Thus, within a longitudinal patient record, one expects to observe heavy redundancy. In this paper, we ask three research questions: (i) How can redundancy be quantified in large-scale text corpora? (ii) Conventional wisdom is that larger corpora yield better results in text mining. But how does the observed EHR redundancy affect text mining? Does such redundancy introduce a bias that distorts learned models? Or does the redundancy introduce benefits by highlighting stable and important subsets of the corpus? (iii) How can one mitigate the impact of redundancy on text mining? We analyze a large-scale EHR corpus and quantify redundancy both in terms of word and semantic concept repetition. We observe redundancy levels of about 30% and non-standard distribution of both words and concepts. We measure the impact of redundancy on two standard text-mining applications: collocation identification and topic modeling. We compare the results of these methods on synthetic data with controlled levels of redundancy and observe significant performance variation. Finally, we compare two mitigation strategies to avoid redundancy-induced bias: (i) a baseline strategy, keeping only the last note for each patient in the corpus; (ii) removing redundant notes with an efficient fingerprinting-based algorithm. (a)For text mining, preprocessing the EHR corpus with fingerprinting yields

  7. A linear-RBF multikernel SVM to classify big text corpora.

    Science.gov (United States)

    Romero, R; Iglesias, E L; Borrajo, L

    2015-01-01

    Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  8. Automatic extraction of property norm-like data from large text corpora.

    Science.gov (United States)

    Kelly, Colin; Devereux, Barry; Korhonen, Anna

    2014-01-01

    Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.

  9. Folklore in Antiquity

    Directory of Open Access Journals (Sweden)

    Galit Hasan-Rokem

    2018-05-01

    Full Text Available Folklore exists in all human groups, small and big. Since early modernity, scholars have provided various definitions of the phenomenon, but earlier texts may also reveal awareness and reflection on the specific character folklore. In this short article, we wish to explore and look into the various definitions and characterizations of folklore given by ancient writers from various times and cultures. We will try to draw a cultural map of awareness to the phenomenon of folklore in ancient Near-Eastern texts, Greco-Roman culture, the Hebrew Bible, Early Christianity and Rabbinic literature. The main questions we wish do deal with are where and if we can find explicit mention of folklore; which folk genres are dominant in ancient writings and what was the social context of ancient folklore? That is to say, whom those text integrated in social frameworks, enabling their users to gain power or to undermine existing cultural, theological and social structures.

  10. Fast and Effective Approximations for Summarization and Categorization of Very Large Text Corpora

    OpenAIRE

    Godbehere, Andrew B.

    2015-01-01

    Given the overwhelming quantities of data generated every day, there is a pressing need for tools that can extract valuable and timely information. Vast reams of text data are now published daily, containing information of interest to those in social science, marketing, finance, and public policy, to name a few. Consider the case of the micro-blogging website Twitter, which in May 2013 was estimated to contain 58 million messages per day: in a single day, Twitter generates a greater volume of...

  11. The Challenge of Folklore to Medieval Studies

    OpenAIRE

    John Lindow

    2018-01-01

    When folklore began to emerge as a valid expression of a people during the early stages of national romanticism, it did so alongside texts and artifacts from the Middle Ages. The fields of folklore and medieval studies were hardly to be distinguished at that time, and it was only as folklore began to develop its own methodology (actually analogous to medieval textual studies) during the nineteenth century that the fields were distinguished. During the 1970s, however, folklore adopted a wholly...

  12. Building and using comparable corpora

    CERN Document Server

    Sharoff, Serge; Zweigenbaum, Pierre; Fung, Pascale

    2013-01-01

    The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and stu

  13. Building Collections: Folklore

    Science.gov (United States)

    Krapp, JoAnn Vergona

    2005-01-01

    Folklore, the oldest form of storytelling, reflects the culture of a country, hence its nonfiction classification. Through these tales, one senses the values, the humor, and the lifestyles of its peoples. A powerful genre, folklore is the foundation on which high fantasy is created, epic films are produced, and a single story is passed from one…

  14. Text mining, a race against time? An attempt to quantify possible variations in text corpora of medical publications throughout the years.

    Science.gov (United States)

    Wagner, Mathias; Vicinus, Benjamin; Muthra, Sherieda T; Richards, Tereza A; Linder, Roland; Frick, Vilma Oliveira; Groh, Andreas; Rubie, Claudia; Weichert, Frank

    2016-06-01

    The continuous growth of medical sciences literature indicates the need for automated text analysis. Scientific writing which is neither unitary, transcending social situation nor defined by a timeless idea is subject to constant change as it develops in response to evolving knowledge, aims at different goals, and embodies different assumptions about nature and communication. The objective of this study was to evaluate whether publication dates should be considered when performing text mining. A search of PUBMED for combined references to chemokine identifiers and particular cancer related terms was conducted to detect changes over the past 36 years. Text analyses were performed using freeware available from the World Wide Web. TOEFL Scores of territories hosting institutional affiliations as well as various readability indices were investigated. Further assessment was conducted using Principal Component Analysis. Laboratory examination was performed to evaluate the quality of attempts to extract content from the examined linguistic features. The PUBMED search yielded a total of 14,420 abstracts (3,190,219 words). The range of findings in laboratory experimentation were coherent with the variability of the results described in the analyzed body of literature. Increased concurrence of chemokine identifiers together with cancer related terms was found at the abstract and sentence level, whereas complexity of sentences remained fairly stable. The findings of the present study indicate that concurrent references to chemokines and cancer increased over time whereas text complexity remained stable. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Folklore and Sociolinguistics

    Directory of Open Access Journals (Sweden)

    John Holmes McDowell

    2018-01-01

    Full Text Available Folklore and sociolinguistics exist in a symbiotic relationship; more than that, at points—in the ethnography of communication and in ethnopoetics, for example—they overlap and become indistinguishable. As part of a reaction to the formal rigor and social detachment of Chomsky’s theoretical linguistics, sociolinguistics emerges in the mid-twentieth century to assess the role of language in social life. Folklorists join the cause and bring to it a commitment to in-depth ethnography and a longstanding engagement with artistic communication. In this essay, I trace key phases in the development of this interdisciplinary movement, revolutionary in its reorientation of language study to the messy but fascinating realm of speech usage. I offer the concept of performative efficacy, the notion that expressive culture performances have the capacity to shape attitude and action and thereby transform perceived realities, as a means of capturing the continuing promise of a sociolinguistically informed folkloristics.

  16. The future of multimodal corpora O futuro dos corpora modais

    Directory of Open Access Journals (Sweden)

    Dawn Knight

    2011-01-01

    Full Text Available This paper takes stock of the current state-of-the-art in multimodal corpus linguistics, and proposes some projections of future developments in this field. It provides a critical overview of key multimodal corpora that have been constructed over the past decade and presents a wish-list of future technological and methodological advancements that may help to increase the availability, utility and functionality of such corpora for linguistic research.Este artigo apresenta um balanço do estado da arte da linguística de corpus multimodal e propõe a projeção de desenvolvimentos futuros nessa área. Um resumo crítico dos corpora multimodais-chave que foram construídos na última década é apresentado, assim como uma lista de desenvolvimentos tecnológicos e metodológicos futuros que podem auxiliar na disponibilização e utilização, bem como na funcionalidade, de tais corpora para a pesquisa linguística.

  17. El folklore y sus paradojas

    Directory of Open Access Journals (Sweden)

    HONORIO M. VELASCO MAILLO

    1990-01-01

    Full Text Available Uno de los rasgos más sobresalientes de la historia del folklore en España y otras naciones europeas son sus paradojas. Propuesto primero como ciencia ha llegado a ser claramente rechazado por posteriores ambientes científicos. Tendría interés hacer una historia social del folklore. Este artículo sugiere que tales paradojas y contradicciones están relacionadas con el paradigma científico que asumieron sus promotores, el evolucionismo cultural y con un concepto idealizado de "pueblo", que ayudaron a construir presentando colecciones de materiales. También analiza las diferentes funciones sociales que ha cumplido el discurso folklórico.

  18. Topic Modeling of Hierarchical Corpora /

    OpenAIRE

    Kim, Do-kyum

    2014-01-01

    The sizes of modern digital libraries have grown beyond our capacity to comprehend manually. Thus we need new tools to help us in organizing and browsing large corpora of text that do not require manually examining each document. To this end, machine learning researchers have developed topic models, statistical learning algorithms for automatic comprehension of large collections of text. Topic models provide both global and local views of a corpus; they discover topics that run through the co...

  19. Corpora from a sociolinguistic perspective Corpora sob uma perspectiva sociolinguística

    Directory of Open Access Journals (Sweden)

    Tyler Kendall

    2011-01-01

    Full Text Available In this paper, I consider the use of corpora in sociolinguistic research and, more broadly, the relationships between corpus linguistics and sociolinguistics. I consider the distinction between "conventional" and "unconventional" corpora (Beal et al. 2007a, b and assess why conventional corpora have not had more traction in sociolinguistics. I then discuss the potential utility of corpora for sociolinguistic study in terms of the recent trajectory of sociolinguistic research interests (Eckert under review, acknowledging that, while many sociolinguists are increasingly using more advanced corpus-based techniques, many are, at the same time, moving away from corpus-like studies. I suggest two primary areas where corpus developers, both sociolinguistic and non-, could focus to develop more useful corpora: Corpora containing a wider range of non-standard (spoken varieties and more flexible annotation and treatment of spoken language data.Neste artigo considero o uso de corpora na pesquisa sociolingüística e, de modo mais geral, a relação entre a linguística de corpus e a sociolinguística. Reflito sobre a distinção entre corpora "convencionais" e "não-convencionais" (BEAL ET AL. 2007 a, b e avalio o porquê de corpora convencionais não terem atraído mais atenção no campo da sociolinguística. Na sequência, discuto a utilidade potencial de corpora para os estudos sociolingüísticos em termos da trajetória recente que tem sido adotada pela pesquisa nesta área (ECKHERT, em avaliação, reconhecendo que, se por um lado, muitos sociolinguistas têm ampliado o seu uso de técnicas avançadas da linguística de corpus, por outro, muitos estão, ao mesmo tempo, se afastando de estudos relaciados a corpora. Sugiro duas áreas principais nas quais compiladores de corpora, independentemente de serem sociolingüísticos ou não, poderiam enfocar para desenvolverem corpora mais úteis: corpora contendo uma amplitude maior de variedades (faladas n

  20. Electronic folklore among teenagers: SMS messages

    Directory of Open Access Journals (Sweden)

    Cvjetićanin Tijana

    2006-01-01

    Full Text Available The development of ICT media made way for a new form of folklore communication. Newly developed media, such as mobile phones, make it possible for their users to participate in electronically mediated communication, thus approaching the form of oral communication. The exchange of special type of SMS text messages represents a new way of transmitting contemporary folklore short forms. These messages use poetic language, they have standard style themes, patterns and formulas, and they form different genres and categories corresponding with already existing familiar folklore forms. The communication process that happens during the exchange of these messages also has folklore’s characteristics: it takes place within small groups, the communication is informal, the texts circulate in chain style, and undergo different transformation which generates the making of variants, etc. This form of electronic folklore is especially popular among teenagers, where it’s social functions and meanings are also most emphasized. Within this population, it adds to an older tradition of children’s written folklore poetry albums. Like poetry albums, SMS exchange influences the development of girls’ gender identity, providing also a socially defined channel for contacts between the sexes. It also functions as a mechanism of socialization and stratification within the group. At the same time, it creates a new field of meaning, which derives from the very media’s novelty and significance. In this sense, the exchange of SMS represents a symbolic act of acknowledging one’s belonging to the group of mobile telephone users. In this way, a new phenomenon is being symbolically processed through a new form of folklore.

  1. Folklore in China: Past, Present, and Challenges

    Directory of Open Access Journals (Sweden)

    Juwen Zhang

    2018-04-01

    Full Text Available This article first outlines the long history of folklore collection in China, and then describes the disciplinary development in the 20th century. In Section 3, it presents the current situation in terms of disciplinary infrastructure, development, contribution, and challenge, with a focus on the recent practice of safeguarding Intangible Cultural Heritage. These accounts are largely based on the views of the Chinese folklorists. In the final section, this article discusses the issues of cultural continuity, integration, and self-healing mechanisms in Chinese culture by putting Chinese folkloristics in a historical and world perspective. This paper suggests that, to understand Chinese folklore and culture, one must be aware of the most basic differences between Chinese fundamental beliefs and values and those of the West, and that Chinese folklore and folkloristics present new challenges to the current paradigms put forward in the post-colonial, post-modern, and imperial ideologies.

  2. Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora

    Directory of Open Access Journals (Sweden)

    Abdulmohsen Al-Thubaity

    2014-01-01

    Full Text Available Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP, corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework.

  3. Linguistic Corpora and Lexicography.

    Science.gov (United States)

    Meijs, Willem

    1996-01-01

    Overviews the development of corpus linguistics, reviews the use of corpora in modern lexicography, and presents central issues in ongoing work aimed at broadening the scope of lexicographical use of corpus data. Focuses on how the field has developed in relation to the production of new monolingual English dictionaries by major British…

  4. Electronic Corpora as Translation Tools

    DEFF Research Database (Denmark)

    Laursen, Anne Lise; Mousten, Birthe; Jensen, Vigdis

    2012-01-01

    translator who has to get a cross-linguistic overview of a new area or a new line of business. Relevant internet texts can be compiled ‘on the fly’, but internet data needs to be sorted and analyzed for rational use. Today, such sorting and analysis can be made by a low-tech, analytical software tool....... This article demonstrates how strategic steps of compiling and retrieving linguistic data by means of specific search strategies can be used to make electronic corpora an efficient tool in translators’ daily work with fields that involve new terminology, but where the skills requested to work correspond...

  5. Working with corpora in the translation classroom

    Directory of Open Access Journals (Sweden)

    Ralph Krüger

    2012-10-01

    Full Text Available This article sets out to illustrate possible applications of electronic corpora in the translation classroom. Starting with a survey of corpus use within corpus-based translation studies, the didactic value of corpora in the translation classroom and their epistemic value in translation teaching and practice will be elaborated. A typology of translation practice-oriented corpora will be presented, and the use of corpora in translation will be positioned within two general models of translation competence. Special consideration will then be given to the design and application of so-called Do-it-yourself (DIY corpora, which are compiled ad hoc with the aim of completing a specific translation task. In this context, possible sources for retrieving corpus texts will be presented and evaluated and it will be argued that, owing to time and availability constraints in real-life translation, the Internet should be used as a major source of corpus data. After a brief discussion of possible Internet research techniques for targeted and quality-focused corpus compilation, the possible use of the Internet itself as a macro-corpus will be elaborated. The article concludes with a brief presentation of corpus use in translation teaching in the MA in Specialised Translation Programme offered at Cologne University of Applied Sciences, Germany.

  6. Overcoming Legal Limitations in Disseminating Slovene Web Corpora

    Directory of Open Access Journals (Sweden)

    Tomaž Erjavec

    2016-09-01

    Full Text Available Web texts are becoming increasingly relevant sources of information, with web corpora useful for corpus linguistic studies and development of language technologies. Even though web texts are directly accessable, which substantially simplifies the collection procedure compilation of web corpora is still complex, time consuming and expensive. It is crucial that similar endeavours are not repeated, which is why it is necessary to make the created corpora easily and widely accessible both to researchers and a wider audience. While this is logistically and technically a straightforward procedure, legal constraints, such as copyright, privacy and terms of use severely hinder the dissemination of web corpora. This paper discusses legal conditions and actual practice in this area, gives an overview of current practices and proposes a range of mitigation measures on the example of the Janes corpus of Slovene user-generated content in order to ensure free and open dissemination of Slovene web corpora.

  7. Avoid violence, rioting, and outrage; approach celebration, delight, and strength: Using large text corpora to compute valence, arousal, and the basic emotions.

    Science.gov (United States)

    Westbury, Chris; Keith, Jeff; Briesemeister, Benny B; Hofmann, Markus J; Jacobs, Arthur M

    2015-01-01

    Ever since Aristotle discussed the issue in Book II of his Rhetoric, humans have attempted to identify a set of "basic emotion labels". In this paper we propose an algorithmic method for evaluating sets of basic emotion labels that relies upon computed co-occurrence distances between words in a 12.7-billion-word corpus of unselected text from USENET discussion groups. Our method uses the relationship between human arousal and valence ratings collected for a large list of words, and the co-occurrence similarity between each word and emotion labels. We assess how well the words in each of 12 emotion label sets-proposed by various researchers over the past 118 years-predict the arousal and valence ratings on a test and validation dataset, each consisting of over 5970 items. We also assess how well these emotion labels predict lexical decision residuals (LDRTs), after co-varying out the effects attributable to basic lexical predictors. We then demonstrate a generalization of our method to determine the most predictive "basic" emotion labels from among all of the putative models of basic emotion that we considered. As well as contributing empirical data towards the development of a more rigorous definition of basic emotions, our method makes it possible to derive principled computational estimates of emotionality-specifically, of arousal and valence-for all words in the language.

  8. 06491 Summary -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    OpenAIRE

    Burnard, Lou; Dobreva, Milena; Fuhr, Norbert; Lüdeling, Anke

    2007-01-01

    The seminar "Digital Historical Corpora" brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. The issues that were discussed include digitization, corpus design, corpus architecture, annotation, search, and retrieval.

  9. Automatic Dictionary Expansion Using Non-parallel Corpora

    Science.gov (United States)

    Rapp, Reinhard; Zock, Michael

    Automatically generating bilingual dictionaries from parallel, manually translated texts is a well established technique that works well in practice. However, parallel texts are a scarce resource. Therefore, it is desirable also to be able to generate dictionaries from pairs of comparable monolingual corpora. For most languages, such corpora are much easier to acquire, and often in considerably larger quantities. In this paper we present the implementation of an algorithm which exploits such corpora with good success. Based on the assumption that the co-occurrence patterns between different languages are related, it expands a small base lexicon. For improved performance, it also realizes a novel interlingua approach. That is, if corpora of more than two languages are available, the translations from one language to another can be determined not only directly, but also indirectly via a pivot language.

  10. Women and the Study of Folklore.

    Science.gov (United States)

    Jordan, Rosan A.; De Caro, F. A.

    1986-01-01

    Presents a critical overview of academic writing on women and folklore, organized in three categories: (1) literature on images of women in verbal folklore, and the role of negative images in shaping attitudes; (2) research on womens' oral genres and performance and female use of folklore; and (3) studies of women as folk performers and artists.…

  11. Folkloric Art in Egyptian Schools.

    Science.gov (United States)

    Osman, Siham

    1983-01-01

    Theories in art education with a western origin have been applied in Egypt to support the revival of folkloric art. There are three important phases in the teaching of a unit on applique, a decorative craft dating back to the earliest Egyptian history. (AM)

  12. THE COMPOSER AND FOLKLORE PROBLEM: FACTORS OF STYLISTIC STRUCTURE

    Directory of Open Access Journals (Sweden)

    COCEAROVA GALINA

    2017-12-01

    Full Text Available This paper continues the author’s earlier study of the Composer and Folklore problem from the stylistic point of view. It is noted that in academic music, where the attention is focused not only on the speech or text characteristics, but primarily on the linguistic and stylistic material of folklore, the appeal to folk sources leads to the emergence of a number of stylistic factors, both, in the formation of the national style, and in the field of ethnic culture as a whole and integral stable system. The research points to the role of folklore as the genetic code of ethnic culture, as well as to other factors acting at on the level ,of musical discourse and musical language, contributing to the formation of „language flexibility” (A. Kolmogorov and, as a result, „flexibility of style”.

  13. Importancia del folklore musical como práctica educativa

    Directory of Open Access Journals (Sweden)

    Arévalo, Azahara

    2009-06-01

    Full Text Available Educational society of today should reflect on the importance of musical folklore as an educative practice. This paper contents a reflection about folklore and different educative practices taking as examples some musical pieces from Jaen’ Song Book. These kinds of practices are essential since they develop the quality of the learning process in general and the learning of music in particular. Nowadays, the school is the unifier mean for the reappraisal, communication and transmission of the folklore of our culture. Recovering our folklore is a task that depends on every member of the community and it can be possible through the updating of these musical pieces to the new social changes and its possible spreading through the media. Jaen’ Song Book may constitute a mean for promoting its folklore among students of this province. The learning of this repertoire may also serve as an open door to the World to know the labor that is done in our schools. This paper tries to make teachers conscious that the use of folk materials may improve the learning of music as well as it may unfold a new way for future didactic, cultural and anthropological researches.

  14. Learner Corpora without Error Tagging

    Directory of Open Access Journals (Sweden)

    Rastelli, Stefano

    2009-01-01

    Full Text Available The article explores the possibility of adopting a form-to-function perspective when annotating learner corpora in order to get deeper insights about systematic features of interlanguage. A split between forms and functions (or categories is desirable in order to avoid the "comparative fallacy" and because – especially in basic varieties – forms may precede functions (e.g., what resembles to a "noun" might have a different function or a function may show up in unexpected forms. In the computer-aided error analysis tradition, all items produced by learners are traced to a grid of error tags which is based on the categories of the target language. Differently, we believe it is possible to record and make retrievable both words and sequence of characters independently from their functional-grammatical label in the target language. For this purpose at the University of Pavia we adapted a probabilistic POS tagger designed for L1 on L2 data. Despite the criticism that this operation can raise, we found that it is better to work with "virtual categories" rather than with errors. The article outlines the theoretical background of the project and shows some examples in which some potential of SLA-oriented (non error-based tagging will be possibly made clearer.

  15. BENTUK KARAKTER ANAK MELALUI DOKUMENTASI FOLKLOR LISAN KEBUDAYAAN LOKAL

    Directory of Open Access Journals (Sweden)

    Ranggi Ramadhani Ilminisa

    2016-06-01

    Full Text Available This research aims to documented the myth, legend, and fairy tale in Jombang and developing the oral folklore to be child story which contain of character education. In this case, used qualitative method. Based on results study getting nine story’s from a few of data site interpretation which include north Jombang, west, south, and middle. From the nine story’s, then documented and described on result study. Thus, it is can be reference of giving character education for kid.  Penelitian ini bertujuan untuk mendokumentasikan mite, legenda, dan dongeng di Jombang dan mengemas folklor lisan tersebut menjadi cerita anak bermuatan pendidikan karakter. Dalam hal ini metode yang digunakan adalah deskriptif kualitatif. Berdasarkan hasil penelitian didapatkan sembilan cerita dari beberapa lokasi pengambilan data yang meliputi Jombang utara, barat, selatan dan tengah. Dari sembilan cerita tersebut didokumentasikan dan dideskripsikan pada temuan hasil penelitian. Dengan demikian, folklor lisan tersebut dapat dijadikan rujukan untuk membentuk pendidikan karakter anak.

  16. Corpora and historical linguistics Corpora e linguística histórica

    Directory of Open Access Journals (Sweden)

    Merja Kytö

    2011-01-01

    Full Text Available The present article aims to survey and assess the current state of electronic historical corpora and corpus methodology, and attempts to look into possible future developments. It highlights the fact that within the wide spectrum of corpus linguistic methodology, historical corpus linguistics has emerged as a vibrant field that has significantly added to the appeal felt for the study of language history and change. In fact, according to a historical linguist with more than fifty years of experience, "[w]e could even go as far as to say that without the support and new impetus provided by corpora, evidence-based historical linguistics would have been close to the end of its life-span in these days of rapid-changing life and research, increasing competition on the academic career track and the methodological attractions offered to young scholars" (RISSANEN, forthcoming. Historical corpora and other electronic resources have also made the study of language history attractive: working on them engages students in an individual and interactive way that they find appealing (CURZAN 2000, p. 81.Este artigo objetiva fazer um levantamento e avaliar o estado da arte dos corpora históricos eletrônicos e da metodologia de estudos de corpora, assim como sugerir possíveis desenvolvimentos futuros na área. Destaca-se que dentro do espectro metodológico da linguística de corpus, a linguística de corpus histórica emergiu como um campo de investigação vibrante que tem adicionado interesse ao estudo da história e da mudança linguística. De acordo com um pesquisador da área com mais de cinqüenta anos de experiência, "pode-se dizer que sem o apoio e o novo ímpeto trazidos pelos corpora, a linguística histórica baseada em evidências teria estado próxima ao fim de sua vida nesses tempos de rápidas mudanças de vida e de pesquisa, aumentando a competição na carreira acadêmica e nas atrações metodológicas oferecidas aos jovens pesquisadores

  17. KABA MALIN DEMAN: MENYIASATI DAMPAK DUA FALSAFAH MINANGKABAU DALAM FOLKLOR

    Directory of Open Access Journals (Sweden)

    Tienn Immerry

    2017-11-01

    Full Text Available Indonesian folktale is transmitted from one generation to the next by word of mouth. The changes from verbal to written manuscript has in fact undergone a long process. Folktale consists of cultural values of folk/ a particular group of people. Research on folklore is one way to reveal the philosophy contain in the written manuscript. Two of Minangkabau philosophies, extinction philosophy and marriage philosophy, are found in kaba Malin Deman, if imbalance occurs it will create problem in their society. Harmonization is the srategy for the imbalance and also as the function of folklore itself.

  18. Spoken corpora and pragmatics Corpora orais e pragmática

    Directory of Open Access Journals (Sweden)

    Massimo Moneglia

    2011-01-01

    Full Text Available The goal of this paper is to present arguments in favour of two points related to the study of oral corpora and pragmatics: a at the level of annotation, corpora must ensure the parsing of the speech flow into utterances on the basis of prosodic cues and provide an easy access to the acoustic source; b at the level of sampling, corpora must ensure the maximum representation of context variation, rather than speaker variation. We will present the reasons which support the very basic prosodic annotation of speech (prosodic boundaries as a means to obtain relevant data from the speech flow. Starting from our present knowledge about the distribution of speech acts types in spoken corpora, we will present the reasons why building corpora in accordance to a context variation strategy should expand our knowledge of pragmatics. Additionally, we will claim that prosody is the necessary interface between locutive and illocutive acts and we will show that a deeper prosodic analysis is necessary to grasp unknown speech act types from language usage. Finally, we will briefly sketch the main assumptions of the Language into Act Theory (CRESTI, 2000 which is dedicated to the link between prosody and pragmatics and helps make explicit core aspects of pragmatic knowledge.O objetivo deste artigo é apresentar argumentos favoráveis a dois pontos relacionados ao estudo de corpora orais e pragmática: a no nível da anotação, os corpora devem garantir o processamento do fluxo discursivo em enunciados, baseando-se em chaves prosódicas, e oferecer fácil acesso aos arquivos de som; b no nível da amostragem, os corpora devem garantir a representatividade máxima de variação contextual, ao invés de variação de falantes. Apresentaremos os motivos que sustentam a escolha das fronteiras prosódicas como o referencial básico para a anotação prosódica da fala, como uma forma relevante de se obterem dados importantes do fluxo discursivo. Partindo do nosso

  19. An Evaluation of Folklore Events in Serbia in Terms of Tourism

    Directory of Open Access Journals (Sweden)

    Željko Bjeljac

    2016-02-01

    Full Text Available In Serbia there are many traditional events based on tradition, folklore, old customs and traditional crafts and trades. Folklore events are the oldest elements in the development of tourism and provide a sufficient motive for tourist visits. On the basis of their program content, these events can be divided into folklore and folk music festivals, festivals of folk customs, and children’s folklore festivals. This paper offers a categorization of folklore events according to economic and geographic criteria; particular attention has been given to events that already are, or have great potential for becoming, a major attraction of the tourist destination in question and can thus contribute to a faster and higher-quality development of tourism.

  20. An Interpretation of Two Oromo Folklore Genres Integrated to ...

    African Journals Online (AJOL)

    The purpose of this study was to analyze and interpret the meanings of two selected folklore genres namely: riddle and pastoral song portrayed in primary Oromo language student text books integrated to enhance the language skills, knowledge, attitude and cultural values of the children. Qualitative method was employed ...

  1. Uso de corpora na formação de tradutores Using corpora in translator training

    Directory of Open Access Journals (Sweden)

    Antonio P. Berber Sardinha

    2003-01-01

    Full Text Available O presente trabalho aborda a questão do uso de corpora na formação de tradutores, enfocando mais especificamente a questão da conscientização. O trabalho apresenta uma discussão sobre o papel de corpora na tradução, sua aplicabilidade na formação profissional, e sua importância para o melhor conhecimento da constituição da linguagem. São oferecidos dois exemplos de análises, detalhadas a fim de serem aplicáveis em contextos em que haja poucos recursos de infraestrutura. As análises centram-se em pesquisas sobre as escolhas lingüísticas de um texto jornalístico traduzido para o português, e da versão brasileira de um slogan de uma campanha publicitária americana. É sugerido que essas atividades possam ser desenvolvidas com alunos de tradução, de tal modo que elas forneçam condições para que os alunos, ao explorarem corpora eletrônicos, possam se conscientizar da complexidade e da especificidade das escolhas lingüísticas envolvidas no processo tradutório.This paper tackles the issue of using corpora in translator training, focussing more specifically on the question of awareness raising. The paper presents a discussion on the role of corpora in translation, their applicability in professional development, and their importance in leading to a better understanding of how language is constituted. Two example analyses are offered and detailed, so that they are applicable to contexts in which computational resources are scarce. The analyses center around the linguistic choices in a translated newspaper text and in the Brazilian version of a slogan from an American advertising campaign. It is suggested that these activities may be carried out with translation students, in such a way that they enable students, while they explore electronic corpora, to become aware of both the complexity and the specificity of the linguistic choices involved in the process of translation.

  2. The use of corpora in English writing classes

    Directory of Open Access Journals (Sweden)

    Paula Pinto Paiva

    2013-01-01

    Full Text Available This study aims at discussing aspects related to learner corpora and linguistic features found in texts written by English learners based on the use of collocations in text production. For this research, we analyzed collocations with the verb “to have” and with the nouns “prejudice” and “regret”.

  3. "Haunting experiences: Ghosts in contemporary folklore," by Diane E. Goldstein et al.

    Directory of Open Access Journals (Sweden)

    Linda Levitt

    2010-03-01

    Full Text Available Diane E. Goldstein, Sylvia Ann Grider, and Jeannie Banks Thomas. Haunting experiences: Ghosts in contemporary folklore. Logan: Utah State University Press, 2007, paperback, $24.95 (272p ISBN 978-0-87421-636-3.

  4. Use of English Corpora as a Primary Resource to Teach English to the Bengali Learners

    Science.gov (United States)

    Dash, Niladri Sekhar

    2011-01-01

    In this paper we argue in favour of teaching English as a second language to the Bengali learners with direct utilisation of English corpora. The proposed strategy is meant to be assisted with computer and is based on data, information, and examples retrieved from the present-day English corpora developed with various text samples composed by…

  5. Of Mermaids and Changelings: Human Rights, Folklore and Contemporary Irish Language Poetry

    Directory of Open Access Journals (Sweden)

    Rióna Ní Fhrighil

    2017-10-01

    Full Text Available This article investigates the intersection of human rights discourse, Irish folklore and contemporary Irish-language poetry. The author contends that contemporary Irish-language poets Louis de Paor and Nuala Ní Dhomhnaill exploit the multi-faceted nature of international folklore motifs, along with their local variants, to represent human rights violations in their poetry. Focusing specifically on the motif of the changeling in De Paor’s poetry and on the motif of the mermaid in Ní Dhomhnaill’s, the author traces how folklore material is reimagined in ways that eschew uncomplicated transnational solidarity but which engender empathetic settlement.

  6. Semantics, contrastive linguistics and parallel corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska

    2014-09-01

    Full Text Available Semantics, contrastive linguistics and parallel corpora In view of the ambiguity of the term “semantics”, the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools. She examines semantics differently in connection with contrastive studies where the description must necessary go from the meaning towards the linguistic form, whereas in traditional contrastive studies the description proceeded from the form towards the meaning. This requirement regarding theoretical contrastive studies necessitates construction of a semantic interlanguage, rather than only singling out universal semantic categories expressed with various language means. Such studies can be strongly supported by parallel corpora. However, in order to make them useful for linguists in manual and computer translations, as well as in the development of dictionaries, including online ones, we need not only formal, often automatic, annotation of texts, but also semantic annotation - which is unfortunately manual. In the article we focus on semantic annotation concerning time, aspect and quantification of names and predicates in the whole semantic structure of the sentence on the example of the “Polish-Bulgarian-Russian parallel corpus”.

  7. “Stories Like the Light of Stars”: Folklore and Narrative Strategies in the Fiction of Éilís Ní Dhuibhne

    Directory of Open Access Journals (Sweden)

    Giovanna Tallone

    2017-10-01

    Full Text Available Besides being one of Ireland’s best-known and eminent writers, Éilís Ní Dhuibhne is also a professional and recognised folklorist and researcher, whose work covers a diversity of topics and subjects, mostly in the area of the tradition of oral storytelling and urban folklore. Her background in folklore has a relevant impact on her fiction, which is marked by reinvention of folklore patterns and juxtaposition of ancient stories and their contemporary counterpart. The purpose of his essay is to shed light on the impact of folklore and folklore projects on the fiction of Éilís Ní Dhuibhne in terms of in allusions, contents, discourse organization and narrative strategies. The tight link between folklore and storytelling in her writing is analysed taking into account her short stories vis-à-vis her academic work in folklore, focussing on Ní Dhuibhne’s awareness of the continuity of traditional narrative in time.

  8. Building gold standard corpora for medical natural language processing tasks.

    Science.gov (United States)

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre

    2012-01-01

    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.

  9. Folklore, creativity, and cultural memory

    DEFF Research Database (Denmark)

    Glaveanu, Vlad Petre

    the role of tradition and creativity in the life of a rural community. Egg decoration is an old custom, with pre-Christian roots, practiced extensively in the historical region of Bucovina, and relying on a complex system of material artefacts and symbolic elements acquired and enacted by artisans usually...... means the opposite of creativity but the actual vehicle of creative activity and its understanding as a stable cultural system ‘engraved’ in collective memory needs to be challenged. The tradition of egg decoration in Romania is a living and evolving social practice that engages the self and community......This paper addresses the question of how folk art can be, simultaneously, a vehicle for cultural memory and cultural creativity. It takes the case of Romanian Easter egg decoration as a practice situated at the intersection between art, folklore, religion and a growing market, it order to unpack...

  10. Slavic Phraseology: A View Through Corpora

    Directory of Open Access Journals (Sweden)

    Zakharov Victor

    2017-12-01

    Full Text Available The study of word collocability is one of the main tasks of linguistics. The combinatory ability of language units, collocability, is one of the linguistic syntagmatic laws. This phenomenon is the main object of the phraseology and lexicography. The article deals with set phrases of different types in Russian, Czech and Slovak from the point of view of their quantitative evaluation. Corpus linguistics understand set phrases as statistically determined unities. This approach is the basic point of different automatic ways to extract idioms and collocations. The paper describes experiments which show how text corpora and corpus methods and tools can be used to expand the entries in existing dictionaries and how set phrases could be evaluated quantitatively. It is shown and maintained that corpus linguistics methods and tools allow to create dictionaries of new type which have to include a larger amount of set phrases and collocations than before.

  11. Nenets Folklore in Russian: The Movement of Culture in Forms and Languages

    Directory of Open Access Journals (Sweden)

    Karina Lukin

    2008-09-01

    Full Text Available In this methodological article the question of authenticity of folklore material is discussed. The article deals mainly with the research history of Nenets folklore studies and examines critically two of its paradigms, namely the so-called Finno-Ugric paradigm and the Soviet studies. It is argued that in these paradigms there existed biases that prevented the students to study certain kind of folklore material. The biases were related to the language and the form of the material: due to these biases folklore performed not in Nenets and not in forms defined traditional were left outside collections and research. Furthermore, it is shown that Russian speech and narratives embedded in speech are part of Nenets everyday communication and thus also material worth studying and collecting. Instead of the criticised paradigms the Nenets discourse is examined within the notions of communication centered studies that have gained attention since the 1980s.

  12. Folklore anecdote between memorata and fabulata: Field research of Serbs in Medina (Hungary

    Directory of Open Access Journals (Sweden)

    Ilić Marija

    2007-01-01

    Full Text Available This work is based on folklore material, which was gathered during ethno linguistic field research of Serbian traditional lexicon and spiritual culture in Medina village in Hungary in 2002. Folklore material is composed of the sayings by the informer Sava Sokic and primarily can be defined as a series of comical narrations. If we look upon these narrations as a genre of oral speech and within context of ethno linguistic interview, we can notice a complex structure of this oral genre. That is, this genre functions as a memorat with typical beginnings and met textual comments. On the other hand, it respects almost all genre norms, which are characteristic for folklore anecdote. Therefore, comic narrations of Save Sokic, and that are valid also for folklore anecdote in general, can be classified as borderline genre - between memorata and fabulata.

  13. Working with Corpora in the Translation Classroom

    Science.gov (United States)

    Krüger, Ralph

    2012-01-01

    This article sets out to illustrate possible applications of electronic corpora in the translation classroom. Starting with a survey of corpus use within corpus-based translation studies, the didactic value of corpora in the translation classroom and their epistemic value in translation teaching and practice will be elaborated. A typology of…

  14. Using Monolingual and Bilingual Corpora in Lexicography

    Science.gov (United States)

    Miangah, Tayebeh Mosavi

    2009-01-01

    Constructing and exploiting different types of corpora are among computer applications exposed to the researchers in different branches of science including lexicography. In lexicography, different types of corpora may be of great help in finding the most appropriate uses of words and expressions by referring to numerous examples and citations.…

  15. Folklore and Folk Songs of Chittagong: A Critical Review

    Directory of Open Access Journals (Sweden)

    Amir Mohammad Khan

    2017-04-01

    Full Text Available Folk Songs stems from Folklore are very rich in the southern region of Chittagong. In this part of the world Folk Songs play pivotal role in the lifestyle of people as a heart-touching and heavenly connection exists between human, nature and Folk Songs. Folk Songs in this area are special because we found the theme of Nature Conservation in them. We took the southern part of Chittagong (Lohagara, Satkania, Chandanaish and Patiya as our research area, selected a village namely Chunati in the systematic sampling and more than 100 people were interviewed through focus group discussion and key informant interviews. The sufficient literature review is also done. People in this area love nature a lot. Here music personnel were born from time to time who not only worked for the musical development but also created consciousness among people to love nature and save it. We discussed about the origin of Folk Songs, pattern of Folk Songs to clarify the importance of Folk Songs of Chittagong for its connection to Folklore and at the same time for promoting the idea of Nature Conservation. Of course, this part of studies deserves more attention in the field of research. Our ultimate goal should be to conserve and promote Folk Songs of Chittagong with yearlong heritage that automatically will later enrich Folklore and Nature Conservation.

  16. Primary diffuse large B-cell lymphoma of the corpora cavernosa presented as a perineal mass

    Directory of Open Access Journals (Sweden)

    González-Satué Carlos

    2012-01-01

    Full Text Available Primary male genital lymphomas may appear rarely in testis, and exceptionally in the penis and prostate, but there is not previous evidence of a lymphoma arising from the corpora cavernosa. We report the first case in the literature of a primary diffuse cell B lymphoma of the corpora cavernosa presented with low urinary tract symptoms, perineal pain and palpable mass. Diagnosis was based on trucut biopsy, histopathological studies and computed tomographic images.

  17. Pendayagunaan Folklor Sebagai Sumber Ekonomi Kreatif Di Daerah Tujuan Wisata Bali

    Directory of Open Access Journals (Sweden)

    I Nyoman Suarka

    2014-06-01

    Full Text Available Tourism practitioners in Bali commonly do not have an adequate understanding of the local culture so that the service given to tourists is less optimal. Therefore, efforts for delving into the original culture are necessary through a scientific research as a source for an information material and appreciation in developing the cultural outlooks of tourism practitioners in Bali. This research aims to delve into, preserve and develop folklores having potentials of high culture as a source of creative economy.This is a qualitative research with a morphology-ethnographic approach which attempts to describe the narrative elements of folklores as a unified whole by considering its history in the community and its supporting culture. That is, besides looking at the lore aspect through the analysis of a folklore structure, it also considers its folk aspect through the analysis of its function and significance. Furthermore, this research focuses on the opportunity for the utilization of folklores as a source of creative economy in addition to strengthening the local wisdom and preventing cultural pollution resulting from the negative aspects of tourism and globalization. Tourism practitioners in Bali commonly do not have an adequate understanding of the local culture so that the service given to tourists is less optimal. Therefore, efforts for delving into the original culture are necessary through a scientific research as a source for an information material and appreciation in developing the cultural outlooks of tourism practitioners in Bali. This research aims to delve into, preserve and develop folklores having potentials of high culture as a source of creative economy.This is a qualitative research with a morphology-­‐ethnographic approach which attempts to describe the narrative elements of folklores as a unified whole by considering its history in the community and its supporting culture. That is, besides looking at the lore aspect through the

  18. Guidelines for normalising Early Modern English corpora: Decisions and justifications

    Directory of Open Access Journals (Sweden)

    Archer Dawn

    2015-03-01

    Full Text Available Corpora of Early Modern English have been collected and released for research for a number of years. With large scale digitisation activities gathering pace in the last decade, much more historical textual data is now available for research on numerous topics including historical linguistics and conceptual history. We summarise previous research which has shown that it is necessary to map historical spelling variants to modern equivalents in order to successfully apply natural language processing and corpus linguistics methods. Manual and semiautomatic methods have been devised to support this normalisation and standardisation process. We argue that it is important to develop a linguistically meaningful rationale to achieve good results from this process. In order to do so, we propose a number of guidelines for normalising corpora and show how these guidelines have been applied in the Corpus of English Dialogues.

  19. The Galileo Legend as Scientific Folklore.

    Science.gov (United States)

    Lessl, Thomas M.

    1999-01-01

    Examines the various ways in which the legend of Galileo's persecution by the Roman Catholic Church diverges from scholarly readings of the Galileo affair. Finds five distinct themes of scientific ideology in the 40 accounts examined. Assesses the part that folklore plays in building and sustaining a professional ideology for the modern scientific…

  20. New approaches for development, analyzing and security of multimedia archive of folklore objects

    Directory of Open Access Journals (Sweden)

    Galina Bogdanova

    2008-07-01

    Full Text Available We present new approaches used in development of the demo version of a WEB based client/server system that contains an archival fund with folklore materials of the Folklore Institute at Bulgarian Academy of Sciences (BAS. Some new methods for image and text securing to embed watermarks in system data are presented. A digital watermark is a visible or perfectly invisible, identification code that is permanently embedded in the data and remains present within the data after any decryption process. We have also developed improved tools and algorithms for analyzing of the database too.

  1. THE STRUCTURE OF POEM IN TALE KERINCI FOLKLORE

    Directory of Open Access Journals (Sweden)

    - Nazurty

    2015-06-01

    Full Text Available Tale is the folklore in the form of poem that is sung. This study aims to gain in-depth understanding of the structure of Tale poem in the release of the Kerinci pilgrims. This qualitative study employed content analysis as the method with a structural approach. This study discussed the structure of the Tale poem. The results of the study are Tale poem consists of sampiran phrase, the rhyme/ sound phrase, and content. It composed by ten lines to twenty lines. It has ab ab rhyme according to the sound phrase flanking each line. The sound expression serves as rhyme and rhythm former.

  2. ROMANIAN FOLKLORE MOTIFS IN FASHION DESIGN

    Directory of Open Access Journals (Sweden)

    MOCENCO Alexandra

    2014-05-01

    Full Text Available The traditional Romanian costume such as the entire popular art (architecture, woodcarvins, pottery etc. was born and lasted in our country since ancient times. Closely related to human existence, the traditional costume reflected over the years as reflected nowadays, the mentality and artistic conception of the people. Today the traditional Romanian costume became an inspiration source to the wholesale fashion production industry designers, both Romanian and international. Although the contemporary designers are working in accordance with a vision, using a wide area of styles, methods and current technology, they usually return to traditional techniques and ethnic folklore motifs, which converts and resize them, integrating them in their contemporary space. Adrian Oianu is a very appreciated Romanian designer who launched two collections inspired by his native’s country traditional costumes: “Suflecata pan’ la brau” (“Turned up ‘til the belt” and “Bucurie” (“Joy”. Dorin Negrau had as inspiration for his “Lost” collection the traditional costume from the Bihor region. Yves Saint Laurent had a collection inspired by the Romanian traditional flax blouses called “La blouse roumaine”. The paper presents the traditional Romanian values throw fashion collections. The research activity will create innovative concepts to support the garment industry in order to develop their own brand and to bring the design activities in Romania at an international level. The research was conducted during the initial stage of a project, financed through national founds, consisting in a documentary study on ethnographic characteristics of the popular costume from different regions of the country.

  3. Multilingual text induced spelling correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a multilingual, language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from raw text corpora, without supervision, and contains word unigrams

  4. The Concept of Love in Lithuanian Folklore and Mythology

    Directory of Open Access Journals (Sweden)

    Doc dr. Daiva Šeškauskaitė

    2013-06-01

    Full Text Available Love is a reserved feeling, meaning amiability and complete internecine understanding. The concept of love has always been important for human world outlook and attitude. Love can have different meaning and expression – for some people it is a nice, warm feeling, a way of action and behaviour, for others it is nothing more than a sexual attraction. As there are different perceptions of love, there also exist few common love manifestations: love can be maternal, childish, juvenile, sexual. Love can also be felt for a home land, own nation, home. Naturally love can be expressed through the particular rituals, symbols and signs. Folk songs introduce four main lover characteristics: beauty, sweetness, kindness and boon. The later feature means that a girl/ boy is supposed to be well-set, to be pleasant and comfortable to touch which is very important when choosing a wife or a husband. Love in folklore is expressed through the common metaphorical and allegorical symbols, it doesn‘t sound as explicit word – more like a metaphor or epithet. Love in folklore can be perceived and felt very differently. Love like an action – love like... special person, essential possession. Prime personal characteristics, such as kindness, tenderness, humility, are the ones to light the love fire as well as beauty, artfulness, eloquence also help. Love is supposed to lead to the sacred sacrament of marriage. Love, if real, is a serious subject. Love is worth dying for. Strong love leads to self-sacrifice. Fairy tales satirize infidelity stressing that love is right only between a wife and a husband while other options are considered as inglorious and wrong. Love, as an incest, is also common in our folklore.

  5. Text Mining of Supreme Administrative Court Jurisdictions

    OpenAIRE

    Feinerer, Ingo; Hornik, Kurt

    2007-01-01

    Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company's legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors. In this paper we use text mining methods to investigate Au...

  6. Russian Folklore as a Reflection of National Character in the Work of Boris Vysheslavtzev

    Directory of Open Access Journals (Sweden)

    Alex L. Nalepin

    2016-09-01

    Full Text Available The essay is focused on the spiritual crisis of Russian culture at the beginning of the 20th Century and on the search of philosophical alternatives to overcome the crisis within the framework of Russian philosophical thought. In particular, it highlights the work of Boris P. Vysheslavtzev, a major thinker among Russian immigrants and his studies in Russian folklore seen as reflection of Russian national character. The essay for the first time introduces new data concerning the specificity of the choice that was highly important for Russian literature and culture as it was for Russian folklore studies.

  7. Folklore Music on Romanian TV. From State Socialist Television to Private Channels

    Directory of Open Access Journals (Sweden)

    Alexandra Urdea

    2014-06-01

    Full Text Available Music genres rooted in folklore have often been interpreted as ideological manoeuvres to forge a sense of national identity (Gordy, Mihailescu, Baker, Cash. This article explores formalized folklore performances of muzică populară as forms ‘media rituals’ (Couldry, and focuses on the role that television has played in establishing the genre as we know it today. It analyses the link between muzică populară as rooted in mass participation activities during communism, and ‘media rituals’ as framed on television (Couldry, indiscriminately and democratically involving the entire population that it addresses (and is available beyond that.

  8. Folklore and the Internet: The Challenge of an Ephemeral Landscape1

    Directory of Open Access Journals (Sweden)

    Trevor J. Blank

    2018-05-01

    Full Text Available Through the lens of memetic folk humor, this essay examines the slippery, ephemeral nature of hybridized forms of contemporary digital folklore. In doing so, it is argued that scholars should not be distracted by the breakneck speed in which expressive materials proliferate and then dissipate but should instead focus on the overarching ways that popular culture and current news events infiltrate digital folk culture in the formation of individuals' cultural inventories. The process of transmission and variation that shapes the resulting hybridized folklore requires greater scrutiny and contextualization.

  9. Proposed framework for the evaluation of standalone corpora processing systems: an application to Arabic corpora.

    Science.gov (United States)

    Al-Thubaity, Abdulmohsen; Al-Khalifa, Hend; Alqifari, Reem; Almazrua, Manal

    2014-01-01

    Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework.

  10. Using corpora in scientific and technical translation training: resources to identify conventionality and promote creativity

    Directory of Open Access Journals (Sweden)

    Clara Inés López-Rodríguez

    2016-06-01

    Full Text Available Since the first Corpus Use and Learning to Translate (CULT Conference in Bertinoro (Italy in 1997, the usefulness of corpora for translators and trainee translators has been highlighted. From an initial approach where translators compiled ad hoc corpora in their hard drive for a subsequent study with lexical analysis software, there emerged a new trend towards the use of the Internet as corpus. In this second approach, the Web is perceived as a huge corpus which is accessed by means of online tools which produce monolingual wordlists and concordances from texts available from the Internet or pre-existing corpora, or by means of bilingual or multilingual concordancers displaying aligned texts from international institutions' parallel corpora. Bilingual concordancers and translation memories are widely used by translators and trainee translators because of the immediate translation solutions they offer, but these tools can restrain creativity by offering conventional solutions and eliminating layout and multimodal elements in texts. The aim of this article is to describe the exploitation of quality corpora in a scientific and technical translation course, focusing on texts on health translated from English into Spanish, and on terminological variation as a reflection of creativity in language.

  11. Using corpora in scientific and technical translation training: resources to identify conventionality and promote creativity

    Directory of Open Access Journals (Sweden)

    Clara Inés López-Rodríguez

    2016-04-01

    Full Text Available http://dx.doi.org/10.5007/2175-7968.2016v36nesp1p88 Since the first Corpus Use and Learning to Translate (CULT Conference in Bertinoro (Italy in 1997, the usefulness of corpora for translators and trainee translators has been highlighted. From an initial approach where translators compiled ad hoc corpora in their hard drive for a subsequent study with lexical analysis software, there emerged a new trend towards the use of the Internet as corpus. In this second approach, the Web is perceived as a huge corpus which is accessed by means of online tools which produce monolingual wordlists and concordances from texts available from the Internet or pre-existing corpora, or by means of bilingual or multilingual concordancers displaying aligned texts from international institutions' parallel corpora. Bilingual concordancers and translation memories are widely used by translators and trainee translators because of the immediate translation solutions they offer, but these tools can restrain creativity by offering conventional solutions and eliminating layout and multimodal elements in texts. The aim of this article is to describe the exploitation of quality corpora in a scientific and technical translation course, focusing on texts on health translated from English into Spanish, and on terminological variation as a reflection of creativity in language.

  12. Sparse Machine Learning Methods for Understanding Large Text Corpora

    Data.gov (United States)

    National Aeronautics and Space Administration — Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational...

  13. LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual

    OpenAIRE

    Diab, Mona; Habash, Nizar; Rambow, Owen; Roth, Ryan

    2013-01-01

    The Linguistic Data Consortium (LDC) has developed hundreds of data corpora for natural language processing (NLP) research. Among these are a number of annotated treebank corpora for Arabic. Typically, these corpora consist of a single collection of annotated documents. NLP research, however, usually requires multiple data sets for the purposes of training models, developing techniques, and final evaluation. Therefore it becomes necessary to divide the corpora used into the required data sets...

  14. Subdomain sensitive statistical parsing using raw corpora

    NARCIS (Netherlands)

    Plank, B.; Sima'an, K.

    2008-01-01

    Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics, music), which implies that the statistics gathered by current statistical parsers are mixtures of subdomains of language

  15. Corpora in Language Teaching and Learning

    Science.gov (United States)

    Boulton, Alex

    2017-01-01

    This timeline looks at explicit uses of corpora in foreign or second language (L2) teaching and learning, i.e. what happens when end-users explore corpus data, whether directly via concordancers or integrated into CALL programs, or indirectly with prepared printed materials. The underlying rationale is that such contact provides the massive…

  16. Tula song folklore: genre-stylistic and dialectic peculiarities

    Directory of Open Access Journals (Sweden)

    Krasovskaya Nelli Alexandrovna

    2016-06-01

    Full Text Available The article analyzes the works of Tula folklore recorded in the western part of the Tula region, in terms of genre, stylistic and linguistic features. The relevance of the study is related to the fact that Tula folk songs has not been studied, linguistic features of the works are not subjected to serious analysis. The article describes the features of the genre of songs recorded in Belevsky district of Tula region, including the ancient fortunetelling chants, wedding ceremony songs, romantic ballads etc., it is cited numerous examples in the lyrics that reflect the dialectal features of the phonetic, grammatical, lexical levels. According to the authors, a modern folk song genre retains its diversity and is a kind of storeroom containing priceless linguistic wealth. The analysis allows to draw conclusions about the presence and well-preserved in the recorded music of South Russian dialect phonetic and grammatical features. So far, there is no established typology of Tula dialects, therefore, according to the authors, the fixation of folklore in the territories bordering on Tula dialects, is very important and interesting for further descriptive and comparative work on identifying the eastern and south-south-west differences in Tula dialects.

  17. NETWORK FOLKLORE AND ITS ROLE IN THE FORMATION OF A COLLECTIVE COGNITIVE SPACE

    Directory of Open Access Journals (Sweden)

    Anastasija Belovodskaja

    2014-04-01

    Full Text Available The global implementation of information-communicative technologies into every sphere of human activity is being accompanied by the emergence of new forms of communication, le­ading to inevitable changes in the means of both the representation and reception of information. In this respect, the field of interest encompasses research into modern anonymous network creative writing, which, as a result of the technological qualities of the Internet space, produces such texts that require particular skills in both comprehension and reproduction. In turn, the products of network folklore, as they spontaneously spread on the Internet, acquire the status of particular signs of a precedent nature. At the same time, the very nature of anonymous network creative writing—amusing and colloquial—raises the attractiveness of such texts and facilitates their reception, allowing them to be used for manipulative aims. The fact that such network folklore can influence the process of idea-formation in society is predetermined by the fact that, by definition, it is the milieu where collective representations are condensed and transmitted. Thus, network folklore is in the focus of attention not only in folklore studies, but is extremely topical for research in such fields as cognitive science, linguistic-cultural studies, public relations, speech effect, and any others which take interest in the processes of keeping, receiving, and transmitting information.

  18. Contrasting Specific English Corpora: Language Variation

    Directory of Open Access Journals (Sweden)

    María Luisa Carrió Pastor

    2009-12-01

    Full Text Available The scientific community has traditionally considered technical English as neutral and objective, able to transmit ideas and research in simple sentences and specialized vocabulary. Nevertheless, global communication and intense information delivery have produced a range of different ways of knowledge transmission. Although technical English is considered an objective way to transmit science, writers of academic papers use some words or structures with different frequency in the same genre. As a consequence of this, contrastive studies about the use of second languages have been increasingly attracting scholarly attention. In this research, we evidence that variation in language production is a reality and can be proved contrasting corpora written by native writers of English and by non-native writers of English. The objectives of this paper are first to detect language variation in a technical English corpus; second, to demonstrate that this finding evidences the parts of the sentence that are more sensitive to variation; finally, it also evidences the non-standardisation of technical English. In order to fulfil these objectives, we analysed a corpus of fifty scientific articles written by native speakers of English and fifty scientific articles written by non-native speakers of English. The occurrences were classified and counted in order to detect the most common variations. Further analysis indicated that the variations were caused by mother tongue interference in virtually all cases, although meaning was only very rarely obscured. These findings suggest that the use of certain patterns and expressions originating from L1 interference should be considered as correct as standard English.La comunidad científica considera al inglés técnico como un tipo de lenguaje neutral y objetivo, capaz de transmitir ideas y hallazgos en frases simples y vocabulario reconocido por los especialistas de ese campo. Sin embargo, la comunicación global y el

  19. The adversative connectives aber and but in conversational corpora.

    Science.gov (United States)

    Gülzow, Insa; Bartlitz, Victoria; Kuehnast, Milena; Golcher, Felix; Bittner, Dagmar

    2018-03-09

    We analyzed the conversational corpora of two German and two English children to investigate how the different use types of the adversative connectives aber and but influence the probability of monologically versus dialogically constructed utterances in the first year of use. Our findings show that children produce adversative connectives mainly in dialogic structures for illocutionary and theme-management purposes, but that the use types of adversative connectives lead to a different distribution of monologic and dialogic clause combinations. The results suggest that monologic and dialogic realizations as a function of text type must be considered when describing the developmental trajectory of the different use types of adversative connectives.

  20. POLTERGEIST PHENOMENA IN CONTEMPORARY FOLKLORE

    Directory of Open Access Journals (Sweden)

    Oana VOICHICI

    2017-05-01

    Full Text Available The article deals with instances of the supernatural in Romanian urban legends, namely what we call the strigoi , or poltergeist. Usually, folklorists tend to exclude the supernatural f rom the category of urban legends, however we have decided to take these accounts into consideration based on the fact that the transmitter, the narrators do not distinguish between these elements and the rest of contemporary legends and today’s popular cu lture abounds in such accounts.

  1. Corpora and corpus technology for translation purposes in professional and academic environments. Major achievements and new perspectives

    Directory of Open Access Journals (Sweden)

    Cécile Frérot

    2016-06-01

    Full Text Available The “use” of corpora and concordancers in translation teaching has grown increasingly attractive since the mid1990s’ with an abundant literature advocating their use and promoting their benefits in the translation classroom. In translator training, efforts are being made to incorporate the use of corpora and concordancers in masters’ programmes and to offer specific modules on corpora for translation as the use of translation memory (TM systems within Computer-Aided Translation (CAT courses still dominates. In the translation profession, while TM systems are part of the everyday working environment, the same cannot be said of corpora and concordancers even though the most recent surveys show that professional translators would like to learn more about the potential of corpora for translation. Overall, the “usefulness” of corpora and corpus technology at the different stages of the translation process remains poorly documented in translation but a growing number of empirical studies has started to show concern as it has now become of paramount importance to assess the extent to which corpora are of added value for translation quality in both professional and academic environments.

  2. Folklore Epistemology: How Does Traditional Folklore Contribute to Children's Thinking and Concept Development?

    Science.gov (United States)

    Agbenyega, Joseph S.; Tamakloe, Deborah E.; Klibthong, Sunanta

    2017-01-01

    This research utilised a "stimulated recall" methodology [Calderhead, J. 1981. "Stimulated Recall: A Method for Research on Teaching." "British Journal of Educational Psychology" 51: 211-217] to explore the potential of African folklore, specifically Ghanaian folk stories in the development of children's reflective…

  3. Discovery learning in the language-for-translation classroom: corpora as learning aids

    Directory of Open Access Journals (Sweden)

    Silvia Bernardini

    2016-06-01

    Full Text Available This contribution reviews the idea of discovery learning with corpora, proposed in the 1990s, evaluating its potential and its implications with reference to the education of translators today. The rationale behind this approach to data-driven learning, combining project-based and form-focused instruction within a socio-constructivistically inspired environment, is discussed. Examples are also provided of authentic, open-ended learning experiences, thanks to which students of translation share responsibility over the development of corpora and their consultation, and teachers can abandon the challenging role of omniscient knowledge providers and wear the more honest hat of "learning experts". Adding to the more straightforward uses of corpora in courses that aim to develop thematic, technological and information mining competences – i.e., in which training is offered in the use of corpora as professional aids –, attention is focused on foreign language teaching for translators and on corpora as learning aids, highlighting their potential for the development of the three other European Master's in Translation (EMT competences (translation service provision, language and intercultural ones.

  4. Rileggendo “Folklore e profitto”. Patrimoni immateriali, mercati, turismo

    Directory of Open Access Journals (Sweden)

    Letizia Bindi

    2014-04-01

    Full Text Available Starting from the anticipatory notes of Luigi M. Lombardi Satriani’s Folklore e profitto [1973], the paper seeks to critically articulate the interesting relation between cultural heritage, capitalistic market and mass media, updating the analysis, also, to the most recent forms of the use of media in promoting and valorizing such traditions. What emerges is a twist of cultural heritage toward consumerism that imposes to anthropologists and cultural heritage scholars new challenges and questions and a late-modern rethinking of critical categories as commodification, alienation and fetishization. A central question, finally, arises about who and what should be today the social actors asked to decide about these processes of cultural manipulation in the new post-industrial and globalized scenario, characterized, inter alia, from a generalized economic crisis. 

  5. Pedagogical Application of Specialized Corpora in ESP Teaching: the case of the UVaSTECorpus

    Directory of Open Access Journals (Sweden)

    Pedro A. Fuertes-Olivera

    2015-11-01

    Full Text Available This article contributes to defining the concept of specialized corpora, reviews the rationale for using them instead of general corpora in teaching activities, and offers the state of art in both corpus-based and corpus-driven approaches to ESP teaching. It also explains some decisions taken regarding the compilation of the University of Valladolid Corpus of Written Scientific and Technical English and illustrates some uses of the corpus. In particular, it presents some tasks with concordances and defends that ESP students should be taught the niceties of lexical gender as it is a grammatical category with social and/or ideological implications.

  6. From the Problems of Dictionaries and Multi-lingual Corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska-Toszewa

    2015-06-01

    Full Text Available From the Problems of Dictionaries and Multi-lingual Corpora The article describes the work on a number of dictionaries being developed by the Corpus Linguistics and Semantics Group of the Institute of Slavic PAS. They include “Contemporary Bulgarian-Polish Dictionary”, “Bulgarian-Polish Online Dictionary” and “Russian-Bulgarian-Polish Dictionary”. The dictionaries differ in the numbers of entries, as well as in the different degrees of their connection with parallel corpora being elaborated under the “Clarin” project. All the discussed dictionaries are similar with respect to their use of traditional, syntactic classifiers and of semantic classifiers, introduced for the first time in the existing lexicographical practice. Thanks to the “Polish-Bulgarian-Russian Corpus”, the Group has managed to verify the results of contrasting Polish and Bulgarian in the light of scope-based logical quantification. Thanks to the Russian material added to the trilingual corpus, the researchers have managed to confirm the fact that from the viewpoint of “incomplete quantification” Russian and Polish (synthetic languages behave similarly, and are opposed to the analytic Bulgarian.

  7. Developing intonation corpora for isiXhosa and isiZulu

    CSIR Research Space (South Africa)

    Govender, N

    2005-11-01

    Full Text Available also show how those corpora can be used without further interpretation to gain insight into matters such as overall pitch contours and gender differences, and discuss the additional steps that will be required to create truly generative models from...

  8. From Annotated Multimodal Corpora to Simulated Human-Like Behaviors

    DEFF Research Database (Denmark)

    Rehm, Matthias; André, Elisabeth

    2008-01-01

    Multimodal corpora prove useful at different stages of the development process of embodied conversational agents. Insights into human-human communicative behaviors can be drawn from such corpora. Rules for planning and generating such behavior in agents can be derived from this information....... And even the evaluation of human-agent interactions can rely on corpus data from human-human communication. In this paper, we exemplify how corpora can be exploited at the different development steps, starting with the question of how corpora are annotated and on what level of granularity. The corpus data...

  9. [Folklore and popular medicine in the Amazon].

    Science.gov (United States)

    Henrique, Márcio Couto

    2009-01-01

    This discussion of the relations between folklore and popular medicine in the Amazon takes Canuto Azevedo's story "Filhos do boto" (Children of the porpoise) as an analytical reference point. Replete with elements of cultural reality, folk tales can serve as historical testimonies expressing clashes between different traditions. Folk records are fruit of what is often a quarrelsome dialogue between folklorists, social scientists, physicians, and pajés and their followers, and their analysis should take into account the conditions under which they were produced. Based on the imaginary attached to the figure of the porpoise--a seductive creature with healing powers--the article explores how we might expand knowledge of popular medicine as practiced in the Amazon, where the shamanistic rite known as pajelança cabocla has a strong presence.

  10. Mergelės Marijos ir akmens sąsajos lietuvių folklore

    OpenAIRE

    Kairaitytė, Aušra

    2008-01-01

    The object of this article is the relation between stone and the Blest Virgin Mary. The aim is to define the functions of stone in narratives about the Mother of God in the Lithuanian folklore, revealing the place of stone during the advent of Mary and finding parallels in the tradition of different Catholic countries. The aim is achieved by applying text analysis and comparative methods. Lithuanian folk stories tell us about growing or walking stones. [...] The other group consists of storie...

  11. Corpora and Language Assessment: The State of the Art

    Science.gov (United States)

    Park, Kwanghyun

    2014-01-01

    This article outlines the current state of and recent developments in the use of corpora for language assessment and considers future directions with a special focus on computational methodology. Because corpora began to make inroads into language assessment in the 1990s, test developers have increasingly used them as a reference resource to…

  12. POLÍTICAS DE LA REPRESENTACIÓN DEL FOLKLORE EN LOS MUSEOS FOLKLÓRICOS/Folklore representation policies in folk museums

    Directory of Open Access Journals (Sweden)

    Ana María Dupey

    2012-11-01

    Full Text Available  Este trabajo trata sobre la invención y la reinvención de los museos de folklore. Se analizan cuáles han sido los propósitos políticos y las razones que se han esgrimido para su establecimiento y quiénes han sido los agentes de estas invenciones / reinvenciones. Si han sido producto de instituciones estatales o surgen de movimientos de elites o grupos minoritarios pertenecientes a la sociedad civil. Simultáneamente, se dilucida cómo las representaciones del folklore son semantizadas para la representación de identidades de colectivos locales, regionales, nacionales y transnacionales. Se analizan a las actuales re-orientaciones de dichas instituciones operadas a partir de los procesos de descolonización (exteriores e interiores con sus consecuencias económicas, políticas, sociales y cognitivas, b las críticas a los análisis coloniales y clasistas desarrollados en el pasado por la Etnología y el Folklore. Disciplinas que abonaron los respectivos discursos museográficos y c la revisión de la definición de la institución museo. AbstractThis work deals with the invention and the reinvention about folk museums. It analyzes what were the political purposes and the reasons that have been put forward for the establishment of folk museums and who were the agents of these inventions/reinventions. If they have been the product of state institutions or movements which arise from elite or minority groups that belongs to the civil society. Simultaneously, it is explained how the folklore representations are semanticized in the representation of the local, regional, national and transnational collective identities. It analyzes a the current guidelines for museums that are based upon the decolonization processes (internal and external and their economic, political, social and cognitive consequences, b the critiques of colonialism and classists analyses developed in the past by Ethnologhy and Folklore. Disciplines that had influenced

  13. Childbirth in ancient Rome: from traditional folklore to obstetrics.

    Science.gov (United States)

    Todman, Donald

    2007-04-01

    In ancient Rome, childbirth was a hazardous event for both mother and child with high rates of infant and maternal mortality. Traditional Roman medicine centred on folklore and religious practices, but with the development of Hippocratic medicine came significant advances in the care of women during pregnancy and confinement. Midwives or obstetrices played an important role and applied rational scientific practices to improve outcomes. This evolution from folklore to obstetrics was a pivotal point in the history of childbirth.

  14. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  15. Differences in motor abilities between dancers in professional and amateur folklore ansambles

    Directory of Open Access Journals (Sweden)

    Kocić Jadranka

    2014-01-01

    Full Text Available Differences in motor abilities between dancers in Serbinan professional folklore ansamble for dance and sing 'Kolo' in Belgrade and amateur folklore ansambles from coulture-arts society 'Vila' and 'Sonja Marinković' from Novi Sad had been tested on sample of 47 members. Motor area was examined by Provincial Governement Institute tests for Sport in Novi sad, and it was received 9 variables: single movement speed, explosivity below extremities (legs, endurance in jumping, absolutely strength backs' flexor muscule, relatively strength backs' flexor muscule, absolutely strength backs' extensor muscule, relatively strength backs' extensor muscule, absolutely strength backs' flexor muscule, relatively strength backs' flexor muscule. Relatively values obtained from absolutely values results using mathemathics. To determine differences between folklore dancers in whole variable system, it was used multivariante analysis variance (MANOVA. It was determined differences between sexes in motor abilities. Data was obtained by statistic packet SPSS 10.0. The aim was to find significant differences in nine mentioned variables between professional and amateur dancers and between sexes. Received results showed that there was not significant differences between professional and amateur dancers. Between sexes it was significant differences in man benefit, except one variable single movement speed. The conclusion is that for better, statisticaly significant results, professional dancers should enlarge contents and expend training intensity.

  16. Use of monolingual and comparable corpora in the classroom to translate adverbial connectors

    Directory of Open Access Journals (Sweden)

    Beatriz Sánchez Cárdenas

    2016-04-01

    This research explored the reasons why certain adverbial discourse connectors, apparently easy to translate, are a source of translation problems that cannot be easily resolved with a bilingual dictionary. Moreover, this study analyzed the use of parallel corpora in the translation classroom and how it can increase the quality of text production. For this purpose, we compared student translations before and after receiving training on the use of corpus analysis tools.

  17. Danish TV Christmas calendars: Folklore, myth and cultural history

    DEFF Research Database (Denmark)

    Agger, Gunhild

    2013-01-01

    in which this traditional genre has succeeded in renewing itself. The so-called Pyrus series, TV 2’s Christmas calendars during the mid-1990s, exhibited folklore, myth and cultural history in a combination of entertainment and information. They were succeeded by calendars such as Jul i Valhal......This article aims at characterizing the Danish Christmas calendar as a TV institution and a meeting place for the traditions of the almanac, folklore and the history of culture. Against the background of a brief outline of the history of Danish Christmas calendars, the article explores ways...

  18. When phonetics matters: creation and perception of female images in song folklore

    Directory of Open Access Journals (Sweden)

    Stashko Halyna

    2017-06-01

    Full Text Available This paper presents a stylistic analysis of female images in American song folklore in order to examine how sound symbolic language elements contribute to the construction of verbal images. The results obtained show the link between sound and meaning and how such phonetic means of stylistics as assonance, alliteration, and onomatopoeia function to reinforce the meanings of words or to set the mood typical of the characters. Their synergy helps create and interpret female images and provides relevant atmosphere and background to them in folk song texts.

  19. Citation Matching in Sanskrit Corpora Using Local Alignment

    Science.gov (United States)

    Prasad, Abhinandan S.; Rao, Shrisha

    Citation matching is the problem of finding which citation occurs in a given textual corpus. Most existing citation matching work is done on scientific literature. The goal of this paper is to present methods for performing citation matching on Sanskrit texts. Exact matching and approximate matching are the two methods for performing citation matching. The exact matching method checks for exact occurrence of the citation with respect to the textual corpus. Approximate matching is a fuzzy string-matching method which computes a similarity score between an individual line of the textual corpus and the citation. The Smith-Waterman-Gotoh algorithm for local alignment, which is generally used in bioinformatics, is used here for calculating the similarity score. This similarity score is a measure of the closeness between the text and the citation. The exact- and approximate-matching methods are evaluated and compared. The methods presented can be easily applied to corpora in other Indic languages like Kannada, Tamil, etc. The approximate-matching method can in particular be used in the compilation of critical editions and plagiarism detection in a literary work.

  20. Collocation lists as instruments for metaphor detection in corpora Listas de colocações como instrumentos para detecção de metáforas em corpora

    Directory of Open Access Journals (Sweden)

    Tony Berber Sardinha

    2006-01-01

    Full Text Available This paper reports a study on the use of collocation lists as instruments for detecting metaphors in corpora. A collocation list contains the collocations for selected words in corpora together with concordances for those words. As corpora become more available to metaphor researchers, there is a growing need for developing ways to gain access to as much data as the corpus can offer. The research described here has hopefully come some way toward meeting the challenges of developing tools for metaphor corpus research. Results suggest that the collocation lists seem to be a good pre-processing instrument for corpus research of metaphor, despite accuracy problems.Este trabalho apresenta uma pesquisa sobre o uso de listas de colocações como instrumentos para detecção de metáforas em corpora. Uma lista de colocação contém as colocações de palavras selecionadas de corpora juntamente com as concordâncias dessas palavras. Na medida que os corpora se tornam mais acessíveis aos pesquisadores de metáfora, começa a surgir uma necessidade de desenvolver maneiras de acessar a maior quantidade possível de dados que um corpus oferece. A pesquisa descrita aqui tentou enfrentar esse desafio, criando e testando ferramentas para pesquisa de metáfora baseada em corpus. Os resultados sugerem que as listas de colocações podem ser um instrumento eficaz de pré-processamento de corpus com vistas à análise humana de metáforas, a despeito de alguns problemas de precisão.

  1. American Folk Music and Folklore Recordings 1985: A Selected List.

    Science.gov (United States)

    Library of Congress, Washington, DC. American Folklife Center.

    Thirty outstanding records and tapes of traditional music and folklore which were released in 1985 are described in this illustrated booklet. All of these recordings are annotated with liner notes or accompanying booklets relating the recordings to the performers, their communities, genres, styles, or other pertinent information. The items are…

  2. Dissemination of Values and Culture through the E-Folklore

    Science.gov (United States)

    Rahim, Normaliza Abd; Affendi, Nik Rafidah Nik Muhammad; Pawi, Awang Azman Awang

    2017-01-01

    This study focuses on the values and culture in the e-folklore. The objectives of the study were to identify and discuss the values in the song lyric "The Stork and the Mouse Deer." The song was taken from phone application in the compilation of the "Kingfisher stories" copyrighted by Dewan Bahasa and Pustaka. The e-folklore…

  3. Early Years Education and the Value for Money Folklore

    Science.gov (United States)

    Campbell-Barr, Verity

    2012-01-01

    This article is intended as a contribution to the debate on the role of human capital in determining value for money in early years education. The article explores how the idea that early years education offers value for money has become folklore amongst policymakers and more widely. However, drawing on both interview data and existing literature…

  4. ′′Early baby teeth′′: Folklore and facts

    Directory of Open Access Journals (Sweden)

    N Uma Maheswari

    2012-01-01

    Full Text Available Variations in the newborns′ oral cavity have been an enduring interest to the pediatric dentist. The occurrence of natal and neonatal teeth is a rare anomaly, which for centuries has been associated with diverse superstitions among many different ethnic groups. Natal teeth are more frequent than neonatal teeth, the ratio being approximately 3:1. The purpose of this case report is to review the literature related to the natal teeth folklore and misconceptions and discuss their possible etiology and treatment.

  5. A Set of Annotation Interfaces for Alignment of Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Singh Anil Kumar

    2014-09-01

    Full Text Available Annotation interfaces for parallel corpora which fit in well with other tools can be very useful. We describe a set of annotation interfaces which fulfill this criterion. This set includes a sentence alignment interface, two different word or word group alignment interfaces and an initial version of a parallel syntactic annotation alignment interface. These tools can be used for manual alignment, or they can be used to correct automatic alignments. Manual alignment can be performed in combination with certain kinds of linguistic annotation. Most of these interfaces use a representation called the Shakti Standard Format that has been found to be very robust and has been used for large and successful projects. It ties together the different interfaces, so that the data created by them is portable across all tools which support this representation. The existence of a query language for data stored in this representation makes it possible to build tools that allow easy search and modification of annotated parallel data.

  6. Corpora amylacea in temporal lobe epilepsy associated with hippocampal sclerosis

    Directory of Open Access Journals (Sweden)

    Ribeiro Marlise de Castro

    2003-01-01

    Full Text Available Hippocampal sclerosis (HS is the commonest pathology in epileptic patients undergoing temporal lobe epilepsy surgery. Beside, there are an increased density of corpora amylacea (CA founded in 6 to 63% of those cases. OBJECTIVE: verify the presence of CA and the clinical correlates of their occurrence in a consective series of patients undergoing temporal surgery with diagnosis of HS. METHOD: We reviewed 72 hippocampus specimens from January 1997 to July 2000. Student's t test for independent, samples, ANOVA and Tukey test were performed for statistical analysis. RESULTS: CA were found in 35 patients (49%, whose mean epilepsy duration (28.7 years was significantly longer than that group of patients without CA (19.5 years, p= 0.001. Besides, when CA were found, duration was also significantly correlated with distribution within hippocampus: 28.7 years with diffuse distribution of CA, 15.4 with exclusively subpial and 17.4 years with distribution subpial plus perivascular (p= 0.001. CONCLUSION: Our findings corroborate the presence of CA in patients with HS and suggest that a longer duration of epilepsy correlate with a more distribution of CA in hippocampus.

  7. FOLKLORE STUDIES AND NATIONALISM IN TURKEY ABSTRACT TÜRKİYE’DE FOLKLOR ÇALIŞMALARI VE MİLLİYETÇİLİK

    Directory of Open Access Journals (Sweden)

    İlhan BAŞGÖZ

    2011-09-01

    Full Text Available Interest in folklore began in Turkey in the second half of the nineteenth century when the need was felt to forge a national language which could be understood by the majority. The Tanzimat reforms, which were introduced in 1839, inaugurated a functional change in Ottoman literature. A new generation of writers who were in contact with the West, especially France, and admired the economic, social, and educational institutions of Europe, soon realized that literature played an important role in the development of these institutions. To create a literature using the language of "common people," which was pure Turkish and unspoiled by foreign influences, made the Tanzimat writers interested in folklore and folk literature. Many other poets, novelists, play- wrights, and the intellectuals joined the movement between 1860 and 1900. The emergence of Turkish nationalism marked a new era in the attitude of intellectuals toward folklore and it was Boratav who introduced folklore to Turkey as an independent, scientific discipline. He enlarged the scope of folklore teaching and research to include verbal and nonverbal tradition. Türkiye’de folklora olan ilk ilgi, on dokuzuncu yüzyılın ikinci yarısında halkın çoğunluğu tarafından anlaşılabilecek bir milli dilin oluşturulması ihtiyacı hissedildiğinde başladı. 1839’da ilan edilen Tanzimat reformları Osmanlı edebiyatında fonksiyonel bir değişimi başlattı. Özellikle Fransa başta olmak üzere, Batı ile sıkı ilişkiler içerisinde olan ve Avrupa’nın ekonomik, sosyal ve eğitim kurumlarını arzu eden, örnek alan yeni nesil Osmanlı yazarları, çok geçmeden bu kurumların gelişmesinde edebiyatın önemli bir rol oynadığını fark ettiler. Yabancı etkilerle kirletilmemiş, saf Türkçe olan halkın dilini kullanarak bir edebiyat yaratmak için Tanzimat yazarları, halk bilimi ve halk edebiyatı ile ilgilendiler. Pek çok şair, romancı, oyun yazarı ve entellekt

  8. The transformation of contemporary analyses of oral folklore: Fairy tale versus fantasy

    Directory of Open Access Journals (Sweden)

    Otčenášek Jaroslav

    2010-01-01

    Full Text Available The study focuses on contemporary forms of folklore and their relationship to literary forms like Fantasy, Sci-fi, Horror and Fantasy Game. The first problem is the specification of the terms and the classification of the internal structure of these terms. A typical structure of contemporary oral folklore, such as urban legends, is a combination of classical forms of folklore (subject matter from fairy tales, anecdotes etc. and the influence of films, television and books. This contamination is really typical for postmodern culture. Fantasy stories can de divided into five categories - 1. alternative history (variants of past history or future evolution; 2. classical fantasy (variants of mythology or classical fairy tales or legends; 3. parody of fantasy or humour fantasy (the fantasy world is mostly only background; 4. urban fantasy (more or less a part of urban legend; 5. comics (the importance of graphic form - Superman, Batman etc.. Sci-fi and horror stories are mostly literary products influenced by classical legends or urban legends. Party games, especially “Dungeons & Dragons”, and their enactments by fans are a special part of the fantasy world. Ethnologists are faced with the questions of which method to use to carry out field research and what is actually relevant. Based on the first experiences we can see that for the research into this “new” field we can use the standard methods without problems. But for a better understanding we need to read fantasy, sci-fi and horror books, watch fantasy, sci-fi and horror movies, and get acquainted with the websites related to fantasy or sci-fi content. For a good analysis of fantasy party games one needs to become a member of a gamers’ group. The use of modern recording equipment like digital video cameras and cameras etc. is also very important.

  9. Folklore information from Assam for family planning and birth control.

    Science.gov (United States)

    Tiwari, K C; Majumder, R; Bhattacharjee, S

    1982-11-01

    The author collected folklore information on herbal treatments to control fertility from different parts of Assam, India. Temporary methods of birth control include Cissampelos pareira L. in combination with Piper nigrum L., root of Mimosa pudica L. and Hibiscus rosa-sinensis L. Plants used for permanent sterilization include Plumbago zeylanica L., Heliotropium indicum L., Salmalia malabrica, Hibiscus rosa-sinensis L., Plumeria rubra L., Bambusa rundinacea. Abortion is achieved through use of Osbeckia nepalensis or Carica papaya L. in combination with resin from Ferula narthex Boiss. It is concluded that there is tremendous scope for the collection of folklore about medicine, family planning agents, and other treatments from Assam and surrounding areas. Such a project requires proper understanding between the survey team and local people, tactful behavior, and a significant amount of time. Monetary rewards can also be helpful for obtaining information from potential respondents.

  10. Combinatorial and compositional aspects of bilingual aligned corpora

    NARCIS (Netherlands)

    Martzoukos, S.

    2016-01-01

    The subject of investigation of this thesis is the building blocks of translation in Statistical Machine Translation (SMT). We find that these building blocks, namely phrase-level dictionary entries, which are extracted from bilingual aligned corpora (training data), admit richer structure than

  11. Promoting free dialog video corpora: the IFADV corpus example

    NARCIS (Netherlands)

    van Son, R.J.J.H.; Wesseling, W.; Sanders, E.; van den Heuvel, H.; Kipp, M.; Martin, J.C.; Paggio, P.; Heylen, D.

    2009-01-01

    Research into spoken language has become more visual over the years. Both fundamental and applied research have progressively included gestures, gaze, and facial expression. Corpora of multi-modal conversational speech are rare and frequently difficult to use due to privacy and copyright

  12. Some Benefits of Corpora as a Language Learning Tool

    Science.gov (United States)

    Marjanovic, Tatjana

    2012-01-01

    What this paper is meant to do is share illustrations and insights into how English learners and teachers alike can benefit from using corpora in their work. Arguments are made for their multifaceted possibilities as grammatical, lexical and discourse pools suitable for discovering ways of the language, be they regularities or idiosyncrasies. The…

  13. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Information extraction (IE); text mining; text repositories; knowledge discovery from .... general purpose English words. However ... of precision and recall, as extensive experimentation is required due to lack of public tagged corpora. 4. Mining ...

  14. Pre-Modern Bosom Serpents and Hippocrates' Epidemiae 5: 86: A Comparative and Contextual Folklore Approach

    Directory of Open Access Journals (Sweden)

    Davide Ermacora

    2016-03-01

    Full Text Available A short Hippocratic passage (Epidemiae 5: 86 might constitute the earliest Western surviving variant of the well-known narrative and experiential theme of snakes or other animals getting into the human body (motif B784, tale-type ATU 285B*. This paper aims: 1 to throw light on this ancient passage through a comparative folkloric analysis and through a philological-contextual study, with reference to modern and contemporary interpretations; and 2 to offer an examination of previous scholarly enquiries on the fantastic intrusion of animals into the human body. In medieval and post-medieval folklore and medicine, sleeping out in the field was dangerous: snakes and similar animals could, it was believed, crawl into the sleeper’s body through the ears, eyes, mouth, nostrils, anus and vagina. Comparative material demonstrates, meanwhile, that the thirsty snake often entered the sleeper’s mouth because of its love of milk and wine. I will argue that while Epidemiae 5: 86 is modelled on this long-standing legendary pattern, for which many interesting literary pre-modern (and modern parallels exist, its relatively precise historical and cultural framework can be efficiently analysed. The story is embedded in a broad set of Graeco-Roman ideas and practices surrounding ancient beliefs about snakes and attitudes to the drinking of unmixed wine.

  15. Folklore and traditional ecological knowledge of geckos in Southern Portugal: implications for conservation and science

    Directory of Open Access Journals (Sweden)

    Vila-Viçosa Carlos M

    2011-09-01

    Full Text Available Abstract Traditional Ecological Knowledge (TEK and folklore are repositories of large amounts of information about the natural world. Ideas, perceptions and empirical data held by human communities regarding local species are important sources which enable new scientific discoveries to be made, as well as offering the potential to solve a number of conservation problems. We documented the gecko-related folklore and TEK of the people of southern Portugal, with the particular aim of understanding the main ideas relating to gecko biology and ecology. Our results suggest that local knowledge of gecko ecology and biology is both accurate and relevant. As a result of information provided by local inhabitants, knowledge of the current geographic distribution of Hemidactylus turcicus was expanded, with its presence reported in nine new locations. It was also discovered that locals still have some misconceptions of geckos as poisonous and carriers of dermatological diseases. The presence of these ideas has led the population to a fear of and aversion to geckos, resulting in direct persecution being one of the major conservation problems facing these animals. It is essential, from both a scientific and conservationist perspective, to understand the knowledge and perceptions that people have towards the animals, since, only then, may hitherto unrecognized pertinent information and conservation problems be detected and resolved.

  16. Folklore motives in the early compositions of Nikola Borota - Radovan

    Directory of Open Access Journals (Sweden)

    Jovanović Jelena

    2014-01-01

    Full Text Available The creative work of Nikola Borota - Radovan (musician, composer, lyricist, arranger and record producer, based in New Zealand - formerly from Yugoslavia held a specific place in development of world music (polygenre in his native homeland in the early 1970s. This study focuses on his creative principles, applied to works published between the years 1970 and 1975 (while the role of these works in social, cultural and political context of the time and place will be elaborated in another study, see Jovanović 2014. The platform established to present this unique musical approach authenticaly was called kamen na kamen (a studio and stage outfit that has included number of collaborations over many years. Based on the musical models and aethetics of the folk revival and created under influence of The Beatles’, in adition to many other popular music production directions of the era, Borota’s works reveal significant musical, performance and production qualities, innovative expression and musical solutions, that need to be percieved from the contemporary (ethnomusicological point of view. Despite the fact that many prominent creative Yugoslav musicians of the time also worked within a similar framework I would argue that Mr. Borota’s creative outcome was signifficantly different from other Yugoslav popular music creative efforts. This is particularly noticeable in the author’s unique treatment of South-European and other folklore motives, which is the main topic of this study. Folk (ethnic idioms exploited by Mr. Borota in his compositions originate from the rural traditions of western Dinaric regions. This is especially true for the rhythmic formations of deaf or silent dance; for the semi-urban and urban tradition of the Balkans and the Mediterranean; Middle European traditions; traditions from non-European peoples; elements of Italian Renaissance; and international (mostly Anglo-American musical models. Compositions are analysed partly in

  17. The transformation of contemporary analyses of oral folklore: Fairy tale versus fantasy

    OpenAIRE

    Otčenášek Jaroslav

    2010-01-01

    The study focuses on contemporary forms of folklore and their relationship to literary forms like Fantasy, Sci-fi, Horror and Fantasy Game. The first problem is the specification of the terms and the classification of the internal structure of these terms. A typical structure of contemporary oral folklore, such as urban legends, is a combination of classical forms of folklore (subject matter from fairy tales, anecdotes etc.) and the influence of films, television and books. This contami...

  18. Corporate Secretarial Bilingual Activity: An English Teaching Proposal Based on Corpora Directed to the Secretaries

    Directory of Open Access Journals (Sweden)

    José Roberto Lourenço

    2015-07-01

    Full Text Available This article presents part of research conducted in the field of Corpus Linguistics about the use of corpora in English Language Teaching specifically directed to corporate secretarial activities. The study developed at the doctoral level had FATEC-SP students as voluntary respondents to a questionnaire on corporate secretarial activities; the responses presented as one of the most important and frequent secretarial activities, "Reading, Preparation and Presentation of Administrative Report". We present a model of practice in English Teaching with an initial focus on "Company History, Strategies and Structure".

  19. AHP 10: Folklore: Bear and Rabbit (I

    Directory of Open Access Journals (Sweden)

    G.yu lha གཡུ་ལྷ།

    2011-06-01

    Full Text Available G.yu Iha writes: I recorded this folktale from Thub bstan (b. 1936, the reincarnate lama in Siyuewu Village (Puxi Township, Rangtang County, Aba Tibetan and Qiang Autonomous Prefecture, Sichuan Province when I visited him in the winter of 2009-2010. Thub bstan learned this folktale from his mother. I heard this tale when I was around six years old from my great grandfather when my family was having dinner near the stove one evening.

  20. The Value in Verifying Medical Folklore

    Directory of Open Access Journals (Sweden)

    Dennis J. Baumgardner

    2017-08-01

    Full Text Available Citing a related article published within this issue of the Journal of Patient-Centered Research and Reviews, the author opines on why traditional ideas regarding human health can persist over decades, and even centuries, despite a lack of scientifically accumulated evidence. It is important to keep in mind that some commonly accepted truths are supported by little to no factual data, and that occasionally patients may benefit from clarification on what is (or, often, is not actually known about longstanding “rules of thumb” (eg, certain home remedies, disease-prevention measures or behavioral concerns. On the flip side, traditions that are shown to be not harmful, like drinking chicken soup to relieve cold symptoms, may be safely indulged regardless of effectiveness.

  1. Automatically Extracting Typical Syntactic Differences from Corpora

    NARCIS (Netherlands)

    Wiersma, Wybo; Nerbonne, John; Lauttamus, Timo

    We develop an aggregate measure of syntactic difference for automatically finding common syntactic differences between collections of text. With the use of this measure, it is possible to mine for differences between, for example, the English of learners and natives, or between related dialects. If

  2. “The Foresight to Become a Mermaid”: Folkloric Cyborg Women in Éilís Ní Dhuibhne’s Short Stories

    Directory of Open Access Journals (Sweden)

    Rebecca Graham

    2017-10-01

    Full Text Available Éilís Ní Dhuibhne is both a folklorist and a feminist, who “took an interest in rewriting or re-inventing women’s history, a history which had been largely unwritten” (Ní Dhuibhne, “Negotiating” 73. Folklore stories and motifs abound in her writing. Elke D’hoker argues that Ní Dhuibhne reimagines and rewrites folktales to “reflect and interpret the social values and attitudes of a postmodern society” (D’hoker 137. The repurposing of folklore allows Ní Dhuibhne to interrogate some of the complex and controversial ways that Irish society has attempted to represent and control women, entrenching taboos about female behaviours and sexualities. Using Donna Haraway’s cyborg feminism and Karen Barad’s deployment of Haraway’s theory of diffraction, this article focuses on issues of voice and orality, and the female body in “The Mermaid Legend”, “Midwife to the Fairies”, and “Holiday in the Land of Murdered Dreams”, to argue that Ní Dhuibhne’s repurposing of folklore is a radically feminist undertaking. All three short stories, which feature female protagonists, reveal diverse, transgressive, sexual mothers and maidens whose symbolic connections with folklore allow them to challenge the restrictive constructions of women in Irish society, creating spaces to explore alternative, heterogeneous, feminist re-conceptions of identity and belonging.

  3. Folklore as historical and cultural legasy of the lower Volga region in the first third of the XXth century: B.S. Laschilin, A.M. Listopadov

    Directory of Open Access Journals (Sweden)

    Rodionova Olga Igorevna

    2013-11-01

    Full Text Available In the present article the question of the folklore phenomenon in the folk art of the Lower Volga Region in the first third of the 20th century is considered. In the course of research high emphasis was placed on the Cossack subject matter. The role of B.S. Laschilin and A.M. Listopadov in collecting and publishing folk art, the folklore of the Don Cossacks, is revealed. Boris Stepanovitch Laschilin’s work left a great impact in the artistic life of our region. In B.S. Laschilin’s books, that were published in Rostov-on-Don, Saratov, Stalingrad-Volgograd, contained tales, fairy tales, bylinas, legends, songs, ditties, proverbs, sayings, ancient dramas of the first Russian folk theatres, exorcisms. Boris Stepanovitch kept selecting songs and ditties, chastooshkas for Voronezh Folk Choir “Voronezh girls”, which are still in the repertoire of the Pyatnitsky Russian Folk Chorus. Folklorist and musician Alexander Mikhailovich Listopadov, who collected and studied folk songs from his youth up, and recorded them in the Don Region hamlets and Cossack villages, spent more than 50 years of his life on the research of the Don Cossack’s musical culture. Alexander Mikhailovich Listopadov’s heritage made an important contribution to the native musical folklore study. Folklore compositions is a unique source of knowledge of history, way of life, moral and other national concepts, which allows us to reconstitute a linguistic personality of a definite historical epoch.

  4. Folklore Traditions in Contemporary Everyday Life: Between Continuity and (Re)construction (based on two examples from the Czech Republic)

    Czech Academy of Sciences Publication Activity Database

    Uhlíková, Lucie; Pavlicová, M.

    2014-01-01

    Roč. 62, č. 2 (2014), s. 163-181 ISSN 1335-1303 Institutional support: RVO:68378076 Keywords : folklore * folklorism * ethno-cultural tradition * social construction * everyday life * the Czech Republic Subject RIV: AC - Archeology, Anthropology, Ethnology

  5. The Nearly Forgotten Malay Folklore: Shall We Start with the Software?

    Science.gov (United States)

    Abd Rahim, Normaliza

    2014-01-01

    The study focuses on the nearly forgotten Malay folklore in Malaysia. The objectives of the study were to identify and discuss the types of Malay folklore among primary school learners. The samples of the study were 100 male and female students at schools in Selangor. The samples were picked at random from several schools and they were given…

  6. Vodú Chic: Haitian Religion and the Folkloric Imaginary in Socialist Cuba

    Directory of Open Access Journals (Sweden)

    Grete Viddal

    2012-12-01

    Full Text Available During the first three decades of the twentieth century, hundreds of thousands of Haitian agricultural laborers arrived in Cuba seeking employment in the expanding sugar industry. Historically, Haitian cane cutters were marginal and occupied the lowest socio-economic status in Cuban society. Until relatively recently, the maintenance of Haitian spiritual beliefs, music, dance, and language in Cuba were associated with rural isolation and poverty. Today however, the continuation of Haitian customs is no longer linked with isolation, but exactly the opposite: performance troupes, heritage festivals, art exhibitions, the circulation of religious specialists, collaborations with research centers and academia, endorsement by music promoters, and the tourism industry. Cubans of Haitian heritage have found innovative ways to transform the abject into the exotic, and are currently gaining a public voice in cultural production, particularly through folkloric performance.

  7. El narco-folklore: narrativas e historias de la droga en la frontera

    Directory of Open Access Journals (Sweden)

    Howard Campbell

    2007-01-01

    Full Text Available Lo que el gobierno de los Estados Unidos ha llamado “La guerra contra las drogas” se basa en la idea de que el consumo y tráfi co de estupefacientes son inequívocamente actividades dañinas y peligrosas que la población del país temerá y rechazará. No obstante, los resultados de estudios etnográfi cos en la frontera Estados Unidos- México indican que el tráfi co de drogas se ha convertido en una actividad tan común que ha generado su propio estilo de subcultura, incluyendo música y folklore. Hasta la fecha los estudios antropológicos de la narco-cultura en la frontera se han enfocado en los narcocorridos, un género de música mexicana popular que celebra y narra el comercio de los estupefacientes y las vidas de trafi cantes de alto nivel. Estos estudios proporcionan perspectivas valiosas sobre los funcionamientos internos de las organizaciones de la droga y del contexto cultural de los cuales emergen. Sin embargo, la mayoría de los trabajadores del narcotráfi co no son los superhéroes o los bandidos ricos retratados en los narcocorridos. Es el pueblo, que tiene como principal motivación para involucrarse en el mundo de los estupefacientes la supervivencia económica. La imagen de un rico folklore de tráfi co de drogas se ha convertido en un perfi l común en la región fronteriza de El Paso / Ciudad Juárez. Este estudio etnográfi co muestra cómo este comercio se ha convertido en una parte “normal” de la vida diaria. El folklore cotidiano alrededor del tráfi co de drogas indica el grado en el cual el comercio de éstas afecta a los habitantes de la frontera en múltiples niveles.

  8. Regulation of the corpora allata in male larvae of the cockroach Diploptera punctata

    International Nuclear Information System (INIS)

    Paulson, C.R.

    1986-01-01

    The regulation of corpora allata was studied in final instar males of Diploptera punctata. The glands were manipulated in vivo and removed to determine the effect by in vitro radiochemical assay for juvenile hormone synthesis. Corpora allata were also treated with putative regulatory factors in vitro. During the final stadium the corpora allata were inhibited both by nerves and by humoral factors. Neural inhibition was shown by an increase in juvenile hormone synthesis following denervation of the corpora allata. This operation elicited an extra larval instar. Humoral inhibition was shown by the decline in juvenile hormone synthesis of adult female corpora allata following transplantation into final instar larval hosts, and conversely the increase in juvenile hormone synthesis by larval corpora allata following implantation into adult females. Humoral inhibition was prevented by decapitation of larvae prior to the head critical period for molting and restored by implantation of a larval brain, showing that the brain is the source of this inhibition

  9. "Sempre tivemos mulheres nos cantos e nas cordas": uma pesquisa sobre o lugar feminino nas corporações musicais

    Directory of Open Access Journals (Sweden)

    Mayara Pacheco Coelho

    2014-04-01

    Full Text Available O presente artigo insere-se em projeto de pesquisa-intervenção sobre a música e suas articulações identitárias nas corporações musicais da região dos Campos das Vertentes, em especial São João del-Rei e cidades vizinhas. Nessa região, a música tem papel significativo na formação da identidade cultural dos cidadãos e na história dos municípios. O recorte atual apresenta uma investigação sobre determinações de gênero, visando conhecer como se dá a participação de musicistas nas bandas e orquestras da região. Para tanto, utilizou-se a análise arqueológica do discurso, a fim de contrapor falas de musicistas às falas de músicos das corporações e, também, às falas masculinas presentes na filosofia e ao discurso utópico sobre a mulher. Observou-se que as diferenças de gênero tradicionais conservam-se encobertas no cotidiano das corporações musicais. Entretanto, observou-se também que as musicistas começam a ser reconhecidas nas corporações e, sobretudo, reconhecem-se como capazes de, nelas, alçarem voos.

  10. How Can We Use Corpus Wordlists for Language Learning? Interfaces between Computer Corpora and Expert Intervention

    Science.gov (United States)

    Chen, Yu-Hua; Bruncak, Radovan

    2015-01-01

    With the advances in technology, wordlists retrieved from computer corpora have become increasingly popular in recent years. The lexical items in those wordlists are usually selected, according to a set of robust frequency and dispersion criteria, from large corpora of authentic and naturally occurring language. Corpus wordlists are of great value…

  11. WARCProcessor: An Integrative Tool for Building and Management of Web Spam Corpora

    Directory of Open Access Journals (Sweden)

    Miguel Callón

    2017-12-01

    Full Text Available In this work we present the design and implementation of WARCProcessor, a novel multiplatform integrative tool aimed to build scientific datasets to facilitate experimentation in web spam research. The developed application allows the user to specify multiple criteria that change the way in which new corpora are generated whilst reducing the number of repetitive and error prone tasks related with existing corpus maintenance. For this goal, WARCProcessor supports up to six commonly used data sources for web spam research, being able to store output corpus in standard WARC format together with complementary metadata files. Additionally, the application facilitates the automatic and concurrent download of web sites from Internet, giving the possibility of configuring the deep of the links to be followed as well as the behaviour when redirected URLs appear. WARCProcessor supports both an interactive GUI interface and a command line utility for being executed in background.

  12. Microsyntactic Annotation of Corpora and its Use in Computational Linguistics Tasks

    Directory of Open Access Journals (Sweden)

    Iomdin Leonid

    2017-12-01

    Full Text Available Microsyntax is a linguistic discipline dealing with idiomatic elements whose important properties are strongly related to syntax. In a way, these elements may be viewed as transitional entities between the lexicon and the grammar, which explains why they are often underrepresented in both of these resource types: the lexicographer fails to see such elements as full-fledged lexical units, while the grammarian finds them too specific to justify the creation of individual well-developed rules. As a result, such elements are poorly covered by linguistic models used in advanced modern computational linguistic tasks like high-quality machine translation or deep semantic analysis. A possible way to mend the situation and improve the coverage and adequate treatment of microsyntactic units in linguistic resources is to develop corpora with microsyntactic annotation, closely linked to specially designed lexicons. The paper shows how this task is solved in the deeply annotated corpus of Russian, SynTagRus.

  13. Language and folklore in Hamid Mosaddeq’s poem

    Directory of Open Access Journals (Sweden)

    IRAN

    2016-02-01

    Full Text Available Abstract"Standard language", "sub-standard language" and "meta-standard language" are the language types of many varieties. Use of sub- standard language in making poetry, known as “stylistic deviation”, is one of the ways of highlighting poetic language. More attention to this technique of language in the contemporary period was paid by Nima. Nima believed that all words have the potentiality to enter the realm of poetry. No word is essentially poetic or non-poetic, but the way of using words by the poet determines its poetic value.Hamid Mossadegh by the use of sub-standard language elements, in addition to increasing the richness of his poems, made them closer to the mind, language and life of people. Folkloric elements of Mosaddeq’s poems were divided into seven groups: 1 Slang words, 2 common and spoken vocabulary 3 Irony and Proverbs 4 Tlfzhay popular 5 allusion to folk tales 6 folk beliefs and customs 7 local vocabulary.Slang words in poems Mosaddeq in the "verb" and "noun" have been examined. Many folk verbs such as "Shangidan" and "gap zadan (to chat" in Mosaddeq’s poems have been applied. Some of folk verbs in his poems are in such a way that at first, one could not understand the point. These verbs have several meanings that one or more specific meanings are slang, like verb "gereftan (to get" that means "to grow the root of the plant" has slang sense.There is an abundance application of folk nouns in Mosaddeq’s poem. Some of the nouns used in Mosaddeq’s poem, considering their figurative meanings, can be investigated in the folk nouns group, like "foot" in the figurative sense of "will"."Colloquial and current words are of the most frequent elements of folk words in the poetry of Mosaddeq. These words in the category of "nouns" and "verbs" could be analyzed. Lexical verbs such as "to hip" and "Perfume of Moskow" are of this kind. "Irony and Proverbs" are the other folk elements of the poetry of Mosaddeq. "till eye can see

  14. “Not a Thing of the Past”, Zora Neale Hurston and the Living Legacy of Folklore « Not a Thing of the Past », Zora Neale Hurston et le legs vivant du folklore

    Directory of Open Access Journals (Sweden)

    Margaret Gillespie

    2009-11-01

    Full Text Available Auteur important bien qu’atypique de la Renaissance de Harlem et premier anthropologue afro-américain à avoir étudié sa propre culture, Zora Neale Hurston est, à de nombreux titres, un écrivain d’exception. Contrairement à d’autres, dont Robert Wright et Alain Locke, Hurston ne renie nullement le legs culturel que représente le folklore noir qu’elle apprécie selon ses propres critères, folklore qui influencera tant la forme que le fond de son art. Anthropologue de formation, Hurston appréhende néanmoins la culture noire américaine du sud non pas comme un vestige du passé qu’il conviendrait de conserver précieusement intact, mais comme une partie intégrante du vécu actuel. À travers les stratégies discursives orales vernaculaires qu’elle adopte et adapte de la tradition folklorique afro-américaine, Hurston, en pionnière, ouvre une voie et donne une voix aux écrivains Noirs à venir.

  15. Blood transfusion and resuscitation using penile corpora: an experimental study.

    Science.gov (United States)

    Abolyosr, Ahmad; Sayed, M A; Elanany, Fathy; Smeika, M A; Shaker, S E

    2005-10-01

    To test the feasibility of using the penile corpora cavernosa for blood transfusion and resuscitation purposes. Three male donkeys were used for autologous blood transfusion into the corpus cavernosum during three sessions with a 1-week interval between each. Two blood units (450 mL each) were transfused per session to each donkey. Moreover, three dogs were bled up until a state of shock was produced. The mean arterial blood pressure decreased to 60 mm Hg. The withdrawn blood (mean volume 396.3 mL) was transfused back into their corpora cavernosa under 150 mm Hg pressure. Different transfusion parameters were assessed. The Assiut faculty of medicine ethical committee approved the study before its initiation. For the donkey model, the mean time of blood collection was 12 minutes. The mean time needed to establish corporal access was 22 seconds. The mean time of blood transfusion was 14.2 minutes. The mean rate of blood transfusion was 31.7 mL/min. Mild penile elongation with or without mild penile tumescence was observed on four occasions. All penile shafts returned spontaneously to their pretransfusion state at a maximum of 5 minutes after cessation of blood transfusion. No extravasation, hematoma formation, or color changes occurred. Regarding the dog model, the mean rate of transfusion was 35.2 mL/min. All dogs were resuscitated at the end of the transfusion. The corpus cavernosum is a feasible, simple, rapid, and effective alternative route for blood transfusion and venous access. It can be resorted to whenever necessary. It is a reliable means for volume replacement and resuscitation in males.

  16. Conservation Implications of the Prevalence and Representation of Locally Extinct Mammals in the Folklore of Native Americans

    Directory of Open Access Journals (Sweden)

    Preston Matthew

    2009-01-01

    Full Text Available Many rationales for wildlife conservation have been suggested. One rationale not often mentioned is the impact of extinctions on the traditions of local people, and conservationists′ subsequent need to strongly consider culturally based reasons for conservation. As a first step in strengthening the case for this rationale, we quantitatively examined the presence and representation of eight potentially extinct mammals in folklore of 48 Native American tribes that live/lived near to 11 national parks in the United States. We aimed to confirm if these extinct animals were traditionally important species for Native Americans. At least one-third of the tribes included the extinct mammals in their folklore (N=45 of 124 and about half of these accounts featured the extinct species with positive and respectful attitudes, especially the carnivores. This research has shown that mammals that might have gone locally extinct have been prevalent and important in Native American traditions. Research is now needed to investigate if there indeed has been or might be any effects on traditions due to these extinctions. Regardless, due to even the possibility that the traditions of local people might be adversely affected by the loss of species, conservationists might need to consider not only all the biological reasons to conserve, but also cultural ones.

  17. La aldea fantasma: Problemas en el estudio del folklore y la cultura popular contemporáneos

    Directory of Open Access Journals (Sweden)

    Díaz G. Viana, Luis

    2003-06-01

    Full Text Available The author analyzes the problems involved in the study of folklore and popular culture in a contemporary world, transnational and hybrid, aparently different from what the object/subject of study was supposed to be. Nevertheless he argues that the type of urban legends we can gather today through Internet does not differe from the traditional materials, such as leyends, games or mores, since they talk (as they used to about people tryng to make sense out of an always changing and mixed world.

    El autor ofrece un análisis de la problemática relacionada con el estudio del folklore y la cultura popular en el mundo contemporáneo, transnacional e híbrido, aparentemente distinto de lo que se suponía que era el objeto/sujeto de estudio tradicional. Sin embargo, argumenta que el tipo de leyendas urbanas que podemos recopilar hoy a través de internet no es diferente de los materiales tradicionales, tales como leyendas, juegos o costumbres; ya que de lo que hablan éstos, al igual que aquéllos, es de las preocupaciones de las personas por dar sentido a un mundo siempre cambiante y siempre en contacto.

  18. Ande-Ande Lumut: Adaptasi Folklor ke Teater Epik Brecht

    Directory of Open Access Journals (Sweden)

    Philipus Nugroho Hari Wibowo

    2013-11-01

    and Japan. The adaptation theory is developing well; everything can be used as an adaptation object, poems, novels, dramas, paintings, dances, and video games. Kemuning is performed by the performing concept of Brecht’s epic theater. However, this is an effort to fi nd out the new form of reading in Ande-Ande Lumut story. The epic theater against one of the main elements in Aristotle’s drama that has been developed by Stanislavsky’s method; there should be an empathy in every aspect of performance. According to Brecht, this process has caused an effect which should be avoided because it brings audience’s passive attitude. Therefore, he tried to make a theory of destroying the illusion, of interrupting method, and of controlling emotion. Brecht’s identical works focus on the social themes, especially on the themes that show the poor people who are suffering from the authority’s policy. The common problems between the master and its worker are refl ected on hisstory. The Kemuning performance has tried to show the prostitutes’ life that is closed to any negative things. In fact, they are still being needed by the society. Unfortunately, sometimes they become the source of scapegoats to any troubles and are always blamed to. Implicitly, this performance is aimed to fi ght for the prostitutes’ life. The audience is invited to see the other points of view about their life that are often regarded as negative by the people. Moreover, Brecht said that a good and demanded theater in this modern era is a theater that can arouse the audience’s critical thinking activities. Therefore, this performance is supposed to be able to motivate the arts lovers in producing a critical analysis to any social awareness and in creating a new movement to any signifi cant changes in society. Keywords: Folklore, Ande-Ande Lumut, Adaptation, and Brecht’ Epic Theater

  19. Plant derived substances with anti-cancer activity: from folklore to practice

    Directory of Open Access Journals (Sweden)

    Marcelo eFridlender

    2015-10-01

    Full Text Available Plants have had an essential role in the folklore of ancient cultures. In addition to the use as food and spices, plants have also been utilized as medicines for over 5000 years. It is estimated that 70-95% of the population in developing countries continues to use traditional medicines even today. A new trend, that involved the isolation of plant active compounds begun during the early 19th century. This trend led to the discovery of different active compounds that are derived from plants. In the last decades, more and more new materials derived from plants have been authorized and subscribed as medicines, including those with anti-cancer activity. Cancer is among the leading causes of morbidity and mortality worldwide. The number of new cases is expected to rise by about 70% over the next 2 decades. Thus, there is a real need for new efficient anti-cancer drugs with reduced side effects, and plants are a promising source for such entities. Here we focus on some plant-derived substances exhibiting anti-cancer and chemoprevention activity, their mode of action and bioavailability. These include paclitaxel, curcumin and cannabinoids. In addition, development and use of their synthetic analogs, and those of strigolactones, are discussed. Also discussed are commercial considerations and future prospects for development of plant derived substances with anti-cancer activity.

  20. Laughter annotations in conversational speech corpora - possibilities and limitations for phonetic analysis

    NARCIS (Netherlands)

    Truong, Khiet Phuong; Trouvain, Jürgen

    Existing laughter annotations provided with several publicly available conversational speech corpora (both multiparty and dyadic conversations) were investigated and compared. We discuss the possibilities and limitations of these rather coarse and shallow laughter annotations. There are definition

  1. Using Small Parallel Corpora to Develop Collocation-Centred Activities in Specialized Translation Classes

    Directory of Open Access Journals (Sweden)

    Postolea Sorina

    2016-12-01

    Full Text Available The research devoted to special languages as well as the activities carried out in specialized translation classes tend to focus primarily on one-word or multi-word terminological units. However, a very important part in the making of specialist registers and texts is played by specialised collocations, i.e. relatively stable word combinations that do not designate concepts but are nevertheless of frequent use in a given field of activity. This is why helping students acquire competences relative to the identification and processing of collocations should become an important objective in specialised translation classes. An easily accessible and dependable resource that may be successfully used to this purpose is represented by corpora and corpus analysis tools, whose usefulness in translator training has been highlighted by numerous studies. This article proposes a series of practical, task-based activities-developed with the help of a small-size parallel corpus of specialised texts-that aim to raise the translation trainees′ awareness of the collocations present in specialised texts and to provide suggestions about their processing in translation.

  2. An analysis on the entity annotations in biological corpora [v1; ref status: indexed, http://f1000r.es/2o0

    Directory of Open Access Journals (Sweden)

    Mariana Neves

    2014-04-01

    Full Text Available Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.

  3. Developing resources for sentiment analysis of informal Arabic text in social media

    OpenAIRE

    Itani, Maher; Roast, Chris; Al-Khayatt, Samir

    2017-01-01

    Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Ara...

  4. Use of monolingual and comparable corpora in the classroom to translate adverbial connectors

    Directory of Open Access Journals (Sweden)

    Beatriz Sánchez Cárdenas

    2016-06-01

    Full Text Available Research in terminology has traditionally focused on nouns. Considerably less attention has been paid to other grammatical categories such as adverbs. However, these words can also be problematic for the novice translator, who tends to use the translation correspondences in bilingual dictionaries without realizing that formal equivalence is not necessarily the same as textual equivalence. However, semantic values, acquired in context, go far beyond dictionary meaning and are related to phenomena such as semantic prosody and preferences of lexical selection that can vary, depending on text type and specialized domain. This research explored the reasons why certain adverbial discourse connectors, apparently easy to translate, are a source of translation problems that cannot be easily resolved with a bilingual dictionary. Moreover, this study analyzed the use of parallel corpora in the translation classroom and how it can increase the quality of text production. For this purpose, we compared student translations before and after receiving training on the use of corpus analysis tools

  5. Visualizing the semantic content of large text databases using text maps

    Science.gov (United States)

    Combs, Nathan

    1993-01-01

    A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content.

  6. Russian Folk Culture in the 20 th Century: Oral Evidence of the Villagers (On the Materials of Folklore Expeditions

    Directory of Open Access Journals (Sweden)

    Ekaterina A. Dorokhova

    2017-12-01

    Full Text Available Folk culture is capable of developing certain adaptation mechanisms that help it promptly react to the changing conditions of natural, socio-political, and economic environment. This is evidenced by the stories of the villagers recorded during folklore expeditions to different regions of Russia. The article highlights changes that took place in the traditional Russian culture under the influence of collectivization in the 1920s–1930s, the collapse of kolkhozes in the 1990s, the development of the rural club amateur performances in the Soviet time, the events of the World War II, modern military conflicts, and Chernobyl ecological catastrophe. The authors come to conclusion that representatives of traditional culture flexibly adapt to their new living conditions, while extreme conditions such as wars and ecological catastrophes often contribute to the actualization of folk culture and enable the return of its certain aspects to living practice.

  7. The specificity of folklore and mythological motifs in the novel “Tsar Maiden” by Vsevolod Solovyov

    Directory of Open Access Journals (Sweden)

    Lyapina Svetlana Mitrofanovna

    2014-12-01

    Full Text Available The article deals with folklore motifs in the novel by Vsevolod Solovyov “Tsar-maiden”, and reveals the link between this work and a magic tale. The author comes to the conclusion that the appeal to the image of the Tsar-maiden due to the desire of the writer to show the irrational spirit of pre-Petrine Russia, judgment of the people of the rulers of Imperial power. In the popular view of the nation the fact that the woman has become a monarch it was beyond their comprehension and considered a miracle akin to a fairy tale. Therefore, from Vsevolod Solovyov’s viewpoint, a fabulous image of the Tsar-maiden in the minds of the people coincided with the image of Princess Sophia.

  8. Juvenile hormone biosynthesis gene expression in the corpora allata of honey bee (Apis mellifera L. female castes.

    Directory of Open Access Journals (Sweden)

    Ana Durvalina Bomtorin

    Full Text Available Juvenile hormone (JH controls key events in the honey bee life cycle, viz. caste development and age polyethism. We quantified transcript abundance of 24 genes involved in the JH biosynthetic pathway in the corpora allata-corpora cardiaca (CA-CC complex. The expression of six of these genes showing relatively high transcript abundance was contrasted with CA size, hemolymph JH titer, as well as JH degradation rates and JH esterase (jhe transcript levels. Gene expression did not match the contrasting JH titers in queen and worker fourth instar larvae, but jhe transcript abundance and JH degradation rates were significantly lower in queen larvae. Consequently, transcriptional control of JHE is of importance in regulating larval JH titers and caste development. In contrast, the same analyses applied to adult worker bees allowed us inferring that the high JH levels in foragers are due to increased JH synthesis. Upon RNAi-mediated silencing of the methyl farnesoate epoxidase gene (mfe encoding the enzyme that catalyzes methyl farnesoate-to-JH conversion, the JH titer was decreased, thus corroborating that JH titer regulation in adult honey bees depends on this final JH biosynthesis step. The molecular pathway differences underlying JH titer regulation in larval caste development versus adult age polyethism lead us to propose that mfe and jhe genes be assayed when addressing questions on the role(s of JH in social evolution.

  9. 'DELİ DUMRUL' BY SUAT TAŞER, WITHIN THE SCOPE OF FOLKLORE - IDEOLOGY – LITERATURE FOLKLOR-İDEOLOJİ-EDEBİYAT ÜÇGENİNDE SUAT TAŞER’İN DELİ DUMRUL’U

    Directory of Open Access Journals (Sweden)

    Nezir TEMUR

    2011-12-01

    Full Text Available It's a sociologically inevitable phenomenon that the social andpolitical changes occuring in societies evoke their reflections in culturalproductions prominently. Since the 19th Century, when nationalidentities began to take form along with romantic nationalism, folkloricartifacts which are significant conveyers of cultural recollections such ashistory and language, have confronted us as a field emphasized byideological and literary movements, notably Social Sciences. In the worldof 20th Century, when ideologies began to take form in political sense,folkloric artifacts undertook significant functions in culture policiesenvisaged by dominant ideologies for the new forms of Societies whichthey tried to build. The style of the folkloric artifacts, cultural codes theyconveyed, and their functionality have been active components in thisapproach. In this sense, the intensifying process , which begins to headtowards works of folk narrations, folk poetry, and folk literature inTurkish Literature after 1930s, gradually increases after the 1940s andthis tendency becomes one of the significant sources fostering literature.At this point, substantial works of Turkish folklore such as epics,folktales, tales, legends have been released to the public within newperspectives and techniques.It can be seen that new pursuits in expressions and utteranceshave been embarked, like in 'Deli Dumrul - Ölüm ve Aşk' (Epic and Playby Dede Korkut , which can be considered as the rewriting of one of hisepics with a new understanding. This study aims to make a comparisonbetween 'Deli Dumrul - Ölüm ve Aşk' (Epic - Play by Suat Taşer and theoriginal text of the Epic of Deli Dumrul and to examine how folkloricartifacts and cultural values tried to be transmitted into those artifactshave been modernized, adapted contemporarily and released ;the partswhere the writer digressed from the souce text during the adaptation; towhat extent the traditional context has been changed

  10. A Linguistic Inquiry and Word Count Analysis of the Adult Attachment Interview in Two Large Corpora.

    Science.gov (United States)

    Waters, Theodore E A; Steele, Ryan D; Roisman, Glenn I; Haydon, Katherine C; Booth-LaForce, Cathryn

    2016-01-01

    An emerging literature suggests that variation in Adult Attachment Interview (AAI; George, Kaplan, & Main, 1985) states of mind about childhood experiences with primary caregivers is reflected in specific linguistic features captured by the Linguistic Inquiry Word Count automated text analysis program (LIWC; Pennebaker, Booth, & Francis, 2007). The current report addressed limitations of prior studies in this literature by using two large AAI corpora ( N s = 826 and 857) and a broader range of linguistic variables, as well as examining associations of LIWC-derived AAI dimensions with key developmental antecedents. First, regression analyses revealed that dismissing states of mind were associated with transcripts that were more truncated and deemphasized discussion of the attachment relationship whereas preoccupied states of mind were associated with longer, more conflicted, and angry narratives. Second, in aggregate, LIWC variables accounted for over a third of the variation in AAI dismissing and preoccupied states of mind, with regression weights cross-validating across samples. Third, LIWC-derived dismissing and preoccupied state of mind dimensions were associated with direct observations of maternal and paternal sensitivity as well as infant attachment security in childhood, replicating the pattern of results reported in Haydon, Roisman, Owen, Booth-LaForce, and Cox (2014) using coder-derived dismissing and preoccupation scores in the same sample.

  11. Designing and Implementing a Cross-Language Information Retrieval System Using Linguistic Corpora

    Directory of Open Access Journals (Sweden)

    Amin Nezarat

    2012-03-01

    Full Text Available Information retrieval (IR is a crucial area of natural language processing (NLP and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval (CLIR refers to a kind of information retrieval in which the language of the query and that of searched document are different. In fact, it is a retrieval process where the user presents queries in one language to retrieve documents in another language. This paper tried to construct a bilingual lexicon of parallel chunks of English and Persian from two very large monolingual corpora an English-Persian parallel corpus which could be directly applied to cross-language information retrieval tasks. For this purpose, a statistical measure known as Association Score (AS was used to compute the association value between every two corresponding chunks in the corpus using a couple of complicated algorithms. Once the CLIR system was developed using this bilingual lexicon, an experiment was performed on a set of one hundred English and Persian phrases and collocations to see to what extend this system was effective in assisting the users find the most relevant and suitable equivalents of their queries in either language.

  12. Entropy Rate Estimates for Natural Language—A New Extrapolation of Compressed Large-Scale Corpora

    Directory of Open Access Journals (Sweden)

    Ryosuke Takahira

    2016-10-01

    Full Text Available One of the fundamental questions about human language is whether its entropy rate is positive. The entropy rate measures the average amount of information communicated per unit time. The question about the entropy of language dates back to experiments by Shannon in 1951, but in 1990 Hilberg raised doubt regarding a correct interpretation of these experiments. This article provides an in-depth empirical analysis, using 20 corpora of up to 7.8 gigabytes across six languages (English, French, Russian, Korean, Chinese, and Japanese, to conclude that the entropy rate is positive. To obtain the estimates for data length tending to infinity, we use an extrapolation function given by an ansatz. Whereas some ansatzes were proposed previously, here we use a new stretched exponential extrapolation function that has a smaller error of fit. Thus, we conclude that the entropy rates of human languages are positive but approximately 20% smaller than without extrapolation. Although the entropy rate estimates depend on the script kind, the exponent of the ansatz function turns out to be constant across different languages and governs the complexity of natural language in general. In other words, in spite of typological differences, all languages seem equally hard to learn, which partly confirms Hilberg’s hypothesis.

  13. Glandectomy with preservation of corpora cavernosa in the treatment of penile carcinoma

    Directory of Open Access Journals (Sweden)

    Fonseca Aluizio G. da

    2003-01-01

    Full Text Available INTRODUCTION: The objective of this work is to describe a conservative surgical technique as an alternative to classic penile amputations, aiming the local control of the disease, in addition to trying to preserve the patient's sexual function. SURGICAL TECHNIQUE: After a circular incision of the skin around the penis, the subfascial plane is developed until the base of the organ. The dorsal neurovascular bundle and the urethra are isolated in their distal extremities. The neurovascular bundle is sectioned distally. A retrocoronal dissection plane is developed between the glans and the corpora cavernosa. When this stage is complete, the glans is fixed only to the urethra, which is distally sectioned as well. The neurovascular bundle is fixed to the dorsal albuginea. Following the spatulation of the urethra, a neomeatus is created using the overlay skin of the penis. Between January 2001 and July 2002, we employed this technique in 6 patients who had epidermoid carcinoma of the penis, which were limited to the glans, superficial, well or moderately differentiated and measuring up to 3 cm. COMMENTS: Several conservative surgical methods for treatment of carcinoma of the penis aim the organ's preservation, in an attempt of improving the quality of life of patients, however the indexes of local recurrence and failure in disease control are significant. The described technique showed to be safe and effective for disease control, in addition to preserving sexual function in all patients who were treated, representing, thus, a quite appealing conservative surgical alternative in selected cases.

  14. Specification of Drosophila corpora cardiaca neuroendocrine cells from mesoderm is regulated by Notch signaling.

    Directory of Open Access Journals (Sweden)

    Sangbin Park

    2011-08-01

    Full Text Available Drosophila neuroendocrine cells comprising the corpora cardiaca (CC are essential for systemic glucose regulation and represent functional orthologues of vertebrate pancreatic α-cells. Although Drosophila CC cells have been regarded as developmental orthologues of pituitary gland, the genetic regulation of CC development is poorly understood. From a genetic screen, we identified multiple novel regulators of CC development, including Notch signaling factors. Our studies demonstrate that the disruption of Notch signaling can lead to the expansion of CC cells. Live imaging demonstrates localized emergence of extra precursor cells as the basis of CC expansion in Notch mutants. Contrary to a recent report, we unexpectedly found that CC cells originate from head mesoderm. We show that Tinman expression in head mesoderm is regulated by Notch signaling and that the combination of Daughterless and Tinman is sufficient for ectopic CC specification in mesoderm. Understanding the cellular, genetic, signaling, and transcriptional basis of CC cell specification and expansion should accelerate discovery of molecular mechanisms regulating ontogeny of organs that control metabolism.

  15. Regeneration of rat corpora cavernosa tissue by transplantation of CD133+ cells derived from human bone marrow and placement of biodegradable gel sponge sheet

    Directory of Open Access Journals (Sweden)

    Shogo Inoue

    2017-01-01

    Full Text Available The objective is to develop an easier technique for regenerating corpora cavernosa tissue through transplantation of human bone marrow-derived CD133 + cells into a rat corpora cavernosa defect model. We excised 2 mm × 2 mm squares of the right corpora cavernosa of twenty-three 8-week-old male nude rats. Alginate gel sponge sheets supplemented with 1 × 10 4 CD133 + cells were then placed over the excised area of nine rats. Functional and histological evaluations were carried out 8 weeks later. The mean intracavernous pressure/mean arterial pressure ratio for the nine rats (0.34258 ± 0.0831 was significantly higher than that for eight rats with only the excision (0.0580 ± 0.0831, P = 0.0238 and similar to that for five rats for which the penis was exposed, and there was no excision (0.37228 ± 0.1051, P = 0.8266. Immunohistochemical analysis revealed that the nine fully treated rats had venous sinus-like structures and quantitative reverse transcription polymerase chain reaction analysis of extracts from their alginate gel sponge sheets revealed that the amounts of mRNA encoding the nerve growth factor (NGF, and vascular endothelial growth factor (VEGF were significantly higher than those for rats treated with alginate gel sheets without cell supplementation (NGF: P = 0.0309; VEGF: P < 0.0001. These findings show that transplantation of CD133 + cells accelerates functional and histological recovery in the corpora cavernosa defect model.

  16. Discovery learning in the language-for-translation classroom: corpora as learning aids

    Directory of Open Access Journals (Sweden)

    Silvia Bernardini

    2016-04-01

    This contribution reviews the idea of discovery learning with corpora, proposed in the 1990s, evaluating its potential and its implications with reference to the education of translators today. The rationale behind this approach to data-driven learning, combining project-based and form-focused instruction within a socio-constructivistically inspired environment, is discussed. Examples are also provided of authentic, open-ended learning experiences, thanks to which students of translation share responsibility over the development of corpora and their consultation, and teachers can abandon the challenging role of omniscient knowledge providers and wear the more honest hat of "learning experts". Adding to the more straightforward uses of corpora in courses that aim to develop thematic, technological and information mining competences – i.e., in which training is offered in the use of corpora as professional aids –, attention is focused on foreign language teaching for translators and on corpora as learning aids, highlighting their potential for the development of the three other European Master's in Translation (EMT competences (translation service provision, language and intercultural ones.

  17. Uma investigação dos sentidos de um phrasal verb por meio dos corpora e dicionários on-line

    Directory of Open Access Journals (Sweden)

    Emiliana Fernandes Bonalumi

    2014-06-01

    Full Text Available Nesta pesquisa analisamos o uso do phrasal verbs throw up encontrado em dois corpora on-line originalmente escritos em língua inglesa, a saber: British National Corpus (BNC e Corpus of Contemporary American English (COCA, bem como no livro didático adotado em sala de aula New English File Upper-Intermediate, com o suporte dos dicionários on-line Cambridge Online Dictionary e Macmillan Dictionary. Objetivamos identificar, classificar e generalizar o uso e significados do phrasal verb selecionado para a análise nos respectivos corpora on-line em relação ao seu uso e significado no livro didático anteriormente mencionado. Por meio dos corpora e dicionários on-line, o aluno expandirá seu conhecimento acerca do uso e significados de um determinado phrasal verb, como o analisado nesta investigação. Palavras-chave: linguística de corpus; ensino movido por dados; phrasal verbs.

  18. Preceitos e normas internas (kakun de casas comerciais japonesas: um estudo sobre a longevidade e a ética da corporação japonesa

    Directory of Open Access Journals (Sweden)

    Isao Yamamoto

    Full Text Available O estudo de corporações de uma das maiores economias mundiais se justifica em um mundo sem fronteiras no qual hoje vivemos e onde diferenças culturais afetam relações negociais. O objetivo é explicitar como as casas comerciais e outras corporações tradicionais japonesas conseguiram enorme longevidade. Foi privilegiado o papel desempenhado pelo kakun nessas corporações; ou seja, o papel desempenhado por um conjunto de preceitos e normas internas que, tendo surgido nos séculos XVII e XVIII, tem viva a sua força até os dias correntes. O método escolhido para o estudo foi a historiografia, que visa ao resgate dos acontecimentos e das atividades humanas ao longo do tempo. Chegamos à conclusão de que muito do que pregava o kakun está hoje presente em estudos sobre organizações e gestão e que, associado a questões éticas, o kakun é, em grande parte, o responsável pela longevidade das empresas japonesas.

  19. Influence of communal and private folklore on bringing meaning to the experience of persistent pain.

    Science.gov (United States)

    Hendricks, Joyce Marie

    2015-11-01

    To provide an overview of the relevance and strengths of using the literary folkloristic methodology to explore the ways in which people with persistent pain relate to and make sense of their experiences through narrative accounts. Storytelling is a conversation with a purpose. The reciprocal bond between researcher and storyteller enables the examination of the meaning of experiences. Life narratives, in the context of wider traditional and communal folklore, can be analysed to discover how people make sense of their circumstances. This paper draws from the experience of the author, who has previously used this narrative approach. It is a reflection of how the approach may be used to understand those experiencing persistent pain without a consensual diagnosis. Using an integrative method, peer-reviewed research and discussion papers published between January 1990 and December 2014 and listed in the CINAHL, Science Direct, PsycINFO and Google Scholar databases were reviewed. In addition, texts that addressed research methodologies such as literary folkloristic methodology and Marxist literary theory were used. The unique role that nurses play in managing pain is couched in the historical and cultural context of nursing. Literary folkloristic methodology offers an opportunity to gain a better understanding and appreciation of how the experience of pain is constructed and to connect with sufferers. Literary folkloristic methodology reveals that those with persistent pain are often rendered powerless to live their lives. Increasing awareness of how this experience is constructed and maintained also allows an understanding of societal influences on nursing practice. Nurse researchers try to understand experiences in light of specific situations. Literary folkloristic methodology can enable them to understand the inter-relationship between people in persistent pain and how they construct their experiences.

  20. Folclore e medicina popular na Amazônia Folklore and popular medicine in the Amazon

    Directory of Open Access Journals (Sweden)

    Márcio Couto Henrique

    2009-12-01

    Full Text Available Discute as relações entre folclore e medicina popular na Amazônia, tendo como referencial de análise o conto "Filhos do boto", de Canuto Azevedo. Aponta que os contos folclóricos estão saturados de elementos da realidade cultural e podem ser utilizados como testemunhos históricos que expressam embates entre diferentes tradições. Os registros folclóricos são fruto do diálogo muitas vezes conflituoso entre folcloristas, cientistas sociais, médicos, pajés e seus seguidores, e sua análise deve ser acompanhada de reflexão sobre as condições de sua produção. Neste caso específico, trata-se de refletir, com base no imaginário de sedução e cura em torno do boto, sobre a possibilidade de ampliar o conhecimento sobre a medicina popular praticada na Amazônia, região de forte presença da pajelança cabocla.This discussion of the relations between folklore and popular medicine in the Amazon takes Canuto Azevedo's story "Filhos do boto" (Children of the porpoise as an analytical reference point. Replete with elements of cultural reality, folk tales can serve as historical testimonies expressing clashes between different traditions. Folk records are fruit of what is often a quarrelsome dialogue between folklorists, social scientists, physicians, and pajés and their followers, and their analysis should take into account the conditions under which they were produced. Based on the imaginary attached to the figure of the porpoise - a seductive creature with healing powers - the article explores how we might expand knowledge of popular medicine as practiced in the Amazon, where the shamanistic rite known as pajelança cabocla has a strong presence.

  1. Sleep paralysis in Brazilian folklore and other cultures: a brief review

    Directory of Open Access Journals (Sweden)

    José Felipe Rodriguez de Sá

    2016-09-01

    Full Text Available Sleep paralysis (SP is a dissociative state that occurs mainly during awakening. SP is characterized by altered motor, perceptual, emotional and cognitive functions, such as inability to perform voluntary movements, visual hallucinations, feelings of chest pressure, delusions about a frightening presence and, in some cases, fear of impending death. Most people experience SP rarely, but typically when sleeping in supine position; however, SP is considered a disease (parasomnia when recurrent and/or associated to emotional burden. Interestingly, throughout human history, different peoples interpreted SP under a supernatural view. For example, Canadian Eskimos attribute SP to spells of shamans, who hinder the ability to move, and provoke hallucinations of a shapeless presence. In the Japanese tradition, SP is due to a vengeful spirit who suffocates his enemies while sleeping. In Nigerian culture, a female demon attacks during dreaming and provokes paralysis. A modern manifestation of SP is the report of alien abductions, experienced as inability to move during awakening associated with visual hallucinations of aliens. Furthermore, SP is a significant example of how a specific biological phenomenon can be interpreted and shaped by different cultural contexts. In order to further explore the ethnopsychology of SP, the Pisadeira, a character of Brazilian folklore originated in the country’s Southeast, but also found in other regions with variant names, has been reviewed. Pisadeira is described as a crone with long fingernails who lurks on roofs at night and tramples on the chest of those who sleep on a full stomach with the belly up. This legend is mentioned in many anthropological accounts; however, we found no comprehensive reference on the Pisadeira from the perspective of sleep science. Here we aim to fill this gap. We first review the neuropsychological aspects of SP, and then present the folk tale of the Pisadeira. Finally, we summarize the

  2. Using corpora in scientific and technical translation training: resources to identify conventionality and promote creativity

    OpenAIRE

    Clara Inés López-Rodríguez

    2016-01-01

    http://dx.doi.org/10.5007/2175-7968.2016v36nesp1p88 Since the first Corpus Use and Learning to Translate (CULT) Conference in Bertinoro (Italy) in 1997, the usefulness of corpora for translators and trainee translators has been highlighted. From an initial approach where translators compiled ad hoc corpora in their hard drive for a subsequent study with lexical analysis software, there emerged a new trend towards the use of the Internet as corpus. In this second approach, the Web is perce...

  3. Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span

    Directory of Open Access Journals (Sweden)

    Jordan MI

    2006-05-01

    Full Text Available Abstract Background The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caenorhabditis Genetic Center (CGC Bibliography using techniques from statistical information retrieval. Items in the CGC biomedical text corpus were modeled using the Latent Dirichlet Allocation (LDA model. LDA is a hierarchical Bayesian model which represents a document as a random mixture over latent topics; each topic is characterized by a distribution over words. Results An LDA model estimated from CGC items had better predictive performance than two standard models (unigram and mixture of unigrams trained using the same data. To illustrate the practical utility of LDA models of biomedical corpora, a trained CGC LDA model was used for a retrospective study of nematode genes known to be associated with life span modification. Corpus-, document-, and word-level LDA parameters were combined with terms from the Gene Ontology to enhance the explanatory value of the CGC LDA model, and to suggest additional candidates for age-related genes. A novel, pairwise document similarity measure based on the posterior distribution on the topic simplex was formulated and used to search the CGC database for "homologs" of a "query" document discussing the life span-modifying clk-2 gene. Inspection of these document homologs enabled and facilitated the production of hypotheses about the function and role of clk-2. Conclusion Like other graphical models for genetic, genomic and other types of biological data, LDA provides a method for extracting unanticipated insights and generating predictions amenable to subsequent experimental validation.

  4. Correlation among foetal number, corpora lutea and plasma progesterone in rockland-swiss mice. [Progesterone determination by radioimmunoassay

    Energy Technology Data Exchange (ETDEWEB)

    Simon, N G; Bridges, R S; Gandelmann, R [Rutgers - the State Univ., New Brunswick. NJ (USA). Dept. of Psychology; Rutgers - the State Univ., Newark, NJ (USA). Inst. of Animal Behavior)

    1978-01-01

    The relationship among plasma progesterone, number of corpora lutea, and foetal number was assessed in Rockland-Swiss albino mice. While number of corpora lutea and foetal number were significantly correlated, neither was related to plasma progesterone level. This finding in the mouse is similar to results reported in the rabbit.

  5. Symbolic Machine Learning: A Different Answer to the Problem of the Acquisition of Lexical Knowledge from Corpora

    Directory of Open Access Journals (Sweden)

    Pascale Sébillot

    2008-07-01

    Full Text Available One relevant way to structure the domain of lexical knowledge (e.g. relations between lexical units acquisition from corpora is to oppose numerical versus symbolic techniques. Numerical approaches of acquisition exploit the frequential aspect of data, have been widely used, and produce portable systems, but poor explanations of their results. Symbolic approaches exploit the structural aspect of data. Among them, the symbolic machine learning (ML techniques can infer efficient and expressive patterns of a target relation from examples of elements that verify this relation. These methods are however far less known, and the aim of this paper is to point out their interest through the description of one precise experiment. To remove their supervised characteristic, and instead of opposing them to numerical approaches, we finally show that it is possible to combine one symbolic ML technique to one numerical one, and keep advantages of both (meaningful patterns, efficient extraction, portability.

  6. Human attitudes towards herpetofauna: The influence of folklore and negative values on the conservation of amphibians and reptiles in Portugal

    Science.gov (United States)

    2012-01-01

    Background Human values and folklore of wildlife strongly influence the effectiveness of conservation efforts. These values and folklore may also vary with certain demographic characteristics such as gender, age, or education. Reptiles and amphibians are among the least appreciated of vertebrates and are victims of many negative values and wrong ideas resulting from the direct interpretation of folklore. We try to demonstrate how these values and folklore can affect the way people relate to them and also the possible conservation impacts on these animals. Methods A questionnaire survey distributed to 514 people in the district of Évora, Portugal, was used to obtain data regarding the hypothesis that the existence of wrong ideas and negative values contributes to the phenomenon of human-associated persecution of these animals. A structural equation model was specified in order to confirm the hypothesis about the possible relationships between the presence of perceptions and negative values about amphibians and reptiles and persecution and anti-conservation attitudes. Sociodemographic variables were also added. Results The results of the model suggest that the presence of folklore and negative values clearly predicts persecution and anti-conservation attitudes towards amphibians and reptiles. Also, the existence of folklore varies sociodemographically, but negative values concerning these animals are widespread in the population. Conclusions With the use of structural equation models, this work is a contribution to the study of how certain ideas and values can directly influence human attitudes towards herpetofauna and how they can be a serious conservation issue. PMID:22316318

  7. Human attitudes towards herpetofauna: the influence of folklore and negative values on the conservation of amphibians and reptiles in Portugal.

    Science.gov (United States)

    Ceríaco, Luis Mp

    2012-02-08

    Human values and folklore of wildlife strongly influence the effectiveness of conservation efforts. These values and folklore may also vary with certain demographic characteristics such as gender, age, or education. Reptiles and amphibians are among the least appreciated of vertebrates and are victims of many negative values and wrong ideas resulting from the direct interpretation of folklore. We try to demonstrate how these values and folklore can affect the way people relate to them and also the possible conservation impacts on these animals. A questionnaire survey distributed to 514 people in the district of Évora, Portugal, was used to obtain data regarding the hypothesis that the existence of wrong ideas and negative values contributes to the phenomenon of human-associated persecution of these animals. A structural equation model was specified in order to confirm the hypothesis about the possible relationships between the presence of perceptions and negative values about amphibians and reptiles and persecution and anti-conservation attitudes. Sociodemographic variables were also added. The results of the model suggest that the presence of folklore and negative values clearly predicts persecution and anti-conservation attitudes towards amphibians and reptiles. Also, the existence of folklore varies sociodemographically, but negative values concerning these animals are widespread in the population. With the use of structural equation models, this work is a contribution to the study of how certain ideas and values can directly influence human attitudes towards herpetofauna and how they can be a serious conservation issue.

  8. Combining Language Corpora with Experimental and Computational Approaches for Language Acquisition Research

    Science.gov (United States)

    Monaghan, Padraic; Rowland, Caroline F.

    2017-01-01

    Historically, first language acquisition research was a painstaking process of observation, requiring the laborious hand coding of children's linguistic productions, followed by the generation of abstract theoretical proposals for how the developmental process unfolds. Recently, the ability to collect large-scale corpora of children's language…

  9. Studies on luteinizing hormone receptors of human corpora lutea during menstrual cycle and pregnancy

    Energy Technology Data Exchange (ETDEWEB)

    Izumi, Yasushi (Keio Univ., Tokyo (Japan). School of Medicine)

    1982-10-01

    With the purpose of explicating the lifespan of human corpora lutea, using human corpora lutea of the menstrual cycle and pregnancy, binding of /sup 125/I-LH to the 20,000g cell membrane fraction was examined. 1) Specific bindings of /sup 125/I-LH, /sup 125/I-HCG were demonstrated in the 20,000g cell membrane fraction. Although LH and HCG were parallel in inhibiting /sup 125/I-LH binding, HCG was found to be more effective. FSH did not inhibit binding. 2) Binding of /sup 125/I-LH was dependent on time, temperature, /sup 125/I-LH concentration, amount of the cell membrane fraction protein and pH. The highest binding was seen at pH 6.0 while incubating for 60 min at 37/sup 0/C. 3) The number of LH receptors in human corpora lutea of the menstrual cycle increased towards midluteal phase, especially on 5th day from ovulation, and decreased towards late luteal phase. LH receptor was not found in corpus albicans. The apparent dissociation constant of each corpus luteum did not change throughout the menstrual cycle. 4) Corpora lutea of pregnancy contained a few or no receptors which bound /sup 125/I-LH specifically. These data suggest that LH receptor is an important factor regulating the lifespan of corpus luteum and exogenous HCG has effect on luteal insufficiency, but the effect of HCG on threatened abortion is uncertain.

  10. Gonadotropin binding sites in human ovarian follicles and corpora lutea during the menstrual cycle

    Energy Technology Data Exchange (ETDEWEB)

    Shima, K.; Kitayama, S.; Nakano, R.

    1987-05-01

    Gonadotropin binding sites were localized by autoradiography after incubation of human ovarian sections with /sup 125/I-labeled gonadotropins. The binding sites for /sup 125/I-labeled human follicle-stimulating hormone (/sup 125/I-hFSH) were identified in the granulosa cells and in the newly formed corpora lutea. The /sup 125/I-labeled human luteinizing hormone (/sup 125/I-hLH) binding to the thecal cells increased during follicular maturation, and a dramatic increase was preferentially observed in the granulosa cells of the large preovulatory follicle. In the corpora lutea, the binding of /sup 125/I-hLH increased from the early luteal phase and decreased toward the late luteal phase. The changes in 3 beta-hydroxysteroid dehydrogenase activity in the corpora lutea corresponded to the /sup 125/I-hLH binding. Thus, the changes in gonadotropin binding sites in the follicles and corpora lutea during the menstrual cycle may help in some important way to regulate human ovarian function.

  11. Studies on luteinizing hormone receptors of human corpora lutea during menstrual cycle and pregnancy

    International Nuclear Information System (INIS)

    Izumi, Yasushi

    1982-01-01

    With the purpose of explicating the lifespan of human corpora lutea, using human corpora lutea of the menstrual cycle and pregnancy, binding of 125 I-LH to the 20,000g cell membrane fraction was examined. 1) Specific bindings of 125 I-LH, 125 I-HCG were demonstrated in the 20,000g cell membrane fraction. Although LH and HCG were parallel in inhibiting 125 I-LH binding, HCG was found to be more effective. FSH did not inhibit binding. 2) Binding of 125 I-LH was dependent on time, temperature, 125 I-LH concentration, amount of the cell membrane fraction protein and pH. The highest binding was seen at pH 6.0 while incubating for 60 min at 37 0 C. 3) The number of LH receptors in human corpora lutea of the menstrual cycle increased towards midluteal phase, especiallt on 5th day from ovulation, and decreased towards late luteal phase. LH receptor was not found in corpus albicans. The apparent dissociation constant of each corpus luteum did not change throughout the menstrual cycle. 4) Corpora lutea of pregnancy contained a few or no receptors which bound 125 I-LH specifically. These data suggest that LH receptor is an important factor regulating the lifespan of corpus luteum and exogenous HCG has effect on luteal insufficiency, but the effect of HCG on threatened abortion is uncertain. (author)

  12. The Corpora of China English: Implications for an EFL Dictionary for ...

    African Journals Online (AJOL)

    The localization of the English language in China has brought about a distinctive English variety which has come to be known as China English. Recently, several corpora of China English have been or are being built; these will help us to identify the established linguistic features of this variety, and should greatly facilitate ...

  13. Using Corpora in EFL Classrooms: The Case Study of IELTS Preparation

    Science.gov (United States)

    Smirnova, Elizaveta A.

    2017-01-01

    This article describes the gathered experience in using corpora in an IELTS preparation course. The practice demonstrates an attempt to reduce negative washback effects occurring when preparation courses just concentrate on the test format neglecting the importance of development of learners' language skills and general study skills. Some…

  14. Application of Learner Corpora to Second Language Learning and Teaching: An Overview

    Science.gov (United States)

    Xu, Qi

    2016-01-01

    The paper gives an overview of learner corpora and their application to second language learning and teaching. It is proposed that there are four core components in learner corpus research, namely, corpus linguistics expertise, a good background in linguistic theory, knowledge of SLA theory, and a good understanding of foreign language teaching…

  15. Charles Dicken’s Use of Folklore: A Study of Elements in Bleak House

    Science.gov (United States)

    1981-04-21

    asserts the association between death and blackness by misquoting Shakespeare ; in Hamlet Shakespeare refers to the "fell sergeant, Death," ~66 and Dickens...2t464. One can find many allusions to works by Shakespeare in Dickens’s novels. The relevance of Shakespeare as a source is a field that awaits extensive... Shakespeare Land (London: Mitchell Hughes and Clarke, 1929), p.41. 3Cora Linn Daniels, ed. Encyclopedia of Superstitions, Folklore and the Occult Sciences of

  16. The development of folklore, arts and crafts in ukrainian ethnic minorities: trends (1990 – 2000-s)

    OpenAIRE

    V. M. Pekarchuk

    2014-01-01

    On the basis of represented wide palette of historical facts, analytic works, scientific documents it is made an attempt to reproduce the place and role of folklore, arts and crafts of Ukrainian ethnic minority cultures within 1990 ­ 2000 ­ ies. The importance of the designated problem is caused, first of all, the need to have a clear understanding of the mechanism of the decision problem of an independent state of interethnic relations. It was found that during the study years in Ukraine,...

  17. When “She” Is Not Maud: An Esoteric Foundation and Subtext for Irish Folklore in the Works of W.B. Yeats

    Directory of Open Access Journals (Sweden)

    C. Nicholas Serra

    2017-10-01

    Full Text Available This article examines Yeats’s broad use of Irish folklore between 1888 and 1938, and attempts to find a justification for his contention that his own unique metaphysical system expressed in both editions of A Vision, itself an outgrowth of his three decades of ritual practice as an initiate in the Hermetic Order of the Golden Dawn, could somehow function as both an interpretation and enlargement of “the folk-lore of the villages”. Beyond treating Irish fairy stories as a way for Yeats to establish his own Irishness, capture what remained of “reckless Ireland” in its twilight, or create a political counter-discourse set against English hegemony, the immutability and immortality of the sídhe are considered in light of the assertions of several minor lectures from the Golden Dawn. This connection sheds new light on Yeats’s ideas about Unity of Being, and hypothesizes a possible esoteric path to “escape” from his system of phases so as to resolve the body-soul dilemma evident in his poetry.

  18. Corpora and corpus technology for translation purposes in professional and academic environments. Major achievements and new perspectives

    Directory of Open Access Journals (Sweden)

    Cécile Frérot

    2016-04-01

    The “use” of corpora and concordancers in translation teaching has grown increasingly attractive since the mid1990s’ with an abundant literature advocating their use and promoting their benefits in the translation classroom. In translator training, efforts are being made to incorporate the use of corpora and concordancers in masters’ programmes and to offer specific modules on corpora for translation as the use of translation memory (TM systems within Computer-Aided Translation (CAT courses still dominates. In the translation profession, while TM systems are part of the everyday working environment, the same cannot be said of corpora and concordancers even though the most recent surveys show that professional translators would like to learn more about the potential of corpora for translation. Overall, the “usefulness” of corpora and corpus technology at the different stages of the translation process remains poorly documented in translation but a growing number of empirical studies has started to show concern as it has now become of paramount importance to assess the extent to which corpora are of added value for translation quality in both professional and academic environments.

  19. Parsing with subdomain instance weighting from raw corpora

    NARCIS (Netherlands)

    Plank, B.; Sima'an, K.

    2008-01-01

    The treebanks that are used for training statistical parsers consist of hand-parsed sentences from a single source/domain like newspaper text. However, newspaper text concerns different subdomains of language use (e.g. finance, sports, politics, music), which implies that the statistics gathered by

  20. Parsing with Subdomain Instance Weighting from Raw Corpora

    NARCIS (Netherlands)

    Plank, Barbara; Sima'an, Khalil

    2008-01-01

    The treebanks that are used for training statistical parsers consist of hand-parsed sentences from a single source/domain like newspaper text. However, newspaper text concerns different subdomains of language use (e.g. finance, sports, politics, music), which implies that the statistics gathered by

  1. Chinese legal texts – Quantitative Description

    Directory of Open Access Journals (Sweden)

    Ľuboš GAJDOŠ

    2017-06-01

    Full Text Available The aim of the paper is to provide a quantitative description of legal Chinese. This study adopts the approach of corpus-based analyses and it shows basic statistical parameters of legal texts in Chinese, namely the length of a sentence, the proportion of part of speech etc. The research is conducted on the Chinese monolingual corpus Hanku. The paper also discusses the issues of statistical data processing from various corpora, e.g. the tokenisation and part of speech tagging and their relevance to study of registers variation.

  2. Aportes materiales y psicoafectivos del negro en el folklore colombiano

    Directory of Open Access Journals (Sweden)

    Manuel Zapata Olivella

    1967-06-01

    Full Text Available La mayoría de los países latinoamericanos se han conformado por los aportes básicos de las culturas indígena, hispánica y africana. El grado de este mestizaje varía en unos y otros, según la importancia de los grupos étnicos. En Colombia, el equilibrio cultural no siempre corresponde a la mezcla de las razas.

  3. The Folklore - Nationalism Relationship in the Balkans. Case Study “Whose Is This Song?” by Adela Peeva

    Directory of Open Access Journals (Sweden)

    Elena-Lorena Nedelcu

    2016-06-01

    Full Text Available This article analyses a 2003 documentary titled “Whose Is This Song?” by Bulgarian movie director Adela Peeva, in the purpose of understanding the relationship between the folklore and the nationalism in the Balkans. The theme of the documentary is the director’s quest to trace the roots of a folk song that she had thought was 100 percent Bulgarian since her childhood. The documentary follows Peeva’s journey with a camera in hand around Turkey, Greece, Macedonia, Albania, Bosnia and Herzegovina, Serbia and Bulgaria, where she discovers that the song is sung by all of these nations. The documentary can be interpreted as showing how an ordinary song could become an instrument of fanatical nationalism and that it reveals mutual strife instead of Balkan unity. In a region defined by ethnic hatred and war, what begins as a simply investigation of the true origins of a song, ends as a sociological and historical exploration of the deep misunderstandings between the people of the Balkans.

  4. Transformation of folklore tradition in the poem by M.I. Tsvetaeva “From the Sea”

    Directory of Open Access Journals (Sweden)

    Galieva Marianna Andreevna

    2015-06-01

    Full Text Available The paper studies the functioning of the folk tradition in the poetics by M.I. Tsvetaeva. The object of research is the poem “From the Sea” of 1926. Scientists have carefully studied motivic structure of the poem, but the attention is not paid to the folk elements. Special attention is paid to the motive of travel to “the other world”, which in terms of the semantics is correlated with the motive of sleep. Folklorism creativity of M.I. Tsvetaeva is studied enough, but there is always a need for the identification of implicit forms of folk traditions that exist in the poetics. In our work we are talking about the breaking of the folk tradition, its inner form. The connection to the archetypal models of poetry (the ship by pre-genre formations. Appeal to the fabulous tradition, to the motif of travel to “the other world” shows the archetypal, not typical in the poetry of the early XX century. It is applied the historical and typological method; Tsvetaeva’s metaphor is genetically traced to the ritual of reality expressed in the plot structure of the ship, eydology of the “other kingdom”. Historical poetics allows look at the poem “From the Sea” differently.

  5. Luteinizing hormone receptors in human ovarian follicles and corpora lutea during the menstrual cycle

    International Nuclear Information System (INIS)

    Yamoto, M.; Nakano, R.; Iwasaki, M.; Ikoma, H.; Furukawa, K.

    1986-01-01

    The binding of 125 I-labeled human luteinizing hormone (hLH) to the 2000-g fraction of human ovarian follicles and corpora lutea during the entire menstrual cycle was examined. Specific high affinity, low capacity receptors for hLH were demonstrated in the 2000-g fraction of both follicles and corpora lutea. Specific binding of 125 I-labeled hLH to follicular tissue increased from the early follicular phase to the ovulatory phase. Specific binding of 125 I-labeled hLH to luteal tissue increased from the early luteal phase to the midluteal phase and decreased towards the late luteal phase. The results of the present study indicate that the increase and decrease in receptors for hLH during the menstrual cycle might play an important role in the regulation of the ovarian cycle

  6. Luteinizing hormone receptors in human ovarian follicles and corpora lutea during the menstrual cycle

    Energy Technology Data Exchange (ETDEWEB)

    Yamoto, M.; Nakano, R.; Iwasaki, M.; Ikoma, H.; Furukawa, K.

    1986-08-01

    The binding of /sup 125/I-labeled human luteinizing hormone (hLH) to the 2000-g fraction of human ovarian follicles and corpora lutea during the entire menstrual cycle was examined. Specific high affinity, low capacity receptors for hLH were demonstrated in the 2000-g fraction of both follicles and corpora lutea. Specific binding of /sup 125/I-labeled hLH to follicular tissue increased from the early follicular phase to the ovulatory phase. Specific binding of /sup 125/I-labeled hLH to luteal tissue increased from the early luteal phase to the midluteal phase and decreased towards the late luteal phase. The results of the present study indicate that the increase and decrease in receptors for hLH during the menstrual cycle might play an important role in the regulation of the ovarian cycle.

  7. Dynamics of extracellular matrix in ovarian follicles and corpora lutea of mice

    DEFF Research Database (Denmark)

    Irving-Rodgers, Helen F; Hummitzsch, Katja; Murdiyarso, Lydia S

    2009-01-01

    Despite the mouse being an important laboratory species, little is known about changes in its extracellular matrix (ECM) during follicle and corpora lutea formation and regression. Follicle development was induced in mice (29 days of age/experimental day 0) by injections of pregnant mare's serum...... and antral follicles. The focimatrix, a specialised matrix of the membrana granulosa, contained collagen type IV alpha1 and alpha2, laminin alpha1, beta1 and gamma1 chains, nidogens 1 and 2, perlecan and collagen type XVIII. In the corpora lutea, staining was restricted to capillary sub-endothelial basal...... gonadotrophin on days 0 and 1 and ovulation was induced by injection of human chorionic gonadotrophin on day 2. Ovaries were collected for immunohistochemistry (n=10 per group) on days 0, 2 and 5. Another group was mated and ovaries were examined on day 11 (n=7). Collagen type IV alpha1 and alpha2, laminin...

  8. Text collections for evaluation of Russian morphological taggers

    Directory of Open Access Journals (Sweden)

    Lyashevskaya Olga

    2017-12-01

    Full Text Available The paper describes the preparation and development of the text collections within the framework of MorphoRuEval-2017 shared task, an evaluation campaign designed to stimulate development of the automatic morphological processing technologies for Russian. The main challenge for the organizers was to standardize all available Russian corpora with the manually verified high-quality tagging to a single format (Universal Dependencies CONLL-U. The sources of the data were the disambiguated subcorpus of the Russian National Corpus, SynTagRus, OpenCorpora.org data and GICR corpus with the resolved homonymy, all exhibiting different tagsets, rules for lemmatization, pipeline architecture, technical solutions and error systematicity. The collections includes both normative texts (the news and modern literature and more informal discourse (social media and spoken data, the texts are available under CC BY-NC-SA 3.0 license.

  9. Folklore Studies on Birth Related Customs within the Banat Community

    Directory of Open Access Journals (Sweden)

    Alexin Otilia Daniela

    2017-12-01

    Full Text Available Birth is perceived as a threshold, a milestone, and is best described as passing from one stage to another and from one status to another. This article aims to present the customs regarding the birth of a child, as they were preserved in the Banat folk mentality: the origin of the midwife and her role as mediator, the belief in the unfailing destiny foreseen by the book of fate, the rite of the first bath having a huge importance for the future of the child and a series of magic and religious acts meant to ward off the Evil forces that intend to harm the child and to restore the balance.

  10. Elements of characterology in folklore music of Dinaric area

    Directory of Open Access Journals (Sweden)

    Kenjalović Milorad

    2012-01-01

    Full Text Available Dinaric type of man, with all its anthropological, genetic and psychological characteristics presents an orthodox example of patriarchal upbringing and tradition. Regardless of their patriarchalism and apparent insensitivity to other people, in almost every element of their intellectual work (music, dance, sazings, etc. the fleshly and instinctive, that had to be satisfied regardless of all bans and restraints, and the message doubtless confirms that he did live in accordance with instincts, but at the same time he had to respect criteria of patriarchal moral. In this work the autors cite several songs from this area and analyze it from the perspective of psychology and characterology, finding the elements of love joy and sorrow, cure, passion, women shyness, etc.

  11. Induction of canine deciduoma in some reproductive stages with the different condition of corpora lutea.

    Science.gov (United States)

    Nomura, K

    1997-03-01

    Bitches were examined to see whether canine deciduoma could be induced at some reproductive stages with the different conditions of corpora lutea by inserting a silk suture into the uterine lumen. The bitches stimulated in the early and middle stages of diestrus or in unilateral pregnancy corresponding to these diestrous stages formed deciduoma at a high induction rate, however, no difference in the strength of decidual reaction between the pregnant and diestrous stages was recognized. On the other hand, no reaction could be seen in bitches in late diestrus, the late stage of unilateral pregnancy or the post partum repair phase in which stromal decidual cells similar to those of the rodentia can be seen. In already implanted uteri, however, no deciduoma was formed in the interplacental areas. Even though the corpora lutea were functional, new additional stimulations were not accepted at the interplacental area in which the uterine horn had already been influenced by fertilized ova. From these results, it was suggested that in the dog as well as the rodentia, the endometrium has to be under the influence of functional corpora lutea in order to form deciduoma.

  12. Deleterious effects of progestagen treatment in VEGF expression in corpora lutea of pregnant ewes.

    Science.gov (United States)

    Letelier, C A; Sanchez, M A; Garcia-Fernandez, R A; Sanchez, B; Garcia-Palencia, P; Gonzalez-Bulnes, A; Flores, J M

    2011-06-01

    The aim of the current study was to determine the possible effects of progestagen oestrous synchronization on vascular endothelial growth factor (VEGF) expression during sheep luteogenesis and the peri-implantation period and the relationship with luteal function. At days 9, 11, 13, 15, 17 and 21 of pregnancy, the ovaries from 30 progestagen treated and 30 ewes cycling after cloprostenol injection were evaluated by ultrasonography and, thereafter, collected and processed for immunohistochemical evaluation of VEGF; blood samples were drawn for evaluating plasma progesterone. The progestagen-treated group showed smaller corpora lutea than cloprostenol-treated and lower progesterone secretion. The expression of VEGF in the luteal cells increased with time in the cloprostenol group, but not in the progestagen-treated group, which even showed a decrease between days 11 and 13. In progestagen-treated sheep, VEGF expression in granulosa-derived parenchymal lobule capillaries was correlated with the size of the luteal tissue, larger corpora lutea had higher expression, and tended to have a higher progesterone secretion. In conclusion, the current study indicates the existence of deleterious effects from exogenous progestagen treatments on progesterone secretion from induced corpora lutea, which correlate with alterations in the expression of VEGF in the luteal tissue and, this, presumably in the processes of neoangiogenesis and luteogenesis. © 2010 Blackwell Verlag GmbH.

  13. [Single and combining effects of Calculus Bovis and zolpidem on inhibitive neurotransmitter of rat striatum corpora].

    Science.gov (United States)

    Liu, Ping; He, Xinrong; Guo, Mei

    2010-04-01

    To investigate the correlation effects between single or combined administration of Calculus Bovis or zolpidem and changes of inhibitive neurotransmitter in rat striatum corpora. Sampling from rat striatum corpora was carried out through microdialysis. The content of two inhibitive neurotransmitters in rat corpus striatum- glycine (Gly) and gama aminobutyric acid (GABA), was determined by HPLC, which involved pre-column derivation with orthophthaladehyde, reversed-phase gradient elution and fluorescence detection. GABA content of rat striatum corpora in Calculus Bovis group was significantly increased compared with saline group (P Calculus Boris plus zolpidem group were increased largely compared with saline group as well (P Calculus Bovis group was higher than combination group (P Calculus Bovis or zolpidem group was markedly increased compared with saline group or combination group (P Calculus Bovis group, zolpidem group and combination group. The magnitude of increase was lower in combination group than in Calculus Bovis group and Zolpidem group, suggesting that Calculus Bovis promoted encephalon inhibition is more powerful than zolpidem. The increase in two inhibitive neurotransmitters did not show reinforcing effect in combination group, suggesting that Calculus Bovis and zolpidem may compete the same receptors. Therefore, combination of Calculus Bovis containing drugs and zolpidem has no clinical significance. Calculus Bovis shouldn't as an aperture-opening drugs be used for resuscitation therapy.

  14. I La Galigo Folklore Illustration on Textile Media

    Directory of Open Access Journals (Sweden)

    Yosepin Sri Ningsih

    2014-01-01

    Full Text Available This project was an effort in conserving the I La Galigo epic story while at the same time adding value to silk, the famous textile product from South Sulawesi and the origin of I La Galigo. As a work of literature, I La Galigo is categorized as an epic. It is known locally as Sureq Galigo in Bugis. It is divided in a number of episodes or tereng. The most well-known tereng is the one which describes the relationship between Sawerigading and a princess called I We Cudai. From that relationship I La Galigo, the central character of this epic is born. For this project six of the most well-known episodes were selected because of the amount of available supporting data, both theoretical and visual. The selected episodes were translated from their original narrative form into visual language or images. The illustration technique used in this project was STP (Space Time Plane. With this technique every object is drawn from varying viewpoints in one frame, both in space and time. Hand embroidery was added to the painted images. The silk painting can be used as an interior element with value added by the I La Galigo illustrations. Keywords: I La Galigo; epic; Sulawesi; illustration; silk painting; space time plane.

  15. Folklore in bureaucracy code: Running a music event

    Directory of Open Access Journals (Sweden)

    Krstanović-Lukić Miroslava

    2004-01-01

    Full Text Available A music folk-created piece of work is a construction expressed as a paradigm part of a set in the bureaucracy system and the public arena. Such a work is a mechanical concept, which defines inheritance as a construction of authenticity saturated with elements of folk, national culture. It is also a subject of certain conventions in the system of regulations; namely, it is a part of the administrative code. The usage of the folk created work as a paradigm and legislations is realized through an organizational apparatus that is, it becomes entertainment, a spectacle. This paper analyzes the functioning of the organizational machinery of a folk spectacle, starting with the government authorities, local self-management and the spectacle's administrative committees. To illustrate this phenomenon, the paper presents the development of a trumpet playing festival in Dragačevo. This particular festival establishes a cultural, economic and political order with a clear and defined division of power. The analysis shows that the folk event in question, through its programs and activities, represents a scene and arena of individual and group interests. Organizational interactions are recognized in binary oppositions: sovereignty/dependency official/unofficial, dominancy/ subordination, innovative/inherited common/different, needed/useful, original/copy, one's own/belonging to someone else.

  16. Cognitive “Boy stories”: urban folklore and urban topographies

    Directory of Open Access Journals (Sweden)

    Bojan Žikić

    2016-02-01

    Full Text Available The culturally cognitive perception of Belgrade’s topographies is considered through its deployment, symbolic use and narrative foundation. As the explanatory material-one football-media incident, the use of certain areas of the city in a spectacleceremonial manner, knowledge and lore of certain elements of the Belgrade topographies and the organization of «the football Belgrade»-were considered. The attitude is taken that the topography of a city is a multifaceted cultural constituent, whose structure of particular meaning, as a part of cultural communication, is determined by the very fact it is an urban space. Physical aspects of spatial-ness are reduced to relationism, i.e. it has a meaning for the cultural communication only when the elements of urban topographies are brought into correlation. Other characteristics of physical spatial-ness are irrelevant for such communication. Meaning relations in which elements of urban topographies exist are formed on the very fact of them being urban, that is, the afore mentioned denotation that is ascribed to space, stems from those cultural features and artifacts that are associated in a given milieu with certain concrete elements of urban topographies.

  17. Polish Phoneme Statistics Obtained On Large Set Of Written Texts

    Directory of Open Access Journals (Sweden)

    Bartosz Ziółko

    2009-01-01

    Full Text Available The phonetical statistics were collected from several Polish corpora. The paper is a summaryof the data which are phoneme n-grams and some phenomena in the statistics. Triphonestatistics apply context-dependent speech units which have an important role in speech recognitionsystems and were never calculated for a large set of Polish written texts. The standardphonetic alphabet for Polish, SAMPA, and methods of providing phonetic transcriptions are described.

  18. Exogenous estradiol enhances apoptosis in regressing post-partum rat corpora lutea possibly mediated by prolactin

    Directory of Open Access Journals (Sweden)

    Telleria Carlos M

    2005-08-01

    Full Text Available Abstract Background In pregnant rats, structural luteal regression takes place after parturition and is associated with cell death by apoptosis. We have recently shown that the hormonal environment is responsible for the fate of the corpora lutea (CL. Changing the levels of circulating hormones in post-partum rats, either by injecting androgen, progesterone, or by allowing dams to suckle, was coupled with a delay in the onset of apoptosis in the CL. The objectives of the present investigation were: i to examine the effect of exogenous estradiol on apoptosis of the rat CL during post-partum luteal regression; and ii to evaluate the post-partum luteal expression of the estrogen receptor (ER genes. Methods In a first experiment, rats after parturition were separated from their pups and injected daily with vehicle or estradiol benzoate for 4 days. On day 4 post-partum, animals were sacrificed, blood samples were taken to determine serum concentrations of hormones, and the ovaries were isolated to study apoptosis in situ. In a second experiment, non-lactating rats after parturition received vehicle, estradiol benzoate or estradiol benzoate plus bromoergocryptine for 4 days, and their CL were isolated and used to study apoptosis ex vivo. In a third experiment, we obtained CL from rats on day 15 of pregnancy and from non-lactating rats on day 4 post-partum, and studied the expression of the messenger RNAs (mRNAs encoding the ERalpha and ERbeta genes. Results Exogenous administration of estradiol benzoate induced an increase in the number of apoptotic cells within the CL on day 4 post-partum when compared with animals receiving vehicle alone. Animals treated with the estrogen had higher serum prolactin and progesterone concentrations, with no changes in serum androstenedione. Administration of bromoergocryptine blocked the increase in serum prolactin and progesterone concentrations, and DNA fragmentation induced by the estrogen treatment. ERalpha and

  19. Level set segmentation of bovine corpora lutea in ex situ ovarian ultrasound images

    Directory of Open Access Journals (Sweden)

    Adams Gregg P

    2008-08-01

    Full Text Available Abstract Background The objective of this study was to investigate the viability of level set image segmentation methods for the detection of corpora lutea (corpus luteum, CL boundaries in ultrasonographic ovarian images. It was hypothesized that bovine CL boundaries could be located within 1–2 mm by a level set image segmentation methodology. Methods Level set methods embed a 2D contour in a 3D surface and evolve that surface over time according to an image-dependent speed function. A speed function suitable for segmentation of CL's in ovarian ultrasound images was developed. An initial contour was manually placed and contour evolution was allowed to proceed until the rate of change of the area was sufficiently small. The method was tested on ovarian ultrasonographic images (n = 8 obtained ex situ. A expert in ovarian ultrasound interpretation delineated CL boundaries manually to serve as a "ground truth". Accuracy of the level set segmentation algorithm was determined by comparing semi-automatically determined contours with ground truth contours using the mean absolute difference (MAD, root mean squared difference (RMSD, Hausdorff distance (HD, sensitivity, and specificity metrics. Results and discussion The mean MAD was 0.87 mm (sigma = 0.36 mm, RMSD was 1.1 mm (sigma = 0.47 mm, and HD was 3.4 mm (sigma = 2.0 mm indicating that, on average, boundaries were accurate within 1–2 mm, however, deviations in excess of 3 mm from the ground truth were observed indicating under- or over-expansion of the contour. Mean sensitivity and specificity were 0.814 (sigma = 0.171 and 0.990 (sigma = 0.00786, respectively, indicating that CLs were consistently undersegmented but rarely did the contour interior include pixels that were judged by the human expert not to be part of the CL. It was observed that in localities where gradient magnitudes within the CL were strong due to high contrast speckle, contour expansion stopped too early. Conclusion The

  20. Benchmarking infrastructure for mutation text mining.

    Science.gov (United States)

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  1. Benchmarking infrastructure for mutation text mining

    Science.gov (United States)

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  2. Folklore and traditional ecological knowledge of geckos in Southern Portugal: implications for conservation and science

    Science.gov (United States)

    2011-01-01

    Traditional Ecological Knowledge (TEK) and folklore are repositories of large amounts of information about the natural world. Ideas, perceptions and empirical data held by human communities regarding local species are important sources which enable new scientific discoveries to be made, as well as offering the potential to solve a number of conservation problems. We documented the gecko-related folklore and TEK of the people of southern Portugal, with the particular aim of understanding the main ideas relating to gecko biology and ecology. Our results suggest that local knowledge of gecko ecology and biology is both accurate and relevant. As a result of information provided by local inhabitants, knowledge of the current geographic distribution of Hemidactylus turcicus was expanded, with its presence reported in nine new locations. It was also discovered that locals still have some misconceptions of geckos as poisonous and carriers of dermatological diseases. The presence of these ideas has led the population to a fear of and aversion to geckos, resulting in direct persecution being one of the major conservation problems facing these animals. It is essential, from both a scientific and conservationist perspective, to understand the knowledge and perceptions that people have towards the animals, since, only then, may hitherto unrecognized pertinent information and conservation problems be detected and resolved. PMID:21892925

  3. The friends that game together: A folkloric expansion of textual poaching to genre farming for socialization in tabletop role-playing games

    Directory of Open Access Journals (Sweden)

    Michael Robert Underwood

    2009-03-01

    Full Text Available Tabletop role-playing games (RPGs are a folkloric form for creating and reaffirming community bonds and performing identity. Gaming is used to communicate and perform cultural capital and identity through fictional narratives, functioning as a form of community building and/or personal expression. With quotations from ethnographic research over the course of 2 years, including interviews with several groups of gamers and participant observation, I examine the ways that players create and affirm social bonds. I return to Michel De Certeau's idea of textual poaching, as adapted by Henry Jenkins, to contrast with it a new concept of genre farming. As both platform for and object of genre farming, RPGs allow players to display cultural competence, create and reaffirm social ties, and seek entertainment in a collaborative fashion.

  4. GOSPEL TEXT IN SCIENCE FICTION NOVELETTES BY V. P. KRAPIVIN (THE CYCLE "IN THE HEART OF THE GREAT CRYSTAL"

    Directory of Open Access Journals (Sweden)

    Velikanova E. A.

    2011-11-01

    Full Text Available The article analyses evangelical motives and images in a cycle of science fiction stories In the heart of the Great Crystal by Vladislav Krapivin. The reference to the evangelical text and connection to folklore and literary elements create the modern moral maintenance of books of the writer addressed to the teenage reader.

  5. Contribuições das teorias institucionais para o estudo de subsidiárias de corporações multinacionais

    Directory of Open Access Journals (Sweden)

    Takeyoshi Imasato

    Full Text Available Este ensaio destaca, inicialmente, as contribuições dos Estudos Organizacionais para o entendimento das corporações multinacionais. Em decorrência da capacidade de influenciar os demais atores nos âmbitos local, nacional, regional, internacional e transnacional, as multinacionais desafiam as abordagens tradicionais de estudos organizacionais seguidas por pesquisadores da área de Gestão Internacional. A seguir, o ensaio explora as possibilidades e os limites das abordagens de teoria institucional para o entendimento das subsidiárias de corporações multinacionais. Esse aporte teórico pode auxiliar tanto no estudo dessas empresas quanto da natureza das diferenças entre as instituições nos diversos países de operação, por possibilitarem a análise simultânea de múltiplos contextos institucionais simultaneamente. Como resultado, o ensaio contribui para o desenvolvimento teórico das interfaces entre as áreas de Estudos Organizacionais e de Gestão Internacional, principalmente, no que se refere às investigações que enfatizem o papel estratégico das subsidiárias.

  6. HUBUNGAN ANTARA STATUS GIZI DAN TINGKAT KEBUGARAN JASMANI DENGAN PRODUKTIVITAS KERJA PADA TENAGA KERJA WANITA UNIT SPINNING 1 BAGIAN WINDING PT. APAC INTI CORPORA BAWEN

    Directory of Open Access Journals (Sweden)

    Sri Rahayu Utami

    2014-10-01

    Full Text Available Tujuan penelitian ini untuk mengetahui hubungan antara status gizi dan tingkat kebugaran jasmani dengan produktivitas kerja pada tenaga kerja wanita unit Spinning 1 bagian Winding PT. Apac Inti Corpora Bawen. Jenis penelitian menggunakan explanatory research dengan pendekatan cross sectional. Populasi berjumlah 73 orang dengan sampel 45 orang. Pengambilan sampel menggunakan metode simple random sampling. Instrument yang digunakan adalah timbangan berat badan dan tinggi badan, bangku harvard, metronome, stopwatch dan lembar data produktivitas. Analisis data menggunakan uji Chi-Square dengan α = 0,05. Dan didapatkan hasil bahwa ada hubungan antara status gizi (p=0,005, tingkat kebugaran jasmani (p=0,001 dengan produktivitas kerja. Melalui penelitian ini diharapkan pekerja dapat mengkonsumsi makanan yang mengandung gizi seimbang ,serta melakukan olahraga untuk meningkatkan kebugaran jasmaninya. The purpose of this research to determine the relationship between nutritional status and level of physical fitness by working on labor productivity women Spinning unit 1 part Winding PT. Apac Inti Corpora Bawen. This research was explanatory research with cross sectional approach. Population was a 73 employees. And sample was 45 employees. Instrument was a weight scales and height, harvard bench, metronome, stopwatch and productivity data sheet. Was processed, using the Chi-Square statistic with α = 0.05. The results was a relationship between nutritional status (p = 0.005, level of physical fitness (p = 0.001 with labor productivity. This research will expect workers to consume foods that contain balanced nutrition and exercise to improve physical fitness.

  7. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  8. Poder e identidade grupal: um estudo em corporações musicais da região das vertentes

    Directory of Open Access Journals (Sweden)

    Marcos Vieira-Silva

    2013-01-01

    Full Text Available A investigação produzida buscou compreender a constituição histórica das formações identitárias e suas articulações com as relações de poder, no desempenho das atividades cotidianas de três corporações musicais mineiras. Percebeu-se que o processo identitário dos músicos é permeado pelo prestígio e valor que a tradição musical imprime na região. As diferenciações na produção de identidades individuais e coletivas podem exercer influências nas relações de poder inter e intragrupais. Também, as diversas formas de estabelecimento das relações de poder entre os agentes exercem influências no desenvolvimento do processo grupal e na atividade musical. Atividade, esta, que legitima tanto as identidades coletivas quanto as individuais, mantendo a vida musical da Região das Vertentes viva e intensa através dos tempos.

  9. Blending research methods: Qualitative and quantitative approaches to researching computer corpora for language learning.

    OpenAIRE

    Boulton , Alex

    2011-01-01

    International audience; This paper outlines how corpora (in printed, electronic or multi-modal form) can be used in language learning, an area often referred to as "data-driven learning" or DDL (Johns 1991). The alleged advantages are numerous, but are in need of empirical support which is frequently claimed to be lacking in the field. However, over 80 studies have so far attempted to evaluate some aspect of corpus use by non-native speakers (Boulton 2010): these are briefly reviewed as a who...

  10. Creazione e sviluppo di corpora multimediali. Nuove metodologie di ricerca nella traduzione audiovisiva

    OpenAIRE

    Valentini, Cristina

    2009-01-01

    The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation,...

  11. “Tá cuid de na mná blasta/Some Women Are Sweet Talkers”: Representations of Women in Seán Ó hEochaidh’s Field Diaries for the Irish Folklore Commission

    Directory of Open Access Journals (Sweden)

    Lillis Ó Laoire

    2017-10-01

    Full Text Available This article discusses representations of women in diaries written by  Seán Ó hEochaidh as part of his work as a field collector for the Irish Folklore Commission (1935-1971. Focusing on a number of well-described events and characters, the article reveals the collector’s attitude to women as they emerge from his writing. It also shows how women could help or hinder his collecting work. The disparities of the lives of a number of working women from Donegal during the period are also highlighted.

  12. Biomechanically Preferred Consonant-Vowel Combinations Fail to Appear in Adult Spoken Corpora

    Science.gov (United States)

    Whalen, D. H.; Giulivi, Sara; Nam, Hosung; Levitt, Andrea G.; Hallé, Pierre; Goldstein, Louis M.

    2012-01-01

    Certain consonant/vowel (CV) combinations are more frequent than would be expected from the individual C and V frequencies alone, both in babbling and, to a lesser extent, in adult language, based on dictionary counts: Labial consonants co-occur with central vowels more often than chance would dictate; coronals co-occur with front vowels, and velars with back vowels (Davis & MacNeilage, 1994). Plausible biomechanical explanations have been proposed, but it is also possible that infants are mirroring the frequency of the CVs that they hear. As noted, previous assessments of adult language were based on dictionaries; these “type” counts are incommensurate with the babbling measures, which are necessarily “token” counts. We analyzed the tokens in two spoken corpora for English, two for French and one for Mandarin. We found that the adult spoken CV preferences correlated with the type counts for Mandarin and French, not for English. Correlations between the adult spoken corpora and the babbling results had all three possible outcomes: significantly positive (French), uncorrelated (Mandarin), and significantly negative (English). There were no correlations of the dictionary data with the babbling results when we consider all nine combinations of consonants and vowels. The results indicate that spoken frequencies of CV combinations can differ from dictionary (type) counts and that the CV preferences apparent in babbling are biomechanically driven and can ignore the frequencies of CVs in the ambient spoken language. PMID:23420980

  13. Pathway computation in models derived from bio-science text sources

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker

    2017-01-01

    This paper outlines a system, OntoScape, serving to accomplish complex inference tasks on knowledge bases and bio-models derived from life-science text corpora. The system applies so-called natural logic, a form of logic which is readable for humans. This logic affords ontological representations...

  14. (Text) Mining the LANDscape: Themes and Trends over 40 years of Landscape and Urban Planning

    Science.gov (United States)

    Paul H. Gobster

    2014-01-01

    In commemoration of the journal's 40th anniversary, the co-editor explores themes and trends covered by Landscape and Urban Planning and its parent journals through a qualitative comparison of co-occurrence term maps generated from the text corpora of its abstracts across the four decadal periods of publication.Cluster maps generated from the...

  15. IDEOLOGICAL APPROACHES OF FOLKLORE STUDIES IN KYRGYZSTAN ON THE SOVIET UNION PERIOD: ERSOLTONOY EPIC EXAMPLE SOVYETLER BİRLİĞİ DÖNEMİNDE KIRGIZİSTAN’DA FOLKLOR ÇALIŞMALARINDA İDEOLOJİK YAKLAŞIMLAR: ER SOLTONOY DESTANI ÖRNEĞİ

    Directory of Open Access Journals (Sweden)

    Mehmet ÇERİBAŞ

    2012-01-01

    Full Text Available Folklore, emerged in the 19th century with the romance movement as a tool of nationalizm, acted as shield aganist discriminative movements in the countries which weren’t able to achieve political unity. Political movement, which doesn’t consist freedom of expression and based on single party system like socialism, nazism and communism, wanted to take advantage of all communication channels for propaganda purpose. These movements imposed important folklore products which was considered as a means of communication and interaction. One of these is to understand the judgements values and develop policies on this judgements, other is to ensure harmony between the regime and people-more clearly by formatting fort he purpose of regime.Socialism which is of the movements using folklore for the ideological purpose have benefited from folklore to make people of occupied countries for he emperialist purpose compatible. Epic type, decorated with elements of romantics and nationalism, is used to increase nationalism by the Turks tribes where oral culture is dominant during the war period at ordinary times has taken spokemanship of proletariat class. Such work has been tested on the Kyrgyz Turks which were nomadic horseman and interested in the type of epic proceeding from Er Soltonoy’s of Kyrgyz Turks. 19. yüzyılda ortaya çıkan romantizm hareketiyle uluslaşmanın bir aracı olarak görülen folklor ürünleri, siyasi birliğini sağlayamamış ülkeler tarafından dıştan gelecek ayrıştırıcı akımlara karşı kalkan görevini görmüştür. Nazizm, Sosyalizm ve Komünizm gibi tek parti sistemine dayanan ve ifade özgürlüğünün olmadığı siyasi akımlar ise halka ulaşabilecekleri bütün iletişim kanallarından propaganda amacıyla yararlanmak istemişler; bu akımlar dönemin iletişim araçlarından sayılan folklor ürünlerine de bu bağlamda önemli görevler yüklemişlerdir. Bu görevlerden biri, halkın değer yarg

  16. Language and folklore in Hamid Mosaddeq’s poem

    Directory of Open Access Journals (Sweden)

    نداسادات IRAN

    2016-01-01

    Full Text Available Abstract"Standard language", "sub-standard language" and "meta-standard language" are the language types of many varieties. Use of sub- standard language in making poetry, known as “stylistic deviation”, is one of the ways of highlighting poetic language. More attention to this technique of language in the contemporary period was paid by Nima. Nima believed that all words have the potentiality to enter the realm of poetry. No word is essentially poetic or non-poetic, but the way of using words by the poet determines its poetic value.Hamid Mossadegh by the use of sub-standard language elements, in addition to increasing the richness of his poems, made them closer to the mind, language and life of people. Folkloric elements of Mosaddeq’s poems were divided into seven groups: 1 Slang words, 2 common and spoken vocabulary 3 Irony and Proverbs 4 Tlfzhay popular 5 allusion to folk tales 6 folk beliefs and customs 7 local vocabulary.Slang words in poems Mosaddeq in the "verb" and "noun" have been examined. Many folk verbs such as "Shangidan" and "gap zadan (to chat" in Mosaddeq’s poems have been applied. Some of folk verbs in his poems are in such a way that at first, one could not understand the point. These verbs have several meanings that one or more specific meanings are slang, like verb "gereftan (to get" that means "to grow the root of the plant" has slang sense.There is an abundance application of folk nouns in Mosaddeq’s poem. Some of the nouns used in Mosaddeq’s poem, considering their figurative meanings, can be investigated in the folk nouns group, like "foot" in the figurative sense of "will"."Colloquial and current words are of the most frequent elements of folk words in the poetry of Mosaddeq. These words in the category of "nouns" and "verbs" could be analyzed. Lexical verbs such as "to hip" and "Perfume of Moskow" are of this kind. "Irony and Proverbs" are the other folk elements of the poetry of Mosaddeq

  17. WARCProcessor: An Integrative Tool for Building and Management of Web Spam Corpora.

    Science.gov (United States)

    Callón, Miguel; Fdez-Glez, Jorge; Ruano-Ordás, David; Laza, Rosalía; Pavón, Reyes; Fdez-Riverola, Florentino; Méndez, Jose Ramón

    2017-12-22

    In this work we present the design and implementation of WARCProcessor, a novel multiplatform integrative tool aimed to build scientific datasets to facilitate experimentation in web spam research. The developed application allows the user to specify multiple criteria that change the way in which new corpora are generated whilst reducing the number of repetitive and error prone tasks related with existing corpus maintenance. For this goal, WARCProcessor supports up to six commonly used data sources for web spam research, being able to store output corpus in standard WARC format together with complementary metadata files. Additionally, the application facilitates the automatic and concurrent download of web sites from Internet, giving the possibility of configuring the deep of the links to be followed as well as the behaviour when redirected URLs appear. WARCProcessor supports both an interactive GUI interface and a command line utility for being executed in background.

  18. Conservation Implications of the Prevalence and Representation of Locally Extinct Mammals in the Folklore of Native Americans

    OpenAIRE

    Preston Matthew; Harcourt Alexander

    2009-01-01

    Many rationales for wildlife conservation have been suggested. One rationale not often mentioned is the impact of extinctions on the traditions of local people, and conservationists′ subsequent need to strongly consider culturally based reasons for conservation. As a first step in strengthening the case for this rationale, we quantitatively examined the presence and representation of eight potentially extinct mammals in folklore of 48 Native American tribes that live/lived near to 11 n...

  19. Building an ontology of pulmonary diseases with natural language processing tools using textual corpora.

    Science.gov (United States)

    Baneyx, Audrey; Charlet, Jean; Jaulent, Marie-Christine

    2007-01-01

    Pathologies and acts are classified in thesauri to help physicians to code their activity. In practice, the use of thesauri is not sufficient to reduce variability in coding and thesauri are not suitable for computer processing. We think the automation of the coding task requires a conceptual modeling of medical items: an ontology. Our task is to help lung specialists code acts and diagnoses with software that represents medical knowledge of this concerned specialty by an ontology. The objective of the reported work was to build an ontology of pulmonary diseases dedicated to the coding process. To carry out this objective, we develop a precise methodological process for the knowledge engineer in order to build various types of medical ontologies. This process is based on the need to express precisely in natural language the meaning of each concept using differential semantics principles. A differential ontology is a hierarchy of concepts and relationships organized according to their similarities and differences. Our main research hypothesis is to apply natural language processing tools to corpora to develop the resources needed to build the ontology. We consider two corpora, one composed of patient discharge summaries and the other being a teaching book. We propose to combine two approaches to enrich the ontology building: (i) a method which consists of building terminological resources through distributional analysis and (ii) a method based on the observation of corpus sequences in order to reveal semantic relationships. Our ontology currently includes 1550 concepts and the software implementing the coding process is still under development. Results show that the proposed approach is operational and indicates that the combination of these methods and the comparison of the resulting terminological structures give interesting clues to a knowledge engineer for the building of an ontology.

  20. PENENTUAN FAKTOR DAN TARAF FAKTOR DALAM PENGENDALIAN KUALITAS PRODUKSI BENANG PCM DI PT APAC INTI CORPORA DENGAN METODE DESAIN EKSPERIMEN

    Directory of Open Access Journals (Sweden)

    Darminto Pujotomo

    2012-02-01

    Full Text Available PT. APAC Inti Corpora merupakan salah satu perusahaan tekstil yang terbesar di Asia Tenggara dimana salah satu jenis produknya adalah benang PCM yang dihasilkan oleh departemen spinning 4. Permasalahan yang muncul adalah produk akhir yang cacat melebihi target perusahaan sebesar 0,8% dari total produksi, sedangkan perusahaan dituntut untuk menghasilkan produk cacat seminimal mungkin. Masalah ini muncul karena masih banyaknya cacat yang timbul pada benang PCM yang didominan oleh cacat crossing (24,67%,  cacat ring cone (21,98%, cacat tanpa ekor (16,02% dan kontaminasi (12,50%. Penelitian ini dimaksudkan untuk melakukan penilaian terhadap proses yang terjadi dan apabila ternyata memang terjadi proses yang tidak terkendali maka selanjutnya akan dilakukan identifikasi dan analisa faktor-faktor yang mempunyai pengaruh secara signifikan terhadap ttimbulnya cacat crossing pada benang PCM. Metode yang digunakan untuk menilai proses operasi adalah metode pengendalian proses statistik (statistical process control, sedangkan metode yang digunakan untuk menganalisa faktor-faktor yang berpengaruh terhadap timbulnya cacat benang PCM adalah metode desain eksperimen faktorial.  Dari grafik pengendali dan penentuan kemampuan proses dapat diketahui bahwa proses operasi yang terjadi berada di luar kontrol karena menghasilkan cukup banyak produk cacat. Faktor-faktor yang akan diteliti dalam penelitian ini adalah faktor ukuran benang, umur mesin dan kecepatan mesin yang masing-masing faktor terdiri dari 2 taraf faktor. Faktor ukuran benang terdiri dari tipis dan tebal. Faktor umur mesin terdiri dari mesin lama dan mesin baru.Faktor kecepatan mesin terdiri dari 900 MPM dan 1000 MPM. Berdasarkaan perhitungan analisa variansi (ANAVA dan test hipotesa, faktor yang signifikan menyebabkan timbulnya cacat crossing adalah faktor ukuran benang  dan umur mesin.   Kata kunci : cacat crossing, pengendalian kualitas, ANAVA   PT.APAC Inti Corpora is the largest textile

  1. Opinion Mining in Latvian Text Using Semantic Polarity Analysis and Machine Learning Approach

    Directory of Open Access Journals (Sweden)

    Gatis Špats

    2016-07-01

    Full Text Available In this paper we demonstrate approaches for opinion mining in Latvian text. Authors have applied, combined and extended results of several previous studies and public resources to perform opinion mining in Latvian text using two approaches, namely, semantic polarity analysis and machine learning. One of the most significant constraints that make application of opinion mining for written content classification in Latvian text challenging is the limited publicly available text corpora for classifier training. We have joined several sources and created a publically available extended lexicon. Our results are comparable to or outperform current achievements in opinion mining in Latvian. Experiments show that lexicon-based methods provide more accurate opinion mining than the application of Naive Bayes machine learning classifier on Latvian tweets. Methods used during this study could be further extended using human annotators, unsupervised machine learning and bootstrapping to create larger corpora of classified text.

  2. The relevance of folkloric usage of plant galls as medicines: Finding the scientific rationale.

    Science.gov (United States)

    Patel, Seema; Rauf, Abdur; Khan, Haroon

    2018-01-01

    Galls, the abnormal growths in plants, induced by virus, bacteria, fungi, nematodes, arthropods, or even other plants, are akin to cancers in fauna. The galls which occur in a myriad of forms are phytochemically-distinct from the normal plant tissues, for these are the sites of tug-of-war, just like the granuloma in animals. To counter the stressors, in the form of the effector proteins of the invaders, the host plants elaborate a large repertoire of metabolites, which they normally will not produce. Perturbation of the jasmonic acid pathway, and the overexpression of auxin, and cytokinin, promote the tissue proliferation and the resultant galls. Though the plant family characteristics and the attackers determine the gall biochemistry, most of the galls are rich in bioactive phytochemicals such as phenolic acids, anthocyanins, purpurogallin, flavonoids, tannins, steroids, triterpenes, alkaloids, lipophilic components (tanshinone) etc. Throughout the long trajectory of evolution, humans have learned to use the galls as therapeutics, much like other plant parts. In diverse cultures, the evidence of folkloric usage of galls abound. Among others, galls from the plant genus like Rhus, Pistacia, Quercus, Terminalia etc. are popular as ethnomedicine. This review mines the literature on galling agents, and the medicinal relevance of galls. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  3. "Old Oxen Cannot Plow": Stereotype Themes of Older Adults in Turkish Folklore.

    Science.gov (United States)

    Marcus, Justin; Sabuncu, Neslihan

    2016-12-01

    Although much research has established the nature of attitudes and stereotypes toward older adults, there are conflicting explanations for the root cause of ageism, including the sociocultural view and interpersonal views, that age bias against older adults is uniquely a product of modernity and occurs through social interactions, and the evolutionary view and intraindividual views, that age bias against older adults is rooted in our naturally occurring and individually held fear of death. We make initial investigations into resolving this conflict, by analyzing literature from a society predating the Industrial Revolution, the society of Ottoman Turks. Using Grounded Theory, we analyzed 1,555 Turkish fairy tales of the most well-known older adult in Turkish folklore, Nasreddin Hoca, for stereotype themes of older adults. Using the same method, we then analyzed 22,000+ Turkish sayings and proverbs for the same themes. Results indicated older adults to be viewed both positively and negatively. Positive stereotypes included wisdom, warmth, deserving of respect, and retirement. Negative stereotypes included incompetence, inadaptability, and frailty/nearing of death. Older females were viewed more negatively relative to older males. Results indicated views of older adults to parallel those found in contemporary research. Results have implications for the design of interventions to reduce ageism and on the cross-cultural generalizability of age-based stereotypes. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  4. Bollywood Movie Corpus for Text, Images and Videos

    OpenAIRE

    Madaan, Nishtha; Mehta, Sameep; Saxena, Mayank; Aggarwal, Aditi; Agrawaal, Taneea S; Malhotra, Vrinda

    2017-01-01

    In past few years, several data-sets have been released for text and images. We present an approach to create the data-set for use in detecting and removing gender bias from text. We also include a set of challenges we have faced while creating this corpora. In this work, we have worked with movie data from Wikipedia plots and movie trailers from YouTube. Our Bollywood Movie corpus contains 4000 movies extracted from Wikipedia and 880 trailers extracted from YouTube which were released from 1...

  5. Adapting computational text analysis to social science (and vice versa

    Directory of Open Access Journals (Sweden)

    Paul DiMaggio

    2015-11-01

    Full Text Available Social scientists and computer scientist are divided by small differences in perspective and not by any significant disciplinary divide. In the field of text analysis, several such differences are noted: social scientists often use unsupervised models to explore corpora, whereas many computer scientists employ supervised models to train data; social scientists hold to more conventional causal notions than do most computer scientists, and often favor intense exploitation of existing algorithms, whereas computer scientists focus more on developing new models; and computer scientists tend to trust human judgment more than social scientists do. These differences have implications that potentially can improve the practice of social science.

  6. If only Derrida missed that flight... About the assessment of the "academic achievements" of the so-called "American Anthropology" by Belgrade Structural-semiotic School of Folklore

    Directory of Open Access Journals (Sweden)

    Miloš Milenković

    2016-02-01

    Full Text Available Taking into account recent critiques of "underdevelopment", "positivism", "methodological backwardness" and other failings attributed to socalled "American Anthropology" by some of the authors from the Belgrade Structural-semiotic School of Anthropology of Folklore, I analyse the context in which colleagues and students may be tempted to explain common sense political connection between polyphone ethnography, neo-romanticism and nationalism as counter-intuitive history of the discipline. I already pointed that the important transformative differences in the attitudes towards structuralism between European anthropologists, especially Belgrade Structural-semiotic School of Anthropology of Folklore and so called "American Anthropology", are the consequence of a pure coincidence – the fact that French structuralism and French poststructuralism were launched simultaneously at the American interdisciplinary intellectual scene ("Theory" at the same conference. This ironic concurrence would not be much more than one entertaining episode for students, historians of anthropology and historians of ideas, if there were no attempts (more and more frequent and increasingly fluently articulated to compare different intellectual traditions as they were elements of the same unilineal evolution of the discipline. Belgrade Structural-semiotic School (further called only SS and especially its spiritus movens and most prominent representative Prof. Kovačević started in recent years to criticise some "American Anthropology" measuring its academic "achievement" (the author’s term in comparative perspective and taking as an analytical unit uncritically generalized traditions marked with a single term of "postmodern anthropology" on the one hand, and "anthropology" on the other. Belgrade SS School did develop globally original, although badly promoted and never fully used, battery for the synchronic analysis of the folklore phenomena, but this was done only after

  7. Dynamic Penile Corpora Cavernosa Reconstruction Using Bilateral Innervated Gracilis Muscles: A Preclinical Investigation.

    Science.gov (United States)

    Yin, Zhuming; Liu, Liqiang; Xue, Bingjian; Fan, Jincai; Chen, Wenlin; Liu, Zheng

    2018-03-07

    investigation proves that corpora cavernosa reconstruction using bilateral innervated gracilis muscles is technically feasible and functionally efficacious. Yin Z, Liu L, Xue B, et al. Dynamic Penile Corpora Cavernosa Reconstruction Using Bilateral Innervated Gracilis Muscles: A Preclinical Investigation. Sex Med 2018;XX:XXX-XXX. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  8. The Application of Hermeneutical Analysis to Research on the Cold War in Soviet Animation Media Texts from the Second Half of the 1940s

    Science.gov (United States)

    Fedorov, A. V.

    2015-01-01

    The Cold War era, which spawned a mutual ideological confrontation between communist and capitalist countries, left its mark on all categories of media texts, including cartoons and animations. Cartoons were used by the authorities as tools for delivering the necessary confrontational ideological content in an attractive folkloric, fairy-tale…

  9. Transylvanianism, Nationalism, Folklore: The Academic Career of Olga Nagy in the Light of her Posthumous Book, Vallomások (2010

    Directory of Open Access Journals (Sweden)

    Kata Zsófia Vincze

    2016-01-01

    Full Text Available The volume Vallomások [‘Testimony’], published posthumously in 2010, is the folklorist Olga Nagy’s (1921-2006 last book. In this paper I will analyze Nagy’s academic significance in the light of her own last self reflection presented in Vallomások. This volume provides an exciting overview of the internal dynamics of East-Central European culture and interethnic relations. While I examine Nagy’s life work, especially her academic work on rural women and her new ideas regarding the alive folklore, I will also reflect on the ideology of so called Transylvanianism that constitutes the framework of many Hungarian writings from Romania. Transylvanianism is a complex ideology rooted in the Hungarian national movement of the nineteenth century, one that later turned into a complex manifestation of the Hungarian minorities in Romania through literature, culture, politics and self-definition. Elaborated by writers, historians and journalists, Transylvanianism after 1918—and even more vehemently after 1947—aimed to preserve and reinforce Hungarian national pride and identity in the region through cultural activities, education and political action.

  10. Effects of prostaglandin F2 alpha and a gonadotropin-releasing hormone agonist on inositol phospholipid metabolism in isolated rat corpora lutea of various ages

    International Nuclear Information System (INIS)

    Lahav, M.; West, L.A.; Davis, J.S.

    1988-01-01

    The sensitivity of rat corpora lutea to luteolytic agents increases with luteal age. We examined the effect of prostaglandin F2 alpha (PGF2 alpha) and [D-Ala6,Des-Gly10]GnRH ethylamide (GnRHa) on inositol phospholipid metabolism in day 2 and day 7 corpora lutea from PMSG-treated rats. Isolated corpora lutea were incubated with 32PO4 or [3H]inositol and were treated with LH, PGF2 alpha, or GnRHa. Phospholipids were purified by TLC, and the water-soluble products of phospholipase-C activity (inositol phosphates) were isolated by ion exchange chromatography. In day 2 corpora lutea, PGF2 alpha, (10 microM) and GnRHa (100 ng/ml) significantly increased 32PO4 incorporation into phosphatidic acid (PA) and phosphatidylinositol (PI), but not into other fractions. LH provoked slight increases in PA. Results were similar with 30 min of prelabeling or simultaneous addition of 32PO4 and stimulants. In other experiments, PGF2 alpha and GnRHa provoked rapid increases (1-5 min) in the accumulation of inositol mono-, bis-, and trisphosphates. LH did not significantly increase inositol phosphate accumulation, but stimulated cAMP accumulation in 2-day-old corpora lutea. Inositol phospholipid metabolism was increased in day 7 corpora lutea compared to that in day 2 corpora lutea. This increase was associated with increased incorporation of 32PO4 into PA and PI and increased accumulation of [3H]inositol phosphates. In day 7 corpora lutea, which are very sensitive to the luteolytic effect of PGF2 alpha, the PG-induced increase in PA labeling was small and inconsistent, whereas PI labeling was unaffected in 30-min incubations. GnRHa was without effect in such corpora lutea. LH, PGF2 alpha, or GnRHa did not increase inositol phosphate accumulation in 7-day-old corpora lutea. These studies demonstrate that the transformation of young (day 2) to mature (day 7) corpora lutea is associated with an increase in luteal inositol phospholipid metabolism

  11. FacetGist: Collective Extraction of Document Facets in Large Technical Corpora.

    Science.gov (United States)

    Siddiqui, Tarique; Ren, Xiang; Parameswaran, Aditya; Han, Jiawei

    2016-10-01

    Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets ( e.g. , application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes.

  12. The European Circulation of Nordic Texts in the Romantic Period

    DEFF Research Database (Denmark)

    Jensen-Rix, Robert William

    2017-01-01

    history of rediscovering Old Norse texts (i.e., poetry and prose written in the North Germanic language until the 14th century, known primarily from Icelandic manuscripts) and medieval Nordic folklore (found in medieval ballads, sagas, and heroic legends) differed in various European countries......, there was also a remarkable sense of common aim and purpose in the reception history as it developed during the Romantic period. This was because European scholars and writers had come to see medieval Nordic texts as epitomizing the manners and literature of a common Germanic past. In particular, Old Norse texts...... from Icelandic manuscripts were believed to preserve the pre-Christian religion, as this was once shared by Scandinavians, Anglo-Saxons, Germans, and the Franks. Thus, interest in such texts circulated with particular intensity between Scandinavia, Germany, and Britain, as well as, to a lesser degree...

  13. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  14. A new universality class in corpus of texts; A statistical physics study

    Science.gov (United States)

    Najafi, Elham; Darooneh, Amir H.

    2018-05-01

    Text can be regarded as a complex system. There are some methods in statistical physics which can be used to study this system. In this work, by means of statistical physics methods, we reveal new universal behaviors of texts associating with the fractality values of words in a text. The fractality measure indicates the importance of words in a text by considering distribution pattern of words throughout the text. We observed a power law relation between fractality of text and vocabulary size for texts and corpora. We also observed this behavior in studying biological data.

  15. Lexical bundles in an advanced INTOCSU writing class and engineering texts: A functional analysis

    Science.gov (United States)

    Alquraishi, Mohammed Abdulrahman

    The purpose of this study is to investigate the functions of lexical bundles in two corpora: a corpus of engineering academic texts and a corpus of IEP advanced writing class texts. This study is concerned with the nature of formulaic language in Pathway IEPs and engineering texts, and whether those types of texts show similar or distinctive formulaic functions. Moreover, the study looked into lexical bundles found in an engineering 1.26 million-word corpus and an ESL 65000-word corpus using a concordancing program. The study then analyzed the functions of those lexical bundles and compared them statistically using chi-square tests. Additionally, the results of this investigation showed 236 unique frequent lexical bundles in the engineering corpus and 37 bundles in the pathway corpus. Also, the study identified several differences between the density and functions of lexical bundles in the two corpora. These differences were evident in the distribution of functions of lexical bundles and the minimal overlap of lexical bundles found in the two corpora. The results of this study call for more attention to formulaic language at ESP and EAP programs.

  16. Text mixing shapes the anatomy of rank-frequency distributions

    Science.gov (United States)

    Williams, Jake Ryland; Bagrow, James P.; Danforth, Christopher M.; Dodds, Peter Sheridan

    2015-05-01

    Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two scaling regimes result from the act of aggregating texts. We observe that text mixing leads to an effective decay of word introduction, which we show provides accurate predictions of the location and severity of breaks in scaling. Upon examining large corpora from 10 languages in the Project Gutenberg eBooks collection, we find emphatic empirical support for the universality of our claim.

  17. Utilité du partage des corpus pour l'analyse des interactions en ligne en situation d'apprentissage : un exemple d'approche méthodologique autour d'une base de corpus d'apprentissage Benefits of Sharing Corpora when Analyzing Online Interactions: an Example of Methodology Related to a Databank of Learning and Teaching Corpora.

    Directory of Open Access Journals (Sweden)

    Maud Ciekanski

    2010-12-01

    Full Text Available La recherche sur les interactions en ligne en situation d'apprentissage offre encore trop peu souvent la possibilité d'accéder aux données à partir desquelles les chercheurs ont élaboré les analyses présentées dans les publications. Cela restreint, d'une part, la compréhension des phénomènes étudiés et, d'autre part, empêche toute réplication dans le but de comparaisons, d'analyses cumulatives ou contrastives. Dans le projet Mulce, nous défendons le point de vue méthodologique suivant : pour permettre une analyse des interactions situées, il convient de relier les différentes données issues de formations en ligne pour construire un objet d'analyse exploitable par différentes équipes et disciplines. Le constat actuel est que les données sont souvent décontextualisées, parcellaires ou simplement inaccessibles à la communauté des chercheurs. Nous proposons donc de structurer les données en corpus d'apprentissage (Letec de façon à rendre possible leur échange et la capitalisation des analyses. Le protocole de recherche, le scénario pédagogique, les interactions, productions et traces, les licences et les analyses capitalisables en sont les constituants. Cet article présente, dans un premier temps, les questionnements, à la fois théoriques, techniques et méthodologiques soulevés par la conception d'un tel projet. Dans un deuxième temps, nous illustrerons notre démarche à partir d'exemples issus des formations Simuligne et Copéas, en indiquant les processus simples de transformation du format Mulce aux formats requis par deux logiciels d'aide à l'analyse (l'un sur les forums, l'autre sur l'alignement entre vidéo et transcription. Nous insistons plus particulièrement sur l'intérêt de ces outils pour l'analyse des phénomènes de polyfocalisation et d'écriture multimodale dans l'analyse des interactions multimodales, caractéristiques des environnements d'apprentissage en ligne. Nous conclurons notre

  18. 1970 MLA Abstracts of Articles in Scholarly Journals, Volume I: General, English, American, Medieval and Neo-Latin, Celtic Literatures; and Folklore.

    Science.gov (United States)

    Fisher, John H., Comp.; Achtert, Walter S., Comp.

    The first volume of an annual series following the arrangement of the "MLA International Bibliography" includes sections on General, English, American, Medieval and Neo-Latin, Celtic literatures, and Folklore. A classified collection of 1,744 brief abstracts of journalarticles on the modern languages and literatures to be used in conjunction with…

  19. 1970 MLA International Bibliography of Books and Articles on the Modern Languages and Literatures, Volume I: General, English, American, Medieval and Neo-Latin, Celtic Literatures; and Folklore.

    Science.gov (United States)

    Meserole, Harrison T., Comp.

    Volume 1 of the four-volume, international bibliography contains over 11,140 entries referring to books, Festschriften, analyzed collections, and articles which focus on General, English, American, medieval and neo-Latin, and Celtic literatures. A section of folklore is also included. The section on general literature includes: (1) aesthetics, (2)…

  20. A practical application of text mining to literature on cognitive rehabilitation and enhancement through neurostimulation

    Directory of Open Access Journals (Sweden)

    Puiu F Balan

    2014-09-01

    Full Text Available The exponential growth in publications represents a major challenge for researchers. Many scientific domains, including neuroscience, are not yet fully engaged in exploiting large bodies of publications. In this paper, we promote the idea to partially automate the processing of scientific documents, specifically using text mining (TM, to efficiently review big corpora of publications. The cognitive advantage given by TM is mainly related to the automatic extraction of relevant trends from corpora of literature, otherwise impossible to analyze in short periods of time. Specifically, the benefits of TM are increased speed, quality and reproducibility of text processing, boosted by rapid updates of the results. First, we selected a set of TM-tools that allow user-friendly approaches of the scientific literature, and which could serve as a guide for researchers willing to incorporate TM in their work. Second, we used these TM-tools to obtain basic insights into the relevant literature on cognitive rehabilitation (CR and cognitive enhancement (CE using transcranial magnetic stimulation (TMS. TM readily extracted the diversity of TMS applications in CR and CE from vast corpora of publications, automatically retrieving trends already described in published reviews. TMS emerged as one of the important non-invasive tools that can both improve cognitive and motor functions in numerous neurological diseases and induce modulations/enhancements of many fundamental brain functions. TM also revealed trends in big corpora of publications by extracting occurrence frequency and relationships of particular subtopics. Moreover, we showed that CR and CE share research topics, both aiming to increase the brain’s capacity to process information, thus supporting their integration in a larger perspective. Methodologically, despite limitations of a simple user-friendly approach, TM served well the reviewing process.

  1. Helios: Understanding Solar Evolution Through Text Analytics

    Energy Technology Data Exchange (ETDEWEB)

    Randazzese, Lucien [SRI International, Menlo Park, CA (United States)

    2016-12-02

    This proof-of-concept project focused on developing, testing, and validating a range of bibliometric, text analytic, and machine-learning based methods to explore the evolution of three photovoltaic (PV) technologies: Cadmium Telluride (CdTe), Dye-Sensitized solar cells (DSSC), and Multi-junction solar cells. The analytical approach to the work was inspired by previous work by the same team to measure and predict the scientific prominence of terms and entities within specific research domains. The goal was to create tools that could assist domain-knowledgeable analysts in investigating the history and path of technological developments in general, with a focus on analyzing step-function changes in performance, or “breakthroughs,” in particular. The text-analytics platform developed during this project was dubbed Helios. The project relied on computational methods for analyzing large corpora of technical documents. For this project we ingested technical documents from the following sources into Helios: Thomson Scientific Web of Science (papers), the U.S. Patent & Trademark Office (patents), the U.S. Department of Energy (technical documents), the U.S. National Science Foundation (project funding summaries), and a hand curated set of full-text documents from Thomson Scientific and other sources.

  2. PEDANT: Parallel Texts in Göteborg

    Directory of Open Access Journals (Sweden)

    Daniel Ridings

    2012-09-01

    Full Text Available

    The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information.

     

    In hierdie artikel word 'n beskrywing gegee van die stand van die PEDANT-projek met parallelle korpora by die Taalbank by die Universiteit van Göteborg. Oplossings vir die verkryging van toegang tot die korpusdata word aangedui. Toegang word verskaf deur middel van die Internet en standaardtoepassings en SGML-sensitiewe programmeringshulpmiddels. Die SGML-formaat vir die enkodering van vertaalpare word gesamentlik geskets. Hierdie metodes laat toe dat gewerk kan word met enigiets vanaf suiwer teks tot tekste wat taalkundig dig geëtiketteer is.

     

  3. Mining consumer health vocabulary from community-generated text.

    Science.gov (United States)

    Vydiswaran, V G Vinod; Mei, Qiaozhu; Hanauer, David A; Zheng, Kai

    2014-01-01

    Community-generated text corpora can be a valuable resource to extract consumer health vocabulary (CHV) and link them to professional terminologies and alternative variants. In this research, we propose a pattern-based text-mining approach to identify pairs of CHV and professional terms from Wikipedia, a large text corpus created and maintained by the community. A novel measure, leveraging the ratio of frequency of occurrence, was used to differentiate consumer terms from professional terms. We empirically evaluated the applicability of this approach using a large data sample consisting of MedLine abstracts and all posts from an online health forum, MedHelp. The results show that the proposed approach is able to identify synonymous pairs and label the terms as either consumer or professional term with high accuracy. We conclude that the proposed approach provides great potential to produce a high quality CHV to improve the performance of computational applications in processing consumer-generated health text.

  4. A practical application of text mining to literature on cognitive rehabilitation and enhancement through neurostimulation.

    Science.gov (United States)

    Balan, Puiu F; Gerits, Annelies; Vanduffel, Wim

    2014-01-01

    The exponential growth in publications represents a major challenge for researchers. Many scientific domains, including neuroscience, are not yet fully engaged in exploiting large bodies of publications. In this paper, we promote the idea to partially automate the processing of scientific documents, specifically using text mining (TM), to efficiently review big corpora of publications. The "cognitive advantage" given by TM is mainly related to the automatic extraction of relevant trends from corpora of literature, otherwise impossible to analyze in short periods of time. Specifically, the benefits of TM are increased speed, quality and reproducibility of text processing, boosted by rapid updates of the results. First, we selected a set of TM-tools that allow user-friendly approaches of the scientific literature, and which could serve as a guide for researchers willing to incorporate TM in their work. Second, we used these TM-tools to obtain basic insights into the relevant literature on cognitive rehabilitation (CR) and cognitive enhancement (CE) using transcranial magnetic stimulation (TMS). TM readily extracted the diversity of TMS applications in CR and CE from vast corpora of publications, automatically retrieving trends already described in published reviews. TMS emerged as one of the important non-invasive tools that can both improve cognitive and motor functions in numerous neurological diseases and induce modulations/enhancements of many fundamental brain functions. TM also revealed trends in big corpora of publications by extracting occurrence frequency and relationships of particular subtopics. Moreover, we showed that CR and CE share research topics, both aiming to increase the brain's capacity to process information, thus supporting their integration in a larger perspective. Methodologically, despite limitations of a simple user-friendly approach, TM served well the reviewing process.

  5. Juvenile hormone biosynthesis gene expression in the corpora allata of honey bee (Apis mellifera L.) female castes.

    Science.gov (United States)

    Bomtorin, Ana Durvalina; Mackert, Aline; Rosa, Gustavo Conrado Couto; Moda, Livia Maria; Martins, Juliana Ramos; Bitondi, Márcia Maria Gentile; Hartfelder, Klaus; Simões, Zilá Luz Paulino

    2014-01-01

    Juvenile hormone (JH) controls key events in the honey bee life cycle, viz. caste development and age polyethism. We quantified transcript abundance of 24 genes involved in the JH biosynthetic pathway in the corpora allata-corpora cardiaca (CA-CC) complex. The expression of six of these genes showing relatively high transcript abundance was contrasted with CA size, hemolymph JH titer, as well as JH degradation rates and JH esterase (jhe) transcript levels. Gene expression did not match the contrasting JH titers in queen and worker fourth instar larvae, but jhe transcript abundance and JH degradation rates were significantly lower in queen larvae. Consequently, transcriptional control of JHE is of importance in regulating larval JH titers and caste development. In contrast, the same analyses applied to adult worker bees allowed us inferring that the high JH levels in foragers are due to increased JH synthesis. Upon RNAi-mediated silencing of the methyl farnesoate epoxidase gene (mfe) encoding the enzyme that catalyzes methyl farnesoate-to-JH conversion, the JH titer was decreased, thus corroborating that JH titer regulation in adult honey bees depends on this final JH biosynthesis step. The molecular pathway differences underlying JH titer regulation in larval caste development versus adult age polyethism lead us to propose that mfe and jhe genes be assayed when addressing questions on the role(s) of JH in social evolution.

  6. PESQUISA EM EDUCAÇÃO: O WORDSMITH COMO FERRAMENTA DE EXPLORAÇÃO DE CORPORA

    Directory of Open Access Journals (Sweden)

    Maria Zuleide da Costa Pereira; Samara Wanderley Xavier Barbosa

    2014-09-01

    Full Text Available Este texto constitui-se a partir da implementação das ações de um projeto do Programa Institucional de Bolsas de Iniciação Científica (PIBIC da UFPB, intitulado “Os Sentidos do Currículo nas Escolas da Rede Municipal de Ensino de João Pessoa/PB”, e desenvolvido no período de 2013 a 2014. O objetivo do plano/projeto é destacar o papel do software Worsmith Tools 6, como ferramenta de análise de corpora, na exploração dos sentidos de educação, currículo e ensino nos documentos curriculares analisados, que são os documentos de políticas curriculares nacionais (Lei de Diretrizes e Bases de nº 9394/96, Parâmetros Curriculares Nacionais de 1ª a 4ª série, Diretrizes Curriculares Gerais para Educação Básica e Diretrizes Curriculares para o Ensino Fundamental de Nove Anos e os locais (Projetos Político-Pedagógicos de nove escolas da Rede Municipal de Ensino. Dessa forma, mostramos os recursos do conjunto de ferramentas utilizadas, exemplificando de que modo elas contribuíram para uma análise documental mais exata e confiável do que outras perspectivas de análise linguística permitiriam. De fato, ao mesmo tempo em que facilitaram às possíveis articulações entre educação, currículo e ensino, o conjunto de ferramentas em questão, como argumenta Sardinha (2004, nos deu a possibilidade de analisar vários aspectos da linguagem, tais como: a composição lexical, o tema dos textos selecionados e a organização da retórica e composicional dos gêneros discursivos. Metodologicamente, organizamos os documentos em análise num conjunto de textos informatizados, de tal forma que se tornaram adequados para o pesquisador analisar, sempre tendo em vista a autenticidade, a legibilidade e a extensão dos textos, e a seleção criteriosa dos enunciados que comporiam o corpus. Para empreender a análise propriamente dita, decidimos utilizar o Worsmith Tools 6 e suas três ferramentas o Wordlist, o Concord e o Keywords, cada

  7. ONTOGRABBING: Extracting Information from Texts Using Generative Ontologies

    DEFF Research Database (Denmark)

    Nilsson, Jørgen Fischer; Szymczak, Bartlomiej Antoni; Jensen, P.A.

    2009-01-01

    We describe principles for extracting information from texts using a so-called generative ontology in combination with syntactic analysis. Generative ontologies are introduced as semantic domains for natural language phrases. Generative ontologies extend ordinary finite ontologies with rules...... for producing recursively shaped terms representing the ontological content (ontological semantics) of NL noun phrases and other phrases. We focus here on achieving a robust, often only partial, ontology-driven parsing of and ascription of semantics to a sentence in the text corpus. The aim of the ontological...... analysis is primarily to identify paraphrases, thereby achieving a search functionality beyond mere keyword search with synsets. We further envisage use of the generative ontology as a phrase-based rather than word-based browser into text corpora....

  8. A religião como meio de inclusão e de exclusão nas corporações de ofício de Estrasburgo (1681-1789

    Directory of Open Access Journals (Sweden)

    Hanna Sonkajärvi

    Full Text Available O artigo propõe uma análise das dinâmicas de inclusão e de exclusão construídas a partir do pertencimento religioso, ou confessional nas corporações de ofício em Estrasburgo no século XVIII. Na sociedade do Antigo Regime, a religião fazia parte - assim como o status social, os vínculos familiares, o gênero, o patronato e os meios financeiros, a língua e os direitos de burguesia - dos fatores decisivos para incluir ou excluir os estrangeiros do acesso aos recursos econômicos, políticos ou sociais das localidades. A construção e a preservação das fronteiras religiosas são examinadas a partir do exemplo dos marceneiros e dos barqueiros na cidade multi-confessional de Estrasburgo.

  9. Directed Activities Related to Text: Text Analysis and Text Reconstruction.

    Science.gov (United States)

    Davies, Florence; Greene, Terry

    This paper describes Directed Activities Related to Text (DART), procedures that were developed and are used in the Reading for Learning Project at the University of Nottingham (England) to enhance learning from texts and that fall into two broad categories: (1) text analysis procedures, which require students to engage in some form of analysis of…

  10. DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

    Directory of Open Access Journals (Sweden)

    Y. B. Abdullin

    2017-01-01

    Full Text Available Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

  11. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  12. The search for novel anticancer agents: a differentiation-based assay and analysis of a folklore product.

    Science.gov (United States)

    Dinnen, R D; Ebisuzaki, K

    1997-01-01

    One alternative approach to the current use of cytotoxic anticancer drugs involves the use of differentiation-inducing agents. However, a wider application of this strategy would require the development of assays to search for new differentiation-inducing agents. In this report we describe an in vitro assay using the murine erythroleukemia (clone 3-1) cells. Tests for the efficacy of this assay for the analysis of antineoplastic activity in natural products led to studies on pau d'arco, a South American folklore product used in the treatment of cancer. Purification of the activity in aqueous extracts by solvent partition and thin layer chromatography (TLC) indicated the presence of two activities, one of which was identified as lapachol. The activity in the pau d'arco extracts and of lapachol was inhibited by vitamin K1. As a vitamin K antagonist, lapachol might target such vitamin K-dependent reactions as the activation of a ligand for the Axl receptor tyrosine kinase.

  13. Declaraciones patrimoniales, turismo y conocimientos locales: Posibilidades de los estudios del folklore para el caso de las ferias en la quebrada de Humahuaca (Jujuy-Argentina Patrimony Statements, Tourism and Local Knowledge: Folklore Studies Posibilities in Quebrada de Humahuaca Fairs Case (Jujuy - Argentina

    Directory of Open Access Journals (Sweden)

    Liliana Bergesio

    2010-12-01

    Full Text Available La Quebrada de Humahuaca se encuentra en la porción central de la provincia de Jujuy (al noroeste de la República Argentina y su poblamiento ronda los 11.000 años de antigüedad. Esta región fue declarada en el año 2003 por la Organización de las Naciones Unidas para la Educación, la Ciencia y la Cultura (UNESCO como "Patrimonio Cultural y Natural de la Humanidad". A partir de esa fecha se incrementó el desarrollo de circuitos turísticos de aventura y culturales. Esta declaración le dio un nuevo impulso a la Quebrada de Humahuaca en el mercado nacional e internacional del turismo. Y el auge de este último en la zona generó que cada pueblo buscara sus propias alternativas para atraer visitantes. Entre las estrategias más comunes está la realización de ferias y fiestas que buscan destacar características locales particulares. En este trabajo proponemos analizar el caso de la localidad de Coctaca (Departamento de Humahuaca y un evento que allí se realiza, en el mes de febrero, el cual incluye la Feria "Los Sabores de la Historia", el "Encuentro de Mujeres Andinas" y la "Serenata a los Andenes de Cultivo". El objetivo del trabajo es plantear las posibilidades que aportan los estudios del folklore para articular en el análisis temas como lo local y global; lo cultural y económico; los productores con sus productos y el turismo con sus demandas y expectativas.Quebrada de Humahuaca is set in the central portion of the Jujuy Province (Northwest of Argentinian Republic and it has been inhabited approximately by 11.000 years. In 2003 this region was declared "Cultural and Natural Patrimony of Mandkind" by The United Nations Educational, Scientific and Cultural Organization (UNESCO. From that date the cultural and adventure tourism circuits development increased. This statement gave new impetus to Quebrada de Humahuaca in the national and international tourism market. And the rise of the latter in the area generated each little town to

  14. Computing Pathways in Bio-Models Derived from Bio-Science Text Sources

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer

    2015-01-01

    This paper outlines a system, OntoScape, serving to accomplish complex inference tasks on knowledge bases and bio-models derived from life-science text corpora. The system applies so-called natural logic, a form of logic which is readable for humans. This logic affords ontological representations...... of complex terms appearing in the text sources. Along with logical propositions, the system applies a semantic graph representation facilitating calculation of bio-pathways. More generally, the system aords means of query answering appealing to general and domain specic inference rules....

  15. Text Maps: Helping Students Navigate Informational Texts.

    Science.gov (United States)

    Spencer, Brenda H.

    2003-01-01

    Notes that a text map is an instructional approach designed to help students gain fluency in reading content area materials. Discusses how the goal is to teach students about the important features of the material and how the maps can be used to build new understandings. Presents the procedures for preparing and using a text map. (SG)

  16. O papel do folclore na motivação para atividades físicas de idosas The role of folklore in the motivation for physical activity of elderly

    Directory of Open Access Journals (Sweden)

    Berta Leni Costa Cardoso

    2011-03-01

    Full Text Available Existem muitos relatos sobre os benefícios biológicos da atividade física em idosos. Porém, o número de praticantes ainda não é satisfatório. Esse ponto controverso foi usado no presente artigo. Pesquisou-se sobre o uso do folclore local como um mecanismo educacional e motivacional útil no aumento da prática de atividades físicas para idosas. Foram entrevistadas idosas do Clube da Amizade em Caetité - BA, que foram motivadas e estimuladas pela dança. Este artigo também usou as reflexões de Paulo Freire, que admite o uso da cultura e contexto de vida pessoal como o mais importante meio de motivação e de educação. Os resultados provaram que é positivo o uso deste citado processo motivacional em estimular idosas nas suas aulas de educação física. Elas relataram que se sentem muito motivadas durante as aulas de dança enquanto podem escutar músicas que as fazem lembrar de seu passado, cultura e valores morais.There are many reports about the biological benefits of the physical activity in older individuals. However the number of physically active elderly is still not satisfactory. This controversial point was used in the present article. It searches if the use of the local folklore as an educating and motivating mechanism was useful for increasing physical activity practices in older individuals. Individuals from "Clube da Amizade" in Caetité city, Bahia (Brazil were interviewed to assess how folkloric dance was used to motivate them in physical education classes. This article also uses the Paulo Freire reflections that admit the use of regional cultural aspects and the life context as the most import strategy to teach and motivate the participants. The results indicated that is positive to use this referred motivational process to stimulate old ladies in the physical education classes. The interviewed ladies reported that they feel very stimulate during dance classes while they listen to music that makes them to remember

  17. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2009-03-01

    Full Text Available The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively...

  18. Sound experiences: the vision of experimental musician on the folkloric music in modern society

    Directory of Open Access Journals (Sweden)

    Rieko Tanaka

    2016-11-01

    Full Text Available This work begins narrating how folk music has always been a remnant in the influence on classical composers. It makes special mention of origin Hungarian musicians Bela Bartok, Zoltan Kodaly. This Musicians are considerate in this work as the most immediate ancestors of an experimental musicians northamericans, because both are influenced by their passion for folk music. We select as musicians principals exponents of American experimental music to John Cage, Lou Harrison and Carl Ruggles. Their works will be considered and analyzed in this text as the sounds as the experiences. Composers that will analyze the sound as experience, as feeling, as emotion, as time and origin. related traits in folk music and experimental music. Not forgetting in this work, and in his final considerations, the relationship between the musician, creation, society and art.

  19. AHP 2A: China's na53 mʑi 53 Tibetans: Life, Language and Folklore

    Directory of Open Access Journals (Sweden)

    Libu Lakhi (Li Jianfu 李建富, Dawa Tenzin ཟླ་བ་བསྟན་འཛིན།

    2009-06-01

    Full Text Available This remarkable book is the product of a fruitful collaboration among a native speaker of na53 mʑi53 kha11 tho11, Tibetan and Chinese consultants, and a dedicated group of Westerners resident in China. It affords the reader an intimate glimpse into traditional na53 mʑi53 life, now well on its way to disappearing along with hundreds of similar minority cultures in the world.

  20. AHP 2B: China's na53 mʑi 53 Tibetans: Life, Language and Folklore

    Directory of Open Access Journals (Sweden)

    Libu Lakhi (Li Jianfu 李建富, Dawa Tenzin ཟླ་བ་བསྟན་འཛིན།

    2009-06-01

    Full Text Available This remarkable book is the product of a fruitful collaboration among a native speaker of na53 mʑi53 kha11 tho11, Tibetan and Chinese consultants, and a dedicated group of Westerners resident in China. It affords the reader an intimate glimpse into traditional na53 mʑi53 life, now well on its way to disappearing along with hundreds of similar minority cultures in the world.

  1. A phytopharmacological review on Justicia picta (Acanthaceae: A well known tropical folklore medicinal plant

    Directory of Open Access Journals (Sweden)

    Pradeep Singh

    2015-12-01

    Full Text Available The Acanthaceae family is an important source of therapeutic drugs and the ethno pharmacological knowledge of this family requires urgent documentation as several of its species are near extinction. Justicia is the largest genus of Acanthaceae with approximately 600 species. Aim of the present review is to present literature for the traditional uses & pharmacological review of Justicia picta (Family: Acanthaceae and to discuss further priorities of research yet to be discovered.

  2. Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou|í]cí z hlediska morfologického značkování : Corpora as Data Sources for the Up-Grading of Morphological Tagging

    Directory of Open Access Journals (Sweden)

    Osolsobě, Klára

    2015-10-01

    Full Text Available Adjectives ending with -oucí/-ící are regularly derived from verbs and hence are not usually listed in any of the Czech monolingual dictionaries. On the level of automatic morphological analysis (the dictionary of Czech they should be generated from verbal roots and tagged as verbal adjectives (pos tag: AG.*. The data from Czech corpora prove a inconsistencies in tagging and b gaps in the dictionary. The main cause of both kinds of insufficiency is the existence of variants on the level of verbal forms from which the verbal adjectives are potentially derived. Consequently, text corpora are a significant sourceof knowledge about the formation and use of adjectives with endings -oucí/-ící that can be important for both a automatic morphological analysis of Czech and b theoretical description of Czech grammar(derivational morphology. Our goal is to present a corpus-based study of the Czech gerund, i.e. verbaladjectives with -oucí/-ící. The link between the inflected and the word-formation variants will bedemonstrated using material from the SYN corpus (2,6 billion tokens of written Czech and the large web corpus czTenTen12 (5,2 billion tokens of Czech text from the Internet — cleaned and deduplicated.

  3. Progestogen treatments for cycle management in a sheep model of assisted conception affect the growth patterns, the expression of luteinizing hormone receptors, and the progesterone secretion of induced corpora lutea.

    Science.gov (United States)

    Letelier, Claudia; García-Fernández, Rosa Ana; Contreras-Solis, Ignacio; Sanchez, María Angeles; Garcia-Palencia, Pilar; Sanchez, Belen; Gonzalez-Bulnes, Antonio; Flores, Juana María

    2010-03-01

    To determine, in a sheep model, the effect of a short-term progestative treatment on growth dynamics and functionality of induced corpora lutea. Observational, model study. Public university. Sixty adult female sheep. Synchronization and induction of ovulation with progestogens and prostaglandin analogues; ovarian ultrasonography, blood sampling, and ovariectomy. Determination of pituitary function and morphologic characteristics, expression of luteinizing hormone (LH) receptors, and progesterone secretion of corpora lutea. The use of progestative pretreatments for assisted conception affect the growth patterns, the expression of LH receptors, and the progesterone secretion of induced corpora lutea. The current study indicates, in a sheep model, the existence of deleterious effects from progestogens on functionality of induced corpora lutea. Copyright 2010 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  4. SEVERAL REPRESENTATIONS OF THE FOREIGNER IN ROMANIAN POPULAR EPICS NOTES FOR A FOLKLORIC IMAGOLOGY

    Directory of Open Access Journals (Sweden)

    Corina Daniela POPESCU

    2015-05-01

    Full Text Available From the perspective of the anthropology of space, the problem of perception and representation of the alien remains a fertile subject of research, nevertheless inevitably interfering with imagology, in the sense assigned to the concept of image as any representation of a cultural r eality through which the individual or the group translate the cultural, social, ideological space in which they are located Identity does not justify an existence in itself, but only in relation to alterity The imagological perspective of the foreigner in Romanian traditional culture proves rich in categories of representation dictated inevitably by reference to spatiality

  5. A tm Plug-In for Distributed Text Mining in R

    Directory of Open Access Journals (Sweden)

    Stefan Theussl

    2012-11-01

    Full Text Available R has gained explicit text mining support with the tm package enabling statisticians to answer many interesting research questions via statistical analysis or modeling of (text corpora. However, we typically face two challenges when analyzing large corpora: (1 the amount of data to be processed in a single machine is usually limited by the available main memory (i.e., RAM, and (2 the more data to be analyzed the higher the need for efficient procedures for calculating valuable results. Fortunately, adequate programming models like MapReduce facilitate parallelization of text mining tasks and allow for processing data sets beyond what would fit into memory by using a distributed file system possibly spanning over several machines, e.g., in a cluster of workstations. In this paper we present a plug-in package to tm called tm.plugin.dc implementing a distributed corpus class which can take advantage of the Hadoop MapReduce library for large scale text mining tasks. We show on the basis of an application in culturomics that we can efficiently handle data sets of significant size.

  6. FOLKLORE ELEMENTS IN BEDRİ RAHMİ EYUBOGLU’S POEMS BEDRİ RAHMİ EYÜBOĞLU’NUN ŞİİRLERİNDE HALK BİLİMİ UNSURLARI

    Directory of Open Access Journals (Sweden)

    Bahar DOĞAN

    2012-01-01

    Full Text Available The aim of this study is to figure out the folklore elements in Eyuboglu’s poems. Thus, his poem books Dol Karabakır Dol and Karadut were examined. In this study, research model was used. In interpreting the results of the study 25 items which were classified by Ornek in his book “Turk Halk Bilimi”, were used.The examples in Eyuboglu’s poems includes village,town and city life; folk architecture; vecihles and transportation technics; ecomomic type; classic folk-economy; nutrition, cuisine, storeroom; measurement, weighing and calculating methods; folk arts and handmade craft; folklore; folk believes, customs and traditions; transition period; stereotyped behaves and expression; folk literature; folk dance; folk music and folk musical instruments.The poet in his pems give place to folk songs and folk arts enormously. The poets who says “ Whenever I hear a village song , I feel shame of my poesy’’ aslo give places to beauty of his country. Occasionally usuing local accents in his poems makes him a simple one from the public.According to this study giving place Eyuboglu’s poems in the textbooks can be an important step for growing up persons who have versatile personality. Bu araştırma Bedri Rahmi Eyüboğlu’nun şiirlerindeki halk bilimi unsurlarını belirlemek amacıyla yapılmıştır. Bu doğrultuda Eyüboğlu’nun Dol Karabakır Dol ve Karadut şiir kitapları incelenmiştir. Araştırmada tarama modeli kullanılmıştır. Elde edilen bulguların yorumlanmasında Örnek’in, Türk Halk Bilimi kitabında halk biliminin çalışma konularını sınıflandırdığı yirmi beş madde kullanılmıştır.Eyüboğlu’nun şiirlerinde köy, kasaba ve kent yaşamı; halk mimarisi; taşıtlar ve taşıma teknikleri; ekonomi türleri; halk ekonomisi; beslenme, mutfak, kiler; ölçme, tartma, hesaplama biçimleri; halk sanatları ve zanaatları; halk bilgisi; halk inançları, töreler, adetler, gelenek ve görenekler; geçiş d

  7. Comparative metabolism of branched-chain amino acids to precursors of juvenile hormone biogenesis in corpora allata of lepidopterous versus nonlepidopterous insects

    International Nuclear Information System (INIS)

    Brindle, P.A.; Schooley, D.A.; Tsai, L.W.; Baker, F.C.

    1988-01-01

    Comparative studies were performed on the role of branched-chain amino acids (BCAA) in juvenile hormone (JH) biosynthesis using several lepidopterous and nonlepidopterous insects. Corpora cardiaca-corpora allata complexes (CC-CA, the corpora allata being the organ of JH biogenesis) were maintained in culture medium containing a uniformly 14 C-labeled BCAA, together with [methyl- 3 H]methionine as mass marker for JH quantification. BCAA catabolism was quantified by directly analyzing the medium for the presence of 14 C-labeled propionate and/or acetate, while JHs were extracted, purified by liquid chromatography, and subjected to double-label liquid scintillation counting. Our results indicate that active BCAA catabolism occurs within the CC-CA of lepidopterans, and this efficiently provides propionyl-CoA (from isoleucine or valine) for the biosynthesis of the ethyl branches of JH I and II. Acetyl-CoA, formed from isoleucine or leucine catabolism, is also utilized by lepidopteran CC-CA for biosynthesizing JH III and the acetate-derived portions of the ethyl-branched JHs. In contrast, CC-CA of nonlepidopterans fail to catabolize BCAA. Consequently, exogenous isoleucine or leucine does not serve as a carbon source for the biosynthesis of JH III by these glands, and no propionyl-CoA is produced for genesis of ethyl-branched JHs. This is the first observation of a tissue-specific metabolic difference which in part explains why these novel homosesquiterpenoids exist in lepidopterans, but not in nonlepidopterans

  8. Quirky Quotes and Needles in the Haystack: Tracing Grammatical Change in Untagged Corpora

    Directory of Open Access Journals (Sweden)

    Norde, Muriel

    2013-12-01

    Full Text Available This paper discusses pivotal theoretical and methodological problems of historical corpus linguistics. In two case studies from Swedish language history, the development of the epistemic adverb kanske and the group genitive respectively, it illustrates how the use of qualitative method in addition to corpus investigation can contribute to understanding grammatical change.

  9. Discovery Learning and Teaching with Electronic Corpora in an Advanced German Grammar Course

    Science.gov (United States)

    Vyatkina, Nina

    2013-01-01

    This study describes the design and implementation of a usage-based and corpus-based advanced German grammar course. Teaching materials for the course included DWDS, or "Digitales Worterbuch der deutschen Sprache": a large, representative, free and publicly available corpus of contemporary German texts. The article outlines specific…

  10. Computer Learner Corpora: Analysing Interlanguage Errors in Synchronous and Asynchronous Communication

    Science.gov (United States)

    MacDonald, Penny; Garcia-Carbonell, Amparo; Carot, Sierra, Jose Miguel

    2013-01-01

    This study focuses on the computer-aided analysis of interlanguage errors made by the participants in the telematic simulation IDEELS (Intercultural Dynamics in European Education through on-Line Simulation). The synchronous and asynchronous communication analysed was part of the MiLC Corpus, a multilingual learner corpus of texts written by…

  11. Relational Data Modelling of Textual Corpora: The Skaldic Project and its Extensions

    DEFF Research Database (Denmark)

    Wills, Tarrin Jon

    2015-01-01

    Skaldic poetry is a highly complex textual phenomenon both in terms of the intricacy of the poetry and its contextual environment. Extensible Markup Language (XML) applications such as that of the Text Encoding Initiative provide a means of semantic representation of some of these complexities. XML...

  12. Sarna Devī: feste di primavera, folklore e sostenibilità nelle tradizioni del Jharkhand

    Directory of Open Access Journals (Sweden)

    Stefano Beggiora

    2014-12-01

    Full Text Available In the Chota Nagpur plateau, in India, the worship of Sarna Devi (sarnaism seems today to unite in a sort of brotherhood many indigenous ethnic groups of the state of Jharkhand. Guardian of the boundaries of villages, Sarna is the goddess of the sacred grove and preside over the good harvest. The present work consists in an ethnographical analysis of the major spring festivals - and related rituals - widespread among the most populous indigenous peoples (ādivāsī of Jharkhand, with particular reference to Santals and Oraons. By reconstructing a comprehensive overview of the cultural relations among the ethnic groups, I emphasize how religious prescriptions and local shamanism share a common message of sustainability and equilibrium between man and the nature. The essay includes translations of verses, sacred liturgies, songs, employed in the celebration of the goddess and her environment.

  13. It Is Not Just Folklore: The Aqueous Extract of Mung Bean Coat Is Protective against Sepsis

    Directory of Open Access Journals (Sweden)

    Shu Zhu

    2012-01-01

    Full Text Available Mung bean (Vigna Radiata has been traditionally used in China both as nutritional food and herbal medicine against a number of inflammatory conditions since the 1050s. A nucleosomal protein, HMGB1, has recently been established as a late mediator of lethal systemic inflammation with a relatively wider therapeutic window for pharmacological interventions. Here we explored the HMGB1-inhibiting capacity and therapeutic potential of mung bean coat (MBC extract in vitro and in vivo. We found that MBC extract dose-dependently attenuated LPS-induced release of HMGB1 and several chemokines in macrophage cultures. Oral administration of MBC extract significantly increased animal survival rates from 29.4% (in saline group, N=17 mice to 70% (in experimental MBC extract group, N=17 mice, P<0.05. In vitro, MBC extract stimulated HMGB1 protein aggregation and facilitated both the formation of microtubule-associatedprotein-1-light-chain-3-(LC3-containing cytoplasmic vesicles, and the production of LC3-II in macrophage cultures. Consequently, MBC extract treatment led to reduction of cellular HMGB1 levels in macrophage cultures, which was impaired by coaddition of two autophagy inhibitors (bafilomycin A1 and 3-methyladenine. Conclusion. MBC extract is protective against lethal sepsis possibly by stimulating autophagic HMGB1 degradation.

  14. Folkloric Modernism – Venice’s Giardini della Biennale and the Geopolitics of Architecture

    Directory of Open Access Journals (Sweden)

    Joel Robinson

    2014-02-01

    Full Text Available This paper considers the national pavilions of the Venice Biennale, the largest and longest running exposition of contemporary art. It begins with an investigation of the post-fascist landscape of Venice’s Giardini della Biennale, as its built environment continued to evolve in the decades after 1945, with the construction of several new pavilions. With a view to exploring the architectural infrastructure of an event that has billed itself as ‘international’ from the first decade of the twentieth century, this paper asks how the mapping of national pavilions here might have changed to reflect the supposedly post-colonial and democratic aspirations of the West after the Second World War. Homing in on the nations that gained representation here in the 1950s and 60s, it looks at three of the more interesting architectural additions to the gardens, namely the pavilions for Israel, Canada and Brazil. These are used to raise questions about how national pavilions are mobilized ideologically, but also to explore broader questions about the geopolitical superstructure of the Biennale as an institution.

  15. ON THE VISUAL ORIGINS OF ONE FOLKLORE MOTIF. THE TOMB IN THE CHURCH

    Directory of Open Access Journals (Sweden)

    Liudmila V. Fadeyeva

    2017-06-01

    Full Text Available The article examines the influence of Christian iconography on poetic images of Rus- The article examines the influence of Christian iconography on poetic images of Rus sian, Ukrainian, and Belorussian spiritual verses. It claims that the icons that symbolize the Passion of Jesus Christ, both in the Western and Eastern European traditions, are possible sources of images and plots for a spiritual verse “Walking of the Virgin” (“Three Tombs”. The essay specifically focuses on the image of the Holy Sepulcher in Russian spiritual verse and its iconographic sources. It discusses a number of cases from the history of its iconography, from the images of the Holy Sepulcher in the Medieval Catho lic churches to the ones in the Orthodox cathedrals and churches of the second half of thth 17 — the beginning of the 18 century. In spiritual verse, the notion of the “tomb in the church” as part of liturgical practice was related not only to death symbolism. In the verse “Walking of the Virgin,” the image of three tombs, and primarily the tomb of the Virgin, bears on the Western-European poetic tradition and includes images that func tion to deny the idea of the finitude of human existence and reaffirm the idea of eternal life. Flowers and birds over the tomb of the Virgin are emblematic: it is a verbal icon of a kind that corresponds with the final episode of the poem, its climax. This emblem refers to conventional images of the Christian iconography that convey Christian dogmas via a combination of contradictory elements that we see, for example, in the traditional image of the Flourishing Cross.

  16. Crowdfunding: entre as Multidões e as Corporações

    Directory of Open Access Journals (Sweden)

    Erick Felinto

    2013-01-01

    Full Text Available Este artigo examina as práticas de crowdfunding e crowdsourcing no contexto da chamada web 2.0.  Por meio de uma exploração filosófica e sociológica das noções de multidão e de indivíduo, investigamos as tensões ideológicas que cercam essas praticas, encaradas por vezes como libertárias, por vezes como conservadoras.  O artigo aborda estudos de caso que ajudam a ilustrar os aspectos contraditórios do crowdfunding.

  17. Collecting and evaluating speech recognition corpora for 11 South African languages

    CSIR Research Space (South Africa)

    Badenhorst, J

    2011-08-01

    Full Text Available . In addition, speech-based access to information may empower illiterate or semi-literate peo- ple, 98% of whom live in the developing world. SDSs can play a useful role in a wide range of applications. Of particular importance in Africa are applications... speech (i.e. appropriate for the recognition task in terms of the language used, the profile of the speakers, speaking style, etc.) This speech generally needs to be curated and transcribed prior to the development of ASR sys- tems, and for most...

  18. Text-Fabric

    NARCIS (Netherlands)

    Roorda, Dirk

    2016-01-01

    Text-Fabric is a Python3 package for Text plus Annotations. It provides a data model, a text file format, and a binary format for (ancient) text plus (linguistic) annotations. The emphasis of this all is on: data processing; sharing data; and contributing modules. A defining characteristic is that

  19. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  20. XML and Free Text.

    Science.gov (United States)

    Riggs, Ken Roger

    2002-01-01

    Discusses problems with marking free text, text that is either natural language or semigrammatical but unstructured, that prevent well-formed XML from marking text for readily available meaning. Proposes a solution to mark meaning in free text that is consistent with the intended simplicity of XML versus SGML. (Author/LRW)

  1. Kontrastivní lingvistika a paralelní korpusy : Contrastive Linguistics and Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Libuše Dušková

    2017-07-01

    Full Text Available The article presents a brief survey of English-Czech contrastive studies based on original texts and their translations from the beginnings in the mid-fifties of the last century to the present. Until the first decade of the present century, excerption was done manually, which limited the research to a small number of samples. The early studies of English largely concentrated on sentence condensation and nominal tendencies in the expression of the predicate, as compared with the verbal character of Czech. In connection with the development of the theory of functional sentence perspective other topics were found in this sphere, especially as regards word order. While the former studies can be currently pursued on the basis of InterCorp at a qualitatively higher level, research into FSP topics remains restricted to issues involving variables with formalizable realization forms. The main part of the paper focuses on some of the fallacies involved in using translation counterparts as the basis of contrastive research. One of them is the possible influence of the original; others appear in such areas as the choice of translation counterparts with respect to the issue under investigation, the assessment of their adequacy, including the possibility of misrepresentation by the translator, the validity of the translation counterpart (which is in most cases limited, as alternatives are possible and others. In studies of functional sentence perspective a point to be considered is equivocal interpretation of the FSP structure in the original. These points are illustrated by translation counterparts in two translations of the same novel.

  2. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    text can be defined by taking as point of departure the digital format in which everything is represented in the binary alphabet. While the notion of text, in most cases, lends itself to be independent of medium and embodiment, it is also often tacitly assumed that it is, in fact, modeled around...... the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities...

  3. U-Compare: share and compare text mining tools with UIMA

    Science.gov (United States)

    Kano, Yoshinobu; Baumgartner, William A.; McCrohon, Luke; Ananiadou, Sophia; Cohen, K. Bretonnel; Hunter, Lawrence; Tsujii, Jun'ichi

    2009-01-01

    Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using them effectively. UIMA, the Unstructured Information Management Architecture, is an open framework designed to aid in the construction of more interoperable tools. U-Compare is built on top of the UIMA framework, and provides both a concrete framework for out-of-the-box text mining and a sophisticated evaluation platform allowing users to run specific tools on any target text, generating both detailed statistics and instance-based visualizations of outputs. U-Compare is a joint project, providing the world's largest, and still growing, collection of UIMA-compatible resources. These resources, originally developed by different groups for a variety of domains, include many famous tools and corpora. U-Compare can be launched straight from the web, without needing to be manually installed. All U-Compare components are provided ready-to-use and can be combined easily via a drag-and-drop interface without any programming. External UIMA components can also simply be mixed with U-Compare components, without distinguishing between locally and remotely deployed resources. Availability: http://u-compare.org/ Contact: kano@is.s.u-tokyo.ac.jp PMID:19414535

  4. From Folklore to Scientific Evidence: Breast-Feeding and Wet-Nursing in Islam and the Case of Non-Puerperal Lactation

    Science.gov (United States)

    Moran, Lia; Gilad, Jacob

    2007-01-01

    Breast-feeding practice has an important medical and socio-cultural role. It has many anthropological aspects concerning the “power structures” that find their expression in breast-feeding and the practices that formed around it, both socially, scientifically, and legally-speaking. Breast-feeding has been given much attention by religions and taboos, folklore, and misconception abound around it making it a topic of genuine curiosity. This paper aims at expanding the spectrum of folklore associated with breast-feeding. The paper deals with historical, religious, and folkloristic aspects of breast-feeding, especially wet-nursing, in Islam and focuses on an intriguing Islamic tale on breast-feeding - lactation by non-pregnant women (or non-puerperal lactation). Apparently, accounts of non-puerperal lactation are not restricted to Islam but have been documented in various societies and religions throughout centuries. Two medical situations - hyperprolactinemia and induced lactation, appear as possible explanations for this phenomenon. This serves as an excellent example for the value of utilizing contemporary scientific knowledge in order to elucidate the origin, anthropology and evolvement of ancient myth and superstition. PMID:23675050

  5. Texting on the Move

    Science.gov (United States)

    ... text. What's the Big Deal? The problem is multitasking. No matter how young and agile we are, ... on something other than the road. In fact, driving while texting (DWT) can be more dangerous than ...

  6. The learner as lexicographer: using monolingual and bilingual corpora to deepen vocabulary knowledge

    Directory of Open Access Journals (Sweden)

    Kristina HMELJAK SANGAWA

    2014-12-01

    Full Text Available Learning vocabulary is one of the most challenging tasks faced by learners with a non-kanji background when learning Japanese as a foreign language. However, learners are often not aware of the range of different aspects of word knowledge they need in order to successfully use Japanese. This includes not only the spoken and written form of a word and its meaning, but also morphological, grammatical, collocational, connotative and pragmatic knowledge as well as knowledge of social constraints to be observed. In this article, we present some background data on the use of dictionaries among students of Japanese at the University of Ljubljana, a selection of resources and a series of exercises developed with the following aims: a to foster greater awareness of the different aspects of Japanese vocabulary, both from a monolingual and a contrastive perspective, b to learn about tools and methods that can be applied in different contexts of language learning and language use, and c to develop strategies for learning new vocabulary, reinforcing knowledge about known vocabulary, and effectively using this knowledge in receptive and productive language tasks.

  7. Text Coherence in Translation

    Science.gov (United States)

    Zheng, Yanping

    2009-01-01

    In the thesis a coherent text is defined as a continuity of senses of the outcome of combining concepts and relations into a network composed of knowledge space centered around main topics. And the author maintains that in order to obtain the coherence of a target language text from a source text during the process of translation, a translator can…

  8. Molecular characterization and expression of DERL1 in bovine ovarian follicles and corpora lutea

    Directory of Open Access Journals (Sweden)

    Lussier Jacques G

    2010-08-01

    Full Text Available Abstract The endoplasmic reticulum (ER is a major site of protein synthesis and facilitates the folding and assembly of newly synthesized proteins. Misfolded proteins are retrotranslocated across the ER membrane and destroyed at the proteasome. DERL1 is an important protein involved in the retrotranslocation and degradation of a subset of misfolded proteins from the ER. We characterized a 2617 bp cDNA from bovine granulosa cells that corresponded to bovine DERL1. Two transcripts of 3 and 2.6 kb were detected by Northern blot analysis, and showed variations in expression among tissues. During follicular development, DERL1 expression was greater in day 5 dominant follicles compared to small follicles, ovulatory follicles, or corpus luteum (CL. Within the CL, DERL1 mRNA expression was intermediate in midcycle, and lowest in late cycle as compared to early in the estrous cycle. Western blot analyses demonstrated the presence of DERL1 in the bovine CL at days 5, 11, and 18 of the estrous cycle. Co-immunoprecipitation using luteal tissues showed that DERL1 interacts with class I MHC but not with VIMP or p97 ATPase. The interaction between DERL1 and MHC I suggests that, in the CL, DERL1 may regulate the integrity of MHC I molecules that are transported to the ER membrane. Furthermore, the greater expression of DERL1 mRNA is associated with the active follicular development and early luteal stages, suggesting a role of DERL1 in tissue remodeling events and maintenance of function in reproductive tissues.

  9. A reassessment of traditional lexicographical tools in the light of new corpora: sports Anglicisms in Spanish

    Directory of Open Access Journals (Sweden)

    Isabel Balteiro

    2011-12-01

    Full Text Available El estatus del inglés como lengua global es incuestionable hoy en día y, por ello, también lo es su presencia a todos los niveles en países de habla no inglesa. En términos lingüísticos, el inglés es una de las lenguas que más han influido en el español a través de su historia y muy especialmente a partir de la década de 1960. En este artículo estudiamos el impacto del inglés en el lenguaje de los deportes en español; en particular, nos centramos en los falsos anglicismos y en los anglicismos deportivos propiamente dichos. Basándonos en un análisis contrastivo de su aparición en el Nuevo diccionario de anglicismos, el Diccionario de la Real Academia Española y el Corpus de Referencia del Español Actual, prestamos especial atención no sólo a las diferentes formas que un anglicismo puede adoptar sino también a cuales de estas formas están más aceptadas y cuales son más rechazadas por los prescriptivistas y por los hablantes en general.There is no question nowadays as to the international and powerful status of English at a global scale and, consequently, as to its presence in non-English speaking countries at different levels. Linguistically speaking, English is one of the languages which have mostly influenced Spanish throughout its history and especially from the late 1960s. In this study, the impact of English on Spanish is considered in the language of sports; particularly, sports Anglicisms and false Anglicisms are analysed. Due attention is paid to the different forms that an Anglicism may adopt and to which of those forms are more widely accepted or rejected by prescriptivists and speakers at large, in the light of a contrastive analysis of their appearance in the Nuevo diccionario de anglicismos, the Diccionario de la Real Academia Española and the Corpus de Referencia del Español Actual.

  10. Pastourelle et folklore

    OpenAIRE

    Dumas, René

    2014-01-01

    Dans l'état où il nous est parvenu, le Tristan de Béroul ne nous offre que peu d'évocations de la ville et de la demeure où évoluent les différents personnages. Pourtant, bon nombre des moments-clés du roman se situent précisément en milieu urbain : la scène de la marche au suplice d'Iseut, par exemple, ou celle de la fête qui célèbre son retour auprès du roi Marc. Aussi tenterons-nous de préciser dans quel cadre de vie se situent les aventures de Tristan et Iseut et de rechercher l'ordre urb...

  11. A 38 Million Words Dutch Text Corpus and its Users

    African Journals Online (AJOL)

    part of speech, was made accessible via Internet (Kruyt 1995a, b). A 27 Million ..... corpora yet, and that 16 user accounts are reserved for students of the Free ... are from Norway, Denmark, Austria, Slovenia, Latvia, Malaysia and Korea.

  12. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  13. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  14. Dictionaries for text production

    DEFF Research Database (Denmark)

    Fuertes-Olivera, Pedro; Bergenholtz, Henning

    2018-01-01

    Dictionaries for Text Production are information tools that are designed and constructed for helping users to produce (i.e. encode) texts, both oral and written texts. These can be broadly divided into two groups: (a) specialized text production dictionaries, i.e., dictionaries that only offer...... a small amount of lexicographic data, most or all of which are typically used in a production situation, e.g. synonym dictionaries, grammar and spelling dictionaries, collocation dictionaries, concept dictionaries such as the Longman Language Activator, which is advertised as the World’s First Production...... Dictionary; (b) general text production dictionaries, i.e., dictionaries that offer all or most of the lexicographic data that are typically used in a production situation. A review of existing production dictionaries reveals that there are many specialized text production dictionaries but only a few general...

  15. Instant Sublime Text starter

    CERN Document Server

    Haughee, Eric

    2013-01-01

    A starter which teaches the basic tasks to be performed with Sublime Text with the necessary practical examples and screenshots. This book requires only basic knowledge of the Internet and basic familiarity with any one of the three major operating systems, Windows, Linux, or Mac OS X. However, as Sublime Text 2 is primarily a text editor for writing software, many of the topics discussed will be specifically relevant to software development. That being said, the Sublime Text 2 Starter is also suitable for someone without a programming background who may be looking to learn one of the tools of

  16. Linguistics in Text Interpretation

    DEFF Research Database (Denmark)

    Togeby, Ole

    2011-01-01

    A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'.......A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'....

  17. LocText

    DEFF Research Database (Denmark)

    Cejuela, Juan Miguel; Vinchurkar, Shrikant; Goldberg, Tatyana

    2018-01-01

    trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast...

  18. Systematic text condensation

    DEFF Research Database (Denmark)

    Malterud, Kirsti

    2012-01-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies.......To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies....

  19. The Perfect Text.

    Science.gov (United States)

    Russo, Ruth

    1998-01-01

    A chemistry teacher describes the elements of the ideal chemistry textbook. The perfect text is focused and helps students draw a coherent whole out of the myriad fragments of information and interpretation. The text would show chemistry as the central science necessary for understanding other sciences and would also root chemistry firmly in the…

  20. Text 2 Mind Map

    OpenAIRE

    Iona, John

    2017-01-01

    This is a review of the web resource 'Text 2 Mind Map' www.Text2MindMap.com. It covers what the resource is, and how it might be used in Library and education context, in particular for School Librarians.

  1. Text File Comparator

    Science.gov (United States)

    Kotler, R. S.

    1983-01-01

    File Comparator program IFCOMP, is text file comparator for IBM OS/VScompatable systems. IFCOMP accepts as input two text files and produces listing of differences in pseudo-update form. IFCOMP is very useful in monitoring changes made to software at the source code level.

  2. Comparative metabolism of branched-chain amino acids to precursors of juvenile hormone biogenesis in corpora allata of lepidopterous versus nonlepidopterous insects

    Energy Technology Data Exchange (ETDEWEB)

    Brindle, P.A.; Schooley, D.A.; Tsai, L.W.; Baker, F.C.

    1988-08-05

    Comparative studies were performed on the role of branched-chain amino acids (BCAA) in juvenile hormone (JH) biosynthesis using several lepidopterous and nonlepidopterous insects. Corpora cardiaca-corpora allata complexes (CC-CA, the corpora allata being the organ of JH biogenesis) were maintained in culture medium containing a uniformly /sup 14/C-labeled BCAA, together with (methyl-/sup 3/H)methionine as mass marker for JH quantification. BCAA catabolism was quantified by directly analyzing the medium for the presence of /sup 14/C-labeled propionate and/or acetate, while JHs were extracted, purified by liquid chromatography, and subjected to double-label liquid scintillation counting. Our results indicate that active BCAA catabolism occurs within the CC-CA of lepidopterans, and this efficiently provides propionyl-CoA (from isoleucine or valine) for the biosynthesis of the ethyl branches of JH I and II. Acetyl-CoA, formed from isoleucine or leucine catabolism, is also utilized by lepidopteran CC-CA for biosynthesizing JH III and the acetate-derived portions of the ethyl-branched JHs. In contrast, CC-CA of nonlepidopterans fail to catabolize BCAA. Consequently, exogenous isoleucine or leucine does not serve as a carbon source for the biosynthesis of JH III by these glands, and no propionyl-CoA is produced for genesis of ethyl-branched JHs. This is the first observation of a tissue-specific metabolic difference which in part explains why these novel homosesquiterpenoids exist in lepidopterans, but not in nonlepidopterans.

  3. Dose-Volume Parameters of the Corpora Cavernosa Do Not Correlate With Erectile Dysfunction After External Beam Radiotherapy for Prostate Cancer: Results From a Dose-Escalation Trial

    International Nuclear Information System (INIS)

    Wielen, Gerard J. van der; Hoogeman, Mischa S.; Dohle, Gert R.; Putten, Wim L.J. van; Incrocci, Luca

    2008-01-01

    Purpose: To analyze the correlation between dose-volume parameters of the corpora cavernosa and erectile dysfunction (ED) after external beam radiotherapy (EBRT) for prostate cancer. Methods and Materials: Between June 1997 and February 2003, a randomized dose-escalation trial comparing 68 Gy and 78 Gy was conducted. Patients at our institute were asked to participate in an additional part of the trial evaluating sexual function. After exclusion of patients with less than 2 years of follow-up, ED at baseline, or treatment with hormonal therapy, 96 patients were eligible. The proximal corpora cavernosa (crura), the superiormost 1-cm segment of the crura, and the penile bulb were contoured on the planning computed tomography scan and dose-volume parameters were calculated. Results: Two years after EBRT, 35 of the 96 patients had developed ED. No statistically significant correlations between ED 2 years after EBRT and dose-volume parameters of the crura, the superiormost 1-cm segment of the crura, or the penile bulb were found. The few patients using potency aids typically indicated to have ED. Conclusion: No correlation was found between ED after EBRT for prostate cancer and radiation dose to the crura or penile bulb. The present study is the largest study evaluating the correlation between ED and radiation dose to the corpora cavernosa after EBRT for prostate cancer. Until there is clear evidence that sparing the penile bulb or crura will reduce ED after EBRT, we advise to be careful in sparing these structures, especially when this involves reducing treatment margins

  4. Viana, V.; Tagnin, S. E. O. (orgs.. Corpora no ensino de línguas estrangeiras DOI: 10.5007/2175-7968.2011v1n27p294

    Directory of Open Access Journals (Sweden)

    Leticia Rebollo Couto

    2011-11-01

    Full Text Available Os trabalhos agrupados neste volume exploram através do viés da Linguística de Corpus, aplicações para o ensino de línguas e de tradução, além de oferecerem subsídios teóricos e reflexões sobre essa emergente subárea dos estudos lingüísticos. Corpora no Ensino de Línguas Estrangeiras é o primeiro volume de seu gênero no mercado editorial brasileiro e inova pelo tema e por congregar pesquisadores experientes e professores de línguas que juntos oferecem ao leitor elementos para aguçar a sua curiosidade e colocar em prática, na sua sala de aula, algumas das sugestões oferecidas pelos autores. O livro, além de estabelecer mais firmemente o perfil da pesquisa e das aplicações da Linguística de Corpus no Brasil, é de interesse para professores de línguas, tradutores, lingüistas e outros profissionais da área de Letras, que certamente nele encontrarão o alicerce para o desenvolvimento de suas competências nas metodologias e aplicações desse estimulante campo do saber.

  5. Os regimentos das corporações dos ofícios mecânicos: O caso do Retábulo-mor da Sé de Lamego (1506-1511 do pintor português Vasco Fernandes

    Directory of Open Access Journals (Sweden)

    Joana Salgueiro

    2010-08-01

    Full Text Available O núcleo em estudo: Retábulo-mor da Sé de Lamego (1506-1511, obra de incontestável importância histórico-artística do pintor quinhentista Vasco Fernandes, “Grão Vasco”, é um conjunto valiosamente documentado pelo seu contrato de obra, que subsistiu até à actualidade. No entanto, sabe-se que muitas vezes os dados empiricamente percepcionados ou mesmo presentes nos actos notariais relativos à feitura do retábulo, por inúmeras razões, nem sempre correspondem na íntegra à realidade. O trabalho que se segue, tem como objectivo, cruzar o conhecimento técnico e material dos suportes destas pinturas, com os dados analisados nos regimentos das corporações dos ofícios mecânicos do trabalho das madeiras: carpinteiros, carpinteiros de marcenaria, marceneiros, entalhadores (e por comparação pintores; de modo a determinar, através das metodologias de examinação dos aprendizes dos ofícios, e restantes normativas, as técnicas e materiais de execução exigidas, no contexto histórico do período Renascentista português.

  6. Zum Bildungspotenzial biblischer Texte

    Directory of Open Access Journals (Sweden)

    Theis, Joachim

    2017-11-01

    Full Text Available Biblical education as a holistic process goes far beyond biblical learning. It must be understood as a lifelong process, in which both biblical texts and their understanders operate appropriating their counterpart in a dialogical way. – Neither does the recipient’s horizon of understanding appear as an empty room, which had to be filled with the text only, nor is the latter a dead material one could only examine cognitively. The recipient discovers the meaning of the biblical text recomposing it by existential appropriation. So the text is brought to live in each individual reality. Both scientific insights and subjective structures as well as the understanders’ community must be included to avoid potential one-sidednesses. Unfortunately, a special negative association obscures the approach of the bible very often: Still biblical work as part of religious education appears in a cognitively oriented habit, which is neither regarding the vitality and sovereignty of the biblical texts nor the students’ desire for meaning. Moreover, the bible is getting misused for teaching moral terms or pontifications. Such downfalls can be disrupted by biblical didactics which are empowerment didactics. Regarding the sovereignty of biblical texts, these didactics assist the understander with his/her individuation by opening the texts with focus on the understander’s otherness. Thus each the text and the recipient become subjects in a dialogue. The approach of the Biblical-Enabling-Didactics leads the Bible to become always new a book of life. Understanding them from within their hermeneutics, empowerment didactics could be raised to the principle of biblical didactics in general and grow into an essential element of holistic education.

  7. From university research to innovation Detecting knowledge transfer via text mining

    DEFF Research Database (Denmark)

    Woltmann, Sabrina; Clemmensen, Line Katrine Harder; Alkærsig, Lars

    2016-01-01

    and indicators such as patents, collaborative publications and license agreements, to assess the contribution to the socioeconomic surrounding of universities. In this study, we present an extension of the current empirical framework by applying new computational methods, namely text mining and pattern...... associated the former with the latter to obtain insights into possible text and semantic relatedness. The text mining methods are extrapolating the correlations, semantic patterns and content comparison of the two corpora to define the document relatedness. We expect the development of a novel tool using...... recognition. Text samples for this purpose can include files containing social media contents, company websites and annual reports. The empirical focus in the present study is on the technical sciences and in particular on the case of the Technical University of Denmark (DTU). We generated two independent...

  8. Corpora and Cultural Cognition

    DEFF Research Database (Denmark)

    Jensen, Kim Ebensgaard

    2017-01-01

    Cultural cognition is, to a great extent, transmitted through language and, consequently, reflected and replicated in language use. Cultural cognition may be instantiated in various patterns of language use, such as the discursive behavior of constructions. Very often, such instantiations can be ...... is addressed. In the third part of the chapter, three case studies are presented – one from Danish and two from English – to illustrate the analysis of cultural conceptualization via corpus-linguistic techniques....

  9. EST: Evading Scientific Text.

    Science.gov (United States)

    Ward, Jeremy

    2001-01-01

    Examines chemical engineering students' attitudes to text and other parts of English language textbooks. A questionnaire was administered to a group of undergraduates. Results reveal one way students get around the problem of textbook reading. (Author/VWL)

  10. nal Sesotho texts

    African Journals Online (AJOL)

    with literary texts written in indigenous South African languages. The project ... Homi Bhabha uses the words of Salman Rushdie to underline the fact that new .... I could not conceptualise an African-language-to-African-language dictionary. An.

  11. Plagiarism in Academic Texts

    Directory of Open Access Journals (Sweden)

    Marta Eugenia Rojas-Porras

    2012-08-01

    Full Text Available The ethical and social responsibility of citing the sources in a scientific or artistic work is undeniable. This paper explores, in a preliminary way, academic plagiarism in its various forms. It includes findings based on a forensic analysis. The purpose of this paper is to raise awareness on the importance of considering these details when writing and publishing a text. Hopefully, this analysis may put the issue under discussion.

  12. Machine Translation from Text

    Science.gov (United States)

    Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

    Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

  13. TEXT Energy Storage System

    International Nuclear Information System (INIS)

    Weldon, W.F.; Rylander, H.G.; Woodson, H.H.

    1977-01-01

    The Texas Experimental Tokamak (TEXT) Enery Storage System, designed by the Center for Electromechanics (CEM), consists of four 50 MJ, 125 V homopolar generators and their auxiliaries and is designed to power the toroidal and poloidal field coils of TEXT on a two-minute duty cycle. The four 50 MJ generators connected in series were chosen because they represent the minimum cost configuration and also represent a minimal scale up from the successful 5.0 MJ homopolar generator designed, built, and operated by the CEM

  14. Reshaping Text Data for Efficient Processing on Amazon EC2

    Directory of Open Access Journals (Sweden)

    Gabriela Turcu

    2011-01-01

    Full Text Available Text analysis tools are nowadays required to process increasingly large corpora which are often organized as small files (abstracts, news articles, etc.. Cloud computing offers a convenient, on-demand, pay-as-you-go computing environment for solving such problems. We investigate provisioning on the Amazon EC2 cloud from the user perspective, attempting to provide a scheduling strategy that is both timely and cost effective. We derive an execution plan using an empirically determined application performance model. A first goal of our performance measurements is to determine an optimal file size for our application to consume. Using the subset-sum first fit heuristic we reshape the input data by merging files in order to match as closely as possible the desired file size. This also speeds up the task of retrieving the results of our application, by having the output be less segmented. Using predictions of the performance of our application based on measurements on small data sets, we devise an execution plan that meets a user specified deadline while minimizing cost.

  15. A Text-Independent Speaker Authentication System for Mobile Devices

    Directory of Open Access Journals (Sweden)

    Florentin Thullier

    2017-09-01

    Full Text Available This paper presents a text independent speaker authentication method adapted to mobile devices. Special attention was placed on delivering a fully operational application, which admits a sufficient reliability level and an efficient functioning. To this end, we have excluded the need for any network communication. Hence, we opted for the completion of both the training and the identification processes directly on the mobile device through the extraction of linear prediction cepstral coefficients and the naive Bayes algorithm as the classifier. Furthermore, the authentication decision is enhanced to overcome misidentification through access privileges that the user should attribute to each application beforehand. To evaluate the proposed authentication system, eleven participants were involved in the experiment, conducted in quiet and noisy environments. Public speech corpora were also employed to compare this implementation to existing methods. Results were efficient regarding mobile resources’ consumption. The overall classification performance obtained was accurate with a small number of samples. Then, it appeared that our authentication system might be used as a first security layer, but also as part of a multilayer authentication, or as a fall-back mechanism.

  16. New mathematical cuneiform texts

    CERN Document Server

    Friberg, Jöran

    2016-01-01

    This monograph presents in great detail a large number of both unpublished and previously published Babylonian mathematical texts in the cuneiform script. It is a continuation of the work A Remarkable Collection of Babylonian Mathematical Texts (Springer 2007) written by Jöran Friberg, the leading expert on Babylonian mathematics. Focussing on the big picture, Friberg explores in this book several Late Babylonian arithmetical and metro-mathematical table texts from the sites of Babylon, Uruk and Sippar, collections of mathematical exercises from four Old Babylonian sites, as well as a new text from Early Dynastic/Early Sargonic Umma, which is the oldest known collection of mathematical exercises. A table of reciprocals from the end of the third millennium BC, differing radically from well-documented but younger tables of reciprocals from the Neo-Sumerian and Old-Babylonian periods, as well as a fragment of a Neo-Sumerian clay tablet showing a new type of a labyrinth are also discussed. The material is presen...

  17. The Emar Lexical Texts

    NARCIS (Netherlands)

    Gantzert, Merijn

    2011-01-01

    This four-part work provides a philological analysis and a theoretical interpretation of the cuneiform lexical texts found in the Late Bronze Age city of Emar, in present-day Syria. These word and sign lists, commonly dated to around 1100 BC, were almost all found in the archive of a single school.

  18. Texts and Readers.

    Science.gov (United States)

    Iser, Wolfgang

    1980-01-01

    Notes that, since fictional discourse need not reflect prevailing systems of meaning and norms or values, readers gain detachment from their own presuppositions; by constituting and formulating text-sense, readers are constituting and formulating their own cognition and becoming aware of the operations for doing so. (FL)

  19. Documents and legal texts

    International Nuclear Information System (INIS)

    2017-01-01

    This section treats of the following documents and legal texts: 1 - Belgium 29 June 2014 - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy; 2 - Belgium, 7 December 2016. - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy

  20. Strategy as Texts

    DEFF Research Database (Denmark)

    Obed Madsen, Søren

    of the strategy into four categories. Second, the managers produce new texts based on the original strategy document by using four different ways of translation models. The study’s findings contribute to three areas. Firstly, it shows that translation is more than a sociological process. It is also...... a craftsmanship that requires knowledge and skills, which unfortunately seems to be overlooked in both the literature and in practice. Secondly, it shows that even though a strategy text is in singular, the translation makes strategy plural. Thirdly, the article proposes a way to open up the black box of what......This article shows empirically how managers translate a strategy plan at an individual level. By analysing how managers in three organizations translate strategies, it identifies that the translation happens in two steps: First, the managers decipher the strategy by coding the different parts...

  1. tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles.

    Science.gov (United States)

    Cejuela, Juan Miguel; McQuilton, Peter; Ponting, Laura; Marygold, Steven J; Stefancsik, Raymund; Millburn, Gillian H; Rost, Burkhard

    2014-01-01

    The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the 'tagtog' system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. DATABASE URL: www.tagtog.net, www.flybase.org.

  2. Reading Authentic Texts

    DEFF Research Database (Denmark)

    Balling, Laura Winther

    2013-01-01

    Most research on cognates has focused on words presented in isolation that are easily defined as cognate between L1 and L2. In contrast, this study investigates what counts as cognate in authentic texts and how such cognates are read. Participants with L1 Danish read news articles in their highly...... proficient L2, English, while their eye-movements were monitored. The experiment shows a cognate advantage for morphologically simple words, but only when cognateness is defined relative to translation equivalents that are appropriate in the context. For morphologically complex words, a cognate disadvantage...... word predictability indexed by the conditional probability of each word....

  3. Documents and legal texts

    International Nuclear Information System (INIS)

    2016-01-01

    This section treats of the following documents and legal texts: 1 - Brazil: Law No. 13,260 of 16 March 2016 (To regulate the provisions of item XLIII of Article 5 of the Federal Constitution on terrorism, dealing with investigative and procedural provisions and redefining the concept of a terrorist organisation; and amends Laws No. 7,960 of 21 December 1989 and No. 12,850 of 2 August 2013); 2 - India: The Atomic Energy (Amendment) Act, 2015; Department Of Atomic Energy Notification (Civil Liability for Nuclear Damage); 3 - Japan: Act on Subsidisation, etc. for Nuclear Damage Compensation Funds following the implementation of the Convention on Supplementary Compensation for Nuclear Damage

  4. Journalistic Text Production

    DEFF Research Database (Denmark)

    Haugaard, Rikke Hartmann

    , a multiple case study investigated three professional text producers’ practices as they unfolded in their natural setting at the Spanish newspaper, El Mundo. • Results indicate that journalists’ revisions are related to form markedly more often than to content. • Results suggest two writing phases serving...... at the Spanish newspaper, El Mundo, in Madrid. The study applied a combination of quantitative and qualitative methods, i.e. keystroke logging, participant observation and retrospective interview. Results indicate that journalists’ revisions are related to form markedly more often than to content (approx. three...

  5. Weitere Texte physiognomischen Inhalts

    Directory of Open Access Journals (Sweden)

    Böck, Barbara

    2004-12-01

    Full Text Available The present article offers the edition of three cuneiform texts belonging to the Akkadian handbook of omens drawn from the physical appearance as well as the morals and behaviour of man. The book comprising up to 27 chapters with more than 100 omens each was entitled in antiquity Alamdimmû. The edition of the three cuneiform tablets completes, thus, the author's monographic study on the ancient Mesopotamian divinatory discipline of physiognomy (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

    En este artículo se presenta la editio princeps de tres textos cuneiformes conservados en el British Museum (Londres y el Vorderasiatisches Museum (Berlín, que pertenecen al libro asirio-babilonio de presagios fisiognómicos. Este libro, titulado originalmente Alamdimmû ('forma, figura', consta de 27 capítulos, cada uno con más de cien presagios escritos en lengua acadia. Los tres textos completan así el estudio monográfico de la autora sobre la disciplina adivinatoria de la fisiognomía en el antiguo Oriente (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

  6. Utah Text Retrieval Project

    Energy Technology Data Exchange (ETDEWEB)

    Hollaar, L A

    1983-10-01

    The Utah Text Retrieval project seeks well-engineered solutions to the implementation of large, inexpensive, rapid text information retrieval systems. The project has three major components. Perhaps the best known is the work on the specialized processors, particularly search engines, necessary to achieve the desired performance and cost. The other two concern the user interface to the system and the system's internal structure. The work on user interface development is not only concentrating on the syntax and semantics of the query language, but also on the overall environment the system presents to the user. Environmental enhancements include convenient ways to browse through retrieved documents, access to other information retrieval systems through gateways supporting a common command interface, and interfaces to word processing systems. The system's internal structure is based on a high-level data communications protocol linking the user interface, index processor, search processor, and other system modules. This allows them to be easily distributed in a multi- or specialized-processor configuration. It also allows new modules, such as a knowledge-based query reformulator, to be added. 15 references.

  7. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Directory of Open Access Journals (Sweden)

    Anika Oellrich

    Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content

  8. CUILESS2016: a clinical corpus applying compositional normalization of text mentions.

    Science.gov (United States)

    Osborne, John D; Neu, Matthew B; Danila, Maria I; Solorio, Thamar; Bethard, Steven J

    2018-01-10

    Traditionally text mention normalization corpora have normalized concepts to single ontology identifiers ("pre-coordinated concepts"). Less frequently, normalization corpora have used concepts with multiple identifiers ("post-coordinated concepts") but the additional identifiers have been restricted to a defined set of relationships to the core concept. This approach limits the ability of the normalization process to express semantic meaning. We generated a freely available corpus using post-coordinated concepts without a defined set of relationships that we term "compositional concepts" to evaluate their use in clinical text. We annotated 5397 disorder mentions from the ShARe corpus to SNOMED CT that were previously normalized as "CUI-less" in the "SemEval-2015 Task 14" shared task because they lacked a pre-coordinated mapping. Unlike the previous normalization method, we do not restrict concept mappings to a particular set of the Unified Medical Language System (UMLS) semantic types and allow normalization to occur to multiple UMLS Concept Unique Identifiers (CUIs). We computed annotator agreement and assessed semantic coverage with this method. We generated the largest clinical text normalization corpus to date with mappings to multiple identifiers and made it freely available. All but 8 of the 5397 disorder mentions were normalized using this methodology. Annotator agreement ranged from 52.4% using the strictest metric (exact matching) to 78.2% using a hierarchical agreement that measures the overlap of shared ancestral nodes. Our results provide evidence that compositional concepts can increase semantic coverage in clinical text. To our knowledge we provide the first freely available corpus of compositional concept annotation in clinical text.

  9. From university research to innovation: Detecting knowledge transfer via text mining

    Energy Technology Data Exchange (ETDEWEB)

    Woltmann, S.; Clemmensen, L.; Alkærsig, L

    2016-07-01

    Knowledge transfer by universities is a top priority in innovation policy and a primary purpose for public research funding, due to being an important driver of technical change and innovation. Current empirical research on the impact of university research relies mainly on formal databases and indicators such as patents, collaborative publications and license agreements, to assess the contribution to the socioeconomic surrounding of universities. In this study, we present an extension of the current empirical framework by applying new computational methods, namely text mining and pattern recognition. Text samples for this purpose can include files containing social media contents, company websites and annual reports. The empirical focus in the present study is on the technical sciences and in particular on the case of the Technical University of Denmark (DTU). We generated two independent text collections (corpora) to identify correlations of university publications and company webpages. One corpus representing the company sites, serving as sample of the private economy and a second corpus, providing the reference to the university research, containing relevant publications. We associated the former with the latter to obtain insights into possible text and semantic relatedness. The text mining methods are extrapolating the correlations, semantic patterns and content comparison of the two corpora to define the document relatedness. We expect the development of a novel tool using contemporary techniques for the measurement of public research impact. The approach aims to be applicable across universities and thus enable a more holistic comparable assessment. This rely less on formal databases, which is certainly beneficial in terms of the data reliability. We seek to provide a supplementary perspective for the detection of the dissemination of university research and hereby enable policy makers to gain additional insights of (informal) contributions of knowledge

  10. Documents and legal texts

    International Nuclear Information System (INIS)

    2013-01-01

    This section reprints a selection of recently published legislative texts and documents: - Russian Federation: Federal Law No.170 of 21 November 1995 on the use of atomic energy, Adopted by the State Duma on 20 October 1995; - Uruguay: Law No.19.056 On the Radiological Protection and Safety of Persons, Property and the Environment (4 January 2013); - Japan: Third Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (concerning Damages related to Rumour-Related Damage in the Agriculture, Forestry, Fishery and Food Industries), 30 January 2013; - France and the United States: Joint Statement on Liability for Nuclear Damage (Aug 2013); - Franco-Russian Nuclear Power Declaration (1 November 2013)

  11. Event-based text mining for biology and functional genomics

    Science.gov (United States)

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

    2015-01-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365

  12. Interconnectedness und digitale Texte

    Directory of Open Access Journals (Sweden)

    Detlev Doherr

    2013-04-01

    Full Text Available Zusammenfassung Die multimedialen Informationsdienste im Internet werden immer umfangreicher und umfassender, wobei auch die nur in gedruckter Form vorliegenden Dokumente von den Bibliotheken digitalisiert und ins Netz gestellt werden. Über Online-Dokumentenverwaltungen oder Suchmaschinen können diese Dokumente gefunden und dann in gängigen Formaten wie z.B. PDF bereitgestellt werden. Dieser Artikel beleuchtet die Funktionsweise der Humboldt Digital Library, die seit mehr als zehn Jahren Dokumente von Alexander von Humboldt in englischer Übersetzung im Web als HDL (Humboldt Digital Library kostenfrei zur Verfügung stellt. Anders als eine digitale Bibliothek werden dabei allerdings nicht nur digitalisierte Dokumente als Scan oder PDF bereitgestellt, sondern der Text als solcher und in vernetzter Form verfügbar gemacht. Das System gleicht damit eher einem Informationssystem als einer digitalen Bibliothek, was sich auch in den verfügbaren Funktionen zur Auffindung von Texten in unterschiedlichen Versionen und Übersetzungen, Vergleichen von Absätzen verschiedener Dokumente oder der Darstellung von Bilden in ihrem Kontext widerspiegelt. Die Entwicklung von dynamischen Hyperlinks auf der Basis der einzelnen Textabsätze der Humboldt‘schen Werke in Form von Media Assets ermöglicht eine Nutzung der Programmierschnittstelle von Google Maps zur geographischen wie auch textinhaltlichen Navigation. Über den Service einer digitalen Bibliothek hinausgehend, bietet die HDL den Prototypen eines mehrdimensionalen Informationssystems, das mit dynamischen Strukturen arbeitet und umfangreiche thematische Auswertungen und Vergleiche ermöglicht. Summary The multimedia information services on Internet are becoming more and more comprehensive, even the printed documents are digitized and republished as digital Web documents by the libraries. Those digital files can be found by search engines or management tools and provided as files in usual formats as

  13. Documents and legal texts

    International Nuclear Information System (INIS)

    2015-01-01

    This section treats of the following Documents and legal texts: 1 - Canada: Nuclear Liability and Compensation Act (An Act respecting civil liability and compensation for damage in case of a nuclear incident, repealing the Nuclear Liability Act and making consequential amendments to other acts); 2 - Japan: Act on Compensation for Nuclear Damage (The purpose of this act is to protect persons suffering from nuclear damage and to contribute to the sound development of the nuclear industry by establishing a basic system regarding compensation in case of nuclear damage caused by reactor operation etc.); Act on Indemnity Agreements for Compensation of Nuclear Damage; 3 - Slovak Republic: Act on Civil Liability for Nuclear Damage and on its Financial Coverage and on Changes and Amendments to Certain Laws (This Act regulates: a) The civil liability for nuclear damage incurred in the causation of a nuclear incident, b) The scope of powers of the Nuclear Regulatory Authority (hereinafter only as the 'Authority') in relation to the application of this Act, c) The competence of the National Bank of Slovakia in relation to the supervised financial market entities in the financial coverage of liability for nuclear damage; and d) The penalties for violation of this Act)

  14. Documents and legal texts

    International Nuclear Information System (INIS)

    2014-01-01

    This section of the Bulletin presents the recently published documents and legal texts sorted by country: - Brazil: Resolution No. 169 of 30 April 2014. - Japan: Act Concerning Exceptions to Interruption of Prescription Pertaining to Use of Settlement Mediation Procedures by the Dispute Reconciliation Committee for Nuclear Damage Compensation in relation to Nuclear Damage Compensation Disputes Pertaining to the Great East Japan Earthquake (Act No. 32 of 5 June 2013); Act Concerning Measures to Achieve Prompt and Assured Compensation for Nuclear Damage Arising from the Nuclear Plant Accident following the Great East Japan Earthquake and Exceptions to the Extinctive Prescription, etc. of the Right to Claim Compensation for Nuclear Damage (Act No. 97 of 11 December 2013); Fourth Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage Resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.); Outline of 'Fourth Supplement to Interim Guidelines (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.)'. - OECD Nuclear Energy Agency: Decision and Recommendation of the Steering Committee Concerning the Application of the Paris Convention to Nuclear Installations in the Process of Being Decommissioned; Joint Declaration on the Security of Supply of Medical Radioisotopes. - United Arab Emirates: Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage; Ratification of the Federal Supreme Council of Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage

  15. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Science.gov (United States)

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh

  16. The interpretation of dream meaning: Resolving ambiguity using Latent Semantic Analysis in a small corpus of text.

    Science.gov (United States)

    Altszyler, Edgar; Ribeiro, Sidarta; Sigman, Mariano; Fernández Slezak, Diego

    2017-11-01

    Computer-based dreams content analysis relies on word frequencies within predefined categories in order to identify different elements in text. As a complementary approach, we explored the capabilities and limitations of word-embedding techniques to identify word usage patterns among dream reports. These tools allow us to quantify words associations in text and to identify the meaning of target words. Word-embeddings have been extensively studied in large datasets, but only a few studies analyze semantic representations in small corpora. To fill this gap, we compared Skip-gram and Latent Semantic Analysis (LSA) capabilities to extract semantic associations from dream reports. LSA showed better performance than Skip-gram in small size corpora in two tests. Furthermore, LSA captured relevant word associations in dream collection, even in cases with low-frequency words or small numbers of dreams. Word associations in dreams reports can thus be quantified by LSA, which opens new avenues for dream interpretation and decoding. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Text-Mining Applications for Creation of Biofilm Literature Database

    Directory of Open Access Journals (Sweden)

    Kanika Gupta

    2017-10-01

    So in the present research published corpora of 34306 documents for biofilm was collected from PubMed database along with non-indexed resources like books, conferences, newspaper articles, etc. and these were divided into five categories i.e. classification, growth and development, physiology, drug effects and radiation effects. These five categories were further individually divided into three parts i.e. Journal Title, Abstract Title, and Abstract Text to make indexing highly specific. Text-processing was done using the software Rapid Miner_v5.3, which tokenizes the entire text into words and provides the frequency of each word within the document. The obtained words were normalized using Remove Stop and Stem Word command of Rapid Miner_v5.3 which removes the stopping and stemming words. The obtained words were stored in MS-Excel 2007 and were sorted in decreasing order of frequency using Sort & Filter command of MS-Excel 2007. The words are visualization through networks obtained by Cytoscape_v2.7.0. Now the words obtained were highly specific for biofilms, generating a controlled biofilm vocabulary and this vocabulary could be used for indexing articles for biofilm (similar to MeSH database which indexes articles for PubMed. The obtained keywords information was stored in the relational database which is locally hosted using the WAMP_v2.4 (Windows, Apache, MySQL, PHP server. The available biofilm vocabulary will be significant for researchers studying biofilm literature, making their search easy and efficient.

  18. Aspects of Text Mining From Computational Semiotics to Systemic Functional Hypertexts

    Directory of Open Access Journals (Sweden)

    Alexander Mehler

    2001-05-01

    Full Text Available The significance of natural language texts as the prime information structure for the management and dissemination of knowledge in organisations is still increasing. Making relevant documents available depending on varying tasks in different contexts is of primary importance for any efficient task completion. Implementing this demand requires the content based processing of texts, which enables to reconstruct or, if necessary, to explore the relationship of task, context and document. Text mining is a technology that is suitable for solving problems of this kind. In the following, semiotic aspects of text mining are investigated. Based on the primary object of text mining - natural language lexis - the specific complexity of this class of signs is outlined and requirements for the implementation of text mining procedures are derived. This is done with reference to text linkage introduced as a special task in text mining. Text linkage refers to the exploration of implicit, content based relations of texts (and their annotation as typed links in corpora possibly organised as hypertexts. In this context, the term systemic functional hypertext is introduced, which distinguishes genre and register layers for the management of links in a poly-level hypertext system.

  19. Text

    International Nuclear Information System (INIS)

    Anon.

    2009-01-01

    The purpose of this act is to safeguard against the dangers and harmful effects of radioactive waste and to contribute to public safety and environmental protection by laying down requirements for the safe and efficient management of radioactive waste. We will find definitions, interrelation with other legislation, responsibilities of the state and local governments, responsibilities of radioactive waste management companies and generators, formulation of the basic plan for the control of radioactive waste, radioactive waste management ( with public information, financing and part of spent fuel management), Korea radioactive waste management corporation ( business activities, budget), establishment of a radioactive waste fund in order to secure the financial resources required for radioactive waste management, and penalties in case of improper operation of radioactive waste management. (N.C.)

  20. Mevlana’nın Menkıbeleri Üzerine Folklorik Bir İnceleme A Folkloric Analysis on the Legends of Mevlana

    Directory of Open Access Journals (Sweden)

    Gülay KARAMAN

    2012-09-01

    Full Text Available behavior to be proud of. Its plural form menakıb for the first time in this meaning, is used in the corpus of hadith which had been written and compiled by IXth century to describe the virtues of prophet Muhammed and his companians. Furthermore, writings consist of the biography of historical personages, the description of works of worthies and even some of the holy cities are also called menakıb. While at the begining menakıbnames were created in order to describe high moral values of both the prophet Muhammed and his companians in later periods, the lives of some important men of sufism and religious orders were also added to this account. The first known example of Türkish menakıbname literature is Tezkire-i Satuk Buğra Han which is from Karahanlı period. Turkish menakıbname literature that begun with Tezkire-i Satuk Buğra Han, also continued to spread quickly among the Muslim Turks came to Anatolia and settled by migrations. Since the author is a member of his own society naturally his work will be a mirror to social, cultural, economical, political life of its century. For this reason, menakıbnames which tell the extraordinary life stories of saints are very important sources of information especially for history, culture, folklore and literature. After careful studies on the legends it can be possible to reach very rich source of information. In Türkiye, Fuad Köprülü is the first name with his work called as Türk Edebiyatında İlk Mutasavvıflar who pointed out using menakıbnames in scientific studies. In this study we want to call attention to Menâkıbu’l-Ârifîn which tells the legends of Mevlana and the other Mevlevi saints. Menâkıbu’l-Ârifîn is written by Mevlevi Ahmed Eflâkî in 14th century after the request of his sheik Ulu Arif Çelebi in Persian. This menakıbname has a certain place in Turkish history and culture since it gives first-hand information about Mevlana and the other Mevlevi saints. In this

  1. Teaching Text Structure: Examining the Affordances of Children's Informational Texts

    Science.gov (United States)

    Jones, Cindy D.; Clark, Sarah K.; Reutzel, D. Ray

    2016-01-01

    This study investigated the affordances of informational texts to serve as model texts for teaching text structure to elementary school children. Content analysis of a random sampling of children's informational texts from top publishers was conducted on text structure organization and on the inclusion of text features as signals of text…

  2. Important Text Characteristics for Early-Grades Text Complexity

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Koons, Heather; Hiebert, Elfrieda H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2015-01-01

    The Common Core set a standard for all children to read increasingly complex texts throughout schooling. The purpose of the present study was to explore text characteristics specifically in relation to early-grades text complexity. Three hundred fifty primary-grades texts were selected and digitized. Twenty-two text characteristics were identified…

  3. Probing the statistical properties of unknown texts: application to the Voynich Manuscript.

    Science.gov (United States)

    Amancio, Diego R; Altmann, Eduardo G; Rybski, Diego; Oliveira, Osvaldo N; Costa, Luciano da F

    2013-01-01

    While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

  4. Text analysis methods, text analysis apparatuses, and articles of manufacture

    Science.gov (United States)

    Whitney, Paul D; Willse, Alan R; Lopresti, Charles A; White, Amanda M

    2014-10-28

    Text analysis methods, text analysis apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a text analysis method includes accessing information indicative of data content of a collection of text comprising a plurality of different topics, using a computing device, analyzing the information indicative of the data content, and using results of the analysis, identifying a presence of a new topic in the collection of text.

  5. Classroom Texting in College Students

    Science.gov (United States)

    Pettijohn, Terry F.; Frazier, Erik; Rieser, Elizabeth; Vaughn, Nicholas; Hupp-Wilds, Bobbi

    2015-01-01

    A 21-item survey on texting in the classroom was given to 235 college students. Overall, 99.6% of students owned a cellphone and 98% texted daily. Of the 138 students who texted in the classroom, most texted friends or significant others, and indicate the reason for classroom texting is boredom or work. Students who texted sent a mean of 12.21…

  6. Serum progesterone levels for diagnosing pregnancy and monitoring corpora lutea function during different reproductive stages in hormonally-treated heat synchronized female damascus goats

    International Nuclear Information System (INIS)

    Zakawi, M.

    2003-01-01

    An experiment was conducted on female damascus goats the breeding season to diagnose pregnancy on days 21-22 and 40-44 after mating and to monitor the corpora lutea function during different reproductive stages by measuring serum progesterone levels using radioimmunoassay. A total of 75 intact female damascus goats were divided into 3 equal groups, S, P and C. females in group S were fitted with sponges containing 60 mg of medroxyprogesterone acetate (MAP) for 14 days and injected, at the sponge withdrawal, with pregnant mare serum gonadotrophin (PMSG). Females in group P were injected twice with prostaglandin F 2a at 11 day intervals. Females in group C (control) received no treatment. The results indicated that the accuracy of positive pregnancy on days 21-22 and 40-44 was 90.5% and 94.4%, respectively, and it was 100% for detecting non-pregnancy. There was no significant difference(p>0.05)among the 3 groups in serum progesterone level between days 21-22 and 40-44 after mating. Whereas, there were significant(p -1 at matinf, during pregnancy and at kidding. The triplet carrying goats had a significantly(p -1 , respectively. While, there was no significant difference in serum progesterone levels between the single and twin-carrying goats

  7. Entity recognition from clinical texts via recurrent neural network.

    Science.gov (United States)

    Liu, Zengjian; Yang, Ming; Wang, Xiaolong; Chen, Qingcai; Tang, Buzhou; Wang, Zhe; Xu, Hua

    2017-07-05

    Entity recognition is one of the most primary steps for text analysis and has long attracted considerable attention from researchers. In the clinical domain, various types of entities, such as clinical entities and protected health information (PHI), widely exist in clinical texts. Recognizing these entities has become a hot topic in clinical natural language processing (NLP), and a large number of traditional machine learning methods, such as support vector machine and conditional random field, have been deployed to recognize entities from clinical texts in the past few years. In recent years, recurrent neural network (RNN), one of deep learning methods that has shown great potential on many problems including named entity recognition, also has been gradually used for entity recognition from clinical texts. In this paper, we comprehensively investigate the performance of LSTM (long-short term memory), a representative variant of RNN, on clinical entity recognition and protected health information recognition. The LSTM model consists of three layers: input layer - generates representation of each word of a sentence; LSTM layer - outputs another word representation sequence that captures the context information of each word in this sentence; Inference layer - makes tagging decisions according to the output of LSTM layer, that is, outputting a label sequence. Experiments conducted on corpora of the 2010, 2012 and 2014 i2b2 NLP challenges show that LSTM achieves highest micro-average F1-scores of 85.81% on the 2010 i2b2 medical concept extraction, 92.29% on the 2012 i2b2 clinical event detection, and 94.37% on the 2014 i2b2 de-identification, which is considerably competitive with other state-of-the-art systems. LSTM that requires no hand-crafted feature has great potential on entity recognition from clinical texts. It outperforms traditional machine learning methods that suffer from fussy feature engineering. A possible future direction is how to integrate knowledge

  8. Argumentation Within Language as Subsidy for the Evaluation of Reading Practices and Production of Argumentative Texts

    Directory of Open Access Journals (Sweden)

    Lauro Gomes

    2016-12-01

    Full Text Available This paper aims to present an evaluation proposal of the performance in reading and writing dissertative-argumentative texts, based on principles and concepts from the theory of Argumentation in Language – created by Jean-Claude Anscombre and Oswald Ducrot, especially the version of the Theory of the Semantic Blocks and the works inspired by it. The goal is to create criteria which are capable of being less intuitive in judging the performance in reading and wrinting dissertative-argumentative texts. The analysis of the corpora – the Enem 2011’s composition proposal and 50 (fifty texts written by the students – and the test of the criteria of reading and writing evaluation in this work revealed practice funcionality and efficiency of criteria. The results allow these criteria to be applied in any evaluation processes of dissertative-argumenative texts. Finally, this paper offers theoretical and methodological subisdies which can help teachers and professors to qualify their teaching of reading and writing and the evaluation of student’s texts.

  9. Observation of [Formula: see text] and [Formula: see text] decays.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Ajaltouni, Z; Akar, S; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Andreassi, G; Andreotti, M; Andrews, J E; Appleby, R B; Archilli, F; d'Argent, P; Arnau Romeu, J; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Babuschkin, I; Bachmann, S; Back, J J; Badalov, A; Baesso, C; Baker, S; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Baszczyk, M; Batozskaya, V; Batsukh, B; Battista, V; Bay, A; Beaucourt, L; Beddow, J; Bedeschi, F; Bediaga, I; Bel, L J; Bellee, V; Belloli, N; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bertolin, A; Betancourt, C; Betti, F; Bettler, M-O; van Beuzekom, M; Bezshyiko, Ia; Bifani, S; Billoir, P; Bird, T; Birnkraut, A; Bitadze, A; Bizzeti, A; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Boettcher, T; Bondar, A; Bondar, N; Bonivento, W; Bordyuzhin, I; Borgheresi, A; Borghi, S; Borisyak, M; Borsato, M; Bossu, F; Boubdir, M; Bowcock, T J V; Bowen, E; Bozzi, C; Braun, S; Britsch, M; Britton, T; Brodzicka, J; Buchanan, E; Burr, C; Bursche, A; Buytaert, J; Cadeddu, S; Calabrese, R; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D H; Capriotti, L; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carniti, P; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cavallero, G; Cenci, R; Charles, M; Charpentier, Ph; Chatzikonstantinidis, G; Chefdeville, M; Chen, S; Cheung, S-F; Chobanova, V; Chrzaszcz, M; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coco, V; Cogan, J; Cogneras, E; Cogoni, V; Cojocariu, L; Collazuol, G; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombs, G; Coquereau, S; Corti, G; Corvo, M; Costa Sobral, C M; Couturier, B; Cowan, G A; Craik, D C; Crocombe, A; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Da Cunha Marinho, F; Dall'Occo, E; Dalseno, J; David, P N Y; Davis, A; De Aguiar Francisco, O; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Serio, M; De Simone, P; Dean, C-T; Decamp, D; Deckenhoff, M; Del Buono, L; Demmer, M; Dendek, A; Derkach, D; Deschamps, O; Dettori, F; Dey, B; Di Canto, A; Dijkstra, H; Dordei, F; Dorigo, M; Dosil Suárez, A; Dovbnya, A; Dreimanis, K; Dufour, L; Dujany, G; Dungs, K; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Déléage, N; Easo, S; Ebert, M; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; Ely, S; Esen, S; Evans, H M; Evans, T; Falabella, A; Farley, N; Farry, S; Fay, R; Fazzini, D; Ferguson, D; Fernandez Prieto, A; Ferrari, F; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fini, R A; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fleuret, F; Fohl, K; Fontana, M; Fontanelli, F; Forshaw, D C; Forty, R; Franco Lima, V; Frank, M; Frei, C; Fu, J; Furfaro, E; Färber, C; Gallas Torreira, A; Galli, D; Gallorini, S; Gambetta, S; Gandelman, M; Gandini, P; Gao, Y; Garcia Martin, L M; García Pardiñas, J; Garra Tico, J; Garrido, L; Garsed, P J; Gascon, D; Gaspar, C; Gavardi, L; Gazzoni, G; Gerick, D; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianì, S; Gibson, V; Girard, O G; Giubega, L; Gizdov, K; Gligorov, V V; Golubkov, D; Golutvin, A; Gomes, A; Gorelov, I V; Gotti, C; Govorkova, E; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graverini, E; Graziani, G; Grecu, A; Griffith, P; Grillo, L; Gruberg Cazon, B R; Grünberg, O; Gushchin, E; Guz, Yu; Gys, T; Göbel, C; Hadavizadeh, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hatch, M; He, J; Head, T; Heister, A; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hombach, C; Hopchev, H; Hulsbergen, W; Humair, T; Hushchyn, M; Hussain, N; Hutchcroft, D; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jawahery, A; Jiang, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kandybei, S; Kanso, W; Karacson, M; Kariuki, J M; Karodia, S; Kecke, M; Kelsey, M; Kenyon, I R; Kenzie, M; Ketel, T; Khairullin, E; Khanji, B; Khurewathanakul, C; Kirn, T; Klaver, S; Klimaszewski, K; Koliiev, S; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Kosmyntseva, A; Kozachuk, A; Kozeiha, M; Kravchuk, L; Kreplin, K; Kreps, M; Krokovny, P; Kruse, F; Krzemien, W; Kucewicz, W; Kucharczyk, M; Kudryavtsev, V; Kuonen, A K; Kurek, K; Kvaratskheliya, T; Lacarrere, D; Lafferty, G; Lai, A; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Leflat, A; Lefrançois, J; Lefèvre, R; Lemaitre, F; Lemos Cid, E; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Likhomanenko, T; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, X; Loh, D; Longstaff, I; Lopes, J H; Lucchesi, D; Lucio Martinez, M; Luo, H; Lupato, A; Luppi, E; Lupton, O; Lusiani, A; Lyu, X; Machefert, F; Maciuc, F; Maev, O; Maguire, K; Malde, S; Malinin, A; Maltsev, T; Manca, G; Mancinelli, G; Manning, P; Maratas, J; Marchand, J F; Marconi, U; Marin Benito, C; Marino, P; Marks, J; Martellotti, G; Martin, M; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massacrier, L M; Massafferri, A; Matev, R; Mathad, A; Mathe, Z; Matteuzzi, C; Mauri, A; Maurin, B; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; Meadows, B; Meier, F; Meissner, M; Melnychuk, D; Merk, M; Merli, A; Michielin, E; Milanes, D A; Minard, M-N; Mitzel, D S; Mogini, A; Molina Rodriguez, J; Monroy, I A; Monteil, S; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Morris, A B; Mountain, R; Muheim, F; Mulder, M; Mussini, M; Müller, D; Müller, J; Müller, K; Müller, V; Naik, P; Nakada, T; Nandakumar, R; Nandi, A; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nieswand, S; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; O'Hanlon, D P; Oblakowska-Mucha, A; Obraztsov, V; Ogilvy, S; Oldeman, R; Onderwater, C J G; Otalora Goicochea, J M; Otto, A; Owen, P; Oyanguren, A; Pais, P R; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L L; Parker, W; Parkes, C; Passaleva, G; Pastore, A; Patel, G D; Patel, M; Patrignani, C; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perret, P; Pescatore, L; Petridis, K; Petrolini, A; Petrov, A; Petruzzo, M; Picatoste Olloqui, E; Pietrzyk, B; Pikies, M; Pinci, D; Pistone, A; Piucci, A; Playfer, S; Plo Casasus, M; Poikela, T; Polci, F; Poluektov, A; Polyakov, I; Polycarpo, E; Pomery, G J; Popov, A; Popov, D; Popovici, B; Poslavskii, S; Potterat, C; Price, E; Price, J D; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Quagliani, R; Rachwal, B; Rademacker, J H; Rama, M; Ramos Pernas, M; Rangel, M S; Raniuk, I; Ratnikov, F; Raven, G; Redi, F; Reichert, S; Dos Reis, A C; Remon Alepuz, C; Renaudin, V; Ricciardi, S; Richards, S; Rihl, M; Rinnert, K; Rives Molina, V; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Lopez, J A; Rodriguez Perez, P; Rogozhnikov, A; Roiser, S; Rollings, A; Romanovskiy, V; Romero Vidal, A; Ronayne, J W; Rotondo, M; Rudolph, M S; Ruf, T; Ruiz Valls, P; Saborido Silva, J J; Sadykhov, E; Sagidova, N; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santimaria, M; Santovetti, E; Sarti, A; Satriano, C; Satta, A; Saunders, D M; Savrina, D; Schael, S; Schellenberg, M; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmelzer, T; Schmidt, B; Schneider, O; Schopper, A; Schubert, K; Schubiger, M; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Semennikov, A; Sergi, A; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Siddi, B G; Silva Coutinho, R; Silva de Oliveira, L; Simi, G; Simone, S; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, E; Smith, I T; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Souza De Paula, B; Spaan, B; Spradlin, P; Sridharan, S; Stagni, F; Stahl, M; Stahl, S; Stefko, P; Stefkova, S; Steinkamp, O; Stemmle, S; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Sun, L; Sutcliffe, W; Swientek, K; Syropoulos, V; Szczekowski, M; Szumlak, T; T'Jampens, S; Tayduganov, A; Tekampe, T; Tellarini, G; Teubert, F; Thomas, E; van Tilburg, J; Tilley, M J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Toriello, F; Tournefier, E; Tourneur, S; Trabelsi, K; Traill, M; Tran, M T; Tresch, M; Trisovic, A; Tsaregorodtsev, A; Tsopelas, P; Tully, A; Tuning, N; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vacca, C; Vagnoni, V; Valassi, A; Valat, S; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vecchi, S; van Veghel, M; Velthuis, J J; Veltri, M; Veneziano, G; Venkateswaran, A; Vernet, M; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Viemann, H; Vilasis-Cardona, X; Vitti, M; Volkov, V; Vollhardt, A; Voneki, B; Vorobyev, A; Vorobyev, V; Voß, C; de Vries, J A; Vázquez Sierra, C; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wang, J; Ward, D R; Wark, H M; Watson, N K; Websdale, D; Weiden, A; Whitehead, M; Wicht, J; Wilkinson, G; Wilkinson, M; Williams, M; Williams, M P; Williams, M; Williams, T; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wraight, K; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yin, H; Yu, J; Yuan, X; Yushchenko, O; Zarebski, K A; Zavertyaev, M; Zhang, L; Zhang, Y; Zhang, Y; Zhelezov, A; Zheng, Y; Zhokhov, A; Zhu, X; Zhukov, V; Zucchelli, S

    2017-01-01

    The decays [Formula: see text] and [Formula: see text] are observed for the first time using a data sample corresponding to an integrated luminosity of 3.0 fb[Formula: see text], collected by the LHCb experiment in proton-proton collisions at the centre-of-mass energies of 7 and 8[Formula: see text]. The branching fractions relative to that of [Formula: see text] are measured to be [Formula: see text]where the first uncertainties are statistical and the second are systematic.

  10. Mining the Text: 34 Text Features that Can Ease or Obstruct Text Comprehension and Use

    Science.gov (United States)

    White, Sheida

    2012-01-01

    This article presents 34 characteristics of texts and tasks ("text features") that can make continuous (prose), noncontinuous (document), and quantitative texts easier or more difficult for adolescents and adults to comprehend and use. The text features were identified by examining the assessment tasks and associated texts in the national…

  11. The language of poetic texts in contemporary Tuvan pop songs

    Directory of Open Access Journals (Sweden)

    Oyumaa M. Saaya

    2017-06-01

    Full Text Available The article presents a linguistic analysis of lyrics of modern Tuvan pop songs. While studying them is important for understanding contemporary songwriting in Tuva, it is also necessary to discover what linguistic means, functional styles and vocabulary are used by modern authors of popular lyrics. The study can also help identify how contemporary global trends influence songwriting in means of linguistics. Three groups of songs can be defined in Tuvan pop music. The first of them comprises songs written by both professional poets and amateurs with good writing skills. Their texts have homogenous literary style and are intended for general audience (rather than specific groups of listeners. They do not feature any jargon or youth slang. The second group consists of “songs of the people” which are still popular and relevant, but not classified as folklore. This group also contains songs previously banned by censorship, and those written by ex-convicts. Their lyrics differ in style, and the vocabulary is also heterogenous: they can include slang and contain vernacular language. The third group includes songs following popular global and Russian trends, which  triggered rapid evolution in Tuvan songwriting. There is significant number of authors or even creative unions, who write both lyric and music. They are stylistically uneven, contain a lot of neologisms, borrowed vocabulary, slang and jargon words and sometimes even macaronic (mixed language. The author provides a more in-depth analysis of lyrics belonging to the third group of songs. They can be divided into 6 thematic subgroups which greatly vary in lexical content and the use of tropes. The lyrics of contemporary Tuvan songs are quite close to the everyday language young people use. Active employment of jargon in the language of young and middle-aged people, especially in lyrics of modern songs, steadily decreases the literary norms of Tuvan language. The author emphasizes that

  12. From Text to Political Positions: Text analysis across disciplines

    NARCIS (Netherlands)

    Kaal, A.R.; Maks, I.; van Elfrinkhof, A.M.E.

    2014-01-01

    ABSTRACT From Text to Political Positions addresses cross-disciplinary innovation in political text analysis for party positioning. Drawing on political science, computational methods and discourse analysis, it presents a diverse collection of analytical models including pure quantitative and

  13. Appraisal of Total Phenol, Flavonoid Contents, and Antioxidant Potential of Folkloric Lannea coromandelica Using In Vitro and In Vivo Assays

    Directory of Open Access Journals (Sweden)

    Tekeshwar Kumar

    2015-01-01

    Full Text Available The aim of this study was to determine the impending antioxidant properties of different extracts of crude methanolic extract (CME of leaves of Lannea coromandelica (L. coromandelica and its two ethyl acetate (EAF and aqueous (AqF subfractions by employing various established in vitro systems and estimation of total phenolic and flavonoid content. The results showed that extract and fractions possessed strong antioxidant activity in vitro and among them, EAF had the strongest antioxidant activity. EAF was confirmed for its highest phenolic content, total flavonoid contents, and total antioxidant capacity. The EAF was found to show remarkable scavenging activity on 2,2-diphenylpicrylhydrazyl (DPPH (EC50 63.9 ± 0.64 µg/mL, superoxide radical (EC50 8.2 ± 0.12 mg/mL, and Fe2+ chelating activity (EC50 6.2 ± 0.09 mg/mL. Based on our in vitro results, EAF was investigated for in vivo antioxidant assay. Intragastric administration of the EAF can significantly increase levels of superoxide dismutase (SOD, catalase (CAT, glutathione (GSH, and glutathione peroxidase (GSH-Px levels, and decrease malondialdehyde (MDA content in the liver and kidney of CCl4-intoxicated rats. These new evidences show that L. coromandelica bared antioxidant activity.

  14. Vestido, identidad y folklore. La invención de un vestido nacional de Guinea Ecuatorial

    Directory of Open Access Journals (Sweden)

    Valenciano-Mañé, Alba

    2012-06-01

    Full Text Available Focussing on clothing viewed as a body practice, this paper attempts to provide an ethnographic overview through the examination of the creation of the national dress of Equatorial Guinea. With a discussion of the importance of the personal motivation of its promoters, the reasons for its alleged failure and the recent attempts in current fashion shows in Malabo to (reproduce identity discourses, this paper also seeks to demonstrate the capacity of dress as a tool for social control, resistance and identity (recreation.

    Situando el vestido como práctica corporal en el centro de la reflexión, el artículo efectúa un recorrido etnográfico por el proceso de creación de un vestido nacional de Guinea Ecuatorial. Se presentará la importancia de la ecuación personal de sus impulsores, los motivos de su fracaso y las recientes tentativas de (reproducción de discursos identitarios en los actuales fashion shows de Malabo. De este modo, se pretende demostrar el potencial del vestido como herramienta de control social, resistencia y (recreación identitaria.

  15. FORMATION OF THE JUNIOR SCHOOLCHELDREN ATTITUDE TO THE NATIONAL SONG FOLKLORE BY THE MEANS OF MULTIMEDIA TECHNOLOGIES

    Directory of Open Access Journals (Sweden)

    A. Vladimirovа

    2014-06-01

    Full Text Available The national system of education of Ukraine urges music teachers to find and implement new forms and methods of classes that will help to form the national identity by engaging primary school pupils to use multimedia technologies. Formation of aesthetic attitudes of younger scholchildren to national folk song by means of multimedia technologies facilitates more efficient aesthetic, intellectual, moral and spiritual development, attracting children to creative research through solving problems of research and creative nature, more fully disclosing their natural inclinations. At present stage of the sstorage of folk song information are reproduced by the computer disks, electronic textbooks , which positively affects on the development of a coherent national cultural identity, its tastes, musical and aesthetic upbringing and acts as a favorable condition and an additional incentive for assimilation of knowledge in both educational processes of the school and in the distance . The introduction of multimedia and distance learning technologies in the classroom practice of music helps to combine didactic function of computer and traditional ways and means of education , enriches and adds to the educational process of an elementary school with new forms of work that promotes more efficient assimilation of musical training material, national folk song, customs and traditions of the Ukrainian people

  16. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  17. How strongly do word reading times and lexical decision times correlate? Combining data from eye movement corpora and megastudies.

    Science.gov (United States)

    Kuperman, Victor; Drieghe, Denis; Keuleers, Emmanuel; Brysbaert, Marc

    2013-01-01

    We assess the amount of shared variance between three measures of visual word recognition latencies: eye movement latencies, lexical decision times, and naming times. After partialling out the effects of word frequency and word length, two well-documented predictors of word recognition latencies, we see that 7-44% of the variance is uniquely shared between lexical decision times and naming times, depending on the frequency range of the words used. A similar analysis of eye movement latencies shows that the percentage of variance they uniquely share either with lexical decision times or with naming times is much lower. It is 5-17% for gaze durations and lexical decision times in studies with target words presented in neutral sentences, but drops to 0.2% for corpus studies in which eye movements to all words are analysed. Correlations between gaze durations and naming latencies are lower still. These findings suggest that processing times in isolated word processing and continuous text reading are affected by specific task demands and presentation format, and that lexical decision times and naming times are not very informative in predicting eye movement latencies in text reading once the effect of word frequency and word length are taken into account. The difference between controlled experiments and natural reading suggests that reading strategies and stimulus materials may determine the degree to which the immediacy-of-processing assumption and the eye-mind assumption apply. Fixation times are more likely to exclusively reflect the lexical processing of the currently fixated word in controlled studies with unpredictable target words rather than in natural reading of sentences or texts.

  18. Working with text tools, techniques and approaches for text mining

    CERN Document Server

    Tourte, Gregory J L

    2016-01-01

    Text mining tools and technologies have long been a part of the repository world, where they have been applied to a variety of purposes, from pragmatic aims to support tools. Research areas as diverse as biology, chemistry, sociology and criminology have seen effective use made of text mining technologies. Working With Text collects a subset of the best contributions from the 'Working with text: Tools, techniques and approaches for text mining' workshop, alongside contributions from experts in the area. Text mining tools and technologies in support of academic research include supporting research on the basis of a large body of documents, facilitating access to and reuse of extant work, and bridging between the formal academic world and areas such as traditional and social media. Jisc have funded a number of projects, including NaCTem (the National Centre for Text Mining) and the ResDis programme. Contents are developed from workshop submissions and invited contributions, including: Legal considerations in te...

  19. Informational Text and the CCSS

    Science.gov (United States)

    Aspen Institute, 2012

    2012-01-01

    What constitutes an informational text covers a broad swath of different types of texts. Biographies & memoirs, speeches, opinion pieces & argumentative essays, and historical, scientific or technical accounts of a non-narrative nature are all included in what the Common Core State Standards (CCSS) envisions as informational text. Also included…

  20. The Only Safe SMS Texting Is No SMS Texting.

    Science.gov (United States)

    Toth, Cheryl; Sacopulos, Michael J

    2015-01-01

    Many physicians and practice staff use short messaging service (SMS) text messaging to communicate with patients. But SMS text messaging is unencrypted, insecure, and does not meet HIPAA requirements. In addition, the short and abbreviated nature of text messages creates opportunities for misinterpretation, and can negatively impact patient safety and care. Until recently, asking patients to sign a statement that they understand and accept these risks--as well as having policies, device encryption, and cyber insurance in place--would have been enough to mitigate the risk of using SMS text in a medical practice. But new trends and policies have made SMS text messaging unsafe under any circumstance. This article explains these trends and policies, as well as why only secure texting or secure messaging should be used for physician-patient communication.

  1. AHP 49: 高原民俗及教育研究 STUDIES OF PLATEAU FOLKLORE & EDUCATION

    Directory of Open Access Journals (Sweden)

    AHP

    2017-08-01

    Full Text Available 本书是《亚洲高原视角(AHP》期刊的首次汉语翻译版,是很多有意者合作的结果,也体现了我们团队合作的力量。书中前言"AHP期刊及其鸣谢" 由才项多杰撰写(第5页);第一章"对《拉卜楞寺》一书的评述"由Christina Kilby Robinson 著(6-9页);第二章"豆后漏藏族村——改发与婚姻"由周毛吉, CK Stuart和Steve Frediani 著,由周毛吉译(10-48页);第三章"循化撒拉族婚礼" 由马伟, 马建忠和CK Stuart著,由赵琳译(49-101页);第四章"被遗弃的山神" 由立穆斯什典著,由旦箭花和尼道斯让译(102-123页);第五章"青藏高原上的传销" 由Devin Gonier和Rgyal yum sgrol ma 著,由朵达拉译(124-144页);第六章"青海东部农村傩祭仪式——民和土族纳顿歌" 由朱永忠和CK Stuart 著,由朱永忠译(145-158页);第七章"民和土族祝酒歌"由朱永忠和 CK Stuart 著,由朱永忠译(159-167页);第八章"情牵相守:三川土族库咕笳歌" 由朱永忠, 祁慧民和CK Stuart 著,由朱永忠译(168-198页)。 Abridged English Translation This is the first AHP volume in the Chinese language, containing the following articles translated from English to Chinese: 1 AHP Preface by Caixiangduojie 2 Review: Labrang Monastery by Christina Kilby Robinson 3 Stag rig Tibetan Village: Hair Changing and Marriage by 'Brug mo skyid, CK Stuart, Alexandru Anton-Luca and Steve Frediani ('Brug mo skyid, translator 4 The Xunhua Salar Wedding by Ma Wei, Ma Jianzhong, and CK Stuart (Zhao Ling, translator 5 An Abandoned Mountain Deity by Limusishiden (Dan Jianua and Nidaosirang, translators 6 Pyramid Schemes on the Tibetan Plateau by Devin Gonier and Rgyal yum sgrol ma (Duodala, translator 7 'Two Bodhisattvas From the East': Minhe Monguor Funeral Orations by Zhu Yongzhong and CK Stuart (Zhu Yongzhong, translator

  2. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  3. Monitoring interaction and collective text production through text mining

    Directory of Open Access Journals (Sweden)

    Macedo, Alexandra Lorandi

    2014-04-01

    Full Text Available This article presents the Concepts Network tool, developed using text mining technology. The main objective of this tool is to extract and relate terms of greatest incidence from a text and exhibit the results in the form of a graph. The Network was implemented in the Collective Text Editor (CTE which is an online tool that allows the production of texts in synchronized or non-synchronized forms. This article describes the application of the Network both in texts produced collectively and texts produced in a forum. The purpose of the tool is to offer support to the teacher in managing the high volume of data generated in the process of interaction amongst students and in the construction of the text. Specifically, the aim is to facilitate the teacher’s job by allowing him/her to process data in a shorter time than is currently demanded. The results suggest that the Concepts Network can aid the teacher, as it provides indicators of the quality of the text produced. Moreover, messages posted in forums can be analyzed without their content necessarily having to be pre-read.

  4. Text recycling: acceptable or misconduct?

    Science.gov (United States)

    Harriman, Stephanie; Patel, Jigisha

    2014-08-16

    Text recycling, also referred to as self-plagiarism, is the reproduction of an author's own text from a previous publication in a new publication. Opinions on the acceptability of this practice vary, with some viewing it as acceptable and efficient, and others as misleading and unacceptable. In light of the lack of consensus, journal editors often have difficulty deciding how to act upon the discovery of text recycling. In response to these difficulties, we have created a set of guidelines for journal editors on how to deal with text recycling. In this editorial, we discuss some of the challenges of developing these guidelines, and how authors can avoid undisclosed text recycling.

  5. TEXT DEIXIS IN NARRATIVE SEQUENCES

    Directory of Open Access Journals (Sweden)

    Josep Rivera

    2007-06-01

    Full Text Available This study looks at demonstrative descriptions, regarding them as text-deictic procedures which contribute to weave discourse reference. Text deixis is thought of as a metaphorical referential device which maps the ground of utterance onto the text itself. Demonstrative expressions with textual antecedent-triggers, considered as the most important text-deictic units, are identified in a narrative corpus consisting of J. M. Barrie’s Peter Pan and its translation into Catalan. Some linguistic and discourse variables related to DemNPs are analysed to characterise adequately text deixis. It is shown that this referential device is usually combined with abstract nouns, thus categorising and encapsulating (non-nominal complex discourse entities as nouns, while performing a referential cohesive function by means of the text deixis + general noun type of lexical cohesion.

  6. Tunical Outer Layer Plays an Essential Role in Penile Veno-occlusive Mechanism Evidenced from Electrocautery Effects to the Corpora Cavernosa in Defrosted Human Cadavers.

    Science.gov (United States)

    Hsieh, Cheng-Hsing; Huang, Yi-Ping; Tsai, Mang-Hung; Chen, Heng-Shen; Huang, Po-Cheng; Lin, Chung-Wu; Hsu, Geng-Long

    2015-12-01

    To determine the exact anatomical structure for establishing penile veno-occlusive function, we sought to conduct a hemodynamic study on defrosted human cadavers. Thirteen penises were used for this experiment, and 11 intact penises were allocated into the electrocautery group (EG, n = 6) and the ligation group (LG, n = 5). A circumcision was made on the penis to access the veins. Two #19 scalp needles were fixed in the 3 and 9 o'clock positions in the distal penis for colloid infusion and intracavernous pressure (ICP) monitoring, respectively. For the EG, the deep dorsal vein and cavernosal vein trunks were freed for 3-5 cm where at least 3 emissary veins were identified via opening Buck's fascia; these veins underwent electrocautery at 45 watts, while the ICP was maintained at 0, 50, 75, 100, 125, and 150 mmHg, respectively. For control, venous ligation was made but at the ICP of 150 mmHg. A tissue block including the emissary vein was then obtained for histological analysis. Except all in the EG and those whose ICP exceed 125 mmHg in the EG, the sinusoids of the corpora cavernosa sustained varied fulgurated fibrosis in every specimen and the severity appeared reversely commensurate with the ICP regarding sinusoidal clumping and darkish bands (P electrocautery damage to intracavernous sinusoids once the ICP reached a level corresponding to a rigid erection. The outer tunica plays an essential role in fulfilling the veno-occlusive mechanism. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Text against Text: Counterbalancing the Hegemony of Assessment.

    Science.gov (United States)

    Cosgrove, Cornelius

    A study examined whether composition specialists can counterbalance the potential privileging of the assessment perspective, or of self-appointed interpreters of that perspective, through the study of assessment discourse as text. Fourteen assessment texts were examined, most of them journal articles and most of them featuring the common…

  8. SparkText: Biomedical Text Mining on Big Data Framework.

    Directory of Open Access Journals (Sweden)

    Zhan Ye

    Full Text Available Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM, and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes.This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  9. Knowledge Representation in Travelling Texts

    DEFF Research Database (Denmark)

    Mousten, Birthe; Locmele, Gunta

    2014-01-01

    Today, information travels fast. Texts travel, too. In a corporate context, the question is how to manage which knowledge elements should travel to a new language area or market and in which form? The decision to let knowledge elements travel or not travel highly depends on the limitation...... and the purpose of the text in a new context as well as on predefined parameters for text travel. For texts used in marketing and in technology, the question is whether culture-bound knowledge representation should be domesticated or kept as foreign elements, or should be mirrored or moulded—or should not travel...... at all! When should semantic and pragmatic elements in a text be replaced and by which other elements? The empirical basis of our work is marketing and technical texts in English, which travel into the Latvian and Danish markets, respectively....

  10. Texting while driving: is speech-based text entry less risky than handheld text entry?

    Science.gov (United States)

    He, J; Chaparro, A; Nguyen, B; Burge, R J; Crandall, J; Chaparro, B; Ni, R; Cao, S

    2014-11-01

    Research indicates that using a cell phone to talk or text while maneuvering a vehicle impairs driving performance. However, few published studies directly compare the distracting effects of texting using a hands-free (i.e., speech-based interface) versus handheld cell phone, which is an important issue for legislation, automotive interface design and driving safety training. This study compared the effect of speech-based versus handheld text entries on simulated driving performance by asking participants to perform a car following task while controlling the duration of a secondary text-entry task. Results showed that both speech-based and handheld text entries impaired driving performance relative to the drive-only condition by causing more variation in speed and lane position. Handheld text entry also increased the brake response time and increased variation in headway distance. Text entry using a speech-based cell phone was less detrimental to driving performance than handheld text entry. Nevertheless, the speech-based text entry task still significantly impaired driving compared to the drive-only condition. These results suggest that speech-based text entry disrupts driving, but reduces the level of performance interference compared to text entry with a handheld device. In addition, the difference in the distraction effect caused by speech-based and handheld text entry is not simply due to the difference in task duration. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. SparkText: Biomedical Text Mining on Big Data Framework

    Science.gov (United States)

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  12. SparkText: Biomedical Text Mining on Big Data Framework.

    Science.gov (United States)

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  13. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  14. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  15. Larger Subcortical Gray Matter Structures and Smaller Corpora Callosa at Age 5 Years in HIV Infected Children on Early ART

    Directory of Open Access Journals (Sweden)

    Steven R. Randall

    2017-11-01

    Full Text Available Sub-Saharan Africa is home to 90% of HIV infected (HIV+ children. Since the advent of antiretroviral therapy (ART, HIV/AIDS has transitioned to a chronic condition where central nervous system (CNS damage may be ongoing. Although, most guidelines recommend early ART to reduce CNS viral reservoirs, the brain may be more vulnerable to potential neurotoxic effects of ART during the rapid development phase in the first years of life. Here we investigate differences in subcortical volumes between 5-year-old HIV+ children who received early ART (before age 18 months and uninfected children using manual tracing of Magnetic Resonance Images. Participants included 61 Xhosa children (43 HIV+/18 uninfected, mean age = 5.4 ± 0.3 years, 25 male from the children with HIV early antiretroviral (CHER trial; 27 children initiated ART before 12 weeks of age (ART-Before12Wks and 16 after 12 weeks (ART-After12Wks. Structural images were acquired on a 3T Allegra MRI in Cape Town and manually traced using MultiTracer. Volumetric group differences (HIV+ vs. uninfected; ART-Before12Wks vs. ART-After12Wks were examined for the caudate, nucleus accumbens (NA, putamen (Pu, globus pallidus (GP, and corpus callosum (CC, as well as associations within infected children of structure volumes with age at ART initiation and CD4/CD8 as a proxy for immune health. HIV+ children had significantly larger NA and Pu volumes bilaterally and left GP volumes than controls, whilst CC was smaller. Bilateral Pu was larger in both treatment groups compared to controls, while left GP and bilateral NA were enlarged only in ART-After12Wks children. CC was smaller in both treatment groups compared to controls, and smaller in ART-After12Wks compared to ART-Before12Wks. Within infected children, delayed ART initiation was associated with larger Pu volumes, effects that remained significant when controlling for sex and duration of treatment interruption (left β = 0.447, p = 0.005; right β = 0

  16. Text and ideology: text-oriented discourse analysis

    Directory of Open Access Journals (Sweden)

    Maria Eduarda Gonçalves Peixoto

    2018-04-01

    Full Text Available The article aims to contribute to the understanding of the connection between text and ideology articulated by the text-oriented analysis of discourse (ADTO. Based on the reflections of Fairclough (1989, 2001, 2003 and Fairclough and Chouliaraki (1999, the debate presents the social ontology that ADTO uses to base its conception of social life as an open system and textually mediated; the article then explains the chronological-narrative development of the main critical theories of ideology, by virtue of which ADTO organizes the assumptions that underpin the particular use it makes of the term. Finally, the discussion presents the main aspects of the connection between text and ideology, offering a conceptual framework that can contribute to the domain of the theme according to a critical discourse analysis approach.

  17. English Metafunction Analysis in Chemistry Text: Characterization of Scientific Text

    Directory of Open Access Journals (Sweden)

    Ahmad Amin Dalimunte, M.Hum

    2013-09-01

    Full Text Available The objectives of this research are to identify what Metafunctions are applied in chemistry text and how they characterize a scientific text. It was conducted by applying content analysis. The data for this research was a twelve-paragraph chemistry text. The data were collected by applying a documentary technique. The document was read and analyzed to find out the Metafunction. The data were analyzed by some procedures: identifying the types of process, counting up the number of the processes, categorizing and counting up the cohesion devices, classifying the types of modulation and determining modality value, finally counting up the number of sentences and clauses, then scoring the grammatical intricacy index. The findings of the research show that Material process (71of 100 is mostly used, circumstance of spatial location (26 of 56 is more dominant than the others. Modality (5 is less used in order to avoid from subjectivity. Impersonality is implied through less use of reference either pronouns (7 or demonstrative (7, conjunctions (60 are applied to develop ideas, and the total number of the clauses are found much more dominant (109 than the total number of the sentences (40 which results high grammatical intricacy index. The Metafunction found indicate that the chemistry text has fulfilled the characteristics of scientific or academic text which truly reflects it as a natural science.

  18. Text Genres in Information Organization

    Science.gov (United States)

    Nahotko, Marek

    2016-01-01

    Introduction: Text genres used by so-called information organizers in the processes of information organization in information systems were explored in this research. Method: The research employed text genre socio-functional analysis. Five genre groups in information organization were distinguished. Every genre group used in information…

  19. Strategies for Translating Vocative Texts

    Directory of Open Access Journals (Sweden)

    Olga COJOCARU

    2014-12-01

    Full Text Available The paper deals with the linguistic and cultural elements of vocative texts and the techniques used in translating them by giving some examples of texts that are typically vocative (i.e. advertisements and instructions for use. Semantic and communicative strategies are popular in translation studies and each of them has its own advantages and disadvantages in translating vocative texts. The advantage of semantic translation is that it takes more account of the aesthetic value of the SL text, while communicative translation attempts to render the exact contextual meaning of the original text in such a way that both content and language are readily acceptable and comprehensible to the readership. Focus is laid on the strategies used in translating vocative texts, strategies that highlight and introduce a cultural context to the target audience, in order to achieve their overall purpose, that is to sell or persuade the reader to behave in a certain way. Thus, in order to do that, a number of advertisements from the field of cosmetics industry and electronic gadgets were selected for analysis. The aim is to gather insights into vocative text translation and to create new perspectives on this field of research, now considered a process of innovation and diversion, especially in areas as important as economy and marketing.

  20. POLTERGEIST PHENOMENA IN CONTEMPORARY FOLKLORE

    OpenAIRE

    Oana VOICHICI

    2017-01-01

    The article deals with instances of the supernatural in Romanian urban legends, namely what we call the strigoi , or poltergeist. Usually, folklorists tend to exclude the supernatural f rom the category of urban legends, however we have decided to take these accounts into consideration based on the fact that the transmitter, the narrators do not distinguish between these elements and the rest of contemporary legends and today’s popular cu lture abounds in such accounts.

  1. Discovering Folklore Through Community Resources.

    Science.gov (United States)

    Sumpter, Magdalena Benavides, Ed.

    The folkways and cultural heritage of the Mexican Americans of South Texas are explored in this volume which is designed to provide the student with the opportunity for cultural enrichment, oral language development, and vocabulary expansion. The first chapter deals with "Creencias" which are common beliefs handed down from generation to…

  2. Systematic characterizations of text similarity in full text biomedical publications.

    Science.gov (United States)

    Sun, Zhaohui; Errami, Mounir; Long, Tara; Renard, Chris; Choradia, Nishant; Garner, Harold

    2010-09-15

    Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text articles are becoming increasingly available, yet the similarities among them have not been systematically studied. Here, we quantitatively investigated the full text similarity of biomedical publications in PubMed Central. 72,011 full text articles from PubMed Central (PMC) were parsed to generate three different datasets: full texts, sections, and paragraphs. Text similarity comparisons were performed on these datasets using the text similarity algorithm eTBLAST. We measured the frequency of similar text pairs and compared it among different datasets. We found that high abstract similarity can be used to predict high full text similarity with a specificity of 20.1% (95% CI [17.3%, 23.1%]) and sensitivity of 99.999%. Abstract similarity and full text similarity have a moderate correlation (Pearson correlation coefficient: -0.423) when the similarity ratio is above 0.4. Among pairs of articles in PMC, method sections are found to be the most repetitive (frequency of similar pairs, methods: 0.029, introduction: 0.0076, results: 0.0043). In contrast, among a set of manually verified duplicate articles, results are the most repetitive sections (frequency of similar pairs, results: 0.94, methods: 0.89, introduction: 0.82). Repetition of introduction and methods sections is more likely to be committed by the same authors (odds of a highly similar pair having at least one shared author, introduction: 2.31, methods: 1.83, results: 1.03). There is also significantly more similarity in pairs of review articles than in pairs containing one review and one nonreview paper (frequency of similar pairs: 0.0167 and 0.0023, respectively). While quantifying abstract similarity is an effective approach for finding duplicate citations, a comprehensive full text analysis is necessary to uncover all potential duplicate citations in the scientific literature and is helpful when

  3. Linguistic Dating of Biblical Texts

    DEFF Research Database (Denmark)

    Ehrensvärd, Martin Gustaf

    2003-01-01

    For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed the chronol......For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed...... the chronology of the texts established by other means: the Hebrew of Genesis-2 Kings was judged to be early and that of Esther, Daniel, Ezra, Nehemiah, and Chronicles to be late. In the current debate where revisionists have questioned the traditional dating, linguistic arguments in the dating of texts have...... come more into focus. The study critically examines some linguistic arguments adduced to support the traditional position, and reviewing the arguments it points to weaknesses in the linguistic dating of EBH texts to pre-exilic times. When viewing the linguistic evidence in isolation it will be clear...

  4. SAIL: Summation-bAsed Incremental Learning for Information-Theoretic Text Clustering.

    Science.gov (United States)

    Cao, Jie; Wu, Zhiang; Wu, Junjie; Xiong, Hui

    2013-04-01

    Information-theoretic clustering aims to exploit information-theoretic measures as the clustering criteria. A common practice on this topic is the so-called Info-Kmeans, which performs K-means clustering with KL-divergence as the proximity function. While expert efforts on Info-Kmeans have shown promising results, a remaining challenge is to deal with high-dimensional sparse data such as text corpora. Indeed, it is possible that the centroids contain many zero-value features for high-dimensional text vectors, which leads to infinite KL-divergence values and creates a dilemma in assigning objects to centroids during the iteration process of Info-Kmeans. To meet this challenge, in this paper, we propose a Summation-bAsed Incremental Learning (SAIL) algorithm for Info-Kmeans clustering. Specifically, by using an equivalent objective function, SAIL replaces the computation of KL-divergence by the incremental computation of Shannon entropy. This can avoid the zero-feature dilemma caused by the use of KL-divergence. To improve the clustering quality, we further introduce the variable neighborhood search scheme and propose the V-SAIL algorithm, which is then accelerated by a multithreaded scheme in PV-SAIL. Our experimental results on various real-world text collections have shown that, with SAIL as a booster, the clustering performance of Info-Kmeans can be significantly improved. Also, V-SAIL and PV-SAIL indeed help improve the clustering quality at a lower cost of computation.

  5. Biomarker Identification Using Text Mining

    Directory of Open Access Journals (Sweden)

    Hui Li

    2012-01-01

    Full Text Available Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.

  6. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  7. Anomaly Detection with Text Mining

    Data.gov (United States)

    National Aeronautics and Space Administration — Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The...

  8. Social Studies: Texts and Supplements.

    Science.gov (United States)

    Curriculum Review, 1979

    1979-01-01

    This review of selected social studies texts, series, and supplements, mainly for the secondary level, includes a special section examining eight titles on warfare and terrorism for grades 4-12. (SJL)

  9. Text Mining in Organizational Research.

    Science.gov (United States)

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  10. Inquérito sôbre a incidência da esquistossomose mansônica entre indivíduos interessados em ingressar em corporação militar do Estado de São Paulo: considerações sôbre a referida verminose como causa de rejeição de candidatos a empregos

    Directory of Open Access Journals (Sweden)

    Vicente Amato Neto

    1970-10-01

    Full Text Available Em várias regiões do Brasil, há rejeição, por diferentes instituições, de indivíduos com esquistossomíase mansônica que se candidatam a empregos, sem serem levados em conta os estádios evolutivos da verminose. Preocupados com essa questão e com a finalidade de coletar, a título de exemplo, informação objetiva sobre aspecto prático a ela concernente, efetuaram os autores inquérito entre 601 pessoas interessadas em ingressar em corporação militar da cidade de São Paulo, baseado na utilização da prova intradérmica para o diagnóstico da helmintíase. Registraram a percentagem de positividade de 13,3%, considerada muito expressiva e tradutora de situação concreta, merecedora de enfática consideração, em face às implicações, de múltiplas ordens, tais como social, econômica e médica, que encerra.

  11. Challenges for automatically extracting molecular interactions from full-text articles.

    Science.gov (United States)

    McIntosh, Tara; Curran, James R

    2009-09-24

    The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved.We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.

  12. GPU-Accelerated Text Mining

    International Nuclear Information System (INIS)

    Cui, X.; Mueller, F.; Zhang, Y.; Potok, Thomas E.

    2009-01-01

    Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit and report on atomic instruction usage that have recently become available in NVIDIA devices

  13. Comprehending text in literature class

    Directory of Open Access Journals (Sweden)

    Purić Daliborka S.

    2016-01-01

    Full Text Available The paper discusses the problem of understanding a text and the contribution of methodological apparatus in the reader book to comprehension of a text being read in junior classes of elementary school. By using the technique of content analysis from methodological apparatuses in eight reader books for the fourth grade of elementary school, approved for usage in 2014/2015 academic year, and surveying 350 teachers in 33 elementary schools and 11 administrative districts in the Republic of Serbia we examined: (a to what extent the Serbian language text book contents enable junior students to understand a literary text; (b to what extent teachers accept the suggestions offered in the textbook for preparing literature teaching. The results show that a large number of suggestions relate to reading comprehension, but some of categories of understanding are unevenly distributed in the methodological apparatus. On the other hand, the majority of teachers use the methodological apparatus given in a textbook for preparing classes, not only the textbook he or she selected for teaching but also other textbooks for the same grade.

  14. Automated Analysis of Corpora Callosa

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille; Davies, Rhodri H.

    2003-01-01

    This report describes and evaluates the steps needed to perform modern model-based interpretation of the corpus callosum in MRI. The process is discussed from the initial landmark-free contours to full-fledged statistical models based on the Active Appearance Models framework. Topics treated incl...... include landmark placement, background modelling and multi-resolution analysis. Preliminary quantitative and qualitative validation in a cross-sectional study show that fully automated analysis and segmentation of the corpus callosum are feasible....

  15. Linguistic Corpora and Language Teaching.

    Science.gov (United States)

    Murison-Bowie, Simon

    1996-01-01

    Examines issues raised by corpus linguistics concerning the description of language. The article argues that it is necessary to start from correct descriptions of linguistic units and the contexts in which they occur. Corpus linguistics has joined with language teaching by sharing a recognition of the importance of a larger, schematic view of…

  16. A Guide Text or Many Texts? "That is the Question”

    Directory of Open Access Journals (Sweden)

    Delgado de Valencia Sonia

    2001-08-01

    Full Text Available The use of supplementary materials in the classroom has always been an essential part of the teaching and learning process. To restrict our teaching to the scope of one single textbook means to stand behind the advances of knowledge, in any area and context. Young learners appreciate any new and varied support that expands their knowledge of the world: diaries, letters, panels, free texts, magazines, short stories, poems or literary excerpts, and articles taken from Internet are materials that will allow learnersto share more and work more collaboratively. In this article we are going to deal with some of these materials, with the criteria to select, adapt, and create them that may be of interest to the learner and that may promote reading and writing processes. Since no text can entirely satisfy the needs of students and teachers, the creativity of both parties will be necessary to improve the quality of teaching through the adequate use and adaptation of supplementary materials.

  17. Individual Profiling Using Text Analysis

    Science.gov (United States)

    2016-04-15

    AFRL-AFOSR-UK-TR-2016-0011 Individual Profiling using Text Analysis 140333 Mark Stevenson UNIVERSITY OF SHEFFIELD, DEPARTMENT OF PSYCHOLOGY Final...REPORT TYPE      Final 3.  DATES COVERED (From - To)      15 Sep 2014 to 14 Sep 2015 4.  TITLE AND SUBTITLE Individual Profiling using Text Analysis ...consisted of collections of tweets for a number of Twitter users whose gender, age and personality scores are known. The task was to construct some system

  18. Identifying issue frames in text.

    Directory of Open Access Journals (Sweden)

    Eyal Sagi

    Full Text Available Framing, the effect of context on cognitive processes, is a prominent topic of research in psychology and public opinion research. Research on framing has traditionally relied on controlled experiments and manually annotated document collections. In this paper we present a method that allows for quantifying the relative strengths of competing linguistic frames based on corpus analysis. This method requires little human intervention and can therefore be efficiently applied to large bodies of text. We demonstrate its effectiveness by tracking changes in the framing of terror over time and comparing the framing of abortion by Democrats and Republicans in the U.S.

  19. Finding text in color images

    Science.gov (United States)

    Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga

    1998-04-01

    In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.

  20. Menzerath-Altmann law for distinct word distribution analysis in a large text

    Science.gov (United States)

    Eroglu, Sertac

    2013-06-01

    The empirical law uncovered by Menzerath and formulated by Altmann, known as the Menzerath-Altmann law (henceforth the MA law), reveals the statistical distribution behavior of human language in various organizational levels. Building on previous studies relating organizational regularities in a language, we propose that the distribution of distinct (or different) words in a large text can effectively be described by the MA law. The validity of the proposition is demonstrated by examining two text corpora written in different languages not belonging to the same language family (English and Turkish). The results show not only that distinct word distribution behavior can accurately be predicted by the MA law, but that this result appears to be language-independent. This result is important not only for quantitative linguistic studies, but also may have significance for other naturally occurring organizations that display analogous organizational behavior. We also deliberately demonstrate that the MA law is a special case of the probability function of the generalized gamma distribution.

  1. Automated analysis of instructional text

    Energy Technology Data Exchange (ETDEWEB)

    Norton, L.M.

    1983-05-01

    The development of a capability for automated processing of natural language text is a long-range goal of artificial intelligence. This paper discusses an investigation into the issues involved in the comprehension of descriptive, as opposed to illustrative, textual material. The comprehension process is viewed as the conversion of knowledge from one representation into another. The proposed target representation consists of statements of the prolog language, which can be interpreted both declaratively and procedurally, much like production rules. A computer program has been written to model in detail some ideas about this process. The program successfully analyzes several heavily edited paragraphs adapted from an elementary textbook on programming, automatically synthesizing as a result of the analysis a working Prolog program which, when executed, can parse and interpret let commands in the basic language. The paper discusses the motivations and philosophy of the project, the many kinds of prerequisite knowledge which are necessary, and the structure of the text analysis program. A sentence-by-sentence account of the analysis of the sample text is presented, describing the syntactic and semantic processing which is involved. The paper closes with a discussion of lessons learned from the project, possible alternative approaches, and possible extensions for future work. The entire project is presented as illustrative of the nature and complexity of the text analysis process, rather than as providing definitive or optimal solutions to any aspects of the task. 12 references.

  2. Solar Concepts: A Background Text.

    Science.gov (United States)

    Gorham, Jonathan W.

    This text is designed to provide teachers, students, and the general public with an overview of key solar energy concepts. Various energy terms are defined and explained. Basic thermodynamic laws are discussed. Alternative energy production is described in the context of the present energy situation. Described are the principal contemporary solar…

  3. FTP: Full-Text Publishing?

    Science.gov (United States)

    Jul, Erik

    1992-01-01

    Describes the use of file transfer protocol (FTP) on the INTERNET computer network and considers its use as an electronic publishing system. The differing electronic formats of text files are discussed; the preparation and access of documents are described; and problems are addressed, including a lack of consistency. (LRW)

  4. Quality Inspection of Printed Texts

    DEFF Research Database (Denmark)

    Pedersen, Jesper Ballisager; Nasrollahi, Kamal; Moeslund, Thomas B.

    2016-01-01

    -folded: for costumers of the printing and verification system, the overall grade used to verify if the text is of sufficient quality, while for printer's manufacturer, the detailed character/symbols grades and quality measurements are used for the improvement and optimization of the printing task. The proposed system...

  5. Sonidos de un Chile profundo: Hacia un análisis crítico del Archivo Sonoro de Música Tradicional Chilena en relación a la conformación del folclore en Chile Sounds from the Depth of Chile: Toward a Critical Analysis of the Sound Archive of Chilean Traditional Music as regards the establishing of Folklore in Chile

    Directory of Open Access Journals (Sweden)

    Mariana León Villagra

    2011-06-01

    Full Text Available La revisión del proceso de rescate patrimonial del Archivo Sonoro de Música Tradicional permite criticar los conceptos de patrimonio e identidad pertinentes, a la luz de la actual situación de las músicas locales y tradicionales en el contexto globalizado de las tecnologías digitales. Según esta perspectiva, es necesario recrear la historia del desarrollismo cultural chileno de la década de los 40 y de los 50, mediante el análisis de la construcción de la identidad nacional bajo el concepto de folclore. Al poner en valor estas músicas y sonoridades tradicionales, fortalecemos la presencia de las identidades locales en la cultura chilena, destacando su importancia para una democratización real de las políticas culturales del estado.The process of rescueing from destruction the patrimonial legacy contained in the Sound Archive of Chilean Traditional Music serves as a basis for a critical review of the concepts of identity and patrimony within the current situation of traditional and local musics in the worldwide context of digital technologies. According to this perspective, it is necessary to review the history of the Chilean cultural development of the 40's and 50's, focusing on the construction of a national identity based on the concept of folklore. Ifthe value of these traditional musics and sounds is brought to light it is possible to strengthen the presence of local identities in the Chilean culture, thus emphasizing their importance for cultural state policies aiming to be truly democratic.

  6. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  7. Linguistic dating of biblical texts

    DEFF Research Database (Denmark)

    Young, Ian; Rezetko, Robert; Ehrensvärd, Martin Gustaf

    Since the beginning of critical scholarship biblical texts have been dated using linguistic evidence. In recent years this has become a controversial topic, especially with the publication of Ian Young (ed.), Biblical Hebrew: Studies in Chronology and Typology (2003). However, until now there has...... been no introduction and comprehensive study of the field. Volume 1 introduces the field of linguistic dating of biblical texts, particularly to intermediate and advanced students of biblical Hebrew who have a reasonable background in the language, having completed at least an introductory course...... in this volume are: What is it that makes Archaic Biblical Hebrew archaic , Early Biblical Hebrew early , and Late Biblical Hebrew late ? Does linguistic typology, i.e. different linguistic characteristics, convert easily and neatly into linguistic chronology, i.e. different historical origins? A large amount...

  8. Text as an Autopoietic System

    DEFF Research Database (Denmark)

    Nicolaisen, Maria Skou

    2016-01-01

    The aim of the present research article is to discuss the possibilities and limitations in addressing text as an autopoietic system. The theory of autopoiesis originated in the field of biology in order to explain the dynamic processes entailed in sustaining living organisms at cellular level. Th....... By comparing the biological with the textual account of autopoietic agency, the end conclusion is that a newly derived concept of sociopoiesis might be better suited for discussing the architecture of textual systems....

  9. The TEXT upgrade vertical interferometer

    International Nuclear Information System (INIS)

    Hallock, G.A.; Gartman, M.L.; Li, W.; Chiang, K.; Shin, S.; Castles, R.L.; Chatterjee, R.; Rahman, A.S.

    1992-01-01

    A far-infrared interferometer has been installed on TEXT upgrade to obtain electron density profiles. The primary system views the plasma vertically through a set of large (60-cm radialx7.62-cm toroidal) diagnostic ports. A 1-cm channel spacing (59 channels total) and fast electronic time response is used, to provide high resolution for radial profiles and perturbation experiments. Initial operation of the vertical system was obtained late in 1991, with six operating channels

  10. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  11. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  12. The Balinese Unicode Text Processing

    Directory of Open Access Journals (Sweden)

    Imam Habibi

    2009-06-01

    Full Text Available In principal, the computer only recognizes numbers as the representation of a character. Therefore, there are many encoding systems to allocate these numbers although not all characters are covered. In Europe, every single language even needs more than one encoding system. Hence, a new encoding system known as Unicode has been established to overcome this problem. Unicode provides unique id for each different characters which does not depend on platform, program, and language. Unicode standard has been applied in a number of industries, such as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, and Unisys. In addition, language standards and modern information exchanges such as XML, Java, ECMA Script (JavaScript, LDAP, CORBA 3.0, and WML make use of Unicode as an official tool for implementing ISO/IEC 10646. There are four things to do according to Balinese script: the algorithm of transliteration, searching, sorting, and word boundary analysis (spell checking. To verify the truth of algorithm, some applications are made. These applications can run on Linux/Windows OS platform using J2SDK 1.5 and J2ME WTK2 library. The input and output of the algorithm/application are character sequence that is obtained from keyboard punch and external file. This research produces a module or a library which is able to process the Balinese text based on Unicode standard. The output of this research is the ability, skill, and mastering of 1. Unicode standard (21-bit as a substitution to ASCII (7-bit and ISO8859-1 (8-bit as the former default character set in many applications. 2. The Balinese Unicode text processing algorithm. 3. An experience of working with and learning from an international team that consists of the foremost experts in the area: Michael Everson (Ireland, Peter Constable (Microsoft US, I Made Suatjana, and Ida Bagus Adi Sudewa.

  13. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  14. Biased limiter experiments on text

    International Nuclear Information System (INIS)

    Phillips, P.E.; Wootton, A.J.; Rowan, W.L.; Ritz, C.P.; Rhodes, T.L.; Bengtson, R.D.; Hodge, W.L.; Durst, R.D.; McCool, S.C.; Richards, B.; Gentle, K.W.; Schoch, P.; Forster, J.C.; Hickok, R.L.; Evans, T.E.

    1987-01-01

    Experiments using an electrically biased limiter have been performed on the Texas Experimental Tokamak (TEXT). A small movable limiter is inserted past the main poloidal ring limiter (which is electrically connected to the vacuum vessel) and biased at V Lim with respect to it. The floating potential, plasma potential and shear layer position can be controlled. With vertical strokeV Lim vertical stroke ≥ 50 V the plasma density increases. For V Lim Lim > 0 the results obtained are inconclusive. Variation of V Lim changes the electrostatic turbulence which may explain the observed total flux changes. (orig.)

  15. New Historicism: Text and Context

    Directory of Open Access Journals (Sweden)

    Violeta M. Vesić

    2016-02-01

    Full Text Available During most of the twentieth century history was seen as a phenomenon outside of literature that guaranteed the veracity of literary interpretation. History was unique and it functioned as a basis for reading literary works. During the seventies of the twentieth century there occurred a change of attitude towards history in American literary theory, and there appeared a new theoretical approach which soon became known as New Historicism. Since its inception, New Historicism has been identified with the study of Renaissance and Romanticism, but nowadays it has been increasingly involved in other literary trends. Although there are great differences in the arguments and practices at various representatives of this school, New Historicism has clearly recognizable features and many new historicists will agree with the statement of Walter Cohen that New Historicism, when it appeared in the eighties, represented something quite new in reference to the studies of theory, criticism and history (Cohen 1987, 33. Theoretical connection with Bakhtin, Foucault and Marx is clear, as well as a kind of uneasy tie with deconstruction and the work of Paul de Man. At the center of this approach is a renewed interest in the study of literary works in the light of historical and political circumstances in which they were created. Foucault encouraged readers to begin to move literary texts and to link them with discourses and representations that are not literary, as well as to examine the sociological aspects of the texts in order to take part in the social struggles of today. The study of literary works using New Historicism is the study of politics, history, culture and circumstances in which these works were created. With regard to one of the main fact which is located in the center of the criticism, that history cannot be viewed objectively and that reality can only be understood through a cultural context that reveals the work, re-reading and interpretation of

  16. Knowledge based word-concept model estimation and refinement for biomedical text mining.

    Science.gov (United States)

    Jimeno Yepes, Antonio; Berlanga, Rafael

    2015-02-01

    Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  18. Linguistic positivity in historical texts reflects dynamic environmental and psychological factors.

    Science.gov (United States)

    Iliev, Rumen; Hoover, Joe; Dehghani, Morteza; Axelrod, Robert

    2016-12-06

    People use more positive words than negative words. Referred to as "linguistic positivity bias" (LPB), this effect has been found across cultures and languages, prompting the conclusion that it is a panhuman tendency. However, although multiple competing explanations of LPB have been proposed, there is still no consensus on what mechanism(s) generate LPB or even on whether it is driven primarily by universal cognitive features or by environmental factors. In this work we propose that LPB has remained unresolved because previous research has neglected an essential dimension of language: time. In four studies conducted with two independent, time-stamped text corpora (Google books Ngrams and the New York Times), we found that LPB in American English has decreased during the last two centuries. We also observed dynamic fluctuations in LPB that were predicted by changes in objective environment, i.e., war and economic hardships, and by changes in national subjective happiness. In addition to providing evidence that LPB is a dynamic phenomenon, these results suggest that cognitive mechanisms alone cannot account for the observed dynamic fluctuations in LPB. At the least, LPB likely arises from multiple interacting mechanisms involving subjective, objective, and societal factors. In addition to having theoretical significance, our results demonstrate the value of newly available data sources in addressing long-standing scientific questions.

  19. A New English?Arabic Parallel Text Corpus for Lexicographic Applications

    Directory of Open Access Journals (Sweden)

    Hashan Al-Ajmi

    2011-10-01

    Full Text Available

    Abstract: Bilingual lexicographers, translation specialists and English teachers in the Arabworld do not have access to computerized corpora of parallel texts for the English–Arabic languagepair. This project has been carried out to meet this requirement by establishing the first generalparallel corpus of English texts and their Arabic translations. The first phase of the project involvedthe selection of general source texts having appropriate lexical and stylistic features. The chosensource texts deal with a variety of topics such as the environment, globalization, psychology, history,politics, drama, etc. Their Arabic translations were taken from The World of Knowledge seriespublished by the National Council for Culture, Arts and Letters (NCCAL in Kuwait.

    Keywords: PARALLEL CORPUS, LEXICOGRAPHY, TRANSLATION, BILINGUAL DICTIONARY,COLLOCATIONS, ALIGNMENT, SYNONYMS, DERIVATIVES, ANTONYMS, GLOSSARY,FREQUENCY

    Opsomming: 'n Nuwe Engels–Arabiese parallelletekskorpus vir leksikografiesetoepassings Tweetalige leksikograwe, vertaalkundiges en Engelsonderwysers in dieArabiese wêreld het nie toegang tot gerekenariseerde korpusse van parallelle tekste vir die Engels–Arabiese taalpaar nie. Hierdie projek is onderneem om in dié behoefte te voorsien deur die eerstealgemene parallelle korpus van Engelse tekste en hul Arabiese vertalings tot stand te bring. Dieeerste fase van die projek het die keuse van algemene brontekste behels wat geskikte leksikale enstilistiese eienskappe besit. Die gekose brontekste handel oor 'n verskeidenheid onderwerpe soosdie omgewing, globalisering, psigologie, geskiedenis, politiek, drama, ens. Hul Arabiese vertalingsis geneem uit The World of Knowledge-reeks gepubliseer deur die National Council for Culture, Artsand Letters (NCCAL in Koeweit.

    Sleutelwoorde: PARALLELLE KORPUS, LEKSIKOGRAFIE, VERTALING, TWEETALIGEWOORDEBOEK, KOLLOKASIES, OOREENSTEMMING, SINONIEME, AFLEIDINGS, ANTONIEME

  20. A programmed text in statistics

    CERN Document Server

    Hine, J

    1975-01-01

    Exercises for Section 2 42 Physical sciences and engineering 42 43 Biological sciences 45 Social sciences Solutions to Exercises, Section 1 47 Physical sciences and engineering 47 49 Biological sciences 49 Social sciences Solutions to Exercises, Section 2 51 51 PhYSical sciences and engineering 55 Biological sciences 58 Social sciences 62 Tables 2 62 x - tests involving variances 2 63,64 x - one tailed tests 2 65 x - two tailed tests F-distribution 66-69 Preface This project started some years ago when the Nuffield Foundation kindly gave a grant for writing a pro­ grammed text to use with service courses in statistics. The work carried out by Mrs. Joan Hine and Professor G. B. Wetherill at Bath University, together with some other help from time to time by colleagues at Bath University and elsewhere. Testing was done at various colleges and universities, and some helpful comments were received, but we particularly mention King Edwards School, Bath, who provided some sixth formers as 'guinea pigs' for the fir...

  1. Cell line name recognition in support of the identification of synthetic lethality in cancer from text

    Science.gov (United States)

    Kaewphan, Suwisa; Van Landeghem, Sofie; Ohta, Tomoko; Van de Peer, Yves; Ginter, Filip; Pyysalo, Sampo

    2016-01-01

    Motivation: The recognition and normalization of cell line names in text is an important task in biomedical text mining research, facilitating for instance the identification of synthetically lethal genes from the literature. While several tools have previously been developed to address cell line recognition, it is unclear whether available systems can perform sufficiently well in realistic and broad-coverage applications such as extracting synthetically lethal genes from the cancer literature. In this study, we revisit the cell line name recognition task, evaluating both available systems and newly introduced methods on various resources to obtain a reliable tagger not tied to any specific subdomain. In support of this task, we introduce two text collections manually annotated for cell line names: the broad-coverage corpus Gellus and CLL, a focused target domain corpus. Results: We find that the best performance is achieved using NERsuite, a machine learning system based on Conditional Random Fields, trained on the Gellus corpus and supported with a dictionary of cell line names. The system achieves an F-score of 88.46% on the test set of Gellus and 85.98% on the independently annotated CLL corpus. It was further applied at large scale to 24 302 102 unannotated articles, resulting in the identification of 5 181 342 cell line mentions, normalized to 11 755 unique cell line database identifiers. Availability and implementation: The manually annotated datasets, the cell line dictionary, derived corpora, NERsuite models and the results of the large-scale run on unannotated texts are available under open licenses at http://turkunlp.github.io/Cell-line-recognition/. Contact: sukaew@utu.fi PMID:26428294

  2. Combining machine learning, crowdsourcing and expert knowledge to detect chemical-induced diseases in text.

    Science.gov (United States)

    Bravo, Àlex; Li, Tong Shu; Su, Andrew I; Good, Benjamin M; Furlong, Laura I

    2016-01-01

    Drug toxicity is a major concern for both regulatory agencies and the pharmaceutical industry. In this context, text-mining methods for the identification of drug side effects from free text are key for the development of up-to-date knowledge sources on drug adverse reactions. We present a new system for identification of drug side effects from the literature that combines three approaches: machine learning, rule- and knowledge-based approaches. This system has been developed to address the Task 3.B of Biocreative V challenge (BC5) dealing with Chemical-induced Disease (CID) relations. The first two approaches focus on identifying relations at the sentence-level, while the knowledge-based approach is applied both at sentence and abstract levels. The machine learning method is based on the BeFree system using two corpora as training data: the annotated data provided by the CID task organizers and a new CID corpus developed by crowdsourcing. Different combinations of results from the three strategies were selected for each run of the challenge. In the final evaluation setting, the system achieved the highest Recall of the challenge (63%). By performing an error analysis, we identified the main causes of misclassifications and areas for improving of our system, and highlighted the need of consistent gold standard data sets for advancing the state of the art in text mining of drug side effects.Database URL: https://zenodo.org/record/29887?ln¼en#.VsL3yDLWR_V. © The Author(s) 2016. Published by Oxford University Press.

  3. Analysis of Influence of Different Relations Types on the Quality of Thesaurus Application to Text Classification Problems

    Directory of Open Access Journals (Sweden)

    Nadezhda S. Lagutina

    2017-01-01

    Full Text Available The main purpose of the article is to analyze how effectively different types of thesaurus relations can be used for solutions of text classification tasks. The basis of the study is an automatically generated thesaurus of a subject area, that contains three types of relations: synonymous, hierarchical and associative. To generate the thesaurus the authors use a hybrid method based on several linguistic and statistical algorithms for extraction of semantic relations. The method allows to create a thesaurus with a sufficiently large number of terms and relations among them. The authors consider two problems: topical text classification and sentiment classification of large newspaper articles. To solve them, the authors developed two approaches that complement standard algorithms with a procedure that take into account thesaurus relations to determine semantic features of texts. The approach to topical classification includes the standard unsupervised BM25 algorithm and the procedure, that take into account synonymous and hierarchical relations of the thesaurus of the subject area. The approach to sentiment classification consists of two steps. At the first step, a thesaurus is created, whose terms weight polarities are calculated depending on the term occurrences in the training set or on the weights of related thesaurus terms. At the second step, the thesaurus is used to compute the features of words from texts and to classify texts by the algorithm SVM or Naive Bayes. In experiments with text corpora BBCSport, Reuters, PubMed and the corpus of articles about American immigrants, the authors varied the types of thesaurus relations that are involved in the classification and the degree of their use. The results of the experiments make it possible to evaluate the efficiency of the application of thesaurus relations for classification of raw texts and to determine under what conditions certain relationships affect more or less. In particular, the

  4. Doing Mathematics with Purpose: Mathematical Text Types

    Science.gov (United States)

    Dostal, Hannah M.; Robinson, Richard

    2018-01-01

    Mathematical literacy includes learning to read and write different types of mathematical texts as part of purposeful mathematical meaning making. Thus in this article, we describe how learning to read and write mathematical texts (proof text, algorithmic text, algebraic/symbolic text, and visual text) supports the development of students'…

  5. The socio-demographics of texting

    DEFF Research Database (Denmark)

    Ling, Richard; Bertel, Troels Fibæk; Sundsøy, Pål

    2012-01-01

    Who texts, and with whom do they text? This article examines the use of texting using metered traffic data from a large dataset (nearly 400 million anonymous text messages). We ask 1) How much do different age groups use mobile phone based texting (SMS)? 2) How wide is the circle of texting...

  6. Bengali text summarization by sentence extraction

    OpenAIRE

    Sarkar, Kamal

    2012-01-01

    Text summarization is a process to produce an abstract or a summary by selecting significant portion of the information from one or more texts. In an automatic text summarization process, a text is given to the computer and the computer returns a shorter less redundant extract or abstract of the original text(s). Many techniques have been developed for summarizing English text(s). But, a very few attempts have been made for Bengali text summarization. This paper presents a method for Bengali ...

  7. BC4GO: a full-text corpus for the BioCreative IV GO task.

    Science.gov (United States)

    Van Auken, Kimberly; Schaeffer, Mary L; McQuilton, Peter; Laulederkind, Stanley J F; Li, Donghui; Wang, Shur-Jen; Hayman, G Thomas; Tweedie, Susan; Arighi, Cecilia N; Done, James; Müller, Hans-Michael; Sternberg, Paul W; Mao, Yuqing; Wei, Chih-Hsuan; Lu, Zhiyong

    2014-01-01

    Gene function curation via Gene Ontology (GO) annotation is a common task among Model Organism Database groups. Owing to its manual nature, this task is considered one of the bottlenecks in literature curation. There have been many previous attempts at automatic identification of GO terms and supporting information from full text. However, few systems have delivered an accuracy that is comparable with humans. One recognized challenge in developing such systems is the lack of marked sentence-level evidence text that provides the basis for making GO annotations. We aim to create a corpus that includes the GO evidence text along with the three core elements of GO annotations: (i) a gene or gene product, (ii) a GO term and (iii) a GO evidence code. To ensure our results are consistent with real-life GO data, we recruited eight professional GO curators and asked them to follow their routine GO annotation protocols. Our annotators marked up more than 5000 text passages in 200 articles for 1356 distinct GO terms. For evidence sentence selection, the inter-annotator agreement (IAA) results are 9.3% (strict) and 42.7% (relaxed) in F1-measures. For GO term selection, the IAAs are 47% (strict) and 62.9% (hierarchical). Our corpus analysis further shows that abstracts contain ∼ 10% of relevant evidence sentences and 30% distinct GO terms, while the Results/Experiment section has nearly 60% relevant sentences and >70% GO terms. Further, of those evidence sentences found in abstracts, less than one-third contain enough experimental detail to fulfill the three core criteria of a GO annotation. This result demonstrates the need of using full-text articles for text mining GO annotations. Through its use at the BioCreative IV GO (BC4GO) task, we expect our corpus to become a valuable resource for the BioNLP research community. Database URL: http://www.biocreative.org/resources/corpora/bc-iv-go-task-corpus/. Published by Oxford University Press 2014. This work is written by US

  8. Text Analysis: Critical Component of Planning for Text-Based Discussion Focused on Comprehension of Informational Texts

    Science.gov (United States)

    Kucan, Linda; Palincsar, Annemarie Sullivan

    2018-01-01

    This investigation focuses on a tool used in a reading methods course to introduce reading specialist candidates to text analysis as a critical component of planning for text-based discussions. Unlike planning that focuses mainly on important text content or information, a text analysis approach focuses both on content and how that content is…

  9. SIAM 2007 Text Mining Competition dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining...

  10. Measurement of [Formula: see text] polarisation in [Formula: see text] collisions at [Formula: see text] = 7 TeV.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Affolder, A; Ajaltouni, Z; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Anderson, J; Andreassen, R; Andreotti, M; Andrews, J E; Appleby, R B; Aquines Gutierrez, O; Archilli, F; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Bachmann, S; Back, J J; Badalov, A; Balagura, V; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Batozskaya, V; Bauer, Th; Bay, A; Beddow, J; Bedeschi, F; Bediaga, I; Belogurov, S; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bettler, M-O; van Beuzekom, M; Bien, A; Bifani, S; Bird, T; Bizzeti, A; Bjørnstad, P M; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Bondar, A; Bondar, N; Bonivento, W; Borghi, S; Borgia, A; Borsato, M; Bowcock, T J V; Bowen, E; Bozzi, C; Brambach, T; van den Brand, J; Bressieux, J; Brett, D; Britsch, M; Britton, T; Brook, N H; Brown, H; Bursche, A; Busetto, G; Buytaert, J; Cadeddu, S; Calabrese, R; Callot, O; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carranza-Mejia, H; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cenci, R; Charles, M; Charpentier, Ph; Cheung, S-F; Chiapolini, N; Chrzaszcz, M; Ciba, K; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coca, C; Coco, V; Cogan, J; Cogneras, E; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombes, M; Coquereau, S; Corti, G; Corvo, M; Counts, I; Couturier, B; Cowan, G A; Craik, D C; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Dalseno, J; David, P; David, P N Y; Davis, A; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Silva, W; De Simone, P; Decamp, D; Deckenhoff, M; Del Buono, L; Déléage, N; Derkach, D; Deschamps, O; Dettori, F; Di Canto, A; Dijkstra, H; Donleavy, S; Dordei, F; Dorigo, M; Dosil Suárez, A; Dossett, D; Dovbnya, A; Dupertuis, F; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Easo, S; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; El Rifai, I; Elsasser, Ch; Esen, S; Evans, T; Falabella, A; Färber, C; Farinelli, C; Farry, S; Ferguson, D; Fernandez Albor, V; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fontana, M; Fontanelli, F; Forty, R; Francisco, O; Frank, M; Frei, C; Frosini, M; Fu, J; Furfaro, E; Gallas Torreira, A; Galli, D; Gandelman, M; Gandini, P; Gao, Y; Garofoli, J; Garra Tico, J; Garrido, L; Gaspar, C; Gauld, R; Gavardi, L; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianelle, A; Giani, S; Gibson, V; Giubega, L; Gligorov, V V; Göbel, C; Golubkov, D; Golutvin, A; Gomes, A; Gordon, H; Gotti, C; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graziani, G; Grecu, A; Greening, E; Gregson, S; Griffith, P; Grillo, L; Grünberg, O; Gui, B; Gushchin, E; Guz, Yu; Gys, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Hampson, T; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hartmann, T; He, J; Head, T; Heijne, V; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hoballah, M; Hombach, C; Hulsbergen, W; Hunt, P; Hussain, N; Hutchcroft, D; Hynds, D; Iakovenko, V; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jaton, P; Jawahery, A; Jezabek, M; Jing, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kaballo, M; Kandybei, S; Kanso, W; Karacson, M; Karbach, T M; Kelsey, M; Kenyon, I R; Ketel, T; Khanji, B; Khurewathanakul, C; Klaver, S; Kochebina, O; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Korolev, M; Kozlinskiy, A; Kravchuk, L; Kreplin, K; Kreps, M; Krocker, G; Krokovny, P; Kruse, F; Kucharczyk, M; Kudryavtsev, V; Kurek, K; Kvaratskheliya, T; La Thi, V N; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lambert, R W; Lanciotti, E; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Lefèvre, R; Leflat, A; Lefrançois, J; Leo, S; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Liles, M; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, G; Lohn, S; Longstaff, I; Longstaff, I; Lopes, J H; Lopez-March, N; Lowdon, P; Lu, H; Lucchesi, D; Luisier, J; Luo, H; Lupato, A; Luppi, E; Lupton, O; Machefert, F; Machikhiliyan, I V; Maciuc, F; Maev, O; Malde, S; Manca, G; Mancinelli, G; Manzali, M; Maratas, J; Marchand, J F; Marconi, U; Marino, P; Märki, R; Marks, J; Martellotti, G; Martens, A; Martín Sánchez, A; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massafferri, A; Matev, R; Mathe, Z; Matteuzzi, C; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; McSkelly, B; Meadows, B; Meier, F; Meissner, M; Merk, M; Milanes, D A; Minard, M-N; Molina Rodriguez, J; Monteil, S; Moran, D; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Mountain, R; Muheim, F; Müller, K; Muresan, R; Muster, B; Naik, P; Nakada, T; Nandakumar, R; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nicol, M; Niess, V; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; Oblakowska-Mucha, A; Obraztsov, V; Oggero, S; Ogilvy, S; Okhrimenko, O; Oldeman, R; Onderwater, G; Orlandea, M; Otalora Goicochea, J M; Owen, P; Oyanguren, A; Pal, B K; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Parkes, C; Parkinson, C J; Passaleva, G; Patel, G D; Patel, M; Patrignani, C; Pazos Alvarez, A; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perez Trigo, E; Perret, P; Perrin-Terrin, M; Pescatore, L; Pesen, E; Petridis, K; Petrolini, A; Picatoste Olloqui, E; Pietrzyk, B; Pilař, T; Pinci, D; Pistone, A; Playfer, S; Plo Casasus, M; Polci, F; Polok, G; Poluektov, A; Polycarpo, E; Popov, A; Popov, D; Popovici, B; Potterat, C; Powell, A; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Rachwal, B; Rademacker, J H; Rakotomiaramanana, B; Rama, M; Rangel, M S; Raniuk, I; Rauschmayr, N; Raven, G; Redford, S; Reichert, S; Reid, M M; Dos Reis, A C; Ricciardi, S; Richards, A; Rinnert, K; Rives Molina, V; Roa Romero, D A; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Perez, P; Roiser, S; Romanovsky, V; Romero Vidal, A; Rotondo, M; Rouvinet, J; Ruf, T; Ruffini, F; Ruiz, H; Ruiz Valls, P; Sabatino, G; Saborido Silva, J J; Sagidova, N; Sail, P; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santovetti, E; Sapunov, M; Sarti, A; Satriano, C; Satta, A; Savrie, M; Savrina, D; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmidt, B; Schneider, O; Schopper, A; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Seco, M; Semennikov, A; Senderowska, K; Sepp, I; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Shires, A; Silva Coutinho, R; Simi, G; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, N A; Smith, E; Smith, E; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Soomro, F; Souza, D; Souza De Paula, B; Spaan, B; Sparkes, A; Spinella, F; Spradlin, P; Stagni, F; Stahl, S; Steinkamp, O; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Stroili, R; Subbiah, V K; Sun, L; Sutcliffe, W; Swientek, K; Swientek, S; Syropoulos, V; Szczekowski, M; Szczypka, P; Szilard, D; Szumlak, T; T'Jampens, S; Teklishyn, M; Tellarini, G; Teodorescu, E; Teubert, F; Thomas, C; Thomas, E; van Tilburg, J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Torr, N; Tournefier, E; Tourneur, S; Tran, M T; Tresch, M; Tsaregorodtsev, A; Tsopelas, P; Tuning, N; Ubeda Garcia, M; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vagnoni, V; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vázquez Sierra, C; Vecchi, S; Velthuis, J J; Veltri, M; Veneziano, G; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Vilasis-Cardona, X; Vollhardt, A; Volyanskyy, D; Voong, D; Vorobyev, A; Vorobyev, V; Voß, C; Voss, H; de Vries, J A; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wandernoth, S; Wang, J; Ward, D R; Watson, N K; Webber, A D; Websdale, D; Whitehead, M; Wicht, J; Wiedner, D; Wiggers, L; Wilkinson, G; Williams, M P; Williams, M; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wright, S; Wu, S; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yuan, X; Yushchenko, O; Zangoli, M; Zavertyaev, M; Zhang, F; Zhang, L; Zhang, W C; Zhang, Y; Zhelezov, A; Zhokhov, A; Zhong, L; Zvyagin, A

    The polarisation of prompt [Formula: see text] mesons is measured by performing an angular analysis of [Formula: see text] decays using proton-proton collision data, corresponding to an integrated luminosity of 1.0[Formula: see text], collected by the LHCb detector at a centre-of-mass energy of 7 TeV. The polarisation is measured in bins of transverse momentum [Formula: see text] and rapidity [Formula: see text] in the kinematic region [Formula: see text] and [Formula: see text], and is compared to theoretical models. No significant polarisation is observed.

  11. Examining Text Complexity in the Early Grades

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Hiebert, Elfrieda H.; Koons, Heather H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2016-01-01

    The Common Core raises the stature of texts to new heights, creating a hubbub. The fuss is especially messy at the early grades, where children are expected to read more complex texts than in the past. But early-grades teachers have been given little actionable guidance about text complexity. The authors recently examined early-grades texts to…

  12. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  13. Ekaterina Efimova, Sovremennaia tiur’ma : byt, traditsii, fol’klor [Contemporary prison: ways of life, traditions, folklore], Moscow : OGI, 2004 & Russian Criminal Tatoo Encyclopedia, Steidl / Fuel 2003.

    Directory of Open Access Journals (Sweden)

    Youri Vavokhine

    2005-04-01

    Full Text Available The first study focuses mainly on the inmates and their sub-culture rather than on the russian-soviet penal institution itself and provides a close examination of different aspects of social action and social control problems within the prison universe. The issues of power, domination and struggle are apprehended in a very serious and explicit way, and that is how the author diverges from the approach proper to the Russian folklorists. Even if this study doesn’t pretend to a rigorous use of s...

  14. A Proposed Arabic Handwritten Text Normalization Method

    Directory of Open Access Journals (Sweden)

    Tarik Abu-Ain

    2014-11-01

    Full Text Available Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and characters recognition. In the present article, a new method for text baseline detection, straightening, and slant correction for Arabic handwritten texts is proposed. The method comprises a set of sequential steps: first components segmentation is done followed by components text thinning; then, the direction features of the skeletons are extracted, and the candidate baseline regions are determined. After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line.  The experiments are conducted on IFN/ENIT benchmark Arabic dataset. The results show that the proposed method has a promising and encouraging performance.

  15. Partition of Ni between olivine and sulfide: the effect of temperature, f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }

    Science.gov (United States)

    Fleet, M. E.; Macrae, N. D.

    1987-03-01

    The experimental distribution coefficient for Ni/ Fe exchange between olivine and monosulfide (KD3) is 35.6±1.1 at 1385° C, f_{{text{O}}_{text{2}} } = 10^{ - 8.87} ,f_{{text{S}}_{text{2}} } = 10^{ - 1.02} , and olivine of composition Fo96 to Fo92. These are the physicochemical conditions appropriate to hypothesized sulfur-saturated komatiite magma. The present experiments equilibrated natural olivine grains with sulfide-oxide liquid in the presence of a (Mg, Fe)-alumino-silicate melt. By a variety of different experimental procedures, K D3 is shown to be essentially constant at about 30 to 35 in the temperature range 900 to 1400° C, for olivine of composition Fo97 to FoO, monosulfide composition with up to 70 mol. % NiS, and a wide range of f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }.

  16. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  17. Multimodal Diversity of Postmodernist Fiction Text

    Directory of Open Access Journals (Sweden)

    U. I. Tykha

    2016-12-01

    Full Text Available The article is devoted to the analysis of structural and functional manifestations of multimodal diversity in postmodernist fiction texts. Multimodality is defined as the coexistence of more than one semiotic mode within a certain context. Multimodal texts feature a diversity of semiotic modes in the communication and development of their narrative. Such experimental texts subvert conventional patterns by introducing various semiotic resources – verbal or non-verbal.

  18. Youth Texting: Help or Hindrance to Literacy?

    Science.gov (United States)

    Zebroff, Dmitri

    2018-01-01

    An extensive amount of research has been performed in recent years into the widespread practice of text messaging in youth. As part of this broad area of research, the associations between youth texting and literacy have been investigated in a variety of contexts. A comprehensive, semi-systematic review of the literature into texting and literacy…

  19. Choices of texts for literary education

    DEFF Research Database (Denmark)

    Skyggebjerg, Anna Karlskov

    This paper charts the general implications of the choice of texts for literature teaching in the Danish school system, especially in Grades 8 and 9. It will analyze and discuss the premises of the choice of texts, and the possibilities of a certain choice of text in a concrete classroom situation...

  20. Effects of Text Messaging on Academic Performance

    OpenAIRE

    Barks Amanda; Searight H. Russell; Ratwik Susan

    2011-01-01

    University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were ...

  1. Text-Picture Relations in Cooking Instructions

    NARCIS (Netherlands)

    van der Sluis, Ielka; Leito, Shadira; Redeker, Gisela; Bunt, Harry

    2016-01-01

    Like many other instructions, recipes on packages with ready-to-use ingredients for a dish combine a series of pictures with short text paragraphs. The information presentation in such multimodal instructions can be compact (either text or picture) and/or cohesive (text and picture). In an

  2. Academic Journal Embargoes and Full Text Databases.

    Science.gov (United States)

    Brooks, Sam

    2003-01-01

    Documents the reasons for embargoes of academic journals in full text databases (i.e., publisher-imposed delays on the availability of full text content) and provides insight regarding common misconceptions. Tables present data on selected journals covering a cross-section of subjects and publishers and comparing two full text business databases.…

  3. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  4. Inclusion in the Workplace - Text Version | NREL

    Science.gov (United States)

    Careers » Inclusion in the Workplace - Text Version Inclusion in the Workplace - Text Version This is the text version for the Inclusion: Leading by Example video. I'm Martin Keller. I'm the NREL of the laboratory. Another very important element in inclusion is diversity. Because if we have a

  5. Effects of Text Messaging on Academic Performance

    Directory of Open Access Journals (Sweden)

    Barks Amanda

    2011-12-01

    Full Text Available University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were randomly assigned to one of two conditions: 1. a group that sent andreceived text messages during a lecture or, 2. a group that did not engage in text messagingduring the lecture. Participants who engaged in text messaging demonstrated significantlypoorer performance on a test covering lecture content compared with the group that did notsend and receive text messages. Participants exhibiting higher levels of text messaging skill hadsignificantly lower test scores than participants who were less proficient at text messaging. It ishypothesized that in terms of retention of lecture material, more frequent task shifting by thosewith greater text messaging proficiency contributed to poorer performance. Overall, the findingsdo not support the view, held by many university students, that this form of multitasking has littleeffect on the acquisition of lecture content. Results provide empirical support for teachers andprofessors who ban text messaging in the classroom.

  6. The artists' text as work of art

    NARCIS (Netherlands)

    van Rijn, I.A.M.J.

    2017-01-01

    Artists’ texts are texts written and produced by visual artists. Their number increasing since the 2000s, it becomes important to clarify their obscure relationship to art institutions. Analysing and comparing four different artists’ texts on a textual level, this research proposes an alternative to

  7. Figure text extraction in biomedical literature.

    Directory of Open Access Journals (Sweden)

    Daehyun Kim

    2011-01-01

    Full Text Available Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures.We first evaluated an off-the-shelf Optical Character Recognition (OCR tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons.The evaluation on 382 figures (9,643 figure texts in total randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for

  8. La integración del videojuego educativo con el folklore. Una propuesta de aplicación en Educación Primaria

    Directory of Open Access Journals (Sweden)

    Sonsoles Ramos Ahijado

    2016-01-01

    Full Text Available El artículo se centra en una experiencia de innovación en Educación Primaria que forma parte del trabajo práctico de la asignatura Música y nuevas tecnologías del Grado en maestro de Educación Primaria impartida en la Escuela Universitaria de Educación y Turismo de Ávila (Universidad de Salamanca, enmarcada en el programa Prácticas de campo. Modalidad II (2015/2016_OP4 del Campus de Excelencia Internacional Studii Salamantini. Consiste en la aplicación del videojuego La granja musical a 46 alumnos de primero y segundo de Educación Primaria del CEIP Santa Ana de Ávila con la finalidad de recuperar la tradición popular del pasado, interpretarla en el presente y proyectarla hacia las nuevas generaciones del futuro, para familiarizar a los alumnos con sus raíces folklóricas. Los resultados demuestran un alto grado de consecución de los objetivos planteados, junto con una participación significativa de elementos innovadores, tanto en la vertiente de las estrategias didácticas como en la relativa a los recursos tecnológicos empleados.

  9. Amateur photography in Mexico: Club Fotográfico de México and the presence of folklorization in the construction of Mexico’s national identity - 1950

    Directory of Open Access Journals (Sweden)

    Priscila Miraz de Freitas Grecco

    2016-08-01

    Full Text Available Abstract In this study, we present Club Fotográfico de México, an institution that focuses on amateur photography in Mexico city, in the first years of its operation, in the 1950s. We understand that photo clubs are a path to be explored by photography history, from the moment they are presented as social spaces in which several issues that involve the practice and theory of photographic image can be observed, from the photograph statute to the country’s propagandistic image. In this sense, Club Fotográfico de México had a central role in the construction of a ‘folklorized’ image of the country and its popular culture and landscapes, thus contributing to consolidate a pro-government identity model, mainly through its guidelines for the creation of photographic image, leading amateur photographers from their choices of topics, going through the techniques and aesthetics that may be used to photograph the country, its landscapes, its culture, and its people.   Keywords: Amateur photography; Photo clubs; Mexico; Mexicanidad.   Original title: A fotografia amadora no México: Club Fotográfico de México e a presença da folclorização na construção da identidade nacional mexicana - 1950.

  10. Analyzing the interaction of a herbal compound Andrographolide from Andrographis paniculata as a folklore against swine flu (H1N1

    Directory of Open Access Journals (Sweden)

    Chandrabhan Seniya

    2014-09-01

    Full Text Available Objective: To find new bioactive molecules for the treatment of swine flu. Methods: The present study is an attempt to elucidate inhibition potential of andrographolide and its derivatives along with an associated binding mechanism through virtual screening and molecular docking simulation studies. Results: Our findings revealed structural conformation changes in 150 loop, secondary sialic acid binding site residues of ACZ97474 {Neuraminidase (A/Blore/NIV236/2009(H1N1}. Andrographolide have been identified as the highest binging energy of -1 0.88 Kcal/mol, 3 hydrogen bond interactions (Arg152, Lys150, and Gly197, total intermolecular energy of -12.07 Kcal/mol with bioactivity value (Ki of 10.59 nmol/L, while the Food and Drug Admistraton approved drug Oseltamivir and Zanamivir have shown 2 and 4 hydrogen bond interactions with binding energies of -6.28 Kcal/mol and -7.73Kcal/mol, respectively, which is higher than andrographolide. The guanidine group of Arg152 has binding affinities to the hydrophilic nature of the inhibitors (-OH and =O groups, as identified by docking of andrographolide (CID: 5318517 on neuraminidase. Conclusions: Hence, andrographolide has the potential to inhibit neuraminidase activity of H1N1 and may be used as an alternative medicinal therapy for swine flu positive patient. With potent antiviral activity and a potentially new mechanism of action, andrographolide may warrant further evaluation as a possible therapy for influenza.

  11. “Canto porque la guitarra / tiene sentido y razón”: folklore and politics in the music of Víctor Jara (1966-1973

    Directory of Open Access Journals (Sweden)

    NATÁLIA AYO SCHMIEDECKE

    2013-01-01

    Full Text Available The present paper is dedicated to analyze the political and cultural discourse identified to the Chilean New Song Movement through the musical work of the singer, composer and theater director Víctor Jara (1932-73. Despite his premature death - due to the coup of 1973 - Jara left a varied artistic legacy which constitutes a privileged source for thinking about the Chilean context of the 1960s and 1970s, a period marked by the strong presence of social movements that had as background the growing of the political polarization. Affiliated to the Communist Party, the musician used the communicative and expressive potential of music to defend the need to promote structural political changes in the country, which necessarily pass through the sphere of culture. Asking for the presence of identity elements in his musical work, we will analyze the dialogue established by the artist with the national political context, indicating the heterogeneity of the repertoire enclosed under the term Chilean New Song Movement.

  12. Entre prouesse et dérision Between prowess and derisionNoise imitation in American folklore at the beginning of the xxth century

    Directory of Open Access Journals (Sweden)

    Victor A. Stoichita

    2011-06-01

    Full Text Available Cet article porte sur un type de virtuosité qui consiste à imiter des sons non musicaux dans le cadre de performances musicales. Il s'attache à décrire les modalités de ces moments d'acrobatie dans le bluegrass et la country music en Amérique du Nord. Pour mettre en lumière l'évolution historique de ce régime de virtuosité, l'analyse suit les transformations d'une mélodie particulière, connue sous le nom de Orange Blossom Special, qui reste à ce jour l'une des plus fameuses imitations de trains dans la musique nord-américaine.This article focuses on a type of virtuosity which involves the imitation of non-musical sounds in the context of musical performances. It attempts to describe the modalities of these moments of acrobatics in bluegrass and country music in North America. To highlight the historical evolution of this virtuosity system, the analysis follows the transformation of a particular melody, known as the Orange Blossom Special, which remains to this day one of the most famous imitations of trains in North American music.

  13. An Embedded Application for Degraded Text Recognition

    Directory of Open Access Journals (Sweden)

    Thillou Céline

    2005-01-01

    Full Text Available This paper describes a mobile device which tries to give the blind or visually impaired access to text information. Three key technologies are required for this system: text detection, optical character recognition, and speech synthesis. Blind users and the mobile environment imply two strong constraints. First, pictures will be taken without control on camera settings and a priori information on text (font or size and background. The second issue is to link several techniques together with an optimal compromise between computational constraints and recognition efficiency. We will present the overall description of the system from text detection to OCR error correction.

  14. The Instructional Text like a Textual Genre

    Directory of Open Access Journals (Sweden)

    Adiane Fogali Marinello

    2011-07-01

    Full Text Available This article analyses the instructional text as a textual genre and is part of the research called Reading and text production from the textual genre perspective, done at Universidade de Caxias do Sul, Campus Universitário da Região dos Vinhedos. Firstly, some theoretical assumptions about textual genre are presented, then, the instructional text is characterized. After that an instructional text is analyzed and, finally, some activities related to reading and writing of the mentioned genre directed to High School and University students are suggested.

  15. Text segmentation in degraded historical document images

    Directory of Open Access Journals (Sweden)

    A.S. Kavitha

    2016-07-01

    Full Text Available Text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background in such images. In this paper, we present a new method for segmenting text and non-text in Indus documents based on the fact that text components are less cursive compared to non-text ones. To achieve this, we propose a new combination of Sobel and Laplacian for enhancing degraded low contrast pixels. Then the proposed method generates skeletons for text components in enhanced images to reduce computational burdens, which in turn helps in studying component structures efficiently. We propose to study the cursiveness of components based on branch information to remove false text components. The proposed method introduces the nearest neighbor criterion for grouping components in the same line, which results in clusters. Furthermore, the proposed method classifies these clusters into text and non-text cluster based on characteristics of text components. We evaluate the proposed method on a large dataset containing varieties of images. The results are compared with the existing methods to show that the proposed method is effective in terms of recall and precision.

  16. Text mining with R a tidy approach

    CERN Document Server

    Silge, Julia

    2017-01-01

    Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective. The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media. Learn how to apply the tidy text format to NLP Use sentiment analysis to mine the emotional content of text Identify a document's most important terms with frequency measurements E...

  17. The nuclear modification of charged particles in Pb-Pb at $\\sqrt{\\text{s}_\\text{NN}} = \\text{5.02}\\,\\text{TeV}$ measured with ALICE

    CERN Document Server

    Gronefeld, Julius

    2016-09-21

    The study of inclusive charged-particle production in heavy-ion collisions provides insights into the density of the medium and the energy-loss mechanisms. The observed suppression of high-$\\textit{p}_\\text{T}$ yield is generally attributed to energy loss of partons as they propagate through a deconfined state of quarks and gluons - Quark-Gluon Plasma (QGP) - predicted by QCD. Such measurements allow the characterization of the QGP by comparison with models. In these proceedings, results on high-$\\textit{p}_\\text{T}$ particle production measured by ALICE in Pb-Pb collisions at $ \\sqrt{\\text{s}_\\text{NN}}\\, = 5.02\\ \\rm{TeV}$ as well as well in pp at $\\sqrt{\\text{s}}\\,=5.02\\ \\rm{TeV}$ are presented for the first time. The nuclear modification factors ($\\text{R}_\\text{AA}$) in Pb-Pb collisions are presented and compared with model calculations.

  18. Deconstruction the end of writing: 'Everything is a text, there is nothing outside context'

    Directory of Open Access Journals (Sweden)

    Gavin P. Hendricks

    2016-03-01

    ]. Language is a constant movement of differences and everything acquires the instability and ambiguity inherent in language (Callinicos 2004. The implications of Derrida�s reading based on his work Of Grammatology (1976 have impacted everything in the humanities and social sciences, including law, anthropology, linguistics and gender studies, as the meaning of the text is not only inscribed in the sign (signifier and the signified, but everything is a �text� and meaning and representation are how we interpret it.Intradisciplinary and/or interdisciplinary implications: Derrida sought to subvert the �sign� in structuralism, as it opens the door to dialogue with the socially constructed �Other� in relation to the �sign� and the false consciousness construction of the text by the West. This challenges the existing interpretive paradigm and open oral and written dialogue of the text for the �other� in terms of the meaning and representation of the oral text, the oral archival memory of the other, indigenous knowledge systems, African rituals, folklore, storytelling and verbal arts.

  19. Sentiment analysis methods for understanding large-scale texts: a case for using continuum-scored words and word shift graphs

    Directory of Open Access Journals (Sweden)

    Andrew J Reagan

    2017-10-01

    Full Text Available Abstract The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, an extraordinary capacity which has profound implications for our understanding of human behavior. Given the growing assortment of sentiment-measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts. Most importantly they can aid understanding of texts with reliable and meaningful word shift graphs if (1 the dictionary covers a sufficiently large portion of a given text’s lexicon when weighted by word usage frequency; and (2 words are scored on a continuous scale.

  20. An indole alkaloid from a tribal folklore inhibits immediate early event in HSV-2 infected cells with therapeutic efficacy in vaginally infected mice.

    Directory of Open Access Journals (Sweden)

    Paromita Bag

    Full Text Available Herpes genitalis, caused by HSV-2, is an incurable genital ulcerative disease transmitted by sexual intercourse. The virus establishes life-long latency in sacral root ganglia and reported to have synergistic relationship with HIV-1 transmission. Till date no effective vaccine is available, while the existing therapy frequently yielded drug resistance, toxicity and treatment failure. Thus, there is a pressing need for non-nucleotide antiviral agent from traditional source. Based on ethnomedicinal use we have isolated a compound 7-methoxy-1-methyl-4,9-dihydro-3H-pyrido[3,4-b]indole (HM from the traditional herb Ophiorrhiza nicobarica Balkr, and evaluated its efficacy on isolates of HSV-2 in vitro and in vivo. The cytotoxicity (CC50, effective concentrations (EC50 and the mode of action of HM was determined by MTT, plaque reduction, time-of-addition, immunofluorescence (IFA, Western blot, qRT-PCR, EMSA, supershift and co-immunoprecipitation assays; while the in vivo toxicity and efficacy was evaluated in BALB/c mice. The results revealed that HM possesses significant anti-HSV-2 activity with EC50 of 1.1-2.8 µg/ml, and selectivity index of >20. The time kinetics and IFA demonstrated that HM dose dependently inhibited 50-99% of HSV-2 infection at 1.5-5.0 µg/ml at 2-4 h post-infection. Further, HM was unable to inhibit viral attachment or penetration and had no synergistic interaction with acyclovir. Moreover, Western blot and qRT-PCR assays demonstrated that HM suppressed viral IE gene expression, while the EMSA and co-immunoprecipitation studies showed that HM interfered with the recruitment of LSD-1 by HCF-1. The in vivo studies revealed that HM at its virucidal concentration was nontoxic and reduced virus yield in the brain of HSV-2 infected mice in a concentration dependent manner, compared to vaginal tissues. Thus, our results suggest that HM can serve as a prototype to develop non-nucleotide antiviral lead targeting the viral IE

  1. Adaptive Text Entry for Mobile Devices

    DEFF Research Database (Denmark)

    Proschowsky, Morten Smidt

    The reduced size of many mobile devices makes it difficult to enter text with them. The text entry methods are often slow or complicated to use. This affects the performance and user experience of all applications and services on the device. This work introduces new easy-to-use text entry methods...... for mobile devices and a framework for adaptive context-aware language models. Based on analysis of current text entry methods, the requirements to the new text entry methods are established. Transparent User guided Prediction (TUP) is a text entry method for devices with one dimensional touch input. It can...... be touch sensitive wheels, sliders or similar input devices. The interaction design of TUP is done with a combination of high level task models and low level models of human motor behaviour. Three prototypes of TUP are designed and evaluated by more than 30 users. Observations from the evaluations are used...

  2. Planning Multisentential English Text Using Communicative Acts

    Science.gov (United States)

    1990-12-01

    Composition, Vol. XI in series Advances in Discourse Processing, Alex Publishing Corporation. de Joia , A. and Stenton, A. 1980. Terms in Linguistics: A Guide to...investigate how attentional constraints relate to text planning and linguistic realization. 14 SUBJECT TE1MS I I N& De OF PAGES Natural Language Generation...surface form? Page I 4. What is the relation of communicative intentions to text structure and surface form? 5. What effects can texts be designed to have

  3. A unified framework for evaluating the risk of re-identification of text de-identification tools.

    Science.gov (United States)

    Scaiano, Martin; Middleton, Grant; Arbuckle, Luk; Kolhatkar, Varada; Peyton, Liam; Dowling, Moira; Gipson, Debbie S; El Emam, Khaled

    2016-10-01

    It has become regular practice to de-identify unstructured medical text for use in research using automatic methods, the goal of which is to remove patient identifying information to minimize re-identification risk. The metrics commonly used to determine if these systems are performing well do not accurately reflect the risk of a patient being re-identified. We therefore developed a framework for measuring the risk of re-identification associated with textual data releases. We apply the proposed evaluation framework to a data set from the University of Michigan Medical School. Our risk assessment results are then compared with those that would be obtained using a typical contemporary micro-average evaluation of recall in order to illustrate the difference between the proposed evaluation framework and the current baseline method. We demonstrate how this framework compares against common measures of the re-identification risk associated with an automated text de-identification process. For the probability of re-identification using our evaluation framework we obtained a mean value for direct identifiers of 0.0074 and a mean value for quasi-identifiers of 0.0022. The 95% confidence interval for these estimates were below the relevant thresholds. The threshold for direct identifier risk was based on previously used approaches in the literature. The threshold for quasi-identifiers was determined based on the context of the data release following commonly used de-identification criteria for structured data. Our framework attempts to correct for poorly distributed evaluation corpora, accounts for the data release context, and avoids the often optimistic assumptions that are made using the more traditional evaluation approach. It therefore provides a more realistic estimate of the true probability of re-identification. This framework should be used as a basis for computing re-identification risk in order to more realistically evaluate future text de-identification tools

  4. BioC-compatible full-text passage detection for protein-protein interactions using extended dependency graph.

    Science.gov (United States)

    Peng, Yifan; Arighi, Cecilia; Wu, Cathy H; Vijay-Shanker, K

    2016-01-01

    There has been a large growth in the number of biomedical publications that report experimental results. Many of these results concern detection of protein-protein interactions (PPI). In BioCreative V, we participated in the BioC task and developed a PPI system to detect text passages with PPIs in the full-text articles. By adopting the BioC format, the output of the system can be seamlessly added to the biocuration pipeline with little effort required for the system integration. A distinctive feature of our PPI system is that it utilizes extended dependency graph, an intermediate level of representation that attempts to abstract away syntactic variations in text. As a result, we are able to use only a limited set of rules to extract PPI pairs in the sentences, and additional rules to detect additional passages for PPI pairs. For evaluation, we used the 95 articles that were provided for the BioC annotation task. We retrieved the unique PPIs from the BioGRID database for these articles and show that our system achieves a recall of 83.5%. In order to evaluate the detection of passages with PPIs, we further annotated Abstract and Results sections of 20 documents from the dataset and show that an f-value of 80.5% was obtained. To evaluate the generalizability of the system, we also conducted experiments on AIMed, a well-known PPI corpus. We achieved an f-value of 76.1% for sentence detection and an f-value of 64.7% for unique PPI detection.Database URL: http://proteininformationresource.org/iprolink/corpora. © The Author(s) 2016. Published by Oxford University Press.

  5. Science and Technology Text Mining Basic Concepts

    National Research Council Canada - National Science Library

    Losiewicz, Paul

    2003-01-01

    ...). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods, such as nearest neighbor, relational learning models, and genetic...

  6. Using Unlabeled Data to Improve Text Classification

    National Research Council Canada - National Science Library

    Nigam, Kamal P

    2001-01-01

    .... This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high-accuracy text classifiers...

  7. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    Science.gov (United States)

    Schneider, Nadine; Fechner, Nikolas; Landrum, Gregory A; Stiefl, Nikolaus

    2017-08-28

    Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods. Handling those huge sets of molecules effectively is quite challenging and requires compromises that often come at the expense of the interpretability of the results. In order to find an intuitive and meaningful approach to organizing large molecular data sets, we adopted a probabilistic framework called "topic modeling" from the text-mining field. Here we present the first chemistry-related implementation of this method, which allows large molecule sets to be assigned to "chemical topics" and investigating the relationships between those. In this first study, we thoroughly evaluate this novel method in different experiments and discuss both its disadvantages and advantages. We show very promising results in reproducing human-assigned concepts using the approach to identify and retrieve chemical series from sets of molecules. We have also created an intuitive visualization of the chemical topics output by the algorithm. This is a huge benefit compared to other unsupervised machine-learning methods, like clustering, which are commonly used to group sets of molecules. Finally, we applied the new method to the 1.6 million molecules of the ChEMBL22 data set to test its robustness and efficiency. In about 1 h we built a 100-topic model of this large data set in which we could identify interesting topics like "proteins", "DNA", or "steroids". Along with this publication we provide our data sets and an open-source implementation of the new method (CheTo) which

  8. Ensino universitário, corporação e profissão: paradoxos e dilemas brasileiros College education, corporation, and profession: Brazilian dilemmas and paradoxes

    Directory of Open Access Journals (Sweden)

    Edson Nunes

    2007-06-01

    Full Text Available Este artigo examina a relação entre a formação de nível superior e a ocupação profissional, a partir dos dados do Censo Demográfico de 2000 (IBGE, com ênfase nas ocupações relacionadas às profissões regulamentadas. Verifica a existência de grande discrepância entre o diploma formal e a ocupação efetiva dos graduados. Considera que a não correspondência entre formação e ocupação deve-se, fundamentalmente, ao modelo educacional, historicamente amarrado às profissões regulamentadas, que não mais atende à realidade efetiva da sociedade brasileira.This article examines the relationship between higher education and professional occupation, based on data from the 2000 Demographic Census (IBGE, stressing occupations related to regulated professions. A large gap is seen between having a formal diploma and the real occupation of graduates. It considers that noncorrespondence between education and occupation is due mainly to the educational model, historically tied to regulated professions, which no longer meets the reality of Brazilian society.

  9. Classifying Written Texts Through Rhythmic Features

    NARCIS (Netherlands)

    Balint, Mihaela; Dascalu, Mihai; Trausan-Matu, Stefan

    2016-01-01

    Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic

  10. Text comprehension strategy instruction with poor readers

    NARCIS (Netherlands)

    Van den Bos, K.P.; Aarnoudse, C.C.; Brand-Gruwel, S.

    1998-01-01

    The goal of this study was to investigate the effects of teaching text comprehension strategies to children with decoding and reading comprehension problems and with a poor or normal listening ability. Two experiments are reported. Four text comprehension strategies, viz., question generation,

  11. Text Manipulation Techniques and Foreign Language Composition.

    Science.gov (United States)

    Walker, Ronald W.

    1982-01-01

    Discusses an approach to teaching second language composition which emphasizes (1) careful analysis of model texts from a limited, but well-defined perspective and (2) the application of text manipulation techniques developed by the word processing industry to student compositions. (EKN)

  12. Teachers' Texts in Culturally Responsive Teaching

    Science.gov (United States)

    Kesler, Ted

    2011-01-01

    In this paper, the author shares three teaching stories that demonstrate the social, cultural, political, and historical factors of all texts in specific interpretive communities. The author shows how the texts that comprised his curriculum constructed particular subject positions that inevitably included some students but marginalized and…

  13. Readability Revisited? The Implications of Text Complexity

    Science.gov (United States)

    Wray, David; Janan, Dahlia

    2013-01-01

    The concept of readability has had a variable history, moving from a position where it was considered as a very important topic for those responsible for producing texts and matching those texts to the abilities and needs of learners, to its current declining visibility in the education literature. Some important work has been coming from the USA…

  14. Tipster Text Phase 2 Architecture Design

    Science.gov (United States)

    1996-06-19

    TIPSTER Text Phase II Architecture Design Version 2.1p 19 June 1996 Ralph Grishman New York University grishman @cs.nyu.edu and the TIPSTER...1996 2. REPORT TYPE 3. DATES COVERED 00-00-1996 to 00-00-1996 4. TITLE AND SUBTITLE TIPSTER Text Phase II Architecture Design 5a. CONTRACT

  15. Using Digital Texts to Promote Fluent Reading

    Science.gov (United States)

    Thoermer, Andrea; Williams, Lunetta

    2012-01-01

    Fluency is a critical skill of adept readers. As listening to read alouds and performing Readers Theatre scripts are two prevalent strategies that can increase students' fluency skills, this article provides suggestions in using these strategies with digital texts through free, online resources. Digital texts can be accessed using a desktop,…

  16. Interest, Inferences, and Learning from Texts

    Science.gov (United States)

    Clinton, Virginia; van den Broek, Paul

    2012-01-01

    Topic interest and learning from texts have been found to be positively associated with each other. However, the reason for this positive association is not well understood. The purpose of this study is to examine a cognitive process, inference generation, that could explain the positive association between interest and learning from texts. In…

  17. Text Fabric: What, How, and Why

    NARCIS (Netherlands)

    Erwich, C.M.; Kingham, Cody

    Text-Fabric (TF) is a promising new framework for the Eep Talstra Center for Bible and Computer corpus plus (linguistic) annotations. TF is a Python 3.x software package that provides scientific, accessible and reproducible ways of processing Biblical Hebrew text data. It also allows sharing the

  18. An Intelligent System For Arabic Text Categorization

    NARCIS (Netherlands)

    Syiam, M.M.; Tolba, Mohamed F.; Fayed, Z.T.; Abdel-Wahab, Mohamed S.; Ghoniemy, Said A.; Habib, Mena Badieh

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. In this paper, an intelligent Arabic text categorization system is presented. Machine learning algorithms are used in this system. Many algorithms for stemming and

  19. Flexible frontiers for text division into rows

    Directory of Open Access Journals (Sweden)

    Dan L. Lacrămă

    2009-01-01

    Full Text Available This paper presents an original solution for flexible hand-written text division into rows. Unlike the standard procedure, the proposed method avoids the isolated characters extensions amputation and reduces the recognition error rate in the final stage.

  20. Lexical Information in Memory for Text.

    Science.gov (United States)

    Hayes-Roth, Barbara

    Cued-recall and two-alternative, forced-choice recognition measures were used to evaluate subjects' retention of the specific wordings of studied texts. Results obtained after 10-minute and 24 hour retention intervals suggest that the studied wordings of texts are functional components of their memory representations. Theories that assume…

  1. Undergraduates' Text Messaging Language and Literacy Skills

    Science.gov (United States)

    Grace, Abbie; Kemp, Nenagh; Martin, Frances Heritage; Parrila, Rauno

    2014-01-01

    Research investigating whether people's literacy skill is being affected by the use of text messaging language has produced largely positive results for children, but mixed results for adults. We asked 150 undergraduate university students in Western Canada and 86 in South Eastern Australia to supply naturalistic text messages and to complete…

  2. Language Skills in Classical Chinese Text Comprehension

    Science.gov (United States)

    Lau, Kit-ling

    2018-01-01

    This study used both quantitative and qualitative methods to explore the role of lower- and higher-level language skills in classical Chinese (CC) text comprehension. A CC word and sentence translation test, text comprehension test, and questionnaire were administered to 393 Secondary Four students; and 12 of these were randomly selected to…

  3. Text Structure and Retention of Prose.

    Science.gov (United States)

    Zimmer, John W.

    1985-01-01

    The effects of text structure were studied using two kinds of reading materials: a standard text with headings and illustrations, as well as a nonstructured manuscript. The manuscript readers scored higher on delayed tests, generated more relevant ideas, and wrote better essays both immediately and after a delay. (Author/GDC)

  4. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  5. Application of LSP Texts in Translator Training

    Science.gov (United States)

    Ilynska, Larisa; Smirnova, Tatjana; Platonova, Marina

    2017-01-01

    The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes) texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate…

  6. Modeling text with generalizable Gaussian mixtures

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Sigurdsson, Sigurdur; Kolenda, Thomas

    2000-01-01

    We apply and discuss generalizable Gaussian mixture (GGM) models for text mining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss...

  7. Text mining for the biocuration workflow.

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  8. Texts, Transmissions, Receptions. Modern Approaches to Narratives

    NARCIS (Netherlands)

    Lardinois, A.P.M.H.; Levie, S.A.; Hoeken, H.; Lüthy, C.H.

    2015-01-01

    The papers collected in this volume study the function and meaning of narrative texts from a variety of perspectives. The word 'text' is used here in the broadest sense of the term: it denotes literary books, but also oral tales, speeches, newspaper articles and comics. One of the purposes of this

  9. A text in Romani from 1622

    DEFF Research Database (Denmark)

    Bakker, Peter

    2015-01-01

    this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212.......this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212....

  10. Where Full-Text Is Viable.

    Science.gov (United States)

    Cotton, P. L.

    1987-01-01

    Defines two types of online databases: source, referring to those intended to be complete in themselves, whether full-text or abstracts; and bibliographic, meaning those that are not complete. Predictions are made about the future growth rate of these two types of databases, as well as full-text versus abstract databases. (EM)

  11. The Medline/full-text research project.

    Science.gov (United States)

    McKinin, E J; Sievert, M; Johnson, E D; Mitchell, J A

    1991-05-01

    This project was designed to test the relative efficacy of index terms and full-text for the retrieval of documents in those MEDLINE journals for which full-text searching was also available. The full-text files used were MEDIS from Mead Data Central and CCML from BRS Information Technologies. One hundred clinical medical topics were searched in these two files as well as the MEDLINE file to accumulate the necessary data. It was found that full-text identified significantly more relevant articles than did the indexed file, MEDLINE. The full-text searches, however, lacked the precision of searches done in the indexed file. Most relevant items missed in the full-text files, but identified in MEDLINE, were missed because the searcher failed to account for some aspect of natural language, used a logical or positional operator that was too restrictive, or included a concept which was implied, but not expressed in the natural language. Very few of the unique relevant full-text citations would have been retrieved by title or abstract alone. Finally, as of July, 1990 the more current issue of a journal was just as likely to appear in MEDLINE as in one of the full-text files.

  12. Ontology Assisted Formal Specification Extraction from Text

    Directory of Open Access Journals (Sweden)

    Andreea Mihis

    2010-12-01

    Full Text Available In the field of knowledge processing, the ontologies are the most important mean. They make possible for the computer to understand better the natural language and to make judgments. In this paper, a method which use ontologies in the semi-automatic extraction of formal specifications from a natural language text is proposed.

  13. NOTICING HYBRID RECASTS IN TEXT CHAT

    Directory of Open Access Journals (Sweden)

    Mark J. Oliver

    2016-12-01

    Full Text Available This study examined ten EFL learners’ noticing of the corrective nature of a form of text-based SCMC (text chat feedback that combined a recast of a grammatical error with metalinguistic information. The feedback, termed a hybrid recast, was provided by a native-speaker interlocutor during two text chat activities: a spot-the-difference and picture-ordering task. Data was collected in two ways: analysis of task-based dyadic text chat interaction in which uptake was used as an indicator of learner noticing, and a post-task questionnaire containing questions that identified evidence of learner noticing. Interaction analysis showed that learners responded to almost two thirds of the hybrid recasts with uptake. In addition, every learner provided evidence that they had correctly perceived at least some of the hybrid recasts as corrective in their post-task questionnaire responses.

  14. EXPLORING STUDENTS‟ DIFFICULTIES IN READING ACADEMIC TEXTS

    Directory of Open Access Journals (Sweden)

    Ira Ernawati

    2017-04-01

    Full Text Available Academic texts play an important role for university students. However, those texts are considered difficult. This study is intended to investigate students‘ difficulties in reading academic texts. The qualitative approach was employed in this study. The design was a case study. The participants were ten students from fifth semester of CLS: EE (Classroom Language and Strategy: Explaining and Exemplifying class who were selected by using purposive sampling. The data were gathered from students‘ journal reflections, observation, and interview. The finding shows that the students encountered reading difficulties in area of textual factors, namely vocabulary, comprehending specific information, text organization, and grammar and human factors including background knowledge, mood, laziness, and time constraint.

  15. Frontiers of biomedical text mining: current progress

    Science.gov (United States)

    Zweigenbaum, Pierre; Demner-Fushman, Dina; Yu, Hong; Cohen, Kevin B.

    2008-01-01

    It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year. PMID:17977867

  16. Learning from text benefits from enactment.

    Science.gov (United States)

    Cutica, Ilaria; Ianì, Francesco; Bucciarelli, Monica

    2014-10-01

    Classical studies on enactment have highlighted the beneficial effects of gestures performed in the encoding phase on memory for words and sentences, for both adults and children. In the present investigation, we focused on the role of enactment for learning from scientific texts among primary-school children. We assumed that enactment would favor the construction of a mental model of the text, and we verified the derived predictions that gestures at the time of encoding would result in greater numbers of correct recollections and discourse-based inferences at recall, as compared to no gestures (Exp. 1), and in a bias to confound paraphrases of the original text with the verbatim text in a recognition test (Exp. 2). The predictions were confirmed; hence, we argue in favor of a theoretical framework that accounts for the beneficial effects of enactment on memory for texts.

  17. Rhetorical structure theory and text analysis

    Science.gov (United States)

    Mann, William C.; Matthiessen, Christian M. I. M.; Thompson, Sandra A.

    1989-11-01

    Recent research on text generation has shown that there is a need for stronger linguistic theories that tell in detail how texts communicate. The prevailing theories are very difficult to compare, and it is also very difficult to see how they might be combined into stronger theories. To make comparison and combination a bit more approachable, we have created a book which is designed to encourage comparison. A dozen different authors or teams, all experienced in discourse research, are given exactly the same text to analyze. The text is an appeal for money by a lobbying organization in Washington, DC. It informs, stimulates and manipulates the reader in a fascinating way. The joint analysis is far more insightful than any one team's analysis alone. This paper is our contribution to the book. Rhetorical Structure Theory (RST), the focus of this paper, is a way to account for the functional potential of text, its capacity to achieve the purposes of speakers and produce effects in hearers. It also shows a way to distinguish coherent texts from incoherent ones, and identifies consequences of text structure.

  18. LITURGICAL TEXT IN RUSSIAN LITERATURE. PROBLEM STATEMENT

    Directory of Open Access Journals (Sweden)

    Avetis Serezhaevich Seropyan

    2012-11-01

    Full Text Available The article analyses artistic expressions of liturgical language in the literary text and its interaction of the Holy Tradition. Many Russian authors knew the liturgical text well. Studying it reveals the crucial meaning of the Gospel and liturgical texts (as part of the Holy Tradition for Russian literature. Authors saw the essence of every phenomenon in the word for it, and the nature of God in His name. Some ideas and sayings of the authors and their characters find their sources in liturgical texts. The article focuses on liturgical sources of some characters' commemorations and invocations, as well as poetical topics of the symbolists, Dostoevsky's famous dictum on beauty which will save the world (The Idiot, etc. De-cyphering this liturgical code will help us learn and comprehend the hidden endless meaning of a literary text. The specific feature of Russian literature is its pursuit of the spiritual liturgical exploration of the world, an exploration when truth takes shape and thus becomes real in both literary text and history.

  19. Application of LSP texts in translator training

    Directory of Open Access Journals (Sweden)

    Larisa Ilynska

    2017-06-01

    Full Text Available The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate the efficiency of this methodology in developing thematic, linguistic and cultural competences of the students, following Bloom’s revised taxonomy and European Master in Translation Network (EMT translator training competences. The methodology has been tested on the students of a professional Master study programme called Technical Translation implemented by the Institute of Applied Linguistics, Riga Technical University, Latvia. The group of students included representatives of different nationalities, translating from English into Latvian, Russian and French. Analysis of popular LSP texts provides an opportunity to structure student background knowledge and expand it to account for linguistic innovation. Application of popular LSP texts instead of purely technical or scientific texts characterised by neutral style and rigid genre conventions provides an opportunity for student translators to develop advanced text processing and decoding skills, to develop awareness of expressive resources of the source and target languages and to develop understanding of socio-pragmatic language use.

  20. Figure-associated text summarization and evaluation.

    Directory of Open Access Journals (Sweden)

    Balaji Polepalli Ramesh

    Full Text Available Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903.