WorldWideScience

Sample records for linguistic annotation workshop

  1. Evaluating automatically annotated treebanks for linguistic research

    NARCIS (Netherlands)

    Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.

    2016-01-01

    This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group

  2. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  3. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    Directory of Open Access Journals (Sweden)

    Qiandong Zeng

    2010-10-01

    Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  4. Preprocessing Greek Papyri for Linguistic Annotation

    Directory of Open Access Journals (Sweden)

    Vierros, Marja

    2017-08-01

    Full Text Available Greek documentary papyri form an important direct source for Ancient Greek. It has been exploited surprisingly little in Greek linguistics due to a lack of good tools for searching linguistic structures. This article presents a new tool and digital platform, “Sematia”, which enables transforming the digital texts available in TEI EpiDoc XML format to a format which can be morphologically and syntactically annotated (treebanked, and where the user can add new metadata concerning the text type, writer and handwriting of each act of writing. An important aspect in this process is to take into account the original surviving writing vs. the standardization of language and supplements made by the editors. This is performed by creating two different layers of the same text. The platform is in its early development phase. Ongoing and future developments, such as tagging linguistic variation phenomena as well as queries performed within Sematia, are discussed at the end of the article.

  5. Microsyntactic Annotation of Corpora and its Use in Computational Linguistics Tasks

    Directory of Open Access Journals (Sweden)

    Iomdin Leonid

    2017-12-01

    Full Text Available Microsyntax is a linguistic discipline dealing with idiomatic elements whose important properties are strongly related to syntax. In a way, these elements may be viewed as transitional entities between the lexicon and the grammar, which explains why they are often underrepresented in both of these resource types: the lexicographer fails to see such elements as full-fledged lexical units, while the grammarian finds them too specific to justify the creation of individual well-developed rules. As a result, such elements are poorly covered by linguistic models used in advanced modern computational linguistic tasks like high-quality machine translation or deep semantic analysis. A possible way to mend the situation and improve the coverage and adequate treatment of microsyntactic units in linguistic resources is to develop corpora with microsyntactic annotation, closely linked to specially designed lexicons. The paper shows how this task is solved in the deeply annotated corpus of Russian, SynTagRus.

  6. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  7. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  8. Workshops for the Handicapped; An Annotated Bibliography - No. 6.

    Science.gov (United States)

    Perkins, Dorothy C., Comp.; And Others

    An annotated bibliography of workshops for the handicapped covers the literature on work programs for the period July, 1968 through June, 1969. One hundred and fifty four publications were reviewed; the number of articles on administration, management, and planning of facilities and programs has increased since the last edition. (Author/RJ)

  9. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  10. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  11. Data Acquisition and Linguistic Resources

    Science.gov (United States)

    Strassel, Stephanie; Christianson, Caitlin; McCary, John; Staderman, William; Olive, Joseph

    All human language technology demands substantial quantities of data for system training and development, plus stable benchmark data to measure ongoing progress. While creation of high quality linguistic resources is both costly and time consuming, such data has the potential to profoundly impact not just a single evaluation program but language technology research in general. GALE's challenging performance targets demand linguistic data on a scale and complexity never before encountered. Resources cover multiple languages (Arabic, Chinese, and English) and multiple genres -- both structured (newswire and broadcast news) and unstructured (web text, including blogs and newsgroups, and broadcast conversation). These resources include significant volumes of monolingual text and speech, parallel text, and transcribed audio combined with multiple layers of linguistic annotation, ranging from word aligned parallel text and Treebanks to rich semantic annotation.

  12. Annotating abstract pronominal anaphora in the DAD project

    DEFF Research Database (Denmark)

    Navarretta, Costanza; Olsen, Sussi Anni

    2008-01-01

    n this paper we present an extension of the MATE/GNOME annotation scheme for anaphora (Poesio 2004) which accounts for abstract anaphora in Danish and Italian. By abstract anaphora it is here meant pronouns whose linguistic antecedents are verbal phrases, clauses and discourse segments. The exten......n this paper we present an extension of the MATE/GNOME annotation scheme for anaphora (Poesio 2004) which accounts for abstract anaphora in Danish and Italian. By abstract anaphora it is here meant pronouns whose linguistic antecedents are verbal phrases, clauses and discourse segments....... The extended scheme, which we call the DAD annotation scheme, allows to annotate information about abstract anaphora which is important to investigate their use, see Webber (1988), Gundel et al. (2003), Navarretta (2004) and which can influence their automatic treatment. Intercoder agreement scores obtained...... by applying the DAD annotation scheme on texts and dialogues in the two languages are given and show that th information proposed in the scheme can be recognised in a reliable way....

  13. Semantics, contrastive linguistics and parallel corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska

    2014-09-01

    Full Text Available Semantics, contrastive linguistics and parallel corpora In view of the ambiguity of the term “semantics”, the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools. She examines semantics differently in connection with contrastive studies where the description must necessary go from the meaning towards the linguistic form, whereas in traditional contrastive studies the description proceeded from the form towards the meaning. This requirement regarding theoretical contrastive studies necessitates construction of a semantic interlanguage, rather than only singling out universal semantic categories expressed with various language means. Such studies can be strongly supported by parallel corpora. However, in order to make them useful for linguists in manual and computer translations, as well as in the development of dictionaries, including online ones, we need not only formal, often automatic, annotation of texts, but also semantic annotation - which is unfortunately manual. In the article we focus on semantic annotation concerning time, aspect and quantification of names and predicates in the whole semantic structure of the sentence on the example of the “Polish-Bulgarian-Russian parallel corpus”.

  14. 06491 Summary -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    OpenAIRE

    Burnard, Lou; Dobreva, Milena; Fuhr, Norbert; Lüdeling, Anke

    2007-01-01

    The seminar "Digital Historical Corpora" brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. The issues that were discussed include digitization, corpus design, corpus architecture, annotation, search, and retrieval.

  15. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  16. Developing Annotation Solutions for Online Data Driven Learning

    Science.gov (United States)

    Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.

    2009-01-01

    Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…

  17. Evaluating stance-annotated sentences from the Brexit Blog Corpus: A quantitative linguistic analysis

    Directory of Open Access Journals (Sweden)

    Simaki Vasiliki

    2018-03-01

    Full Text Available This paper offers a formally driven quantitative analysis of stance-annotated sentences in the Brexit Blog Corpus (BBC. Our goal is to identify features that determine the formal profiles of six stance categories (contrariety, hypotheticality, necessity, prediction, source of knowledge and uncertainty in a subset of the BBC. The study has two parts: firstly, it examines a large number of formal linguistic features, such as punctuation, words and grammatical categories that occur in the sentences in order to describe the specific characteristics of each category, and secondly, it compares characteristics in the entire data set in order to determine stance similarities in the data set. We show that among the six stance categories in the corpus, contrariety and necessity are the most discriminative ones, with the former using longer sentences, more conjunctions, more repetitions and shorter forms than the sentences expressing other stances. necessity has longer lexical forms but shorter sentences, which are syntactically more complex. We show that stance in our data set is expressed in sentences with around 21 words per sentence. The sentences consist mainly of alphabetical characters forming a varied vocabulary without special forms, such as digits or special characters.

  18. Workshop Proceedings

    DEFF Research Database (Denmark)

    2012-01-01

    , the main focus there is on spoken languages in their written and spoken forms. This series of workshops, however, offers a forum for researchers focussing on sign languages. For the third time, the workshop had sign language corpora as its main topic. This time, the focus was on the interaction between...... corpus and lexicon. More than half of the papers presented contribute to this topic. Once again, the papers at this workshop clearly identify the potentials of even closer cooperation between sign linguists and sign language engineers, and we think it is events like this that contribute a lot to a better...

  19. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... and metonymic. We have conducted an analysis in English, Danish and Spanish. Later on, we have tried to replicate the human judgments by means of unsupervised and semi-supervised sense prediction. The automatic sense-prediction systems have been unable to find empiric evidence for the underspecified sense, even...

  20. A Set of Annotation Interfaces for Alignment of Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Singh Anil Kumar

    2014-09-01

    Full Text Available Annotation interfaces for parallel corpora which fit in well with other tools can be very useful. We describe a set of annotation interfaces which fulfill this criterion. This set includes a sentence alignment interface, two different word or word group alignment interfaces and an initial version of a parallel syntactic annotation alignment interface. These tools can be used for manual alignment, or they can be used to correct automatic alignments. Manual alignment can be performed in combination with certain kinds of linguistic annotation. Most of these interfaces use a representation called the Shakti Standard Format that has been found to be very robust and has been used for large and successful projects. It ties together the different interfaces, so that the data created by them is portable across all tools which support this representation. The existence of a query language for data stored in this representation makes it possible to build tools that allow easy search and modification of annotated parallel data.

  1. Network workshop

    DEFF Research Database (Denmark)

    Bruun, Jesper; Evans, Robert Harry

    2014-01-01

    This paper describes the background for, realisation of and author reflections on a network workshop held at ESERA2013. As a new research area in science education, networks offer a unique opportunity to visualise and find patterns and relationships in complicated social or academic network data....... These include student relations and interactions and epistemic and linguistic networks of words, concepts and actions. Network methodology has already found use in science education research. However, while networks hold the potential for new insights, they have not yet found wide use in the science education...... research community. With this workshop, participants were offered a way into network science based on authentic educational research data. The workshop was constructed as an inquiry lesson with emphasis on user autonomy. Learning activities had participants choose to work with one of two cases of networks...

  2. WORKSHOPS FOR THE HANDICAPPED, AN ANNOTATED BIBLIOGRAPHY--NO. 3.

    Science.gov (United States)

    PERKINS, DOROTHY C.; AND OTHERS

    THESE 126 ANNOTATIONS ARE THE THIRD VOLUME OF A CONTINUING SERIES OF BIBLIOGRAPHIES LISTING ARTICLES APPEARING IN JOURNALS AND CONFERENCE, RESEARCH, AND PROJECT REPORTS. LISTINGS INCLUDE TESTS, TEST RESULTS, STAFF TRAINING PROGRAMS, GUIDES FOR COUNSELORS AND TEACHERS, AND ARCHITECTURAL PLANNING, AND RELATE TO THE MENTALLY RETARDED, EMOTIONALLY…

  3. Planetarium Educator's Workshop Guide. International Planetarium Society Special Report No. 10.

    Science.gov (United States)

    Friedman, Alan; And Others

    Presented is a workshop guide for planetarium educators. Seven modules and four appendices focus on organizational patterns, learning theories, questioning strategies, activities for the planetarium, and incorporating all of the above into teaching. The four appendices include a list of the 1978 workshop participants, an annotated bibliography for…

  4. Annotating patient clinical records with syntactic chunks and named entities: the Harvey Corpus.

    Science.gov (United States)

    Savkov, Aleksandar; Carroll, John; Koeling, Rob; Cassell, Jackie

    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning.

  5. "It's worth our time": a model of culturally and linguistically supportive professional development for K-12 STEM educators

    Science.gov (United States)

    Charity Hudley, Anne H.; Mallinson, Christine

    2017-09-01

    Professional development on issues of language and culture is often separate from professional development on issues related to STEM education, resulting in linguistic and cultural gaps in K-12 STEM pedagogy and practice. To address this issue, we have designed a model of professional development in which we work with educators to build cultural and linguistic competence and to disseminate information about how educators view the relevance of language, communication, and culture to STEM teaching and learning. We describe the design and facilitation of our model of culturally and linguistically responsive professional development, grounded in theories of multicultural education and culturally supportive teaching, through professional development workshops to 60 K-12 STEM educators from schools in Maryland and Virginia that serve African American students. Participants noted that culturally and linguistically responsive approaches had yet to permeate their K-12 STEM settings, which they identified as a critical challenge to effectively teaching and engaging African-American students. Based on pre-surveys, workshops were tailored to participants' stated needs for information on literacy (e.g., disciplinary literacies and discipline-specific jargon), cultural conflict and mismatch (e.g., student-teacher miscommunication), and linguistic bias in student assessment (e.g., test design). Educators shared feedback via post-workshop surveys, and a subset of 28 participants completed in-depth interviews and a focus group. Results indicate the need for further implementation of professional development such as ours that address linguistic and cultural issues, tailored for K-12 STEM educators. Although participants in this study enumerated several challenges to meeting this need, they also identified opportunities for collaborative solutions that draw upon teacher expertise and are integrated with curricula across content areas.

  6. Linguistic measures of chemical diversity and the "keywords" of molecular collections.

    Science.gov (United States)

    Woźniak, Michał; Wołos, Agnieszka; Modrzyk, Urszula; Górski, Rafał L; Winkowski, Jan; Bajczyk, Michał; Szymkuć, Sara; Grzybowski, Bartosz A; Eder, Maciej

    2018-05-15

    Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections ("corpora"), including those deposited on the Internet - indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic "chemical words" that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular "keywords" by which such collections are best characterized and annotated.

  7. Feeling Expression Using Avatars and Its Consistency for Subjective Annotation

    Science.gov (United States)

    Ito, Fuyuko; Sasaki, Yasunari; Hiroyasu, Tomoyuki; Miki, Mitsunori

    Consumer Generated Media(CGM) is growing rapidly and the amount of content is increasing. However, it is often difficult for users to extract important contents and the existence of contents recording their experiences can easily be forgotten. As there are no methods or systems to indicate the subjective value of the contents or ways to reuse them, subjective annotation appending subjectivity, such as feelings and intentions, to contents is needed. Representation of subjectivity depends on not only verbal expression, but also nonverbal expression. Linguistically expressed annotation, typified by collaborative tagging in social bookmarking systems, has come into widespread use, but there is no system of nonverbally expressed annotation on the web. We propose the utilization of controllable avatars as a means of nonverbal expression of subjectivity, and confirmed the consistency of feelings elicited by avatars over time for an individual and in a group. In addition, we compared the expressiveness and ease of subjective annotation between collaborative tagging and controllable avatars. The result indicates that the feelings evoked by avatars are consistent in both cases, and using controllable avatars is easier than collaborative tagging for representing feelings elicited by contents that do not express meaning, such as photos.

  8. Preface to Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

    NARCIS (Netherlands)

    Krahmer, E.; Krahmer, E.; Theune, Mariet

    We are pleased to present the Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009). ENLG 2009 was held in Athens, Greece, as a workshop at the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). Following our call, we

  9. Annotating public fungal ITS sequences from the built environment according to the MIxS-Built Environment standard – a report from a May 23-24, 2016 workshop (Gothenburg, Sweden

    Directory of Open Access Journals (Sweden)

    Kessy Abarenkov

    2016-09-01

    Full Text Available Recent molecular studies have identified substantial fungal diversity in indoor environments. Fungi and fungal particles have been linked to a range of potentially unwanted effects in the built environment, including asthma, decay of building materials, and food spoilage. The study of the built mycobiome is hampered by a number of constraints, one of which is the poor state of the metadata annotation of fungal DNA sequences from the built environment in public databases. In order to enable precise interrogation of such data – for example, “retrieve all fungal sequences recovered from bathrooms” – a workshop was organized at the University of Gothenburg (May 23-24, 2016 to annotate public fungal barcode (ITS sequences according to the MIxS-Built Environment annotation standard (http://gensc.org/mixs/. The 36 participants assembled a total of 45,488 data points from the published literature, including the addition of 8,430 instances of countries of collection from a total of 83 countries, 5,801 instances of building types, and 3,876 instances of surface-air contaminants. The results were implemented in the UNITE database for molecular identification of fungi (http://unite.ut.ee and were shared with other online resources. Data obtained from human/animal pathogenic fungi will furthermore be verified on culture based metadata for subsequent inclusion in the ISHAM-ITS database (http://its.mycologylab.org.

  10. Discovering and annotating fish early life-stage (FELS) adverse outcome pathways: Putting the research strategy into practice

    Science.gov (United States)

    In May 2012, a HESI-sponsored expert workshop yielded a proposed research strategy for systematically discovering, characterizing, and annotating fish early life-stage (FELS) adverse outcome pathways (AOPs) as well as prioritizing AOP development in light of current restrictions ...

  11. t4 Workshop Report*

    Science.gov (United States)

    Kleensang, Andre; Maertens, Alexandra; Rosenberg, Michael; Fitzpatrick, Suzanne; Lamb, Justin; Auerbach, Scott; Brennan, Richard; Crofton, Kevin M.; Gordon, Ben; Fornace, Albert J.; Gaido, Kevin; Gerhold, David; Haw, Robin; Henney, Adriano; Ma’ayan, Avi; McBride, Mary; Monti, Stefano; Ochs, Michael F.; Pandey, Akhilesh; Sharan, Roded; Stierum, Rob; Tugendreich, Stuart; Willett, Catherine; Wittwehr, Clemens; Xia, Jianguo; Patton, Geoffrey W.; Arvidson, Kirk; Bouhifd, Mounir; Hogberg, Helena T.; Luechtefeld, Thomas; Smirnova, Lena; Zhao, Liang; Adeleye, Yeyejide; Kanehisa, Minoru; Carmichael, Paul; Andersen, Melvin E.; Hartung, Thomas

    2014-01-01

    Summary Despite wide-spread consensus on the need to transform toxicology and risk assessment in order to keep pace with technological and computational changes that have revolutionized the life sciences, there remains much work to be done to achieve the vision of toxicology based on a mechanistic foundation. A workshop was organized to explore one key aspect of this transformation – the development of Pathways of Toxicity (PoT) as a key tool for hazard identification based on systems biology. Several issues were discussed in depth in the workshop: The first was the challenge of formally defining the concept of a PoT as distinct from, but complementary to, other toxicological pathway concepts such as mode of action (MoA). The workshop came up with a preliminary definition of PoT as “A molecular definition of cellular processes shown to mediate adverse outcomes of toxicants”. It is further recognized that normal physiological pathways exist that maintain homeostasis and these, sufficiently perturbed, can become PoT. Second, the workshop sought to define the adequate public and commercial resources for PoT information, including data, visualization, analyses, tools, and use-cases, as well as the kinds of efforts that will be necessary to enable the creation of such a resource. Third, the workshop explored ways in which systems biology approaches could inform pathway annotation, and which resources are needed and available that can provide relevant PoT information to the diverse user communities. PMID:24127042

  12. Final report: 'Rhodopseudomonas palustris' genome workshop to be held in Spring of 2001; FINAL

    International Nuclear Information System (INIS)

    Harwood, Caroline S.

    2002-01-01

    The 'Rhodopseudomonas palustris' genome workshop took place in Iowa City on April 6-8, 2001. The purpose of the meeting was to instruct members of the annotation working group in approaches to accomplishing the 'human' phase of the 'R. palustris' genome annotation. A partial draft of a paper describing the 'Rhodopseudomonas palustris' genome has been written and a full version of the paper should be ready for submission by the end of the summer 2002

  13. Cognitive aspects in games workshops for learning a foreign language

    Directory of Open Access Journals (Sweden)

    Claudia Ferrareto Lopes

    2014-08-01

    Full Text Available The goal of the study was to analyze the cognitive aspects related to learning English as a foreign language, by means of games workshops with students of the 6th grade of elementary school from a state school in Londrina. The paper is grounded on Piagetian theory and is descriptive-interpretative study with a qualitative perspective. Two guiding questions motivate the study: what is the role of games workshops for learning English as a foreign language? In what way the cognitive processes are held in the games workshops for learning English? To meet the proposed goals, workshops were implemented with games containing the linguistic contents studied in English classes. The games workshops enabled the observation and analysis of the cognitive aspects involved in learning a foreign language. Results show that the games workshops promote the participation of the students motivating action and output, evidencing gaps on the knowledge and providing equilibration processes. Subjects are asked to produce outputs via games demands, thus evoking knowhow, as well as the thinking about their own products, suggesting a conscious-awareness process.

  14. Linguistic Engineering and Linguistic of Engineering: Adaptation of Linguistic Paradigm for Circumstance of Engineering Epoch

    OpenAIRE

    Natalya Halina

    2014-01-01

    The article is devoted to the problems of linguistic knowledge in the Engineering Epoch. Engineering Epoch is the time of adaptation to the information flows by knowledge management, The system of adaptation mechanisms is connected with linguistic and linguistic technologies, forming in new linguistic patterns Linguistic Engineering and Linguistic of Engineering.

  15. Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach.

    Science.gov (United States)

    Rinaldi, Fabio; Schneider, Gerold; Kaljurand, Kaarel; Hess, Michael; Andronis, Christos; Konstandi, Ourania; Persidis, Andreas

    2007-02-01

    The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.

  16. Estimating the annotation error rate of curated GO database sequence annotations

    Directory of Open Access Journals (Sweden)

    Brown Alfred L

    2007-05-01

    Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.

  17. Taxonomic annotation of public fungal ITS sequences from the built environment – a report from an April 10–11, 2017 workshop (Aberdeen, UK

    Directory of Open Access Journals (Sweden)

    R. Henrik Nilsson

    2018-01-01

    Full Text Available Recent DNA-based studies have shown that the built environment is surprisingly rich in fungi. These indoor fungi – whether transient visitors or more persistent residents – may hold clues to the rising levels of human allergies and other medical and building-related health problems observed globally. The taxonomic identity of these fungi is crucial in such pursuits. Molecular identification of the built mycobiome is no trivial undertaking, however, given the large number of unidentified, misidentified, and technically compromised fungal sequences in public sequence databases. In addition, the sequence metadata required to make informed taxonomic decisions – such as country and host/substrate of collection – are often lacking even from reference and ex-type sequences. Here we report on a taxonomic annotation workshop (April 10–11, 2017 organized at the James Hutton Institute/University of Aberdeen (UK to facilitate reproducible studies of the built mycobiome. The 32 participants went through public fungal ITS barcode sequences related to the built mycobiome for taxonomic and nomenclatural correctness, technical quality, and metadata availability. A total of 19,508 changes – including 4,783 name changes, 14,121 metadata annotations, and the removal of 99 technically compromised sequences – were implemented in the UNITE database for molecular identification of fungi (https://unite.ut.ee/ and shared with a range of other databases and downstream resources. Among the genera that saw the largest number of changes were Penicillium, Talaromyces, Cladosporium, Acremonium, and Alternaria, all of them of significant importance in both culture-based and culture-independent surveys of the built environment.

  18. Taxonomic annotation of public fungal ITS sequences from the built environment – a report from an April 10–11, 2017 workshop (Aberdeen, UK)

    Science.gov (United States)

    Nilsson, R. Henrik; Taylor, Andy F. S.; Adams, Rachel I.; Baschien, Christiane; Johan Bengtsson-Palme; Cangren, Patrik; Coleine, Claudia; Heide-Marie Daniel; Glassman, Sydney I.; Hirooka, Yuuri; Irinyi, Laszlo; Reda Iršėnaitė; Pedro M. Martin-Sanchez; Meyer, Wieland; Seung-Yoon Oh; Jose Paulo Sampaio; Seifert, Keith A.; Sklenář, Frantisek; Dirk Stubbe; Suh, Sung-Oui; Summerbell, Richard; Svantesson, Sten; Martin Unterseher; Cobus M. Visagie; Weiss, Michael; Woudenberg, Joyce HC; Christian Wurzbacher; den Wyngaert, Silke Van; Yilmaz, Neriman; Andrey Yurkov; Kõljalg, Urmas; Abarenkov, Kessy

    2018-01-01

    Abstract Recent DNA-based studies have shown that the built environment is surprisingly rich in fungi. These indoor fungi – whether transient visitors or more persistent residents – may hold clues to the rising levels of human allergies and other medical and building-related health problems observed globally. The taxonomic identity of these fungi is crucial in such pursuits. Molecular identification of the built mycobiome is no trivial undertaking, however, given the large number of unidentified, misidentified, and technically compromised fungal sequences in public sequence databases. In addition, the sequence metadata required to make informed taxonomic decisions – such as country and host/substrate of collection – are often lacking even from reference and ex-type sequences. Here we report on a taxonomic annotation workshop (April 10–11, 2017) organized at the James Hutton Institute/University of Aberdeen (UK) to facilitate reproducible studies of the built mycobiome. The 32 participants went through public fungal ITS barcode sequences related to the built mycobiome for taxonomic and nomenclatural correctness, technical quality, and metadata availability. A total of 19,508 changes – including 4,783 name changes, 14,121 metadata annotations, and the removal of 99 technically compromised sequences – were implemented in the UNITE database for molecular identification of fungi (https://unite.ut.ee/) and shared with a range of other databases and downstream resources. Among the genera that saw the largest number of changes were Penicillium, Talaromyces, Cladosporium, Acremonium, and Alternaria, all of them of significant importance in both culture-based and culture-independent surveys of the built environment. PMID:29559822

  19. CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics.

    Science.gov (United States)

    Gai, Xiaowu; Perin, Juan C; Murphy, Kevin; O'Hara, Ryan; D'arcy, Monica; Wenocur, Adam; Xie, Hongbo M; Rappaport, Eric F; Shaikh, Tamim H; White, Peter S

    2010-02-04

    Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated

  20. CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics

    Directory of Open Access Journals (Sweden)

    Rappaport Eric F

    2010-02-01

    Full Text Available Abstract Background Recent studies have shown that copy number variations (CNVs are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. Results We developed a suite of software tools and resources (CNV Workshop for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. Conclusions To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and

  1. Forensic linguistics: Applications of forensic linguistics methods to anonymous letters

    OpenAIRE

    NOVÁKOVÁ, Veronika

    2011-01-01

    The title of my bachelor work is ?Forensic linguistics: Applications of forensic linguistics methods to anonymous letters?. Forensic linguistics is young and not very known branch of applied linguistics. This bachelor work wants to introduce forensic linguistics and its method. The bachelor work has two parts ? theory and practice. The theoretical part informs about forensic linguistics in general. Its two basic aspects utilized in forensic science and respective methods. The practical part t...

  2. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Science.gov (United States)

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh

  3. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Directory of Open Access Journals (Sweden)

    Anika Oellrich

    Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content

  4. Applied Linguistics and the "Annual Review of Applied Linguistics."

    Science.gov (United States)

    Kaplan, Robert B.; Grabe, William

    2000-01-01

    Examines the complexities and differences involved in granting disciplinary status to the role of applied linguistics, discusses the role of the "Annual Review of Applied Linguistics" as a contributor to the development of applied linguistics, and highlights a set of publications for the future of applied linguistics. (Author/VWL)

  5. annot8r: GO, EC and KEGG annotation of EST datasets

    Directory of Open Access Journals (Sweden)

    Schmid Ralf

    2008-04-01

    Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non

  6. On Linguistic Abilities, Multilingualism, and Linguistic Justice

    Directory of Open Access Journals (Sweden)

    Iannàccaro Gabriele

    2016-10-01

    Full Text Available The notion of linguistic justice should be related to the concept of linguistic ease, by which we mean the full social and communicative freedom of concern of the speaker in a given social interaction involving the use of language(s present in the society, according to the social norms of use. To acquire an acceptable degree of linguistic ease, the knowledge of at least one L2 is considered important. But the acquisition of a L2 is interfered by the previous linguistic skills of the learner/speaker who, in many cases, does not have a suitable competence even of the languages of the society in which he/she lives.

  7. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  8. The linguistically aware teacher and the teacher-aware linguist.

    Science.gov (United States)

    McCartney, Elspeth; Ellis, Sue

    2013-07-01

    This review evaluates issues of teacher linguistic knowledge relating to their work with children with speech, language and communication difficulties (SLCD). Information is from Ellis and McCartney [(2011a). Applied linguistics and primary school teaching. Cambridge: Cambridge University Press], a state-of-the-art text deriving from a British Association of Applied Linguistics/Cambridge University Press expert seminar series that details: linguistic research underpinning primary school curricula and pedagogy; the form of linguistic knowledge useful for teachers supporting children with SLCD in partnership with speech and language therapists; and how and when teachers acquire and learn to apply such knowledge. Critical analysis of the options presented for teacher learning indicate that policy enjoinders now include linguistic application as an expected part of teachers' professional knowledge, for all children including those with SLCD, but there is a large unmet learning need. It is concluded that there is a role for clinical linguists to disseminate useable knowledge to teachers in an accessible format. Ways of achieving this are considered.

  9. Linguistic Polyphony

    DEFF Research Database (Denmark)

    Nølke, Henning

    on the Scandinavian variant of polyphony, ScaPoLine. ScaPoLine is a formal linguistic theory whose main purpose is to specify the instructions conveyed through linguistic form for the creation of polyphonic meaning. The theoretical introduction is followed by polyphonic analyses of linguistic phenomena...

  10. A Fuzzy Linguistic Methodology to Deal With Unbalanced Linguistic Term Sets

    OpenAIRE

    Herrera, F.; Herrera-Viedma, Enrique; Martinez, L.

    2008-01-01

    Many real problems dealing with qualitative aspects use linguistic approaches to assess such aspects. In most of these problems, a uniform and symmetrical distribution of the linguistic term sets for linguistic modeling is assumed. However, there exist problems whose assessments need to be represented by means of unbalanced linguistic term sets, i.e., using term sets that are not uniformly and symmetrically distributed. The use of linguistic variables implies processes of computing with words...

  11. Corpus linguistics and statistics with R introduction to quantitative methods in linguistics

    CERN Document Server

    Desagulier, Guillaume

    2017-01-01

    This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and t...

  12. BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments.

    Science.gov (United States)

    López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F

    2013-07-01

    Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  13. Chado controller: advanced annotation management with a community annotation system.

    Science.gov (United States)

    Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie

    2012-04-01

    We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form valentin.guignon@cirad.fr; stephanie.sidibe-bocs@cirad.fr Supplementary data are available at Bioinformatics online.

  14. Linguistic Structure Prediction

    CERN Document Server

    Smith, Noah A

    2011-01-01

    A major part of natural language processing now depends on the use of text data to build linguistic analyzers. We consider statistical, computational approaches to modeling linguistic structure. We seek to unify across many approaches and many kinds of linguistic structures. Assuming a basic understanding of natural language processing and/or machine learning, we seek to bridge the gap between the two fields. Approaches to decoding (i.e., carrying out linguistic structure prediction) and supervised and unsupervised learning of models that predict discrete structures as outputs are the focus. W

  15. Probabilistic linguistics

    NARCIS (Netherlands)

    Bod, R.; Heine, B.; Narrog, H.

    2010-01-01

    Probabilistic linguistics takes all linguistic evidence as positive evidence and lets statistics decide. It allows for accurate modelling of gradient phenomena in production and perception, and suggests that rule-like behaviour is no more than a side effect of maximizing probability. This chapter

  16. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  17. A New Hesitant Fuzzy Linguistic TOPSIS Method for Group Multi-Criteria Linguistic Decision Making

    Directory of Open Access Journals (Sweden)

    Fangling Ren

    2017-11-01

    Full Text Available Hesitant fuzzy linguistic decision making is a focus point in linguistic decision making, in which the main method is based on preference ordering. This paper develops a new hesitant fuzzy linguistic TOPSIS method for group multi-criteria linguistic decision making; the method is inspired by the TOPSIS method and the preference degree between two hesitant fuzzy linguistic term sets (HFLTSs. To this end, we first use the preference degree to define a pseudo-distance between two HFLTSs and analyze its properties. Then we present the positive (optimistic and negative (pessimistic information of each criterion provided by each decision maker and aggregate these by using weights of decision makers to obtain the hesitant fuzzy linguistic positive and negative ideal solutions. On the basis of the proposed pseudo-distance, we finally obtain the positive (negative ideal separation matrix and a new relative closeness degree to rank alternatives. We also design an algorithm based on the provided method to carry out hesitant fuzzy linguistic decision making. An illustrative example shows the elaboration of the proposed method and comparison with the symbolic aggregation-based method, the hesitant fuzzy linguistic TOPSIS method and the hesitant fuzzy linguistic VIKOR method; it seems that the proposed method is a useful and alternative decision-making method.

  18. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  19. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  20. Linguistic Imperialism

    DEFF Research Database (Denmark)

    Phillipson, Robert

    2013-01-01

    The study of linguistic imperialism focuses on how and why certain languages dominate internationally, and attempts to account for such dominance in a theoretically informed way.......The study of linguistic imperialism focuses on how and why certain languages dominate internationally, and attempts to account for such dominance in a theoretically informed way....

  1. Pipeline to upgrade the genome annotations

    Directory of Open Access Journals (Sweden)

    Lijin K. Gopi

    2017-12-01

    Full Text Available Current era of functional genomics is enriched with good quality draft genomes and annotations for many thousands of species and varieties with the support of the advancements in the next generation sequencing technologies (NGS. Around 25,250 genomes, of the organisms from various kingdoms, are submitted in the NCBI genome resource till date. Each of these genomes was annotated using various tools and knowledge-bases that were available during the period of the annotation. It is obvious that these annotations will be improved if the same genome is annotated using improved tools and knowledge-bases. Here we present a new genome annotation pipeline, strengthened with various tools and knowledge-bases that are capable of producing better quality annotations from the consensus of the predictions from different tools. This resource also perform various additional annotations, apart from the usual gene predictions and functional annotations, which involve SSRs, novel repeats, paralogs, proteins with transmembrane helices, signal peptides etc. This new annotation resource is trained to evaluate and integrate all the predictions together to resolve the overlaps and ambiguities of the boundaries. One of the important highlights of this resource is the capability of predicting the phylogenetic relations of the repeats using the evolutionary trace analysis and orthologous gene clusters. We also present a case study, of the pipeline, in which we upgrade the genome annotation of Nelumbo nucifera (sacred lotus. It is demonstrated that this resource is capable of producing an improved annotation for a better understanding of the biology of various organisms.

  2. What Is Applied Linguistics?

    Science.gov (United States)

    James, Carl

    1993-01-01

    Ostensive and expository definitions of applied linguistics are assessed. It is suggested that the key to a meaningful definition lies in the dual articulation of applied linguistics: it is an interface between linguistics and practicality. Its role as an "expert system" is suggested. (45 references) (Author/LB)

  3. The Study of Critical Eco-Linguistic in Green Discourse: Prospective Eco-Linguistic Analysis

    Directory of Open Access Journals (Sweden)

    Tommi Yuniawan

    2017-10-01

    Full Text Available Eco-linguistic studies are influenced by one of the other interdisciplinary sciences, namely critical discourse analysis. The combination of these two sciences is called critical eco-linguistic studies. Critical eco-linguistic examines the discourse about the environment and various forms of discourse and their ideology which concerns people and the environment. The environmental discourse with all its manifestations (oral text, written text is called green discourse. To that end, critical eco-linguistic dictates the linguistic aspects contained in the green discourse. Utilization of lingual units in green discourse will affect the sense and logic of people involved in the discourse, ie the writers and readers or the speakers and the speakers. What is recorded in their cognition, will affect their attitudes and actions to the environment. If green discourse is constructive, then their attitude and actions to the environment are constructive. Conversely, if green discourse is more destructive and exploitative, then their attitudes and actions towards the environment will also be affected towards destruction and exploitation. For this reason, critical eco-linguistic studies in green discourse deserve to be given space as a form of prospective eco-linguistic analysis.

  4. Linguistic and Psycho-Linguistic Principles of Linguadidactics (theoretical interpretation

    Directory of Open Access Journals (Sweden)

    Liudmila Mauzienė

    2011-04-01

    Full Text Available This article considers linguadidactics being closely related to linguistics, psychology, psycholinguistics and didactics and applies their theoretical statements and regularities in its scientific studies. Methodology refers to linguistics which investigates the language as a teaching subject. Methodology is linked to psychology in two ways. First of all, it is based on psychology as the teaching process is an intellectual psychical act and its regularities are necessary to know. On the other hand, methodology applies rules of pedagogy that predicts ways of learning and development of language skills. The article emphasizes that sustainable work experience and analysis of scientific research show that teaching process is more effective if consistent patterns of linguistics and psychology are appropriately applied.

  5. Semantic annotation of consumer health questions.

    Science.gov (United States)

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most

  6. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  7. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  9. Forensic Linguistics: The Linguistic Analyst and Expert Witness of Language Evidence in Criminal Trials.

    Science.gov (United States)

    Jordan, Sherilynn Nidever

    Forensic linguistics (FL) provides consultation to lawyers through the analysis of language evidence during the pre-trial investigation. Evidence commonly analyzed by linguists in criminal cases includes transcripts of police interviews and language crimes (such as bribery) and anonymous or questioned texts. Forensic linguistic testimony is rarely…

  10. Sample Undergraduate Linguistics Courses. Linguistics in the Undergraduate Curriculum, Appendix 5.

    Science.gov (United States)

    Linguistic Society of America, Washington, DC.

    Thirty-six nontraditional undergraduate courses in linguistics are described. Course topics include: animal communication, bilingualism, sociolinguistics, introductory linguistics, language and formal reasoning, language and human conflict, language and power, language and sex, language and the brain, language planning, language typology and…

  11. Linguistics and the digital humanities

    DEFF Research Database (Denmark)

    Jensen, Kim Ebensgaard

    2014-01-01

    Corpus linguistics has been closely intertwined with digital technology since the introduction of university computer mainframes in the 1960s. Making use of both digitized data in the form of the language corpus and computational methods of analysis involving concordancers and statistics software......, corpus linguistics arguably has a place in the digital humanities. Still, it remains obscure and figures only sporadically in the literature on the digital humanities. This article provides an overview of the main principles of corpus linguistics and the role of computer technology in relation to data...... and method and also offers a bird's-eye view of the history of corpus linguistics with a focus on its intimate relationship with digital technology and how digital technology has impacted the very core of corpus linguistics and shaped the identity of the corpus linguist. Ultimately, the article is oriented...

  12. The Routledge Applied Linguistics Reader

    Science.gov (United States)

    Wei, Li, Ed.

    2011-01-01

    "The Routledge Applied Linguistics Reader" is an essential collection of readings for students of Applied Linguistics. Divided into five sections: Language Teaching and Learning, Second Language Acquisition, Applied Linguistics, Identity and Power and Language Use in Professional Contexts, the "Reader" takes a broad…

  13. Measuring Linguistic Empathy: An Experimental Approach to Connecting Linguistic and Social Psychological Notions of Empathy

    Science.gov (United States)

    Kann, Trevor

    2017-01-01

    This dissertation investigated the relationship between Linguistic Empathy and Psychological Empathy by implementing a psycholinguistic experiment that measured a person's acceptability ratings of sentences with violations of Linguistic Empathy and correlating them with a measure of the person's Psychological Empathy. Linguistic Empathy…

  14. LANGUE AND PAROLE IN AMERICAN LINGUISTICS.

    Science.gov (United States)

    LEVIN, SAMUEL R.

    THE PROBLEM OF THE NATURE OF LANGUAGE STRUCTURE IS CONSIDERED AND THE FORM WHICH ANY LINGUISTIC DESCRIPTION SHOULD TAKE. THE AUTHOR EXAMINES THE INFLUENCE OF THE SWISS LINGUIST, FERDINAND DE SAUSSURE, ON THE DEVELOPMENT OF AMERICAN LINGUISTICS. THE QUESTION OF "MENTALISM" IN LINGUISTICS IS REDUCED TO THE PROBLEM OF WHETHER LINGUISTIC…

  15. Concept annotation in the CRAFT corpus.

    Science.gov (United States)

    Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E

    2012-07-09

    Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

  16. Linguistic Communications 1.

    Science.gov (United States)

    Monash Univ., Clayton, Victoria (Australia).

    The present compilation of papers on linguistics is the result of joint efforts by the Classical Studies, French, Japanese, Linguistics, and Russian Departments of Monash University. Selections in the Pre-Prints and Articles section include: "For/Arabic Bilingualism in the Zalingei Area," by B. Jernudd; "Prosodic Problems in a Generative Phonology…

  17. Essential Requirements for Digital Annotation Systems

    Directory of Open Access Journals (Sweden)

    ADRIANO, C. M.

    2012-06-01

    Full Text Available Digital annotation systems are usually based on partial scenarios and arbitrary requirements. Accidental and essential characteristics are usually mixed in non explicit models. Documents and annotations are linked together accidentally according to the current technology, allowing for the development of disposable prototypes, but not to the support of non-functional requirements such as extensibility, robustness and interactivity. In this paper we perform a careful analysis on the concept of annotation, studying the scenarios supported by digital annotation tools. We also derived essential requirements based on a classification of annotation systems applied to existing tools. The analysis performed and the proposed classification can be applied and extended to other type of collaborative systems.

  18. Working Memory for Linguistic and Non-linguistic Manual Gestures: Evidence, Theory, and Application.

    Science.gov (United States)

    Rudner, Mary

    2018-01-01

    Linguistic manual gestures are the basis of sign languages used by deaf individuals. Working memory and language processing are intimately connected and thus when language is gesture-based, it is important to understand related working memory mechanisms. This article reviews work on working memory for linguistic and non-linguistic manual gestures and discusses theoretical and applied implications. Empirical evidence shows that there are effects of load and stimulus degradation on working memory for manual gestures. These effects are similar to those found for working memory for speech-based language. Further, there are effects of pre-existing linguistic representation that are partially similar across language modalities. But above all, deaf signers score higher than hearing non-signers on an n-back task with sign-based stimuli, irrespective of their semantic and phonological content, but not with non-linguistic manual actions. This pattern may be partially explained by recent findings relating to cross-modal plasticity in deaf individuals. It suggests that in linguistic gesture-based working memory, semantic aspects may outweigh phonological aspects when processing takes place under challenging conditions. The close association between working memory and language development should be taken into account in understanding and alleviating the challenges faced by deaf children growing up with cochlear implants as well as other clinical populations.

  19. LINGUISTICS AND SECOND LANGUAGE TEACHING: AN ...

    African Journals Online (AJOL)

    The relationship between linguistics and second language teaching has always been a controversial one. Many linguists have argued that linguistics has nothing to say to the teacher. Sampson (1980, p.10), for example, says: ·"1 do not believe that linguistics has any contribution to make to the teaching of English or the.

  20. On the concept of a linguistic variable

    International Nuclear Information System (INIS)

    Kerre, E.

    1996-01-01

    The concept of a linguistic variable plays a crucial role in the representation of imprecise knowledge in information sciences. A variable is called linguistic as soon as its values are linguistic terms rather than numerical ones. The power of daily communication and common sense reasoning lies in the use of such linguistic values. Even when exact numerical values are available, experts tend to transform these values into linguistic ones. A physician will usually translate a numerical measurement of a blood pressure into linguistic specifications such as normal, very high, too low... Zadeh has argued that the set of values for a linguistic variable assumes a more-or-less fixed structure. Starting from an atomic value and its antonym all remaining values are constructed using logical connectives on the one hand and linguistic hedges on the other hand. In this paper we will describe how to represent the value set of a linguistic variable in general and of linguistic hedges in particular

  1. Clinical Linguistics--Retrospect and Prospect.

    Science.gov (United States)

    Grunwell, Pamela

    In the past 20 years, linguistics has gained a prominent position in speech and language pathology in Britain, evolving into a new field, clinical linguistics. It includes three related areas of activity: training of speech pathologists/therapists; professional practice; and research. Linguistics and speech/language pathology have developed as…

  2. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  3. Semantic annotation in biomedicine: the current landscape.

    Science.gov (United States)

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  4. Etymology and Modern Linguistics

    Science.gov (United States)

    Malkiel, Yakov

    1975-01-01

    Discusses the estrangement between etymology and modern linguistics, and concludes that a reconciliation between spatio-temporal linguistics and etymology must occur, because without it, both disciplines are doomed to inanition. (Author/AM)

  5. Linguistic Corpora and Language Teaching.

    Science.gov (United States)

    Murison-Bowie, Simon

    1996-01-01

    Examines issues raised by corpus linguistics concerning the description of language. The article argues that it is necessary to start from correct descriptions of linguistic units and the contexts in which they occur. Corpus linguistics has joined with language teaching by sharing a recognition of the importance of a larger, schematic view of…

  6. Linguistic dating of biblical texts

    DEFF Research Database (Denmark)

    Young, Ian; Rezetko, Robert; Ehrensvärd, Martin Gustaf

    Since the beginning of critical scholarship biblical texts have been dated using linguistic evidence. In recent years this has become a controversial topic, especially with the publication of Ian Young (ed.), Biblical Hebrew: Studies in Chronology and Typology (2003). However, until now there has...... been no introduction and comprehensive study of the field. Volume 1 introduces the field of linguistic dating of biblical texts, particularly to intermediate and advanced students of biblical Hebrew who have a reasonable background in the language, having completed at least an introductory course...... in this volume are: What is it that makes Archaic Biblical Hebrew archaic , Early Biblical Hebrew early , and Late Biblical Hebrew late ? Does linguistic typology, i.e. different linguistic characteristics, convert easily and neatly into linguistic chronology, i.e. different historical origins? A large amount...

  7. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  8. Active learning reduces annotation time for clinical concept extraction.

    Science.gov (United States)

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Working Memory for Linguistic and Non-linguistic Manual Gestures: Evidence, Theory, and Application

    Directory of Open Access Journals (Sweden)

    Mary Rudner

    2018-05-01

    Full Text Available Linguistic manual gestures are the basis of sign languages used by deaf individuals. Working memory and language processing are intimately connected and thus when language is gesture-based, it is important to understand related working memory mechanisms. This article reviews work on working memory for linguistic and non-linguistic manual gestures and discusses theoretical and applied implications. Empirical evidence shows that there are effects of load and stimulus degradation on working memory for manual gestures. These effects are similar to those found for working memory for speech-based language. Further, there are effects of pre-existing linguistic representation that are partially similar across language modalities. But above all, deaf signers score higher than hearing non-signers on an n-back task with sign-based stimuli, irrespective of their semantic and phonological content, but not with non-linguistic manual actions. This pattern may be partially explained by recent findings relating to cross-modal plasticity in deaf individuals. It suggests that in linguistic gesture-based working memory, semantic aspects may outweigh phonological aspects when processing takes place under challenging conditions. The close association between working memory and language development should be taken into account in understanding and alleviating the challenges faced by deaf children growing up with cochlear implants as well as other clinical populations.

  10. The linguistic repudiation of Wundt.

    Science.gov (United States)

    Nerlich, B; Clarke, D D

    1998-08-01

    Wilhelm Wundt's influence on the development of linguistics and psychology was pervasive. The foundations for this web of influence on the sciences of mind and language were laid down in Wundt's own research program, which was quite different from other attempts at founding a new psychology, as it was deeply rooted in German philosophy. This resulted in certain gaps in Wundt's conception of mind and language. These gaps provoked a double repudiation of Wundt's theories, by linguists and psychologists. The psychological repudiation has been studied by historians of psychology, and the linguistic repudiation has been studied by historians of linguistics. The intent of this article is to bring the linguistic repudiation to the attention of historians of psychology, especially the one outlined by two important figures in the history of psychology: Karl Buhler and George Mead.

  11. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    Energy Technology Data Exchange (ETDEWEB)

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  12. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

    Science.gov (United States)

    Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  13. Stellenbosch Papers in Linguistics Plus: Journal Sponsorship

    African Journals Online (AJOL)

    Publisher. Stellenbosch Papers in Linguistics (SPiL) is published by the Department of General Linguistics of Stellenbosch University. Department of General Linguistics, Stellenbosch University. Sources of Support. The Department of General Linguistics acknowledges the financial support provided by the Fonds ...

  14. Peace linguistics for language teachers

    Directory of Open Access Journals (Sweden)

    Francisco GOMES DE MATOS

    2014-12-01

    Full Text Available This text aims at presenting the concept of Peace Linguistics - origins and recent developments -- as being implemented in the author's ongoing work in that emerging branch of Applied Linguistics. Examples of applicational possibilities are given, with a focus on language teaching-learning and a Checklist is provided, of topics for suggested linguistic-educational research, centered on communicative peace.

  15. Legal Linguistics as a Mutual Arena for Cooperation: Recent Developments in the Field of Applied Linguistics and Law

    Science.gov (United States)

    Engberg, Jan

    2013-01-01

    This article reports on some of the recent projects and individual works in the field of Legal Linguistics as examples of cooperation between Applied Linguistics and law. The article starts by discussing relevant prototypical concepts of Legal Linguistics. Legal Linguistics scrutinizes interactions between human beings in the framework of legal…

  16. Measuring the diffusion of linguistic change.

    Science.gov (United States)

    Nerbonne, John

    2010-12-12

    We examine situations in which linguistic changes have probably been propagated via normal contact as opposed to via conquest, recent settlement and large-scale migration. We proceed then from two simplifying assumptions: first, that all linguistic variation is the result of either diffusion or independent innovation, and, second, that we may operationalize social contact as geographical distance. It is clear that both of these assumptions are imperfect, but they allow us to examine diffusion via the distribution of linguistic variation as a function of geographical distance. Several studies in quantitative linguistics have examined this relation, starting with Séguy (Séguy 1971 Rev. Linguist. Romane 35, 335-357), and virtually all report a sublinear growth in aggregate linguistic variation as a function of geographical distance. The literature from dialectology and historical linguistics has mostly traced the diffusion of individual features, however, so that it is sensible to ask what sort of dynamic in the diffusion of individual features is compatible with Séguy's curve. We examine some simulations of diffusion in an effort to shed light on this question.

  17. Linguistic Dating of Biblical Texts

    DEFF Research Database (Denmark)

    Ehrensvärd, Martin Gustaf

    2003-01-01

    For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed the chronol......For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed...... the chronology of the texts established by other means: the Hebrew of Genesis-2 Kings was judged to be early and that of Esther, Daniel, Ezra, Nehemiah, and Chronicles to be late. In the current debate where revisionists have questioned the traditional dating, linguistic arguments in the dating of texts have...... come more into focus. The study critically examines some linguistic arguments adduced to support the traditional position, and reviewing the arguments it points to weaknesses in the linguistic dating of EBH texts to pre-exilic times. When viewing the linguistic evidence in isolation it will be clear...

  18. Computer systems for annotation of single molecule fragments

    Science.gov (United States)

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  19. Image annotation under X Windows

    Science.gov (United States)

    Pothier, Steven

    1991-08-01

    A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.

  20. Motion lecture annotation system to learn Naginata performances

    Science.gov (United States)

    Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko

    2013-12-01

    This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

  1. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  2. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  3. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  4. Annotating temporal information in clinical narratives.

    Science.gov (United States)

    Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

    2013-12-01

    Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Linguistic Intuitions and Cognitive Penetrability

    Directory of Open Access Journals (Sweden)

    Michael Devitt

    2014-12-01

    Full Text Available Metalinguistic intuitions play a very large evidential role in both linguistics and philosophy. Linguists think that these intuitions are products of underlying linguistic competence. I call this view “the voice of competence” (“VoC”. Although many philosophers seem to think that metalinguistic intuitions are a priori many may implicitly hold the more scientifically respectable VoC. According to VoC, I argue, these intuitions can be cognitively penetrated by the central processor. But, I have argued elsewhere, VoC is false. Instead, we should hold “the modest explanation” (“ME” according to which these intuitions are fairly unreflective empirical theory-laden central-processor responses to phenomena. On ME, no question of cognitive penetration arises. ME has great methodological significance for the study of language. Insofar as we rely on intuitions as evidence we should prefer those of linguists and philosophers because they are more expert. But, more importantly, we should be seeking other evidence in linguistic usage.

  6. M.Yu. Lermontov’s linguistic/literary personality through perspective of linguistic personality perception by philologist V.V. Vinogrado

    Directory of Open Access Journals (Sweden)

    Larisa N. Kuznetsova

    2011-04-01

    Full Text Available The article considers M.Yu. Lermontov’s linguistic / literary personality through perspective of linguistic personality perception by Great Russian scientist-philologist and linguist, Academician V.V. Vinogradov.

  7. Teaching Hispanic Linguistics: Strategies to Engage Learners

    Science.gov (United States)

    Knouse, Stephanie M.; Gupton, Timothy; Abreau, Laurel

    2015-01-01

    Even though many post-secondary institutions offer a variety of Hispanic linguistics classes (Hualde 2006; Lipski 2006), research on the pedagogy of Hispanic linguistics is an underdeveloped or non-existent area of the discipline. Courses in Hispanic linguistics can present not only linguistic challenges for non-native speakers of Spanish, but…

  8. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  9. Mathematics and linguistics

    Energy Technology Data Exchange (ETDEWEB)

    Landauer, C.; Bellman, K.L.

    1996-12-31

    In this paper, we study foundational issues that we believe will help us develop a theoretically sound approach to constructing complex systems. The two theoretical approaches that have helped us understand and develop computational systems in the past are mathematics and linguistics. We describe some differences and strengths of the approaches, and propose a research program to combine the richness of linguistic reasoning with the precision of mathematics.

  10. Research workshop to research work: initial steps in establishing health research systems on Malaita, Solomon Islands

    Directory of Open Access Journals (Sweden)

    Kekuabata Esau

    2010-10-01

    Full Text Available Abstract Introduction Atoifi Adventist Hospital is a 90 bed general hospital in East Kwaio, Malaita, Solomon Islands providing services to the population of subsistence villagers of the region. Health professionals at the hospital and attached College of Nursing have considerable human capacity and willingness to undertake health research. However they are constrained by limited research experience, training opportunities, research systems, physical infrastructure and access to resources. This brief commentary describes an 'Introduction to Health Research' workshop delivered at Atoifi Adventist Hospital in September 2009 and efforts to move from 'research workshop' to 'research work'. The Approach Using a participatory-action research approach underpinned by decolonising methodologies, staff from Atoifi Adventist Hospital and James Cook University (Queensland, Australia collaboratively designed, implemented and evaluated a health research workshop. Basic health research principles and methods were presented using active learning methodologies. Following the workshop, Atoifi Adventist Hospital and Atoifi College of Nursing staff, other professionals and community members reported an increased awareness and understanding of health research. The formation of a local Research Committee, improved ethics review procedures and the identification of local research mentors followed the week long workshop. The workshop has acted as a catalyst for research activity, increasing structural and human resource capacity for local health professionals and community leaders to engage in research. Discussion and Conclusions Participants from a variety of educational backgrounds participated in, and received benefit from, a responsive, culturally and linguistically accessible health research workshop. Improving health research systems at a remote hospital and aligning these with local and national research agendas is establishing a base to strengthen public health

  11. Stellenbosch Papers in Linguistics

    African Journals Online (AJOL)

    Stellenbosch Papers in Linguistics (SPiL) is an annual/biannual open access, peer-reviewed international journal, published by the Department of General Linguistics, Stellenbosch University. The papers published in SPiL are ... Poetry in South African Sign Language: What is different? EMAIL FREE FULL TEXT EMAIL ...

  12. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  13. The effectiveness of annotated (vs. non-annotated) digital pathology slides as a teaching tool during dermatology and pathology residencies.

    Science.gov (United States)

    Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A

    2014-06-01

    With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  14. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....

  15. The Dutch Linguistic Intraoperative Protocol: a valid linguistic approach to awake brain surgery.

    Science.gov (United States)

    De Witte, E; Satoer, D; Robert, E; Colle, H; Verheyen, S; Visch-Brink, E; Mariën, P

    2015-01-01

    Intraoperative direct electrical stimulation (DES) is increasingly used in patients operated on for tumours in eloquent areas. Although a positive impact of DES on postoperative linguistic outcome is generally advocated, information about the neurolinguistic methods applied in awake surgery is scarce. We developed for the first time a standardised Dutch linguistic test battery (measuring phonology, semantics, syntax) to reliably identify the critical language zones in detail. A normative study was carried out in a control group of 250 native Dutch-speaking healthy adults. In addition, the clinical application of the Dutch Linguistic Intraoperative Protocol (DuLIP) was demonstrated by means of anatomo-functional models and five case studies. A set of DuLIP tests was selected for each patient depending on the tumour location and degree of linguistic impairment. DuLIP is a valid test battery for pre-, intraoperative and postoperative language testing and facilitates intraoperative mapping of eloquent language regions that are variably located. Copyright © 2014 Elsevier Inc. All rights reserved.

  16. The Unbalanced Linguistic Aggregation Operator in Group Decision Making

    Directory of Open Access Journals (Sweden)

    Li Zou

    2012-01-01

    Full Text Available Many linguistic aggregation methods have been proposed and applied in the linguistic decision-making problems. In practice, experts need to assess a number of values in a side of reference domain higher than in the other one; that is, experts use unbalanced linguistic values to express their evaluation for problems. In this paper, we propose a new linguistic aggregation operator to deal with unbalanced linguistic values in group decision making, we adopt 2-tuple representation model of linguistic values and linguistic hierarchies to express unbalanced linguistic values, and moreover, we present the unbalanced linguistic ordered weighted geometric operator to aggregate unbalanced linguistic evaluation values; a comparison example is given to show the advantage of our method.

  17. Quantitative Research in Systemic Functional Linguistics

    Science.gov (United States)

    He, Qingshun

    2018-01-01

    The research of Systemic Functional Linguistics has been quite in-depth in both theory and practice. However, many linguists hold that Systemic Functional Linguistics has no hypothesis testing or experiments and its research is only qualitative. Analyses of the corpus, intelligent computing and language evolution on the ideological background of…

  18. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search

  19. Linguistic fuzzy selection of liquid levelmeters in nuclear facilities

    International Nuclear Information System (INIS)

    Ghyym, S. H.

    1999-01-01

    In this work, a selection methodology of liquid levelmeters, especially, level sensors in non-nuclear category, to be installed in nuclear facilities is developed using a linguistic fuzzy approach. Depending on defuzzification techniques, the linguistic fuzzy methodology leads to either linguistic (exactly, fully-linguistic) or cardinal (i.e., semi-linguistic) evaluation. In the case of the linguistic method, for each alternative, fuzzy preference index is converted to linguistic utility value by means of a similarity measure determining the degree of similarity between fuzzy index and linguistic ratings. For the cardinal method, the index is translated to cardinal overall utility value. According to these values, alternatives of interest are linguistically or numerically evaluated and a suitable alternative can be selected. Under given selection criteria, the suitable selections out of some liquid levelmeters for nuclear facilities are dealt with using the linguistic fuzzy methodology proposed. Then, linguistic fuzzy evaluation results are compared with numerical results available in the literature. It is found that as to a suitable option the linguistic fuzzy selection is in agreement with the crisp numerical selection. In addition, this comparison shows that the fully-linguistic method facilitates linguistic interpretation regarding evaluation results

  20. Linguistic fuzzy selection of liquid levelmeters in nuclear facilities

    Energy Technology Data Exchange (ETDEWEB)

    Ghyym, S. H. [KEPRI, Taejon (Korea, Republic of)

    1999-10-01

    In this work, a selection methodology of liquid levelmeters, especially, level sensors in non-nuclear category, to be installed in nuclear facilities is developed using a linguistic fuzzy approach. Depending on defuzzification techniques, the linguistic fuzzy methodology leads to either linguistic (exactly, fully-linguistic) or cardinal (i.e., semi-linguistic) evaluation. In the case of the linguistic method, for each alternative, fuzzy preference index is converted to linguistic utility value by means of a similarity measure determining the degree of similarity between fuzzy index and linguistic ratings. For the cardinal method, the index is translated to cardinal overall utility value. According to these values, alternatives of interest are linguistically or numerically evaluated and a suitable alternative can be selected. Under given selection criteria, the suitable selections out of some liquid levelmeters for nuclear facilities are dealt with using the linguistic fuzzy methodology proposed. Then, linguistic fuzzy evaluation results are compared with numerical results available in the literature. It is found that as to a suitable option the linguistic fuzzy selection is in agreement with the crisp numerical selection. In addition, this comparison shows that the fully-linguistic method facilitates linguistic interpretation regarding evaluation results.

  1. Logic Programming for Linguistics

    DEFF Research Database (Denmark)

    Christiansen, Henning

    2010-01-01

    This article gives a short introduction on how to get started with logic pro- gramming in Prolog that does not require any previous programming expe- rience. The presentation is aimed at students of linguistics, but it does not go deeper into linguistics than any student who has some ideas of what...

  2. Functional MR imaging of cerebral auditory cortex with linguistic and non-linguistic stimulation: preliminary study

    International Nuclear Information System (INIS)

    Kang, Su Jin; Kim, Jae Hyoung; Shin, Tae Min

    1999-01-01

    To obtain preliminary data for understanding the central auditory neural pathway by means of functional MR imaging (fMRI) of the cerebral auditory cortex during linguistic and non-linguistic auditory stimulation. In three right-handed volunteers we conducted fMRI of auditory cortex stimulation at 1.5 T using a conventional gradient-echo technique (TR/TE/flip angle: 80/60/40 deg). Using a pulsed tone of 1000 Hz and speech as non-linguistic and linguistic auditory stimuli, respectively, images-including those of the superior temporal gyrus of both hemispheres-were obtained in sagittal plases. Both stimuli were separately delivered binaurally or monoaurally through a plastic earphone. Images were activated by processing with homemade software. In order to analyze patterns of auditory cortex activation according to type of stimulus and which side of the ear was stimulated, the number and extent of activated pixels were compared between both temporal lobes. Biaural stimulation led to bilateral activation of the superior temporal gyrus, while monoaural stimulation led to more activation in the contralateral temporal lobe than in the ipsilateral. A trend toward slight activation of the left (dominant) temporal lobe in ipsilateral stimulation, particularly with a linguistic stimulus, was observed. During both biaural and monoaural stimulation, a linguistic stimulus produced more widespread activation than did a non-linguistic one. The superior temporal gyri of both temporal lobes are associated with acoustic-phonetic analysis, and the left (dominant) superior temporal gyrus is likely to play a dominant role in this processing. For better understanding of physiological and pathological central auditory pathways, further investigation is needed

  3. The Generic Style Rules for Linguistics

    OpenAIRE

    Haspelmath, Martin

    2014-01-01

    The Generic Style Rules for Linguistics provide a style sheet that can be used by any linguistics journal or edited book, or for teaching purposes. They regulate aspects of text-structure style such as typographic highlighting, citation style, use of capitalization, and bibliographic style (based on the LSA's Unified Stylesheet for linguistics).

  4. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  5. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  6. Displaying Annotations for Digitised Globes

    Science.gov (United States)

    Gede, Mátyás; Farbinger, Anna

    2018-05-01

    Thanks to the efforts of the various globe digitising projects, nowadays there are plenty of old globes that can be examined as 3D models on the computer screen. These globes usually contain a lot of interesting details that an average observer would not entirely discover for the first time. The authors developed a website that can display annotations for such digitised globes. These annotations help observers of the globe to discover all the important, interesting details. Annotations consist of a plain text title, a HTML formatted descriptive text and a corresponding polygon and are stored in KML format. The website is powered by the Cesium virtual globe engine.

  7. THE DIMENSIONS OF COMPOSITION ANNOTATION.

    Science.gov (United States)

    MCCOLLY, WILLIAM

    ENGLISH TEACHER ANNOTATIONS WERE STUDIED TO DETERMINE THE DIMENSIONS AND PROPERTIES OF THE ENTIRE SYSTEM FOR WRITING CORRECTIONS AND CRITICISMS ON COMPOSITIONS. FOUR SETS OF COMPOSITIONS WERE WRITTEN BY STUDENTS IN GRADES 9 THROUGH 13. TYPESCRIPTS OF THE COMPOSITIONS WERE ANNOTATED BY CLASSROOM ENGLISH TEACHERS. THEN, 32 ENGLISH TEACHERS JUDGED…

  8. An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.

    Science.gov (United States)

    Arighi, Cecilia N; Carterette, Ben; Cohen, K Bretonnel; Krallinger, Martin; Wilbur, W John; Fey, Petra; Dodson, Robert; Cooper, Laurel; Van Slyke, Ceri E; Dahdul, Wasila; Mabee, Paula; Li, Donghui; Harris, Bethany; Gillespie, Marc; Jimenez, Silvia; Roberts, Phoebe; Matthews, Lisa; Becker, Kevin; Drabkin, Harold; Bello, Susan; Licata, Luana; Chatr-aryamontri, Andrew; Schaeffer, Mary L; Park, Julie; Haendel, Melissa; Van Auken, Kimberly; Li, Yuling; Chan, Juancarlos; Muller, Hans-Michael; Cui, Hong; Balhoff, James P; Chi-Yang Wu, Johnny; Lu, Zhiyong; Wei, Chih-Hsuan; Tudor, Catalina O; Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar; Cejuela, Juan Miguel; Dubey, Pratibha; Wu, Cathy

    2013-01-01

    In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators' overall experience of a system, regardless of the system's high score on design, learnability and

  9. Evaluation of three automated genome annotations for Halorhabdus utahensis.

    Directory of Open Access Journals (Sweden)

    Peter Bakke

    2009-07-01

    Full Text Available Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus ribosome-binding site. Additionally, we conducted laboratory experiments to test H. utahensis growth and enzyme activity. Current annotation practices need to improve in order to more accurately reflect a genome's biological potential. We make specific recommendations that could improve the quality of microbial annotation projects.

  10. MimoSA: a system for minimotif annotation

    Directory of Open Access Journals (Sweden)

    Kundeti Vamsi

    2010-06-01

    Full Text Available Abstract Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to

  11. Having Linguistic Rules and Knowing Linguistic Facts

    Directory of Open Access Journals (Sweden)

    Peter Ludlow

    2010-11-01

    Full Text Available

    'Knowledge' doesn't correctly describe our relation to linguistic rules. It is too thick a notion (for example, we don't believe linguistic rules. On the other hand, 'cognize', without further elaboration, is too thin a notion, which is to say that it is too thin to play a role in a competence theory. One advantage of the term 'knowledge'-and presumably Chomsky's original motivation for using it-is that knowledge would play the right kind of role in a competence theory: Our competence would consist in a body of knowledge which we have and which we may or may not act upon-our performance need not conform to the linguistic rules that we know.

    Is there a way out of the dilemma? I'm going to make the case that the best way to talk about grammatical rules is simply to say that we have them. That doesn't sound very deep, I know, but saying that we have individual rules leaves room for individual norm guidance in a way that 'cognize' does not. Saying we have a rule like subjacency is also thicker than merely saying we cognize it. Saying I have such a rule invites the interpretation that it is a rule for me-that I am normatively guided by it. The competence theory thus becomes a theory of the rules that we have. Whether we follow those rules is another matter entirely.

  12. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease.

    Science.gov (United States)

    Sifrim, Alejandro; Van Houdt, Jeroen Kj; Tranchevent, Leon-Charles; Nowakowska, Beata; Sakai, Ryo; Pavlopoulos, Georgios A; Devriendt, Koen; Vermeesch, Joris R; Moreau, Yves; Aerts, Jan

    2012-01-01

    The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.

  13. Non-linguistic Conditions for Causativization as a Linguistic Attractor.

    Science.gov (United States)

    Nichols, Johanna

    2017-01-01

    An attractor, in complex systems theory, is any state that is more easily or more often entered or acquired than departed or lost; attractor states therefore accumulate more members than non-attractors, other things being equal. In the context of language evolution, linguistic attractors include sounds, forms, and grammatical structures that are prone to be selected when sociolinguistics and language contact make it possible for speakers to choose between competing forms. The reasons why an element is an attractor are linguistic (auditory salience, ease of processing, paradigm structure, etc.), but the factors that make selection possible and propagate selected items through the speech community are non-linguistic. This paper uses the consonants in personal pronouns to show what makes for an attractor and how selection and diffusion work, then presents a survey of several language families and areas showing that the derivational morphology of pairs of verbs like fear and frighten , or Turkish korkmak 'fear, be afraid' and korkutmak 'frighten, scare', or Finnish istua 'sit' and istutta 'seat (someone)', or Spanish sentarse 'sit down' and sentar 'seat (someone)' is susceptible to selection. Specifically, the Turkish and Finnish pattern, where 'seat' is derived from 'sit' by addition of a suffix-is an attractor and a favored target of selection. This selection occurs chiefly in sociolinguistic contexts of what is defined here as linguistic symbiosis, where languages mingle in speech, which in turn is favored by certain demographic, sociocultural, and environmental factors here termed frontier conditions. Evidence is surveyed from northern Eurasia, the Caucasus, North and Central America, and the Pacific and from both modern and ancient languages to raise the hypothesis that frontier conditions and symbiosis favor causativization.

  14. Annotated chemical patent corpus: a gold standard for text mining.

    Directory of Open Access Journals (Sweden)

    Saber A Akhondi

    Full Text Available Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  15. Georgetown University Round Table on Languages and Linguistics 2001. Linguistics, Language, and the Real World: Discourse and Beyond.

    Science.gov (United States)

    Tannen, Deborah, Ed.; Alatis, James E., Ed.

    This book contains papers from the 2001 Georgetown University Round Table on Languages and Linguistics, "Linguistics, Language, and the Real World: Discourse and Beyond." Papers include: "Introduction" (Deborah Tannen); "A Brief History of the Georgetown University Round Table on Languages and Linguistics" (James E.…

  16. Conversation Analysis and Applied Linguistics.

    Science.gov (United States)

    Schegloff, Emanuel A.; Koshik, Irene; Jacoby, Sally; Olsher, David

    2002-01-01

    Offers biographical guidance on several major areas of conversation-analytic work--turn-taking, repair, and word selection--and indicates past or potential points of contact with applied linguistics. Also discusses areas of applied linguistic work. (Author/VWL)

  17. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2017-11-09

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  18. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan; Jia, Fan; Liu, Wei; Ghanem, Bernard

    2017-01-01

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  19. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  20. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  1. Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.

    Science.gov (United States)

    Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.

  2. The GATO gene annotation tool for research laboratories

    Directory of Open Access Journals (Sweden)

    A. Fujita

    2005-11-01

    Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  3. Bats and wind energy: a literature synthesis and annotated bibliography

    Science.gov (United States)

    Ellison, Laura E.

    2012-01-01

    Turbines have been used to harness energy from wind for hundreds of years. However, with growing concerns about climate change, wind energy has only recently entered the mainstream of global electricity production. Since early on in the development of wind-energy production, concerns have arisen about the potential impacts of turbines to wildlife; these concerns have especially focused on the mortality of birds. Despite recent improvements to turbines that have resulted in reduced mortality of birds, there is clear evidence that bat mortality at wind turbines is of far greater conservation concern. Bats of certain species are dying by the thousands at turbines across North America, and the species consistently affected tend to be those that rely on trees as roosts and most migrate long distances. Turbine-related bat mortalities are now affecting nearly a quarter of all bat species occurring in the United States and Canada. Most documented bat mortality at wind-energy facilities has occurred in late summer and early fall and has involved tree bats, with hoary bats (Lasiurus cinereus) being the most prevalent among fatalities. This literature synthesis and annotated bibliography focuses on refereed journal publications and theses about bats and wind-energy development in North America (United States and Canada). Thirty-six publications and eight theses were found, and their key findings were summarized. These publications date from 1996 through 2011, with the bulk of publications appearing from 2007 to present, reflecting the relatively recent conservation concerns about bats and wind energy. The idea for this Open-File Report formed while organizing a joint U.S. Fish and Wildlife Service/U.S. Geological Survey "Bats and Wind Energy Workshop," on January 25-26, 2012. The purposes of the workshop were to develop a list of research priorities to support decision making concerning bats with respect to siting and operations of wind-energy facilities across the United

  4. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2010-09-14

    The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".

  5. Lancaster Summer School in Corpus Linguistics

    Directory of Open Access Journals (Sweden)

    Jaka Čibej

    2016-11-01

    Full Text Available Med 12. in 15. julijem je na Univerzi v Lancastru potekala poletna šola korpusnega jezikoslovja Lancaster Summer Schools in Corpus Linguistics and Other Digital Methods. Poletno šolo so organizirali UCREL (University Centre for Computer Corpus Research on Language, ERC (Evropski svet za raziskave – European Research Council, CASS (ESRC Centre for Corpus Approaches to Social Science in ESRC (Economic and Social Research Council, razdeljena pa je bila na šest programov, prilagojenih različnim področjem: Korpusno jezikoslovje za proučevanje jezikov (Corpus Linguistics for Language Studies, Korpusno jezikoslovje za družbene vede (Corpus Linguistics for Social Science, Korpusno jezikoslovje za humanistiko (Corpus Linguistics for Humanities, Statistika za korpusno jezikoslovje (Statistics for Corpus Linguistics, Geografski informacijski sistemi za digitalno humanistiko (Geographical Information Systems for the Digital Humanities in Korpusno podprta obdelava naravnih jezikov (Corpus-based Natural Language Processing.

  6. Predicting panel scores by linguistic analysis

    Energy Technology Data Exchange (ETDEWEB)

    Van den Besselaar, P.; Stout, L.; Gou, X

    2016-07-01

    In this paper we explore the use of text analysis for deriving quality indicators of project proposals. We do full text analysis of 3030 review reports. After term extraction, we aggregate the term occurrences to linguistic categories. Using thse linguistic categories as independent variables, we study how well these predict the grading by the review panels. Together, the different linguistic categories explain about 50% of the variance in the grading of the applications. The relative importance of the different linguistic categories inform us about the way the panels work. This can be used to develop altmetrics for the quality of the peer and panel review processes. (Author)

  7. English linguistic purism: history, development, criticism

    Directory of Open Access Journals (Sweden)

    Grishechko Ovsanna Savvichna

    2015-12-01

    Full Text Available Linguistic purism as an area of linguistic analysis describes the practices of identification and acknowledgement of a certain language variety as more structurally advanced as compared to its other varieties. Linguistic protection is associated with preservation of some abstract, classical, conservative linguistic ideal and performs the regulatory function, above all. The puristic approach to the development of the English language has been subjected to heated debate for several centuries and is reflected in both scientific research and literary texts. Supporters of purification of the English language champion the idea of protection of “pure language”. The idea, however, is actively criticized by opponents.

  8. Solar Tutorial and Annotation Resource (STAR)

    Science.gov (United States)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  9. Writing, Literacy, and Applied Linguistics.

    Science.gov (United States)

    Leki, Ilona

    2000-01-01

    Discusses writing and literacy in the domain of applied linguistics. Focus is on needs analysis for literacy acquisition; second language learner identity; longitudinal studies as extensions of identity work; and applied linguistics contributions to second language literacy research. (Author/VWL)

  10. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  11. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  12. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  13. Annotating non-coding regions of the genome.

    Science.gov (United States)

    Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B

    2010-08-01

    Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.

  14. Non-linguistic Conditions for Causativization as a Linguistic Attractor

    Directory of Open Access Journals (Sweden)

    Johanna Nichols

    2018-01-01

    Full Text Available An attractor, in complex systems theory, is any state that is more easily or more often entered or acquired than departed or lost; attractor states therefore accumulate more members than non-attractors, other things being equal. In the context of language evolution, linguistic attractors include sounds, forms, and grammatical structures that are prone to be selected when sociolinguistics and language contact make it possible for speakers to choose between competing forms. The reasons why an element is an attractor are linguistic (auditory salience, ease of processing, paradigm structure, etc., but the factors that make selection possible and propagate selected items through the speech community are non-linguistic. This paper uses the consonants in personal pronouns to show what makes for an attractor and how selection and diffusion work, then presents a survey of several language families and areas showing that the derivational morphology of pairs of verbs like fear and frighten, or Turkish korkmak ‘fear, be afraid’ and korkutmak ‘frighten, scare’, or Finnish istua ‘sit’ and istutta ‘seat (someone’, or Spanish sentarse ‘sit down’ and sentar ‘seat (someone’ is susceptible to selection. Specifically, the Turkish and Finnish pattern, where ‘seat’ is derived from ‘sit’ by addition of a suffix—is an attractor and a favored target of selection. This selection occurs chiefly in sociolinguistic contexts of what is defined here as linguistic symbiosis, where languages mingle in speech, which in turn is favored by certain demographic, sociocultural, and environmental factors here termed frontier conditions. Evidence is surveyed from northern Eurasia, the Caucasus, North and Central America, and the Pacific and from both modern and ancient languages to raise the hypothesis that frontier conditions and symbiosis favor causativization.

  15. Stellenbosch Papers in Linguistics: Contact

    African Journals Online (AJOL)

    Mailing Address. Editors SPiL. Department of General Linguistics University of Stellenbosch Private Bag X1 Matieland, 7602. Stellenbosch South Africa. Principal Contact. Dr Kate Huddlestone Journal Manager Department of General Linguistics. University of Stellenbosch. Private Bag X1. Matieland, 7602. Stellenbosch.

  16. Systems Theory and Communication. Annotated Bibliography.

    Science.gov (United States)

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  17. The surplus value of semantic annotations

    NARCIS (Netherlands)

    Marx, M.

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,

  18. Linguistica matematica, statistica linguistica e linguistica applicata. Una nota storica sui lessici di frequenza e l'educazione linguistica (Mathematical Linguistics, Linguistic Statistics, and Applied Linguistics. An Historical Note on Word Frequencies and Linguistic Education)

    Science.gov (United States)

    Elia, Annibale

    1977-01-01

    This article traces the history of several themes in applied linguistics and to show the relationships between linguistic theory and the sciences concerned with the learning and teaching of languages. Interest in word frequency statistics is discussed in particular. (Text is in Italian.) (CFM)

  19. Annotation-based enrichment of Digital Objects using open-source frameworks

    Directory of Open Access Journals (Sweden)

    Marcus Emmanuel Barnes

    2017-07-01

    Full Text Available The W3C Web Annotation Data Model, Protocol, and Vocabulary unify approaches to annotations across the web, enabling their aggregation, discovery and persistence over time. In addition, new javascript libraries provide the ability for users to annotate multi-format content. In this paper, we describe how we have leveraged these developments to provide annotation features alongside Islandora’s existing preservation, access, and management capabilities. We also discuss our experience developing with the Web Annotation Model as an open web architecture standard, as well as our approach to integrating mature external annotation libraries. The resulting software (the Web Annotation Utility Module for Islandora accommodates annotation across multiple formats. This solution can be used in various digital scholarship contexts.

  20. Preparing culturally and linguistically diverse preservice Early Childhood teachers for field experience

    Directory of Open Access Journals (Sweden)

    Melinda Miller

    2016-07-01

    Full Text Available This article reports on an action research project focussed on preparing culturally and linguistically diverse (CALD preservice early childhood teachers for field experience. A series of targeted workshops delivered over one semester was designed to support the students to develop intercultural competence in relation to knowledge, attitude, skills and behaviours that contribute to success on field placement. Findings indicate that short-term initiatives targeted specifically to students’ identified needs and strengths can help to build intercultural competence for both students and teacher educators. For the participants, access to communication strategies, opportunities for rehearsal of teaching practice, and peer and academic support contributed to shifts in attitude, and the development of skills and new knowledge. New learnings for the teacher educators included challenging assumptions about CALD students’ sense of community and belonging in the university context.

  1. Genetic and linguistic coevolution in Northern Island Melanesia.

    Science.gov (United States)

    Hunley, Keith; Dunn, Michael; Lindström, Eva; Reesink, Ger; Terrill, Angela; Healy, Meghan E; Koki, George; Friedlaender, Françoise R; Friedlaender, Jonathan S

    2008-10-01

    Recent studies have detailed a remarkable degree of genetic and linguistic diversity in Northern Island Melanesia. Here we utilize that diversity to examine two models of genetic and linguistic coevolution. The first model predicts that genetic and linguistic correspondences formed following population splits and isolation at the time of early range expansions into the region. The second is analogous to the genetic model of isolation by distance, and it predicts that genetic and linguistic correspondences formed through continuing genetic and linguistic exchange between neighboring populations. We tested the predictions of the two models by comparing observed and simulated patterns of genetic variation, genetic and linguistic trees, and matrices of genetic, linguistic, and geographic distances. The data consist of 751 autosomal microsatellites and 108 structural linguistic features collected from 33 Northern Island Melanesian populations. The results of the tests indicate that linguistic and genetic exchange have erased any evidence of a splitting and isolation process that might have occurred early in the settlement history of the region. The correlation patterns are also inconsistent with the predictions of the isolation by distance coevolutionary process in the larger Northern Island Melanesian region, but there is strong evidence for the process in the rugged interior of the largest island in the region (New Britain). There we found some of the strongest recorded correlations between genetic, linguistic, and geographic distances. We also found that, throughout the region, linguistic features have generally been less likely to diffuse across population boundaries than genes. The results from our study, based on exceptionally fine-grained data, show that local genetic and linguistic exchange are likely to obscure evidence of the early history of a region, and that language barriers do not particularly hinder genetic exchange. In contrast, global patterns may

  2. Genetic and linguistic coevolution in Northern Island Melanesia.

    Directory of Open Access Journals (Sweden)

    Keith Hunley

    2008-10-01

    Full Text Available Recent studies have detailed a remarkable degree of genetic and linguistic diversity in Northern Island Melanesia. Here we utilize that diversity to examine two models of genetic and linguistic coevolution. The first model predicts that genetic and linguistic correspondences formed following population splits and isolation at the time of early range expansions into the region. The second is analogous to the genetic model of isolation by distance, and it predicts that genetic and linguistic correspondences formed through continuing genetic and linguistic exchange between neighboring populations. We tested the predictions of the two models by comparing observed and simulated patterns of genetic variation, genetic and linguistic trees, and matrices of genetic, linguistic, and geographic distances. The data consist of 751 autosomal microsatellites and 108 structural linguistic features collected from 33 Northern Island Melanesian populations. The results of the tests indicate that linguistic and genetic exchange have erased any evidence of a splitting and isolation process that might have occurred early in the settlement history of the region. The correlation patterns are also inconsistent with the predictions of the isolation by distance coevolutionary process in the larger Northern Island Melanesian region, but there is strong evidence for the process in the rugged interior of the largest island in the region (New Britain. There we found some of the strongest recorded correlations between genetic, linguistic, and geographic distances. We also found that, throughout the region, linguistic features have generally been less likely to diffuse across population boundaries than genes. The results from our study, based on exceptionally fine-grained data, show that local genetic and linguistic exchange are likely to obscure evidence of the early history of a region, and that language barriers do not particularly hinder genetic exchange. In contrast

  3. 77 FR 31371 - Public Workshop: Privacy Compliance Workshop

    Science.gov (United States)

    2012-05-25

    ... presentations, including the privacy compliance fundamentals, privacy and data security, and the privacy... DEPARTMENT OF HOMELAND SECURITY Office of the Secretary Public Workshop: Privacy Compliance... Homeland Security Privacy Office will host a public workshop, ``Privacy Compliance Workshop.'' DATES: The...

  4. Literacy in Somali: Linguistic Consequences.

    Science.gov (United States)

    Biber, Douglas; Hared, Mohamed

    1991-01-01

    Linguistic consequences of literacy in Somalia are examined in a review of the literature and through a study of five dimensions of variation among Somali registers and the expansion of linguistic variation in Somali resulting from the introduction of written registers. (36 references) (LB)

  5. Functional categories in comparative linguistics

    DEFF Research Database (Denmark)

    Rijkhoff, Jan

    , Roger M. 1979. Linguistic knowledge and cultural knowledge: some doubts and speculation. American Anthropologist 81-1, 14-36. Levinson, Stephen C. 1997. From outer to inner space: linguistic categories and non-linguistic thinking. In J. Nuyts and E. Pederson (eds.), Language and Conceptualization, 13......). Furthermore certain ‘ontological categories’ are language-specific (Malt 1995). For example, speakers of Kalam (New Guinea) do not classify the cassowary as a bird, because they believe it has a mythical kinship relation with humans (Bulmer 1967).       In this talk I will discuss the role of functional...

  6. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  7. PANNZER2: a rapid functional annotation web server.

    Science.gov (United States)

    Törönen, Petri; Medlar, Alan; Holm, Liisa

    2018-05-08

    The unprecedented growth of high-throughput sequencing has led to an ever-widening annotation gap in protein databases. While computational prediction methods are available to make up the shortfall, a majority of public web servers are hindered by practical limitations and poor performance. Here, we introduce PANNZER2 (Protein ANNotation with Z-scoRE), a fast functional annotation web server that provides both Gene Ontology (GO) annotations and free text description predictions. PANNZER2 uses SANSparallel to perform high-performance homology searches, making bulk annotation based on sequence similarity practical. PANNZER2 can output GO annotations from multiple scoring functions, enabling users to see which predictions are robust across predictors. Finally, PANNZER2 predictions scored within the top 10 methods for molecular function and biological process in the CAFA2 NK-full benchmark. The PANNZER2 web server is updated on a monthly schedule and is accessible at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. The source code is available under the GNU Public Licence v3.

  8. Machine Learning and Applied Linguistics

    OpenAIRE

    Vajjala, Sowmya

    2018-01-01

    This entry introduces the topic of machine learning and provides an overview of its relevance for applied linguistics and language learning. The discussion will focus on giving an introduction to the methods and applications of machine learning in applied linguistics, and will provide references for further study.

  9. Ontological problems of contemporary linguistics

    Directory of Open Access Journals (Sweden)

    А В Бондаренко

    2009-03-01

    Full Text Available The article studies linguistic ontology problems such as evolution of essential-existential views of language, interrelation within Being-Language-Man triad, linguistics gnosiological principles, language essence localization, and «expression» as language metalinguistic unit as well as architectonics of language personality et alia.

  10. Concise Lexicon for Sign Linguistics

    NARCIS (Netherlands)

    dr. Jan Nijen Twilhaar; Dr. Beppie van den Bogaerde

    2016-01-01

    This extensive, well-researched and clearly formatted lexicon of a wide variety of linguistic terms is a long overdue. It is an extremely welcome addition to the bookshelves of sign language teachers, interpreters, linguists, learners and other sign language users, and of course of the Deaf

  11. Heritage language and linguistic theory

    Science.gov (United States)

    Scontras, Gregory; Fuchs, Zuzanna; Polinsky, Maria

    2015-01-01

    This paper discusses a common reality in many cases of multilingualism: heritage speakers, or unbalanced bilinguals, simultaneous or sequential, who shifted early in childhood from one language (their heritage language) to their dominant language (the language of their speech community). To demonstrate the relevance of heritage linguistics to the study of linguistic competence more broadly defined, we present a series of case studies on heritage linguistics, documenting some of the deficits and abilities typical of heritage speakers, together with the broader theoretical questions they inform. We consider the reorganization of morphosyntactic feature systems, the reanalysis of atypical argument structure, the attrition of the syntax of relativization, and the simplification of scope interpretations; these phenomena implicate diverging trajectories and outcomes in the development of heritage speakers. The case studies also have practical and methodological implications for the study of multilingualism. We conclude by discussing more general concepts central to linguistic inquiry, in particular, complexity and native speaker competence. PMID:26500595

  12. Workshops som forskningsmetode

    OpenAIRE

    Ørngreen, Rikke; Levinsen, Karin Tweddell

    2017-01-01

    This paper contributes to knowledge on workshops as a research methodology, and specifically on how such workshops pertain to e-learning. A literature review illustrated that workshops are discussed according to three different perspectives: workshops as a means, workshops as practice, and workshops as a research methodology. Focusing primarily on the latter, this paper presents five studies on upper secondary and higher education teachers’ professional development and on teaching and learnin...

  13. The Extension of Quality Function Deployment Based on 2-Tuple Linguistic Representation Model for Product Design under Multigranularity Linguistic Environment

    Directory of Open Access Journals (Sweden)

    Ming Li

    2012-01-01

    Full Text Available Quality function deployment (QFD is a customer-driven approach for product design and development. A QFD analysis process includes a series of subprocesses, such as determination of the importance of customer requirements (CRs, the correlation among engineering characteristics (ECs, and the relationship between CRs and ECs. Usually more than group of one decision makers are involved in the subprocesses to make the decision. In most decision making problems, they often provide their evaluation information in the linguistic form. Moreover, because of different knowledge, background, and discrimination ability, decision makers may express their linguistic preferences in multigranularity linguistic information. Therefore, an effective approach to deal with the multi-granularity linguistic information in QFD analysis process is highly needed. In this study, the QFD methodology is extended with 2-tuple linguistic representation model under multi-granularity linguistic environment. The extended QFD methodology can cope with multi-granularity linguistic evaluation information and avoid the loss of information. The applicability of the proposed approach is demonstrated with a numerical example.

  14. New Conceptualizations of Linguistic Giftedness

    Science.gov (United States)

    Biedron, Adriana; Pawlak, Miroslaw

    2016-01-01

    This state-of-the art paper focuses on the issue of linguistic giftedness, somewhat neglected in the second language acquisition (SLA) literature, attempting to reconceptualize, expand and update this concept in response to latest developments in the fields of psychology, linguistics and neurology. It first discusses contemporary perspectives on…

  15. Critical and Alternative Directions in Applied Linguistics

    Science.gov (United States)

    Pennycook, Alastair

    2010-01-01

    Critical directions in applied linguistics can be understood in various ways. The term "critical" as it has been used in "critical applied linguistics," "critical discourse analysis," "critical literacy" and so forth, is now embedded as part of applied linguistic work, adding an overt focus on questions of power and inequality to discourse…

  16. MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences.

    Science.gov (United States)

    Zhidkov, Ilia; Nagar, Tal; Mishmar, Dan; Rubin, Eitan

    2011-11-01

    The use of Next-Generation Sequencing of mitochondrial DNA is becoming widespread in biological and clinical research. This, in turn, creates a need for a convenient tool that detects and analyzes heteroplasmy. Here we present MitoBamAnnotator, a user friendly web-based tool that allows maximum flexibility and control in heteroplasmy research. MitoBamAnnotator provides the user with a comprehensively annotated overview of mitochondrial genetic variation, allowing for an in-depth analysis with no prior knowledge in programming. Copyright © 2011 Elsevier B.V. and Mitochondria Research Society. All rights reserved. All rights reserved.

  17. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Directory of Open Access Journals (Sweden)

    Bert Ely

    Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  18. On Redundancy in Describing Linguistic Systems

    Directory of Open Access Journals (Sweden)

    Vladimir Borissov Pericliev

    2015-12-01

    Full Text Available On Redundancy in Describing Linguistic Systems The notion of system of linguistic elements figures prominently in most post-Saussurian linguistics up to the present. A “system” is the network of the contrastive (or, distinctive features each element in the system bears to the remaining elements. The meaning (valeur of each element in the system is the set of features that are necessary and jointly sufficient to distinguish this element from all others. The paper addresses the problems of “redundancy”, i.e. the occurrence of features that are not strictly necessary in describing an element in a system. Redundancy is shown to smuggle into the description of linguistic systems, this infelicitous practice illustrated with some examples from the literature (e.g. the classical phonemic analysis of Russian by Cherry, Halle, and Jakobson, 1953. The logic and psychology of the occurrence of redundancy are briefly sketched and it is shown that, in addition to some other problems, redundancy leads to a huge and unresolvable ambiguity of descriptions of linguistic systems (the Buridan’s ass problem.

  19. Linguistic Theory and Actual Language.

    Science.gov (United States)

    Segerdahl, Par

    1995-01-01

    Examines Noam Chomsky's (1957) discussion of "grammaticalness" and the role of linguistics in the "correct" way of speaking and writing. It is argued that the concern of linguistics with the tools of grammar has resulted in confusion, with the tools becoming mixed up with the actual language, thereby becoming the central…

  20. The Perilous Life of a Linguistic Genre Convention

    DEFF Research Database (Denmark)

    Borchmann, Simon

    2014-01-01

    , the descriptions are more informative than the structures hitherto described by text linguistics. Secondly, as historical norms, they are a testimony to the development and change of language use. Thirdly, the descriptions contribute to language users’ awareness of the origin of standards, their understanding......The primary, theoretical aim of the article is to present a linguistic text analysis that differs from standard text linguistic approaches by being informative with regard to the linguistic choices and textual organisation that characterise a text as a social act. The analysis is exemplified...... by using texts of a relatively new Danish journalistic genre nyhedsanalyse (news analysis). The secondary, empirical aim of the article is to present a corpus-based, linguistic analysis of central elements of the genre nyhedsanalyse within the Danish system of newspaper genres. Text linguistics is based...

  1. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...

  2. PCAS – a precomputed proteome annotation database resource

    Directory of Open Access Journals (Sweden)

    Luo Jingchu

    2003-11-01

    Full Text Available Abstract Background Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. Results We report here the development of PCAS (ProteinCentric Annotation System as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at http://pak.cbi.pku.edu.cn/proteome/gca.php Conclusion PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms

  3. HTTR workshop (workshop on hydrogen production technology)

    International Nuclear Information System (INIS)

    Shiina, Yasuaki; Takizuka, Takakazu

    2004-12-01

    Various research and development efforts have been performed to solve the global energy and environmental problems caused by large consumption of fossil fuels. Research activities on advanced hydrogen production technology by the use of nuclear heat from high temperature gas cooled reactors, for example, have been flourished in universities, research institutes and companies in many countries. The Department of HTTR Project and the Department of Advanced Nuclear Heat Technology of JAERI held the HTTR Workshop (Workshop on Hydrogen Production Technology) on July 5 and 6, 2004 to grasp the present status of R and D about the technology of HTGR and the nuclear hydrogen production in the world and to discuss about necessity of the nuclear hydrogen production and technical problems for the future development of the technology. More than 110 participants attended the Workshop including foreign participants from USA, France, Korea, Germany, Canada and United Kingdom. In the Workshop, the presentations were made on such topics as R and D programs for nuclear energy and hydrogen production technologies by thermo-chemical or other processes. Also, the possibility of the nuclear hydrogen production in the future society was discussed. The workshop showed that the R and D for the hydrogen production by the thermo-chemical process has been performed in many countries. The workshop affirmed that nuclear hydrogen production could be one of the competitive supplier of hydrogen in the future. The second HTTR Workshop will be held in the autumn next year. (author)

  4. Language Works. Linguistic Journal

    DEFF Research Database (Denmark)

    Hartling, Anna Sofie; Nørreby, Thomas Rørbeck; Skovse, Astrid Ravn

    2016-01-01

    Language works! – and with this initiative and this journal we want to give the opportunity to many more students to present their linguistic research to each other, to the scientific community and to all interested.......Language works! – and with this initiative and this journal we want to give the opportunity to many more students to present their linguistic research to each other, to the scientific community and to all interested....

  5. A semi-automatic annotation tool for cooking video

    Science.gov (United States)

    Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe

    2013-03-01

    In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.

  6. Experiments with crowdsourced re-annotation of a POS tagging data set

    DEFF Research Database (Denmark)

    Hovy, Dirk; Plank, Barbara; Søgaard, Anders

    2014-01-01

    Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, these are annotations for non-sequential classification tasks. While there has been some work on crowdsourcing named entity annotations, researchers have assumed that syntactic tasks such as part......-of-speech (POS) tagging cannot be crowdsourced. This paper shows that workers can actually annotate sequential data almost as well as experts. Further, we show that the models learned from crowdsourced annotations fare as well as the models learned from expert annotations in downstream tasks....

  7. MPEG-7 based video annotation and browsing

    Science.gov (United States)

    Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens

    2003-11-01

    The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.

  8. Collective Variables in Apphed Linguistics Research

    OpenAIRE

    ヘンスリー, ジョール; HENSLEY, Joel

    2011-01-01

    This paper focuses on the key dynamic(al)systems theory concept of collective variables as it relates to developmental research in applied linguistics. Dynamic(al) systems theory is becoming prevalent in linguistic research and in the past two decades has jumped to the forefront of cutting edge in the field. One key concept in dynamic(al) systems theory is that of collective variables. In order to help properly orient this concept in the field of applied linguistics, this paper discusses the ...

  9. Linguistics and the TEFL Teacher.

    Science.gov (United States)

    Fraser, Bruce

    This paper asserts the "unquestionable" relevance of linguistic insights in the training of and subsequent use by teachers of English as a foreign language. Although the author agrees with Chomsky's view that linguistics has nothing to offer the teacher in the form of specific proposals for language teaching methodology, he argues that linguistics…

  10. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  11. Propagating annotations of molecular networks using in silico fragmentation.

    Science.gov (United States)

    da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C

    2018-04-18

    The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

  12. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  13. Annotation of the Evaluative Language in a Dependency Treebank

    Directory of Open Access Journals (Sweden)

    Šindlerová Jana

    2017-12-01

    Full Text Available In the paper, we present our efforts to annotate evaluative language in the Prague Dependency Treebank 2.0. The project is a follow-up of the series of annotations of small plaintext corpora. It uses automatic identification of potentially evaluative nodes through mapping a Czech subjectivity lexicon to syntactically annotated data. These nodes are then manually checked by an annotator and either dismissed as standing in a non-evaluative context, or confirmed as evaluative. In the latter case, information about the polarity orientation, the source and target of evaluation is added by the annotator. The annotations unveiled several advantages and disadvantages of the chosen framework. The advantages involve more structured and easy-to-handle environment for the annotator, visibility of syntactic patterning of the evaluative state, effective solving of discontinuous structures or a new perspective on the influence of good/bad news. The disadvantages include little capability of treating cases with evaluation spread among more syntactically connected nodes at once, little capability of treating metaphorical expressions, or disregarding the effects of negation and intensification in the current scheme.

  14. Linguistics and the Literary Text.

    Science.gov (United States)

    Ferrar, Madeleine

    1984-01-01

    Discusses the opposing viewpoints of the two most influential linguists of this century--Saussure and Chomsky--suggesting that while both are interested in form as opposed to substance, Saussure sees linguistics as a branch of semiotics and Chomsky sees it as part of cognitive psychology. Evaluates the relevance of these two viewpoints to the…

  15. The caBIG annotation and image Markup project.

    Science.gov (United States)

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Sepukar, Kastubh; Rubin, Daniel L

    2010-04-01

    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

  16. Interoperable Multimedia Annotation and Retrieval for the Tourism Sector

    NARCIS (Netherlands)

    Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.

    2015-01-01

    The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies.

  17. Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

    Science.gov (United States)

    Cohen, K Bretonnel; Lanfranchi, Arrick; Choi, Miji Joo-Young; Bada, Michael; Baumgartner, William A; Panteleyeva, Natalya; Verspoor, Karin; Palmer, Martha; Hunter, Lawrence E

    2017-08-17

    Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. The corpus was manually annotated with coreference relations, including identity and appositives for all coreferring base noun phrases. The OntoNotes annotation guidelines, with minor adaptations, were used. Interannotator agreement ranges from 0.480 (entity-based CEAF) to 0.858 (Class-B3), depending on the metric that is used to assess it. The resulting corpus adds nearly 30,000 annotations to the previous release of the CRAFT corpus. Differences from related projects include a much broader definition of markables, connection to extensive annotation of several domain-relevant semantic classes, and connection to complete syntactic annotation. Tool performance was benchmarked on the data. A publicly available out-of-the-box, general-domain coreference resolution system achieved an F-measure of 0.14 (B3), while a simple domain-adapted rule-based system achieved an F-measure of 0.42. An ensemble of the two reached F of 0.46. Following the IDENTITY chains in the data would add 106,263 additional named entities in the full 97-paper corpus, for an increase of 76% percent in the semantic classes of the eight ontologies that have been annotated in earlier versions of the CRAFT corpus. The project produced a large data set for further investigation of coreference and coreference resolution in the scientific literature. The work raised issues in the phenomenon of reference in this domain and genre, and the paper proposes that many mentions that would be considered generic in the general domain are not

  18. A Novel Approach to Semantic and Coreference Annotation at LLNL

    Energy Technology Data Exchange (ETDEWEB)

    Firpo, M

    2005-02-04

    A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.

  19. Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae

    Directory of Open Access Journals (Sweden)

    Deng Jixin

    2009-02-01

    Full Text Available Abstract Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO. In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57% being annotated with 1,957 distinct and specific GO terms. Unannotated proteins

  20. Clinical linguistics: its past, present and future.

    Science.gov (United States)

    Perkins, Michael R

    2011-11-01

    Historiography is a growing area of research within the discipline of linguistics, but so far the subfield of clinical linguistics has received virtually no systematic attention. This article attempts to rectify this by tracing the development of the discipline from its pre-scientific days up to the present time. As part of this, I include the results of a survey of articles published in Clinical Linguistics & Phonetics between 1987 and 2008 which shows, for example, a consistent primary focus on phonetics and phonology at the expense of grammar, semantics and pragmatics. I also trace the gradual broadening of the discipline from its roots in structural linguistics to its current reciprocal relationship with speech and language pathology and a range of other academic disciplines. Finally, I consider the scope of clinical linguistic research in 2011 and assess how the discipline seems likely develop in the future.

  1. Combined evidence annotation of transposable elements in genome sequences.

    Directory of Open Access Journals (Sweden)

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  2. NoGOA: predicting noisy GO annotations using evidences and sparse representation.

    Science.gov (United States)

    Yu, Guoxian; Lu, Chang; Wang, Jun

    2017-07-21

    Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

  3. Plann: A command-line application for annotating plastome sequences.

    Science.gov (United States)

    Huang, Daisie I; Cronk, Quentin C B

    2015-08-01

    Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.

  4. Applied linguistics - a science of culture?

    Directory of Open Access Journals (Sweden)

    Benke, Gertraud

    2003-01-01

    Full Text Available In this article, the status of applied linguistics as discipline is questioned and problems of establishing it - and other newly formed scientific enterprises like cultural science - as disciplines are discussed. This discussion is contextualized using the author's own experience as applied linguist working in (the institutional structure of Austria. Secondly, applied linguistics is presented as complementing cultural science, with both exploring at times the same phenomena albeit under different perspectives and focussing on different levels of experience. Two examples of research involving such a joint interest with different foci are discussed.

  5. Linguistic fire and human cognitive powers

    DEFF Research Database (Denmark)

    Cowley, Stephen

    2012-01-01

    To view language as a cultural tool challenges much of what claims to be linguistic science while opening up a new people-centred linguistics. On this view, how we speak, think and act depends on, not just brains (or minds), but also cultural traditions. Yet, Everett is conservative: like others...... theory, bodily dynamics themselves act as cues to meaning. Linguistic exostructures resemble tools that constrain how people concert acting-perceiving bodies. The result is unending renewal of verbal structures: like artefacts and institutions, they function to sustain a species-specific cultural ecology...

  6. Semantator: annotating clinical narratives with semantic web ontologies.

    Science.gov (United States)

    Song, Dezhao; Chute, Christopher G; Tao, Cui

    2012-01-01

    To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.

  7. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  8. Ethical Issues in Corpus Linguistics And Annotation: Pay Per Hit Does Not Affect Effective Hourly Rate For Linguistic Resource Development On Amazon Mechanical Turk.

    Science.gov (United States)

    Cohen, K Bretonnel; Fort, Karën; Adda, Gilles; Zhou, Sophia; Farri, Dimeji

    2016-05-01

    Ethical issues reported with paid crowdsourcing include unfairly low wages. It is assumed that such issues are under the control of the task requester. Can one control the amount that a worker earns by controlling the amount that one pays? 412 linguistic data development tasks were submitted to Amazon Mechanical Turk. The pay per HIT was manipulated through a range of values. We examined the relationship between the pay that is offered per HIT and the effective pay rate. There is no such relationship. Paying more per HIT does not cause workers to earn more: the higher the pay per HIT, the more time workers spend on them ( R = 0.92). So, the effective hourly rate stays roughly the same. The finding has clear implications for language resource builders who want to behave ethically: other means must be found in order to compensate workers fairly. The findings of this paper should not be taken as an endorsement of unfairly low pay rates for crowdsourcing workers. Rather, the intention is to point out that additional measures, such as pre-calculating and communicating to the workers an average hourly, rather than per-task, rate must be found in order to ensure an ethical rate of pay.

  9. Relations between Formal Linguistic Insecurity and the Perception of Linguistic Insecurity: A Quantitative Study in an Educational Environment at the Valencian Community (Spain)

    Science.gov (United States)

    Baldaqui Escandell, Josep M.

    2011-01-01

    What is the relationship between the awareness of linguistic prestige and the security or insecurity in the use of minoritised languages? Is formal linguistic insecurity (as initially described by Labov) the same as the speakers' perception of linguistic insecurity? Which are the variables related to the various types of linguistic insecurity in…

  10. Ontology modularization to improve semantic medical image annotation.

    Science.gov (United States)

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  11. Psychological and Linguistic Portrait of Criminals. Introduction to Discussion

    Directory of Open Access Journals (Sweden)

    Jadwiga Stawnicka

    2017-01-01

    Full Text Available The article concerns one aspect of forensic linguistics, which concerns determination by the challenged statements. This is done in collaboration with linguist – creating a profile linguistic songwriter – and a psychologist – that creates a psychological profile. The cooperation of specialists can be used at the level of assessment, which is used for the purposes of investigation and legal proceedings. Expertise in the field of forensic linguistics (forensic linguistics, German. Forensische Linguistik include setting by/performance of speech based on the content of spoken or written (eg. The farewell letters, threatening letters, ransom demands; the possibility of setting texts by anonymous on the Internet, to determine the characteristics of linguistic stalkers and cyberstalkerów that can identify the sender of the message sender identification of the origin country, constructing linguistic profile anonymous author, the linguistic profile of the author of the well-known text. It should be added that the analysis of the content in content-language document contains emotional component, which is related to our knowledge about the determinants of language to express emotions, both negative and positive. An important element of the text is a matter of psychological portrait of the sender (author and / or performer of the text based on the identified linguistic features.

  12. [Prescription annotations in Welfare Pharmacy].

    Science.gov (United States)

    Han, Yi

    2018-03-01

    Welfare Pharmacy contains medical formulas documented by the government and official prescriptions used by the official pharmacy in the pharmaceutical process. In the last years of Southern Song Dynasty, anonyms gave a lot of prescription annotations, made textual researches for the name, source, composition and origin of the prescriptions, and supplemented important historical data of medical cases and researched historical facts. The annotations of Welfare Pharmacy gathered the essence of medical theory, and can be used as precious materials to correctly understand the syndrome differentiation, compatibility regularity and clinical application of prescriptions. This article deeply investigated the style and form of the prescription annotations in Welfare Pharmacy, the name of prescriptions and the evolution of terminology, the major functions of the prescriptions, processing methods, instructions for taking medicine and taboos of prescriptions, the medical cases and clinical efficacy of prescriptions, the backgrounds, sources, composition and cultural meanings of prescriptions, proposed that the prescription annotations played an active role in the textual dissemination, patent medicine production and clinical diagnosis and treatment of Welfare Pharmacy. This not only helps understand the changes in the names and terms of traditional Chinese medicines in Welfare Pharmacy, but also provides the basis for understanding the knowledge sources, compatibility regularity, important drug innovations and clinical medications of prescriptions in Welfare Pharmacy. Copyright© by the Chinese Pharmaceutical Association.

  13. A framework for annotating human genome in disease context.

    Science.gov (United States)

    Xu, Wei; Wang, Huisong; Cheng, Wenqing; Fu, Dong; Xia, Tian; Kibbe, Warren A; Lin, Simon M

    2012-01-01

    Identification of gene-disease association is crucial to understanding disease mechanism. A rapid increase in biomedical literatures, led by advances of genome-scale technologies, poses challenge for manually-curated-based annotation databases to characterize gene-disease associations effectively and timely. We propose an automatic method-The Disease Ontology Annotation Framework (DOAF) to provide a comprehensive annotation of the human genome using the computable Disease Ontology (DO), the NCBO Annotator service and NCBI Gene Reference Into Function (GeneRIF). DOAF can keep the resulting knowledgebase current by periodically executing automatic pipeline to re-annotate the human genome using the latest DO and GeneRIF releases at any frequency such as daily or monthly. Further, DOAF provides a computable and programmable environment which enables large-scale and integrative analysis by working with external analytic software or online service platforms. A user-friendly web interface (doa.nubic.northwestern.edu) is implemented to allow users to efficiently query, download, and view disease annotations and the underlying evidences.

  14. What can literature do for linguistics?

    DEFF Research Database (Denmark)

    Nørgaard, Nina

    2007-01-01

      Through analyses of selected passages from James Joyce's Ulysses, this article demonstrates how the challenging of the boundaries between linguistics and literary studies can be more than a one-way process aimed at uncovering linguistic patterns of literary texts. The theoretical basis...

  15. Annotated bibliography

    International Nuclear Information System (INIS)

    1997-08-01

    Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners

  16. Supporting Keyword Search for Image Retrieval with Integration of Probabilistic Annotation

    Directory of Open Access Journals (Sweden)

    Tie Hua Zhou

    2015-05-01

    Full Text Available The ever-increasing quantities of digital photo resources are annotated with enriching vocabularies to form semantic annotations. Photo-sharing social networks have boosted the need for efficient and intuitive querying to respond to user requirements in large-scale image collections. In order to help users formulate efficient and effective image retrieval, we present a novel integration of a probabilistic model based on keyword query architecture that models the probability distribution of image annotations: allowing users to obtain satisfactory results from image retrieval via the integration of multiple annotations. We focus on the annotation integration step in order to specify the meaning of each image annotation, thus leading to the most representative annotations of the intent of a keyword search. For this demonstration, we show how a probabilistic model has been integrated to semantic annotations to allow users to intuitively define explicit and precise keyword queries in order to retrieve satisfactory image results distributed in heterogeneous large data sources. Our experiments on SBU (collected by Stony Brook University database show that (i our integrated annotation contains higher quality representatives and semantic matches; and (ii the results indicating annotation integration can indeed improve image search result quality.

  17. Towards a theoretical framework for analyzing complex linguistic networks

    CERN Document Server

    Lücking, Andy; Banisch, Sven; Blanchard, Philippe; Job, Barbara

    2016-01-01

    The aim of this book is to advocate and promote network models of linguistic systems that are both based on thorough mathematical models and substantiated in terms of linguistics. In this way, the book contributes first steps towards establishing a statistical network theory as a theoretical basis of linguistic network analysis the boarder of the natural sciences and the humanities.This book addresses researchers who want to get familiar with theoretical developments, computational models and their empirical evaluation in the field of complex linguistic networks. It is intended to all those who are interested in statisticalmodels of linguistic systems from the point of view of network research. This includes all relevant areas of linguistics ranging from phonological, morphological and lexical networks on the one hand and syntactic, semantic and pragmatic networks on the other. In this sense, the volume concerns readers from many disciplines such as physics, linguistics, computer science and information scien...

  18. Quick Pad Tagger : An Efficient Graphical User Interface for Building Annotated Corpora with Multiple Annotation Layers

    OpenAIRE

    Marc Schreiber; Kai Barkschat; Bodo Kraft; Albert Zundorf

    2015-01-01

    More and more domain specific applications in the internet make use of Natural Language Processing (NLP) tools (e. g. Information Extraction systems). The output quality of these applications relies on the output quality of the used NLP tools. Often, the quality can be increased by annotating a domain specific corpus. However, annotating a corpus is a time consuming and exhaustive task. To reduce the annota tion time we present...

  19. LINGUISTIC DIVERSITY AT PORTUGUESE TEXTBOOK: SOME CONSIDERATIONS

    Directory of Open Access Journals (Sweden)

    Paula Gaida Winch

    2013-12-01

    Full Text Available It is analyzed how linguistic diversity is dealt with in a Portuguese textbook, where two chapters are designated to it. In these, it is pointed out that speaker ethnic origin can be manifested differently by: morphological changes; use of foreign expressions; accent in oral language. In synthesis, the linguistic diversity is dealt with through activities of identification and reproduction of linguistic varieties to be carried out by the students.

  20. Cognitive linguistics.

    Science.gov (United States)

    Evans, Vyvyan

    2012-03-01

    Cognitive linguistics is one of the fastest growing and influential perspectives on the nature of language, the mind, and their relationship with sociophysical (embodied) experience. It is a broad theoretical and methodological enterprise, rather than a single, closely articulated theory. Its primary commitments are outlined. These are the Cognitive Commitment-a commitment to providing a characterization of language that accords with what is known about the mind and brain from other disciplines-and the Generalization Commitment-which represents a dedication to characterizing general principles that apply to all aspects of human language. The article also outlines the assumptions and worldview which arises from these commitments, as represented in the work of leading cognitive linguists. WIREs Cogn Sci 2012, 3:129-141. doi: 10.1002/wcs.1163 For further resources related to this article, please visit the WIREs website. Copyright © 2012 John Wiley & Sons, Inc.

  1. Supplementary Material for: BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    Abstract Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACONâ s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  2. Linguistic Characteristics of Advertising English

    Institute of Scientific and Technical Information of China (English)

    易高燕

    2010-01-01

    Advertising language takes form under the influence of linguistics,psychology and sociology,etc,and its way of choosing words and building sentences are quite different from normal English.And as a practical language,advertising English has its specific functions,and it has been distinguished from normal English as an independent language,and it has plentiful values.This paper aims to discuss some linguistic characteristics of advertising English.

  3. Extending eScience Provenance with User-Submitted Semantic Annotations

    Science.gov (United States)

    Michaelis, J.; Zednik, S.; West, P.; Fox, P. A.; McGuinness, D. L.

    2010-12-01

    eScience based systems generate provenance of their data products, related to such things as: data processing, data collection conditions, expert evaluation, and data product quality. Recent advances in web-based technology offer users the possibility of making annotations to both data products and steps in accompanying provenance traces, thereby expanding the utility of such provenance for others. These contributing users may have varying backgrounds, ranging from system experts to outside domain experts to citizen scientists. Furthermore, such users may wish to make varying types of annotations - ranging from documenting the purpose of a provenance step to raising concerns about the quality of data dependencies. Semantic Web technologies allow for such kinds of rich annotations to be made to provenance through the use of ontology vocabularies for (i) organizing provenance, and (ii) organizing user/annotation classifications. Furthermore, through Linked Data practices, Semantic linkages may be made from provenance steps to external data of interest. A desire for Semantically-annotated provenance has been motivated by data management issues in the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photomoeter-based readings are taken of solar activity and subsequently processed into final data products consumable by end users. At intermediate stages of ACOS processing, factors such as evaluations by human experts and weather conditions are logged, which could impact data product quality. If such factors are linked via user-submitted annotations to provenance, it could be significantly beneficial for other users. Likewise, the background of a user could impact the credibility of their annotations. For example, an annotation made by a citizen scientist describing the purpose of a provenance step may not be as reliable as a similar annotation made by an ACOS project member. For this work, we have developed a software package that

  4. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  5. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  6. Are Prospective English Teachers Linguistically Intelligent?

    Science.gov (United States)

    Tezel, Kadir Vefa

    2017-01-01

    Language is normally associated with linguistic capabilities of individuals. In the theory of multiple intelligences, language is considered to be related primarily to linguistic intelligence. Using the theory of Multiple Intelligences as its starting point, this descriptive survey study investigated to what extent prospective English teachers'…

  7. Linguistic Recycling and the Open Community.

    Science.gov (United States)

    Dasgupta, Probal

    2001-01-01

    Examines linguistic recycling in the context of domestic Esperanto use. Argues that word-meaning recycling reflects the same fundamental principles as sentential recursion, and that a linguistics theoretically sensitive to these principles strengthens practical efforts towards the social goal of an open speech community. (Author/VWL)

  8. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  9. An open annotation ontology for science on web 3.0.

    Science.gov (United States)

    Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim

    2011-05-17

    There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for

  10. ACID: annotation of cassette and integron data

    Directory of Open Access Journals (Sweden)

    Stokes Harold W

    2009-04-01

    Full Text Available Abstract Background Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Description By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided. Conclusion ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.

  11. Proceedings of the IDA Workshop on Formal Specification and Verification of Ada (Trade Name) (1st) Held in Alexandria, Virginia on 18-20 March 1985.

    Science.gov (United States)

    1985-12-01

    on the third day. 5 ADA VERIFICATION WORKSHOP MARCH 18-20, 1985 LIST OF PARTICIPANTS Bernard Abrams ABRAMS@ADA20 Grumman Aerospace Corporation Mail...20301-3081 (202) 694-0211 Mark R. Cornwell CORNWELL @NRL-CSS Code 7590 Naval Research Lab Washington, D.C. 20375 (202) 767-3365 Jeff Facemire FACEMIRE...accompanied by descriptions of their purpose in English, to LUCKHAM@SAIL for annotation. - X-2 DISTRIBUTION LIST FOR M-146 Bernard Abrams ABRAMS@USC-ECLB

  12. Use of Annotations for Component and Framework Interoperability

    Science.gov (United States)

    David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.

    2009-12-01

    The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the

  13. Protein linguistics - a grammar for modular protein assembly?

    Science.gov (United States)

    Gimona, Mario

    2006-01-01

    The correspondence between biology and linguistics at the level of sequence and lexical inventories, and of structure and syntax, has fuelled attempts to describe genome structure by the rules of formal linguistics. But how can we define protein linguistic rules? And how could compositional semantics improve our understanding of protein organization and functional plasticity?

  14. On Norms and Linguistic Categories in Linguistic Diversity Management

    NARCIS (Netherlands)

    Marácz, L.

    2014-01-01

    Due to globalization there is an increase in the appearances of languages in the multilingual linguistic landscape in urban spaces. Commentators have described this state of affairs as super-, mega- or complex diversity. Mainstream sociolinguists have argued that languages have no fixed boundaries

  15. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...

  16. Stellenbosch Papers in Linguistics: Journal Sponsorship

    African Journals Online (AJOL)

    Publisher. Stellenbosch Papers in Linguistics (SPiL) is published by the Department of General Linguistics of Stellenbosch University. Publisher contact person: Mrs Christine Smit. Email: linguis@sun.ac.za. Phone: 021 808 2052. Fax: 021 808 2009. Mailing address: Private Bag X1, Matieland, 7602. Department of General ...

  17. Applied Linguistics in Its Disciplinary Context

    Science.gov (United States)

    Liddicoat, Anthony J.

    2010-01-01

    Australia's current attempt to develop a process to evaluate the quality of research (Excellence in Research for Australia--ERA) places a central emphasis on the disciplinary organisation of academic work. This disciplinary focus poses particular problems for Applied Linguistics in Australia. This paper will examine Applied Linguistics in relation…

  18. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  19. Ninth international conference on computational linguistics Coling 82

    Energy Technology Data Exchange (ETDEWEB)

    1983-01-01

    This paper presents the summary reports presented at the concluding session and evaluating the state of the art, trends and perspectives as reflected in the papers presented at Coling 82 in six domains: machine translation, grammatico-semantic analysis, linguistics in its relations to computational linguistics, question answering, artificial intelligence and knowledge representation, and information retrieval and linguistic data bases.

  20. A phylogenetic and cognitive perspective on linguistic complexity ...

    African Journals Online (AJOL)

    In recent years a growing interest in the nature of linguistic complexity has emerged in linguistic circles. A striking feature of this interest is that linguistic complexity is taken to be a phenomenon in its own right. In fact, an extreme construal of the inherent complexity of language is represented in the notion of universal ...

  1. Ideologeme "Order" in Modern American Linguistic World Image

    Science.gov (United States)

    Ibatova, Aygul Z.; Vdovichenko, Larisa V.; Ilyashenko, Lubov K.

    2016-01-01

    The paper studies the topic of modern American linguistic world image. It is known that any language is the most important instrument of cognition of the world by a person but there is also no doubt that any language is the way of perception and conceptualization of this knowledge about the world. In modern linguistics linguistic world image is…

  2. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  3. A Survey on the Exchange of Linguistic Resources: Publishing Linguistic Linked Open Data on the Web

    Science.gov (United States)

    Lezcano, Leonardo; Sanchez-Alonso, Salvador; Roa-Valverde, Antonio J.

    2013-01-01

    Purpose: The purpose of this paper is to provide a literature review of the principal formats and frameworks that have been used in the last 20 years to exchange linguistic resources. It aims to give special attention to the most recent approaches to publishing linguistic linked open data on the Web. Design/methodology/approach: Research papers…

  4. Developing workshop module of realistic mathematics education: Follow-up workshop

    Science.gov (United States)

    Palupi, E. L. W.; Khabibah, S.

    2018-01-01

    Realistic Mathematics Education (RME) is a learning approach which fits the aim of the curriculum. The success of RME in teaching mathematics concepts, triggering students’ interest in mathematics and teaching high order thinking skills to the students will make teachers start to learn RME. Hence, RME workshop is often offered and done. This study applied development model proposed by Plomp. Based on the study by RME team, there are three kinds of RME workshop: start-up workshop, follow-up workshop, and quality boost. However, there is no standardized or validated module which is used in that workshops. This study aims to develop a module of RME follow-up workshop which is valid and can be used. Plopm’s developmental model includes materials analysis, design, realization, implementation, and evaluation. Based on the validation, the developed module is valid. While field test shows that the module can be used effectively.

  5. Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

    Directory of Open Access Journals (Sweden)

    Merchant Sabeeha S

    2011-07-01

    Full Text Available Abstract Background Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. Description The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of

  6. MODERN LINGUISTICS, ITS DEVELOPMENT AND SCOPE.

    Science.gov (United States)

    LEVIN, SAMUEL R.

    THE DEVELOPMENT OF MODERN LINGUISTICS STARTED WITH JONES' DISCOVERY IN 1786 THAT SANSKRIT IS CLOSELY RELATED TO THE CLASSICAL, GERMANIC, AND CELTIC LANGUAGES, AND HAS ADVANCED TO INCLUDE THE APPLICATION OF COMPUTERS IN LANGUAGE ANALYSIS. THE HIGHLIGHTS OF LINGUISTIC RESEARCH HAVE BEEN DE SAUSSURE'S DISTINCTION BETWEEN THE DIACHRONIC AND THE…

  7. Applied Linguistics: The Challenge of Theory

    Science.gov (United States)

    McNamara, Tim

    2015-01-01

    Language has featured prominently in contemporary social theory, but the relevance of this fact to the concerns of Applied Linguistics, with its necessary orientation to practical issues of language in context, represents an ongoing challenge. This article supports the need for a greater engagement with theory in Applied Linguistics. It considers…

  8. Automated Linguistic Personality Description and Recognition Methods

    Directory of Open Access Journals (Sweden)

    Danylyuk Illya

    2016-12-01

    Full Text Available Background: The relevance of our research, above all, is theoretically motivated by the development of extraordinary scientific and practical interest in the possibilities of language processing of huge amount of data generated by people in everyday professional and personal life in the electronic forms of communication (e-mail, sms, voice, audio and video blogs, social networks, etc.. Purpose: The purpose of the article is to describe the theoretical and practical framework of the project "Communicative-pragmatic and discourse-grammatical lingvopersonology: structuring linguistic identity and computer modeling". The description of key techniques is given, such as machine learning for language modeling, speech synthesis, handwriting simulation. Results: Lingvopersonology developed some great theoretical foundations, its methods, tools, and significant achievements let us predict that the newest promising trend is a linguistic identity modeling by means of information technology, including language. We see three aspects of the modeling: 1 modeling the semantic level of linguistic identity – by means of the use of corpus linguistics; 2 sound level formal modeling of linguistic identity – with the help of speech synthesis; 3 formal graphic level modeling of linguistic identity – with the help of image synthesis (handwriting. For the first case, we suppose to use machine learning technics and vector-space (word2vec algorithm for textual speech modeling. Hybrid CUTE method for personality speech modeling will be applied to the second case. Finally, trained with the person handwriting images neural network can be an instrument for the last case. Discussion: The project "Communicative-pragmatic, discourse, and grammatical lingvopersonology: structuring linguistic identity and computer modeling", which is implementing by the Department of General and Applied Linguistics and Slavonic philology, selected a task to model Yuriy Shevelyov (Sherekh

  9. Creating Fantastic PI Workshops

    Energy Technology Data Exchange (ETDEWEB)

    Biedermann, Laura B. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Clark, Blythe G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Colbert, Rachel S. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Dagel, Amber Lynn [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Gupta, Vipin P. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hibbs, Michael R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Perkins, David Nikolaus [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); West, Roger Derek [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-10-01

    The goal of this SAND report is to provide guidance for other groups hosting workshops and peerto-peer learning events at Sandia. Thus this SAND report provides detail about our team structure, how we brainstormed workshop topics and developed the workshop structure. A Workshop “Nuts and Bolts” section provides our timeline and check-list for workshop activities. The survey section provides examples of the questions we asked and how we adapted the workshop in response to the feedback.

  10. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  11. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  12. Ghana Journal of Linguistics: Editorial Policies

    African Journals Online (AJOL)

    Focus and Scope. The Ghana Journal of Linguistics is a peer-reviewed scholarly journal appearing twice a year, published by the Linguistics Association of Ghana. Beginning with Volume 2 (2013) it is published in electronic format only, open access, at www.ajol.info. However print-on-demand copies can be made ...

  13. Physical Linguistics.

    Science.gov (United States)

    Tice, Bradley S.

    Physical linguistics is defined as the use of treatments from the field of speech pathology to enhance first and second language production in healthy individuals, resulting in increased quality and strength of phonation and articulation. A series of exercises for treating dysarthria (weakness, paralysis, discoordination, primary and secondary…

  14. Dissociating linguistic and non-linguistic gesture processing: electrophysiological evidence from American Sign Language.

    Science.gov (United States)

    Grosvald, Michael; Gutierrez, Eva; Hafer, Sarah; Corina, David

    2012-04-01

    A fundamental advance in our understanding of human language would come from a detailed account of how non-linguistic and linguistic manual actions are differentiated in real time by language users. To explore this issue, we targeted the N400, an ERP component known to be sensitive to semantic context. Deaf signers saw 120 American Sign Language sentences, each consisting of a "frame" (a sentence without the last word; e.g. BOY SLEEP IN HIS) followed by a "last item" belonging to one of four categories: a high-close-probability sign (a "semantically reasonable" completion to the sentence; e.g. BED), a low-close-probability sign (a real sign that is nonetheless a "semantically odd" completion to the sentence; e.g. LEMON), a pseudo-sign (phonologically legal but non-lexical form), or a non-linguistic grooming gesture (e.g. the performer scratching her face). We found significant N400-like responses in the incongruent and pseudo-sign contexts, while the gestures elicited a large positivity. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. LINGUISTIC AND CULTURAL STUDIES: THE QUEST FOR NEW IDEAS

    Directory of Open Access Journals (Sweden)

    Vitalii Kononenko

    2015-05-01

    Full Text Available The article highlights the principles of researching into text from the interdisciplinary linguistic and cultural perspective. Cognitological analysis of linguistic and extralinguistic cultural meanings reveals that there exist of specific linguistic and aesthetic formations best presented through the ‘language – culture – identity’ triad. One of the components of literary discourse is monocultural layer, which secures the continuity of national cultural tradition; researching into it, one should take into account mental and historical, psycholinguistic, sociolinguistic and other factors. Linguistic and aesthetic analysis helps to establish the system of linguistic and cultural means (metaphorization, imagery, verbal symbols, linguistic conceptualization, connotative meanings, which reveals its potential in literary texts. The lingual identity as a general notional category shows its nationally-oriented characteristics through the dichotomies of ‘addresser-addressee’ , ‘author-reader’, ‘narrator-narratee’ and is presented in the author’s idiolect.

  16. Model and Interoperability using Meta Data Annotations

    Science.gov (United States)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and

  17. Black English Annotations for Elementary Reading Programs.

    Science.gov (United States)

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  18. Role of linguistic skills in fifth-grade mathematics.

    Science.gov (United States)

    Kleemans, Tijs; Segers, Eliane; Verhoeven, Ludo

    2018-03-01

    The current study investigated the direct and indirect relations between basic linguistic skills (i.e., phonological skills and grammatical ability) and advanced linguistic skills (i.e., academic vocabulary and verbal reasoning), on the one hand, and fifth-grade mathematics (i.e., arithmetic, geometry, and fractions), on the other, taking working memory and general intelligence into account and controlling for socioeconomic status, age, and gender. The results showed the basic linguistic representations of 167 fifth graders to be indirectly related to their geometric and fraction skills via arithmetic. Furthermore, advanced linguistic skills were found to be directly related to geometry and fractions after controlling for arithmetic. It can be concluded that linguistic skills directly and indirectly relate to mathematical ability in the upper grades of primary education, which highlights the importance of paying attention to such skills in the school curriculum. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Aspects of conversational style-linguistic versus behavioral analysis.

    Science.gov (United States)

    Hall, G A

    1992-01-01

    Skinner's functional analysis of verbal behavior has been contrasted with formal linguistic analysis which studies the grammatical structure and "meaning" of verbal response-products, regardless of the circumstances under which they are produced. Nevertheless, it appears that certain areas of linguistic analysis are not entirely structural. In her recent books That's Not What I Meant (1986) and You Just Don't Understand (1990), the linguist Deborah Tannen purports to explain how people exhibit different "conversation styles"-that is, how they speak and achieve effects on listeners in different ways. There are indications, however, that the linguistic model may not be the most functional and precise one that could be used in analyzing conversational style. This paper takes concepts presented in Deborah Tannen's book That's Not What I Meant (1986), analyzes them from a linguistic and a behavioral perspective, and compares the relative utility of the two approaches.

  20. Wittgenstein and the linguistic turn in social theory

    DEFF Research Database (Denmark)

    Hermansen, Jens Christian

    of Winch in social theory, the wider and more recent influence of Wittgenstein in areas such as technology and science studies, social theory, feminist and gender studies and conversation and discourse analysis is also considered. Historically, the readings of Wittgenstein in the social sciences have taken...... of the linguistic turn in social theory, the linguistic turn is a double-edged sword of both profound insights and limits; the claim is that the limits of the linguistic turn are the strengths of functionalist, structuralist and materialist approaches to the social sciences. The approach of the critical turn...... is to develop a more comprehensive social theory that is sensitive to these strengths and thus supersedes the limits of the linguistic turn. This paper suggests a different approach. Against the critical turn, the paper argues that the limits of the linguistic turn are identical with the very assumptions...

  1. Special Issue: Annotated Bibliography for Volumes XIX-XXXII.

    Science.gov (United States)

    Pullin, Richard A.

    1998-01-01

    This annotated bibliography lists 310 articles from the "Journal of Cooperative Education" from Volumes XIX-XXXII, 1983-1997. Annotations are presented in the order they appear in the journal; author and subject indexes are provided. (JOW)

  2. Statistical Measures for Usage-Based Linguistics

    Science.gov (United States)

    Gries, Stefan Th.; Ellis, Nick C.

    2015-01-01

    The advent of usage-/exemplar-based approaches has resulted in a major change in the theoretical landscape of linguistics, but also in the range of methodologies that are brought to bear on the study of language acquisition/learning, structure, and use. In particular, methods from corpus linguistics are now frequently used to study distributional…

  3. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Directory of Open Access Journals (Sweden)

    Gustavo Arango-Argoty

    Full Text Available Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/, which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  4. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  5. MIPS: analysis and annotation of genome information in 2007.

    Science.gov (United States)

    Mewes, H W; Dietmann, S; Frishman, D; Gregory, R; Mannhaupt, G; Mayer, K F X; Münsterkötter, M; Ruepp, A; Spannagl, M; Stümpflen, V; Rattei, T

    2008-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  6. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  7. Improving Microbial Genome Annotations in an Integrated Database Context

    Science.gov (United States)

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2013-01-01

    Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

  8. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  9. Educational language planning and linguistic identity

    Science.gov (United States)

    Sutton, Peter

    1991-03-01

    There are cases in which a "high" form of a language is taught and used in formal situations, but linguistic variation is also caused by geography, ethnicity and socioeconomic class. Certain variants are regarded as inferior and restricted in expressive capacity, and are disadvantageous. The paper suggests that it is possible to map each person's linguistic identity in two dimensions: the number of languages spoken, and the situation-specific variants of each language. Further, it is argued that the distance between a "low" variant and a "high" standard form of a language may present to the "low" learner of a standardized mother tongue a barrier just as great as that posed by the learning of a related foreign language to a speaker of the high variant. It is proposed that greater tolerance be exercised in acceptance of variation and in recognition of linguistic identity, so that this can be built on in the necessary and desirable expansion of linguistic competence, rather than being devalued. The relevance of the communicative approach to language teaching is touched on.

  10. The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

    Science.gov (United States)

    Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

    2017-07-03

    BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. BAT: An open-source, web-based audio events annotation tool

    OpenAIRE

    Blai Meléndez-Catalan, Emilio Molina, Emilia Gómez

    2017-01-01

    In this paper we present BAT (BMAT Annotation Tool), an open-source, web-based tool for the manual annotation of events in audio recordings developed at BMAT (Barcelona Music and Audio Technologies). The main feature of the tool is that it provides an easy way to annotate the salience of simultaneous sound sources. Additionally, it allows to define multiple ontologies to adapt to multiple tasks and offers the possibility to cross-annotate audio data. Moreover, it is easy to install and deploy...

  12. Annotating images by mining image search results.

    Science.gov (United States)

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  13. Collaborative Paper-Based Annotation of Lecture Slides

    Science.gov (United States)

    Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max

    2009-01-01

    In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…

  14. Music journals in South Africa 1854-2010: an annotated bibliography

    African Journals Online (AJOL)

    Music journals in South Africa 1854-2010: an annotated bibliography. ... The article focuses on presenting an annotated bibliography of music journalism in South Africa from as early as 1854 until 2010. Most of ... Key words: annotated bibliography, electronic journals, music journals, periodicals, South African music history ...

  15. Plenary Speeches: Applied Linguists without Borders

    Science.gov (United States)

    Tarone, Elaine

    2013-01-01

    Until 1989, the American Association for Applied Linguistics (AAAL) could have been viewed as an interest group of the Linguistics Society of America (LSA); AAAL met in two designated meeting rooms as a subsection of the LSA conference. In 1991, I was asked to organize the first independent meeting of AAAL in New York City, with the help of…

  16. Educational Linguistics and College English Syllabus Design

    Institute of Scientific and Technical Information of China (English)

    LIU Ji-xin

    2016-01-01

    The direct application of linguistic theories to syllabus design gives rise to frequent change of syllabus type in the histo-ry of syllabus development, which makes language teachers feel difficult to adapt to, to adopt and to implement. The recognition and popularization of the new-born discipline educational linguistics servers as a method to ease the situation, especially in the college English syllabus design in China. The development and application of the fruitful achievements in educational linguis-tics is bound to provide us with a more scientific approach to syllabus design in the future.

  17. Translating Linguistic Jokes for Dubbing

    Directory of Open Access Journals (Sweden)

    Elena ALEKSANDROVA

    2012-01-01

    Full Text Available This study has attempted to establish the possible ways of translating linguistic jokes whendubbing. The study is also intended to identify the most problematic cases of screen translation andthe factors which cause these problems. In order to support such an approach a corpus of 7American and British films has been compiled, including as many as 16 as their various dubbingtranslations into Russian. In the films, almost 12 instances of original linguistic jokes have beenidentified.

  18. An annotated corpus with nanomedicine and pharmacokinetic parameters.

    Science.gov (United States)

    Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T

    2017-01-01

    A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration's Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided.

  19. Plann: A command-line application for annotating plastome sequences1

    Science.gov (United States)

    Huang, Daisie I.; Cronk, Quentin C. B.

    2015-01-01

    Premise of the study: Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Methods and Results: Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann’s output can be used in the National Center for Biotechnology Information’s tbl2asn to create a Sequin file for GenBank submission. Conclusions: Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved. PMID:26312193

  20. COGNITIVE METAPHOR IN MODERN LINGUISTICS

    Directory of Open Access Journals (Sweden)

    Antonina KARTASHOVA

    2010-11-01

    Full Text Available The article outlines the basic notions connected with cognitive metaphor which has lately undergone a thorough examination. The contribution made by linguists resulted in the rise of cognitive linguistics. This science regards metaphor not as a linguistic phenomenon but as a mental one that establishes connection between language and mind in the form of understanding new notions in terms of notions and categories known due to the previously gained experience. The interaction of new and previous experience can generate three main types of metaphors: structural metaphors which imply the structuring of target domain in terms of source domain, ontological metaphors which view abstract notions as concrete objects with clear outlines and orientational metaphors which represent the ways to fix the experience of spatial orientation. The classification of metaphors complemented with examples is presented below along with some controversial cases of determining the type of metaphor.

  1. The Transition from Animal to Linguistic Communication

    NARCIS (Netherlands)

    Smit, Harry

    2016-01-01

    Darwin's theory predicts that linguistic behavior gradually evolved out of animal forms of communication (signaling). However, this prediction is confronted by the conceptual problem that there is an essential difference between signaling and linguistic behavior: using words is a normative practice.

  2. Evaluation of web-based annotation of ophthalmic images for multicentric clinical trials.

    Science.gov (United States)

    Chalam, K V; Jain, P; Shah, V A; Shah, Gaurav Y

    2006-06-01

    An Internet browser-based annotation system can be used to identify and describe features in digitalized retinal images, in multicentric clinical trials, in real time. In this web-based annotation system, the user employs a mouse to draw and create annotations on a transparent layer, that encapsulates the observations and interpretations of a specific image. Multiple annotation layers may be overlaid on a single image. These layers may correspond to annotations by different users on the same image or annotations of a temporal sequence of images of a disease process, over a period of time. In addition, geometrical properties of annotated figures may be computed and measured. The annotations are stored in a central repository database on a server, which can be retrieved by multiple users in real time. This system facilitates objective evaluation of digital images and comparison of double-blind readings of digital photographs, with an identifiable audit trail. Annotation of ophthalmic images allowed clinically feasible and useful interpretation to track properties of an area of fundus pathology. This provided an objective method to monitor properties of pathologies over time, an essential component of multicentric clinical trials. The annotation system also allowed users to view stereoscopic images that are stereo pairs. This web-based annotation system is useful and valuable in monitoring patient care, in multicentric clinical trials, telemedicine, teaching and routine clinical settings.

  3. Aspects of conversational style—linguistic versus behavioral analysis

    Science.gov (United States)

    Hall, Genae A.

    1992-01-01

    Skinner's functional analysis of verbal behavior has been contrasted with formal linguistic analysis which studies the grammatical structure and “meaning” of verbal response-products, regardless of the circumstances under which they are produced. Nevertheless, it appears that certain areas of linguistic analysis are not entirely structural. In her recent books That's Not What I Meant (1986) and You Just Don't Understand (1990), the linguist Deborah Tannen purports to explain how people exhibit different “conversation styles”—that is, how they speak and achieve effects on listeners in different ways. There are indications, however, that the linguistic model may not be the most functional and precise one that could be used in analyzing conversational style. This paper takes concepts presented in Deborah Tannen's book That's Not What I Meant (1986), analyzes them from a linguistic and a behavioral perspective, and compares the relative utility of the two approaches. PMID:22477048

  4. Essential Annotation Schema for Ecology (EASE)-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Science.gov (United States)

    Pfaff, Claas-Thido; Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  5. Essential Annotation Schema for Ecology (EASE-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Directory of Open Access Journals (Sweden)

    Claas-Thido Pfaff

    Full Text Available Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  6. Citation Analysis and Authorship Patterns of Two Linguistics Journals

    Science.gov (United States)

    Ezema, Ifeanyi J.; Asogwa, Brendan E.

    2014-01-01

    This article analyzes the sources cited in articles published in two linguistics journals, "Applied Linguistics and Journal of Linguistics," from 2001 to 2010. A retrospective descriptive study was conducted using bibliometric indicators, such as types of cited sources, timeliness of cited sources, authorship patterns, rank lists of the…

  7. Australia and New Zealand Applied Linguistics (ANZAL): Taking Stock

    Science.gov (United States)

    Kleinsasser, Robert C.

    2004-01-01

    This paper reviews some emerging trends in applied linguistics in both Australia and New Zealand. It sketches the current scene of (selected) postgraduate applied linguistics programs in higher education and considers how various university programs define applied linguistics through the classes (titles) they have postgraduate students complete to…

  8. Workshops as a Research Methodology

    Science.gov (United States)

    Ørngreen, Rikke; Levinsen, Karin

    2017-01-01

    This paper contributes to knowledge on workshops as a research methodology, and specifically on how such workshops pertain to e-learning. A literature review illustrated that workshops are discussed according to three different perspectives: workshops as a means, workshops as practice, and workshops as a research methodology. Focusing primarily on…

  9. Historical Trajectory of the Quechuan Linguistic Family and its Relations to the Aimaran Linguistic Family

    OpenAIRE

    Adelaar, Willem

    2012-01-01

    This article seeks to present the principal stages of the prehistory and history of the Quechuan language family in its interaction with the Aimaran family. It reconstructs a plausible scenario for a unique, intensive process of linguistic convergence that underlies the protolanguages of both families. From there on, it traces the principal developments that characterize the history of the Quechuan linguistic family, such as the initial split in two main branches, Quechua I and Quechua II (fo...

  10. Annotating Logical Forms for EHR Questions.

    Science.gov (United States)

    Roberts, Kirk; Demner-Fushman, Dina

    2016-05-01

    This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

  11. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  12. Managing and Querying Image Annotation and Markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  13. Expressed Peptide Tags: An additional layer of data for genome annotation

    Energy Technology Data Exchange (ETDEWEB)

    Savidor, Alon [ORNL; Donahoo, Ryan S [ORNL; Hurtado-Gonzales, Oscar [University of Tennessee, Knoxville (UTK); Verberkmoes, Nathan C [ORNL; Shah, Manesh B [ORNL; Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL

    2006-01-01

    While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller sub-databases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While ~77% of Phytophthora EPTs supported the current annotation, a portion of them (7.2% and 12.6% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

  14. Implicit learning of non-linguistic and linguistic regularities in children with dyslexia.

    Science.gov (United States)

    Nigro, Luciana; Jiménez-Fernández, Gracia; Simpson, Ian C; Defior, Sylvia

    2016-07-01

    One of the hallmarks of dyslexia is the failure to automatise written patterns despite repeated exposure to print. Although many explanations have been proposed to explain this problem, researchers have recently begun to explore the possibility that an underlying implicit learning deficit may play a role in dyslexia. This hypothesis has been investigated through non-linguistic tasks exploring implicit learning in a general domain. In this study, we examined the abilities of children with dyslexia to implicitly acquire positional regularities embedded in both non-linguistic and linguistic stimuli. In experiment 1, 42 children (21 with dyslexia and 21 typically developing) were exposed to rule-governed shape sequences; whereas in experiment 2, a new group of 42 children were exposed to rule-governed letter strings. Implicit learning was assessed in both experiments via a forced-choice task. Experiments 1 and 2 showed a similar pattern of results. ANOVA analyses revealed no significant differences between the dyslexic and the typically developing group, indicating that children with dyslexia are not impaired in the acquisition of simple positional regularities, regardless of the nature of the stimuli. However, within group t-tests suggested that children from the dyslexic group could not transfer the underlying positional rules to novel instances as efficiently as typically developing children.

  15. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    Directory of Open Access Journals (Sweden)

    Jianfang Cao

    2015-01-01

    Full Text Available With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  16. Aspects of conversational style—linguistic versus behavioral analysis

    OpenAIRE

    Hall, Genae A.

    1992-01-01

    Skinner's functional analysis of verbal behavior has been contrasted with formal linguistic analysis which studies the grammatical structure and “meaning” of verbal response-products, regardless of the circumstances under which they are produced. Nevertheless, it appears that certain areas of linguistic analysis are not entirely structural. In her recent books That's Not What I Meant (1986) and You Just Don't Understand (1990), the linguist Deborah Tannen purports to explain how people exhibi...

  17. Multiview Hessian regularization for image annotation.

    Science.gov (United States)

    Liu, Weifeng; Tao, Dacheng

    2013-07-01

    The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.

  18. MOOC Design Workshop

    DEFF Research Database (Denmark)

    Nørgård, Rikke Toft; Mor, Yishay; Warburton, Steven

    2016-01-01

    For the last two years we have been running a series of successful MOOC design workshops. These workshops build on previous work in learning design and MOOC design patterns. The aim of these workshops is to aid practitioners in defining and conceptualising educational innovations (predominantly......, but not exclusively MOOCs) which are based on an empathic user-centered view of the target learners and teachers. In this paper, we share the main principles, patterns and resources of our workshops and present some initial results for their effectiveness...

  19. Ten steps to get started in Genome Assembly and Annotation

    Science.gov (United States)

    Dominguez Del Angel, Victoria; Hjerde, Erik; Sterck, Lieven; Capella-Gutierrez, Salvadors; Notredame, Cederic; Vinnere Pettersson, Olga; Amselem, Joelle; Bouri, Laurent; Bocs, Stephanie; Klopp, Christophe; Gibrat, Jean-Francois; Vlasova, Anna; Leskosek, Brane L.; Soler, Lucile; Binzer-Panchal, Mahesh; Lantz, Henrik

    2018-01-01

    As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). PMID:29568489

  20. Sharing Map Annotations in Small Groups: X Marks the Spot

    Science.gov (United States)

    Congleton, Ben; Cerretani, Jacqueline; Newman, Mark W.; Ackerman, Mark S.

    Advances in location-sensing technology, coupled with an increasingly pervasive wireless Internet, have made it possible (and increasingly easy) to access and share information with context of one’s geospatial location. We conducted a four-phase study, with 27 students, to explore the practices surrounding the creation, interpretation and sharing of map annotations in specific social contexts. We found that annotation authors consider multiple factors when deciding how to annotate maps, including the perceived utility to the audience and how their contributions will reflect on the image they project to others. Consumers of annotations value the novelty of information, but must be convinced of the author’s credibility. In this paper we describe our study, present the results, and discuss implications for the design of software for sharing map annotations.

  1. Annotation-based feature extraction from sets of SBML models.

    Science.gov (United States)

    Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron

    2015-01-01

    Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.

  2. Social network size can influence linguistic malleability and the propagation of linguistic change.

    Science.gov (United States)

    Lev-Ari, Shiri

    2018-07-01

    We learn language from our social environment, but the more sources we have, the less informative each source is, and therefore, the less weight we ascribe its input. According to this principle, people with larger social networks should give less weight to new incoming information, and should therefore be less susceptible to the influence of new speakers. This paper tests this prediction, and shows that speakers with smaller social networks indeed have more malleable linguistic representations. In particular, they are more likely to adjust their lexical boundary following exposure to a new speaker. Experiment 2 uses computational simulations to test whether this greater malleability could lead people with smaller social networks to be important for the propagation of linguistic change despite the fact that they interact with fewer people. The results indicate that when innovators were connected with people with smaller rather than larger social networks, the population exhibited greater and faster diffusion. Together these experiments show that the properties of people's social networks can influence individuals' learning and use as well as linguistic phenomena at the community level. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Roadmap for annotating transposable elements in eukaryote genomes.

    Science.gov (United States)

    Permal, Emmanuelle; Flutre, Timothée; Quesneville, Hadi

    2012-01-01

    Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.

  4. 77 FR 12313 - Food Labeling Workshop; Public Workshop

    Science.gov (United States)

    2012-02-29

    ... DEPARTMENT OF HEALTH AND HUMAN SERVICES Food and Drug Administration [Docket No. FDA-2012-N-0001] Food Labeling Workshop; Public Workshop AGENCY: Food and Drug Administration, HHS. ACTION: Notice of... District Office (DALDO), in collaboration with Oklahoma State University (OSU), Robert M. Kerr Food...

  5. 75 FR 29775 - Food Labeling Workshop; Public Workshop

    Science.gov (United States)

    2010-05-27

    ... DEPARTMENT OF HEALTH AND HUMAN SERVICES [Docket No. FDA-2010-N-0001] Food and Drug Administration Food Labeling Workshop; Public Workshop AGENCY: Food and Drug Administration, HHS. ACTION: Notice of...: Institute of Food Science & Engineering, University of Arkansas, 2650 North Young Ave., Fayetteville, AR...

  6. Systems Engineering Workshops | Wind | NREL

    Science.gov (United States)

    Workshops Systems Engineering Workshops The Wind Energy Systems Engineering Workshop is a biennial topics relevant to systems engineering and the wind industry. The presentations and agendas are available for all of the Systems Engineering Workshops: The 1st NREL Wind Energy Systems Engineering Workshop

  7. A lattice-valued linguistic decision model for nuclear safeguards applications

    International Nuclear Information System (INIS)

    Ruan, D.; Liu, J.; Carchon, R.

    2001-01-01

    In this study, we focus our attention on decision making models to process uncertainty-based information directly without transforming them into any particular membership function, i.e., directly using linguistic information (linguistic values) instead of numbers (numerical values). By analyzing the feature of linguistic values ordered by their means of common usage, we argue that the set of linguistic values should be characterized by a lattice structure. We propose the lattice structure based on a logical algebraic structure i.e., lattice implication algebra. Finally, we obtain a multi-objective decision-making model by extending Yager's multi-objective model from the following aspects: (1) extension of linguistic information: from a set of linear ordered linguistic labels (values) to that of lattice-valued linguistic labels; (2) extension of the combination function M, which is used to combine the individual ratings with the weights of criteria. We propose an implication operation form of M. The implication operation can be drawn from lattice implication algebra. As an illustration, we will finally apply this decision model to the evaluation problem in safeguard relevant information. (orig.)

  8. The Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning

    Directory of Open Access Journals (Sweden)

    Saeideh Ahangari

    2010-05-01

    Full Text Available In our modern technological world, Computer-Assisted Language learning (CALL is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotations, dynamic picture annotations, and written annotations on L2 vocabulary learning. To fulfill this objective, the researchers selected sixty four EFL learners as the participants of this study. The participants were randomly assigned to one of the four groups: a control group that received no annotations and three experimental groups that received:  still picture annotations, dynamic picture annotations, and written annotations. Each participant was required to take a pre-test. A vocabulary post- test was also designed and administered to the participants in order to assess the efficacy of each annotation. First for each group a paired t-test was conducted between their pre and post test scores in order to observe their improvement; then through an ANCOVA test the performance of four groups was compared. The results showed that using multimedia annotations resulted in a significant difference in the participants’ vocabulary learning. Based on the results of the present study, multimedia annotations are suggested as a vocabulary teaching strategy.

  9. Chomsky and Wittgenstein on Linguistic Competence

    Directory of Open Access Journals (Sweden)

    Thomas McNally

    2012-11-01

    Full Text Available In his Wittgenstein on Rules and Private Language, Saul Kripke presents his influential reading of Wittgenstein’s later writings on language. One of the largely unexplored features of that reading is that Kripke makes a small number of suggestive remarks concerning the possible threat that Wittgenstein’s arguments pose for Chomsky’s linguistic project. In this paper, we attempt to characterise the relevance of Wittgenstein’s later work on meaning and rule-following for transformational linguistics, and in particular to identify the potentially negative impact it has on that project. Although we use Kripke’s remarks to articulate some of the pertinent issues, we return to Wittgenstein’s later writings to address them. We argue that Wittgenstein’s main target in the relevant sections of the Philosophical Investigations is the notion of ‘logical compulsion’, which involves assuming that there is more to applying a word or rule than how we are naturally or “psychologically” compelled to apply. We characterise two of the main lines of argument in the Investigations in terms of the rejection of logical compulsion. We thus propose to address the relevance of Wittgenstein’s writings for Chomsky by considering whether Chomsky’s linguistics presupposes the targeted notion of logical compulsion. We argue that Chomsky’s conception of linguistic competence in terms of successive states of the “language faculty” (containing the principles of universal grammar does presuppose this problematic notion. Chomsky responded to Kripke by devoting a chapter of his Knowledge of Language to defending this conception of linguistic competence against the Wittgensteinian arguments. We evaluate his response and argue that he has misidentified the threat to his linguistic project as consisting in the attack on its ‘individual psychology’ standpoint, rather than its commitment to logical compulsion. We conclude by arguing that Chomsky

  10. Linguistic complex networks as a young field of quantitative linguistics. Comment on "Approaching human language with complex networks" by J. Cong and H. Liu

    Science.gov (United States)

    Köhler, Reinhard

    2014-12-01

    We have long been used to the domination of qualitative methods in modern linguistics. Indeed, qualitative methods have advantages such as ease of use and wide applicability to many types of linguistic phenomena. However, this shall not overshadow the fact that a great part of human language is amenable to quantification. Moreover, qualitative methods may lead to over-simplification by employing the rigid yes/no scale. When variability and vagueness of human language must be taken into account, qualitative methods will prove inadequate and give way to quantitative methods [1, p. 11]. In addition to such advantages as exactness and precision, quantitative concepts and methods make it possible to find laws of human language which are just like those in natural sciences. These laws are fundamental elements of linguistic theories in the spirit of the philosophy of science [2,3]. Theorization effort of this type is what quantitative linguistics [1,4,5] is devoted to. The review of Cong and Liu [6] has provided an informative and insightful survey of linguistic complex networks as a young field of quantitative linguistics, including the basic concepts and measures, the major lines of research with linguistic motivation, and suggestions for future research.

  11. Evaluating Functional Annotations of Enzymes Using the Gene Ontology.

    Science.gov (United States)

    Holliday, Gemma L; Davidson, Rebecca; Akiva, Eyal; Babbitt, Patricia C

    2017-01-01

    The Gene Ontology (GO) (Ashburner et al., Nat Genet 25(1):25-29, 2000) is a powerful tool in the informatics arsenal of methods for evaluating annotations in a protein dataset. From identifying the nearest well annotated homologue of a protein of interest to predicting where misannotation has occurred to knowing how confident you can be in the annotations assigned to those proteins is critical. In this chapter we explore what makes an enzyme unique and how we can use GO to infer aspects of protein function based on sequence similarity. These can range from identification of misannotation or other errors in a predicted function to accurate function prediction for an enzyme of entirely unknown function. Although GO annotation applies to any gene products, we focus here a describing our approach for hierarchical classification of enzymes in the Structure-Function Linkage Database (SFLD) (Akiva et al., Nucleic Acids Res 42(Database issue):D521-530, 2014) as a guide for informed utilisation of annotation transfer based on GO terms.

  12. [An essay about science and linguistics].

    Science.gov (United States)

    Cugini, P

    2011-01-01

    Both the methodology and epistemology of science provided the criteria by which the scientific research can describe and interpret data and results of its observational or experimental studies. When the scientist approaches the conclusive inference, it is mandatory to think that both the knowledge and truth imply the use of words semantically and etymologically (semiologically) appropriate, especially if neologisms are required. Lacking a vocabulary, there will be the need of popularizing the inference to the linguistics of the context to which the message is addressed. This could imply a discrepancy among science, knowledge, truth and linguistics, that can be defined "semiologic bias". To avoid this linguistic error, the scientist must feel the responsibility to provide the scientific community with the new words that are semantically and etymologically coherent with what it has been scientifically discovered.

  13. Political Liberalism, Linguistic Diversity and Equal Treatment

    Science.gov (United States)

    Bonotti, Matteo

    2017-01-01

    This article explores the implications of John Rawls' political liberalism for linguistic diversity and language policy, by focusing on the following question: what kind(s) of equality between speakers of different languages and with different linguistic identities should the state guarantee under political liberalism? The article makes three…

  14. Child Participant Roles in Applied Linguistics Research

    Science.gov (United States)

    Pinter, Annamaria

    2014-01-01

    Children's status as research participants in applied linguistics has been largely overlooked even though unique methodological and ethical concerns arise in projects where children, rather than adults, are involved. This article examines the role of children as research participants in applied linguistics and discusses the limitations of…

  15. Term Bases and Linguistic Linked Open Data

    DEFF Research Database (Denmark)

    for pursuing their work. The theme of this year’s TKE is ‘Term Bases and Linguistic Linked Open Data’. Mono- and multi-lingual term bases, which contain information about concepts (terms, definitions, examples of use, references, comments on equivalence etc.), have always made up valuable linguistic resources...

  16. Applied antineutrino physics workshop

    International Nuclear Information System (INIS)

    Lund, James C.

    2008-01-01

    This workshop is the fourth one of a series that includes the Neutrino Geophysics Conference at Honolulu, Hawaii, which I attended in 2005. This workshop was organized by the Astro-Particle and Cosmology laboratory in the recently opened Condoret building of the University of Paris. More information, including copies of the presentations, on the workshop is available on the website: www.apc.univ-paris7.fr/AAP2007/. The workshop aims at opening neutrino physics to various fields such that it can be applied in geosciences, nuclear industry (reactor and spent fuel monitoring) and non-proliferation. The workshop was attended by over 60 people from Europe, USA, Asia and Brazil. The meeting was also attended by representatives of the Comprehensive nuclear-Test Ban Treaty (CTBT) and the International Atomic Energy Agency (IAEA). The workshop also included a workshop dinner on board of a river boat sailing the Seine river

  17. AutoFACT: An Automatic Functional Annotation and Classification Tool

    Directory of Open Access Journals (Sweden)

    Lang B Franz

    2005-06-01

    Full Text Available Abstract Background Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1 analyzes nucleotide and protein sequence data; (2 determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3 assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4 generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at http://megasun.bch.umontreal.ca/Software/AutoFACT.htm.

  18. LeARN: a platform for detecting, clustering and annotating non-coding RNAs

    Directory of Open Access Journals (Sweden)

    Schiex Thomas

    2008-01-01

    Full Text Available Abstract Background In the last decade, sequencing projects have led to the development of a number of annotation systems dedicated to the structural and functional annotation of protein-coding genes. These annotation systems manage the annotation of the non-protein coding genes (ncRNAs in a very crude way, allowing neither the edition of the secondary structures nor the clustering of ncRNA genes into families which are crucial for appropriate annotation of these molecules. Results LeARN is a flexible software package which handles the complete process of ncRNA annotation by integrating the layers of automatic detection and human curation. Conclusion This software provides the infrastructure to deal properly with ncRNAs in the framework of any annotation project. It fills the gap between existing prediction software, that detect independent ncRNA occurrences, and public ncRNA repositories, that do not offer the flexibility and interactivity required for annotation projects. The software is freely available from the download section of the website http://bioinfo.genopole-toulouse.prd.fr/LeARN

  19. Essential Annotation Schema for Ecology (EASE)—A framework supporting the efficient data annotation and faceted navigation in ecology

    Science.gov (United States)

    Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines. PMID:29023519

  20. MicroScope: a platform for microbial genome annotation and comparative genomics.

    Science.gov (United States)

    Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  1. Linguistic approaches to the study of Persian Literature

    OpenAIRE

    محمد امین ناصح

    2010-01-01

    Since the start of the last century, along with those literary men who took a literary approach to the study of literary texts, there has been another group who has taken a linguistic approach. The ancient and ever flourishing tradition of literary studies of translation can no doubt take benefit from linguistic methods and tools in the investigations of literary texts. As a result of this we come across linguistic terms in three high school textbooks of Persian language and literature. In f...

  2. Ideology, Linguistic Capital and the Medium of Instruction in Hong Kong.

    Science.gov (United States)

    Morrison, Keith; Lui, Icy

    2000-01-01

    Examines the links between linguistic capital, cultural capital, linguistic imperialism, and the use of English as the medium of instruction (MOI) in Hong Kong. Suggests that the notion of linguistic imperialism in Hong Kong is superceded by the notion of linguistic capital, although neither presents a complete analysis of the MOI issue in Hong…

  3. Words Get in the Way: Linguistic Effects on Talker Discrimination.

    Science.gov (United States)

    Narayan, Chandan R; Mak, Lorinda; Bialystok, Ellen

    2017-07-01

    A speech perception experiment provides evidence that the linguistic relationship between words affects the discrimination of their talkers. Listeners discriminated two talkers' voices with various linguistic relationships between their spoken words. Listeners were asked whether two words were spoken by the same person or not. Word pairs varied with respect to the linguistic relationship between the component words, forming either: phonological rhymes, lexical compounds, reversed compounds, or unrelated pairs. The degree of linguistic relationship between the words affected talker discrimination in a graded fashion, revealing biases listeners have regarding the nature of words and the talkers that speak them. These results indicate that listeners expect a talker's words to be linguistically related, and more generally, indexical processing is affected by linguistic information in a top-down fashion even when listeners are not told to attend to it. Copyright © 2016 Cognitive Science Society, Inc.

  4. Exploring Linguistic Identity in Young Multilingual Learners

    Science.gov (United States)

    Dressler, Roswita

    2014-01-01

    This article explores the linguistic identity of young multilingual learners through the use of a Language Portrait Silhouette. Examples from a research study of children aged 6-8 years in a German bilingual program in Canada provide teachers with an understanding that linguistic identity comprises expertise, affiliation, and inheritance. This…

  5. Youth Culture, Language Endangerment and Linguistic Survivance

    Science.gov (United States)

    Wyman, Leisy

    2012-01-01

    Detailing a decade of life and language use in a remote Alaskan Yup'ik community, Youth Culture, Language Endangerment and Linguistic Survivance provides rare insight into young people's language brokering and Indigenous people's contemporary linguistic ecologies. This book examines how two consecutive groups of youth in a Yup'ik village…

  6. Improving English Instruction through Neuro-Linguistic Programming

    Science.gov (United States)

    Helm, David Jay

    2009-01-01

    This study examines the background information and numerous applications of neuro-linguistic programming as it applies to improving English instruction. In addition, the N.L.P. modalities of eye movement, the use of predicates, and posturing are discussed. Neuro-linguistic programming presents all students of English an opportunity to reach their…

  7. Second international tsunami workshop on the technical aspects of tsunami warning systems, tsunami analysis, preparedness, observation and instrumentation

    International Nuclear Information System (INIS)

    1989-01-01

    The Second Workshop on the Technical Aspects of Tsunami Warning Systems, Tsunami Analysis, Preparedness, Observation, and Instrumentation, sponsored and convened by the Intergovernmental Oceanographic Commission (IOC), was held on 1-2 August 1989, in the modern and attractive research town of Academgorodok, which is located 20 km south from downtown Novosibirsk, the capital of Siberia, USSR. The Program was arranged in eight major areas of interest covering the following: Opening and Introduction; Survey of Existing Tsunami Warning Centers - present status, results of work, plans for future development; Survey of some existing seismic data processing systems and future projects; Methods for fast evaluation of Tsunami potential and perspectives of their implementation; Tsunami data bases; Tsunami instrumentation and observations; Tsunami preparedness; and finally, a general discussion and adoption of recommendations. The Workshop presentations not only addressed the conceptual improvements that have been made, but focused on the inner workings of the Tsunami Warning System, as well, including computer applications, on-line processing and numerical modelling. Furthermore, presentations reported on progress has been made in the last few years on data telemetry, instrumentation and communications. Emphasis was placed on new concepts and their application into operational techniques that can result in improvements in data collection, rapid processing of the data, in analysis and prediction. A Summary Report on the Second International Tsunami Workshop, containing abstracted and annotated proceedings has been published as a separate report. The present Report is a Supplement to the Summary Report and contains the full text of the papers presented at this Workshop. Refs, figs and tabs

  8. Linguistic Theory in the Practical Lexicography of the African Languages

    Directory of Open Access Journals (Sweden)

    Emmanuel Chabata

    2011-10-01

    Full Text Available Abstract: In this article, we look at the relationship between linguistics and lexicography. We specifically look at the relevance of data derived from theoretical linguistic investigations to the compilation of diction-aries in African languages. Our point of departure is that since it is language description that lies at the core of both lexicography and linguistic theory, lexicographers can improve their work by using insights from theoretically-guided linguistic investigations. Our view is that as long as lexicographers focus on words and their existence in the linguistic system, they cannot work effectively without referring to linguistic theory, consciously or unconsciously. Lexicography is not only concerned with dictionary creation, that is, with the collection of lexical units and their proper description in dictionary entries, but also with the theoretical aspects concerning the lexicon. It is necessary for dictionaries to capture all lexical interrelationships of a phonetic, morphological, syntactic or semantic nature. Drawing examples from a few dictionaries on African languages, we try to show how dictionary compilers have benefited from specific theoretical investigations in general linguistics. We look at how the different linguistic theories have contributed to the improvement in the quality of the contents of some dictionaries of African languages. Our conclusion is that there is a stronger bond between linguistic theory and lexicographic practice than is generally assumed. Ways must therefore be found to understand the various links between the two disciplines. There should be a deliberate move from mutual neglect to collaboration between the two disciplines.

  9. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  10. ONEMercury: Towards Automatic Annotation of Earth Science Metadata

    Science.gov (United States)

    Tuarob, S.; Pouchard, L. C.; Noy, N.; Horsburgh, J. S.; Palanisamy, G.

    2012-12-01

    Earth sciences have become more data-intensive, requiring access to heterogeneous data collected from multiple places, times, and thematic scales. For example, research on climate change may involve exploring and analyzing observational data such as the migration of animals and temperature shifts across the earth, as well as various model-observation inter-comparison studies. Recently, DataONE, a federated data network built to facilitate access to and preservation of environmental and ecological data, has come to exist. ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for discovering and accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple data repositories and makes it searchable via a common search interface built upon cutting edge search engine technology, allowing users to interact with the system, intelligently filter the search results on the fly, and fetch the data from distributed data sources. Linking data from heterogeneous sources always has a cost. A problem that ONEMercury faces is the different levels of annotation in the harvested metadata records. Poorly annotated records tend to be missed during the search process as they lack meaningful keywords. Furthermore, such records would not be compatible with the advanced search functionality offered by ONEMercury as the interface requires a metadata record be semantically annotated. The explosion of the number of metadata records harvested from an increasing number of data repositories makes it impossible to annotate the harvested records manually, urging the need for a tool capable of automatically annotating poorly curated metadata records. In this paper, we propose a topic-model (TM) based approach for automatic metadata annotation. Our approach mines topics in the set of well annotated records and suggests keywords for poorly annotated records based on topic similarity. We utilize the

  11. "New linguistic issues", by Pier Pasolini, is causing scandal among linguists, philologists, writers, critics and intellectuals

    Directory of Open Access Journals (Sweden)

    Teodoro Negri

    1993-12-01

    Full Text Available Pasolini departs from the diagnosis of a problem: the critical quest stage in contemporary literature, centered on the 1950s; he points out the author´s inability to create the design for a national language. He goes on to analyze the deep mutation in Italian Society, which determineted a new socio-linguistic outlook; to wit, a language clearly marked by strong technicality and instrumentation. Drawing examples from newspapers, TV features, official political speeches and commercials, Pasolini demonstrates that factual communication takes precedence over formal expression. This is ascribed to one principle which sets both rules and approvals for all forms of national language. This fact, according to Pasolini, is the result of an industrial and technological transformation process, which would  permite advent of a new linguistic bourgeoisie. The linguistic unification caused by such approving  principle would, therefore, imply the social manifestation of the bourgeoisie.

  12. Design Features for Linguistically-Mediated Meaning Construction: The Relative Roles of the Linguistic and Conceptual Systems in Subserving the Ideational Function of Language.

    Science.gov (United States)

    Evans, Vyvyan

    2016-01-01

    Recent research in language and cognitive science proposes that the linguistic system evolved to provide an "executive" control system on the evolutionarily more ancient conceptual system (e.g., Barsalou et al., 2008; Evans, 2009, 2015a,b; Bergen, 2012). In short, the claim is that embodied representations in the linguistic system interface with non-linguistic representations in the conceptual system, facilitating rich meanings, or simulations, enabling linguistically mediated communication. In this paper I build on these proposals by examining the nature of what I identify as design features for this control system. In particular, I address how the ideational function of language-our ability to deploy linguistic symbols to convey meanings of great complexity-is facilitated. The central proposal of this paper is as follows. The linguistic system of any given language user, of any given linguistic system-spoken or signed-facilitates access to knowledge representation-concepts-in the conceptual system, which subserves this ideational function. In the most general terms, the human meaning-making capacity is underpinned by two distinct, although tightly coupled representational systems: the conceptual system and the linguistic system. Each system contributes to meaning construction in qualitatively distinct ways. This leads to the first design feature: given that the two systems are representational-they are populated by semantic representations-the nature and function of the representations are qualitatively different. This proposed design feature I term the bifurcation in semantic representation. After all, it stands to reason that if a linguistic system has a different function, vis-à-vis the conceptual system, which is of far greater evolutionary antiquity, then the semantic representations will be complementary, and as such, qualitatively different, reflecting the functional distinctions of the two systems, in collectively giving rise to meaning. I consider the

  13. A Selected Annotated Bibliography on Work Time Options.

    Science.gov (United States)

    Ivantcho, Barbara

    This annotated bibliography is divided into three sections. Section I contains annotations of general publications on work time options. Section II presents resources on flexitime and the compressed work week. In Section III are found resources related to these reduced work time options: permanent part-time employment, job sharing, voluntary…

  14. Peer-Review Writing Workshops in College Courses: Students’ Perspectives about Online and Classroom Based Workshops

    Directory of Open Access Journals (Sweden)

    Erin B. Jensen

    2016-11-01

    Full Text Available Peer-review workshops are commonly used in writing courses as a way for students to give their peers feedback as well as help their own writing. Most of the research on peer-review workshops focuses on workshops held in traditional in-person courses, with less research on peer-review workshops held online. Students in a freshman writing course experienced both a classroom based writing workshop and an online workshop and then took a survey about their experiences. The majority of the students preferred the online writing workshop because of the convenience of the workshop and being able to post anonymous reviews. Students whom preferred the traditional in-person writing workshop liked being able to talk with their peers about their papers. This research article focuses on the students’ responses and experiences with traditional and online peer-reviews.

  15. Genre Analysis in the Frame of Systemic Functional Linguistics

    Directory of Open Access Journals (Sweden)

    Najih Imtihani

    2012-08-01

    Full Text Available Systemic Functional Linguistics is a linguistics approach which cop-siders not only the structure of the language but also its social context. In the Systemic Functional Linguistics the concept of genre is defined as a step-by-step activity to reach the goal. The concept of genre is used to describe the cultural context in a language. According to this view, text should be seen and observed in its interaction with the context and social background. For that, the genre analysis will constantly involve the linguistic social context in the forms of field, tenor, mode, schematic structure and its realization in the text.

  16. ICP-MS Workshop

    Energy Technology Data Exchange (ETDEWEB)

    Carman, April J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Eiden, Gregory C. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2014-11-01

    This is a short document that explains the materials that will be transmitted to LLNL and DNN HQ regarding the ICP-MS Workshop held at PNNL June 17-19th. The goal of the information is to pass on to LLNL information regarding the planning and preparations for the Workshop at PNNL in preparation of the SIMS workshop at LLNL.

  17. Using the Linguistic Landscape to Bridge Languages

    Science.gov (United States)

    Mari, Vanessa

    2018-01-01

    In this article Vanessa Mari describes how she uses the linguistic landscape to bridge two or more languages with students learning English. The linguistic landscape is defined by Landry and Bourhis (1997, 25) as "the language of public road signs, advertising billboards, street names, place names, commercial shop signs, and public signs on…

  18. 78 FR 33849 - Battery-Powered Medical Devices Workshop: Challenges and Opportunities; Public Workshop; Request...

    Science.gov (United States)

    2013-06-05

    ... after the public workshop on the Internet at http://www.fda.gov/MedicalDevices/NewsEvents/Workshops..., compact, and mobile, the number of battery-powered medical devices will continue to increase. While many...] Battery-Powered Medical Devices Workshop: Challenges and Opportunities; Public Workshop; Request for...

  19. How Far Is Stanford from Prague (and vice versa? Comparing Two Dependency-based Annotation Schemes by Network Analysis

    Directory of Open Access Journals (Sweden)

    Marco Passarotti

    2016-07-01

    Full Text Available The paper evaluates the differences between two currently leading annotation schemes for dependency treebanks. By relying on four treebanks, we demonstrate that the treatment of conjunctions and adpositions represents the core difference between the two schemes and that this impacts the topological properties of the linguistic networks induced from the treebanks. We also show that such properties are reflected in the performances of four probabilistic dependency parsers trained on the treebanks. L’articolo valuta le differenze tra i due principali schemi di annotazione a dipenden-ze in uso. Sulla base di quattro treebank, l’articolo dimostra che il trattamento delle congiunzioni e delle pre/postposizioni rappresenta la differenza principale tra i due schemi e che ciò comporta delle conseguenze sulle proprietà topologiche dei net-work indotti dalle treebank. Inoltre, si dimostra come tali proprietà siano riflesse nell’accuratezza di quattro parser probabilistici a dipendenze addestrati sulle treebank.

  20. Can delusions be understood linguistically?

    Science.gov (United States)

    Hinzen, Wolfram; Rosselló, Joana; McKenna, Peter

    2016-07-01

    Delusions are widely believed to reflect disturbed cognitive function, but the nature of this remains elusive. The "un-Cartesian" cognitive-linguistic hypothesis maintains (a) that there is no thought separate from language, that is, there is no distinct mental space removed from language where "thinking" takes place; and (b) that a somewhat broadened concept of grammar is responsible for bestowing meaning on propositions, and this among other things gives them their quality of being true or false. It is argued that a loss of propositional meaning explains why delusions are false, impossible and sometimes fantastic. A closely related abnormality, failure of linguistic embedding, can additionally account for why delusions are held with fixed conviction and are not adequately justified by the patient. The un-Cartesian linguistic approach to delusions has points of contact with Frith's theory that inability to form meta-representations underlies a range of schizophrenic symptoms. It may also be relevant to the nature of the "second factor" in monothematic delusions in neurological disease. Finally, it can inform the current debate about whether or not delusions really are beliefs.

  1. Prepare-Participate-Connect: Active Learning with Video Annotation

    Science.gov (United States)

    Colasante, Meg; Douglas, Kathy

    2016-01-01

    Annotation of video provides students with the opportunity to view and engage with audiovisual content in an interactive and participatory way rather than in passive-receptive mode. This article discusses research into the use of video annotation in four vocational programs at RMIT University in Melbourne, which allowed students to interact with…

  2. Formal monkey linguistics

    NARCIS (Netherlands)

    Schlenker, Philippe; Chemla, Emmanuel; Schel, Anne M.; Fuller, James; Gautier, Jean Pierre; Kuhn, Jeremy; Veselinović, Dunja; Arnold, Kate; Cäsar, Cristiane; Keenan, Sumir; Lemasson, Alban; Ouattara, Karim; Ryder, Robin; Zuberbühler, Klaus

    2016-01-01

    We argue that rich data gathered in experimental primatology in the last 40 years can benefit from analytical methods used in contemporary linguistics. Focusing on the syntactic and especially semantic side, we suggest that these methods could help clarify five questions: (i) what morphology and

  3. Re-annotation and re-analysis of the Campylobacter jejuni NCTC11168 genome sequence

    Directory of Open Access Journals (Sweden)

    Dorrell Nick

    2007-06-01

    Full Text Available Abstract Background Campylobacter jejuni is the leading bacterial cause of human gastroenteritis in the developed world. To improve our understanding of this important human pathogen, the C. jejuni NCTC11168 genome was sequenced and published in 2000. The original annotation was a milestone in Campylobacter research, but is outdated. We now describe the complete re-annotation and re-analysis of the C. jejuni NCTC11168 genome using current database information, novel tools and annotation techniques not used during the original annotation. Results Re-annotation was carried out using sequence database searches such as FASTA, along with programs such as TMHMM for additional support. The re-annotation also utilises sequence data from additional Campylobacter strains and species not available during the original annotation. Re-annotation was accompanied by a full literature search that was incorporated into the updated EMBL file [EMBL: AL111168]. The C. jejuni NCTC11168 re-annotation reduced the total number of coding sequences from 1654 to 1643, of which 90.0% have additional information regarding the identification of new motifs and/or relevant literature. Re-annotation has led to 18.2% of coding sequence product functions being revised. Conclusions Major updates were made to genes involved in the biosynthesis of important surface structures such as lipooligosaccharide, capsule and both O- and N-linked glycosylation. This re-annotation will be a key resource for Campylobacter research and will also provide a prototype for the re-annotation and re-interpretation of other bacterial genomes.

  4. Comparison of concept recognizers for building the Open Biomedical Annotator

    Directory of Open Access Journals (Sweden)

    Rubin Daniel

    2009-09-01

    Full Text Available Abstract The National Center for Biomedical Ontology (NCBO is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2:S1. The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers – NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.

  5. Effects of Reviewing Annotations and Homework Solutions on Math Learning Achievement

    Science.gov (United States)

    Hwang, Wu-Yuin; Chen, Nian-Shing; Shadiev, Rustam; Li, Jin-Sing

    2011-01-01

    Previous studies have demonstrated that making annotations can be a meaningful and useful learning method that promote metacognition and enhance learning achievement. A web-based annotation system, Virtual Pen (VPEN), which provides for the creation and review of annotations and homework solutions, has been developed to foster learning process…

  6. LocusTrack: Integrated visualization of GWAS results and genomic annotation.

    Science.gov (United States)

    Cuellar-Partida, Gabriel; Renteria, Miguel E; MacGregor, Stuart

    2015-01-01

    Genome-wide association studies (GWAS) are an important tool for the mapping of complex traits and diseases. Visual inspection of genomic annotations may be used to generate insights into the biological mechanisms underlying GWAS-identified loci. We developed LocusTrack, a web-based application that annotates and creates plots of regional GWAS results and incorporates user-specified tracks that display annotations such as linkage disequilibrium (LD), phylogenetic conservation, chromatin state, and other genomic and regulatory elements. Currently, LocusTrack can integrate annotation tracks from the UCSC genome-browser as well as from any tracks provided by the user. LocusTrack is an easy-to-use application and can be accessed at the following URL: http://gump.qimr.edu.au/general/gabrieC/LocusTrack/. Users can upload and manage GWAS results and select from and/or provide annotation tracks using simple and intuitive menus. LocusTrack scripts and associated data can be downloaded from the website and run locally.

  7. "Annotated Lectures": Student-Instructor Interaction in Large-Scale Global Education

    Directory of Open Access Journals (Sweden)

    Roger Diehl

    2009-10-01

    Full Text Available We describe an "Annotated Lectures" system, which will be used in a global virtual teaching and student collaboration event on embodied intelligence presented by the University of Zurich. The lectures will be broadcasted via video-conference to lecture halls of different universities around the globe. Among other collaboration features, an "Annotated Lectures" system will be implemented in a 3D collaborative virtual environment and used by the participating students to make annotations to the video-recorded lectures, which will be sent to and answered by their supervisors, and forwarded to the lecturers in an aggregated way. The "Annotated Lectures" system aims to overcome the issues of limited studentinstructor interaction in large-scale education, and to foster an intercultural and multidisciplinary discourse among students who review the lectures in a group. After presenting the concept of the "Annotated Lectures" system, we discuss a prototype version including a description of the technical components and its expected benefit for large-scale global education.

  8. Annotation an effective device for student feedback: a critical review of the literature.

    Science.gov (United States)

    Ball, Elaine C

    2010-05-01

    The paper examines hand-written annotation, its many features, difficulties and strengths as a feedback tool. It extends and clarifies what modest evidence is in the public domain and offers an evaluation of how to use annotation effectively in the support of student feedback [Marshall, C.M., 1998a. The Future of Annotation in a Digital (paper) World. Presented at the 35th Annual GLSLIS Clinic: Successes and Failures of Digital Libraries, June 20-24, University of Illinois at Urbana-Champaign, March 24, pp. 1-20; Marshall, C.M., 1998b. Toward an ecology of hypertext annotation. Hypertext. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, June 20-24, Pittsburgh Pennsylvania, US, pp. 40-49; Wolfe, J.L., Nuewirth, C.M., 2001. From the margins to the centre: the future of annotation. Journal of Business and Technical Communication, 15(3), 333-371; Diyanni, R., 2002. One Hundred Great Essays. Addison-Wesley, New York; Wolfe, J.L., 2002. Marginal pedagogy: how annotated texts affect writing-from-source texts. Written Communication, 19(2), 297-333; Liu, K., 2006. Annotation as an index to critical writing. Urban Education, 41, 192-207; Feito, A., Donahue, P., 2008. Minding the gap annotation as preparation for discussion. Arts and Humanities in Higher Education, 7(3), 295-307; Ball, E., 2009. A participatory action research study on handwritten annotation feedback and its impact on staff and students. Systemic Practice and Action Research, 22(2), 111-124; Ball, E., Franks, H., McGrath, M., Leigh, J., 2009. Annotation is a valuable tool to enhance learning and assessment in student essays. Nurse Education Today, 29(3), 284-291]. Although a significant number of studies examine annotation, this is largely related to on-line tools and computer mediated communication and not hand-written annotation as comment, phrase or sign written on the student essay to provide critique. Little systematic research has been conducted to consider how this latter form

  9. BioCause: Annotating and analysing causality in the biomedical domain.

    Science.gov (United States)

    Mihăilă, Claudiu; Ohta, Tomoko; Pyysalo, Sampo; Ananiadou, Sophia

    2013-01-16

    Biomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems. However, bio-event annotation alone cannot cater for all the needs of biologists. Unlike work on relation and event extraction, most of which focusses on specific events and named entities, we aim to build a comprehensive resource, covering all statements of causal association present in discourse. Causality lies at the heart of biomedical knowledge, such as diagnosis, pathology or systems biology, and, thus, automatic causality recognition can greatly reduce the human workload by suggesting possible causal connections and aiding in the curation of pathway models. A biomedical text corpus annotated with such relations is, hence, crucial for developing and evaluating biomedical text mining. We have defined an annotation scheme for enriching biomedical domain corpora with causality relations. This schema has subsequently been used to annotate 851 causal relations to form BioCause, a collection of 19 open-access full-text biomedical journal articles belonging to the subdomain of infectious diseases. These documents have been pre-annotated with named entity and event information in the context of previous shared tasks. We report an inter-annotator agreement rate of over 60% for triggers and of over 80% for arguments using an exact match constraint. These increase significantly using a relaxed match setting. Moreover, we analyse and describe the causality relations in BioCause from various points of view. This information can then be leveraged for the training of automatic causality detection systems. Augmenting named entity and event annotations with information about causal discourse relations could benefit the development of more sophisticated IE systems. These will further influence the development of multiple tasks, such as enabling textual inference to detect entailments, discovering new facts and providing new

  10. Automatic Function Annotations for Hoare Logic

    Directory of Open Access Journals (Sweden)

    Daniel Matichuk

    2012-11-01

    Full Text Available In systems verification we are often concerned with multiple, inter-dependent properties that a program must satisfy. To prove that a program satisfies a given property, the correctness of intermediate states of the program must be characterized. However, this intermediate reasoning is not always phrased such that it can be easily re-used in the proofs of subsequent properties. We introduce a function annotation logic that extends Hoare logic in two important ways: (1 when proving that a function satisfies a Hoare triple, intermediate reasoning is automatically stored as function annotations, and (2 these function annotations can be exploited in future Hoare logic proofs. This reduces duplication of reasoning between the proofs of different properties, whilst serving as a drop-in replacement for traditional Hoare logic to avoid the costly process of proof refactoring. We explain how this was implemented in Isabelle/HOL and applied to an experimental branch of the seL4 microkernel to significantly reduce the size and complexity of existing proofs.

  11. Linguistic Variability and Intellectual Development. Miami Linguistics Series No. 9.

    Science.gov (United States)

    von Humboldt, Wilhelm

    Although this edition of Wilhelm von Humboldt's "Linguistic Variability and Intellectual Development" is based entirely on the original German edition, the translators (George C. Buck and Frithjof A. Raven) and the publisher have attempted to clarify certain aspects of this work for the modern-day reader. These features include the addition of…

  12. Automatically Annotated Mapping for Indoor Mobile Robot Applications

    DEFF Research Database (Denmark)

    Özkil, Ali Gürcan; Howard, Thomas J.

    2012-01-01

    This paper presents a new and practical method for mapping and annotating indoor environments for mobile robot use. The method makes use of 2D occupancy grid maps for metric representation, and topology maps to indicate the connectivity of the ‘places-of-interests’ in the environment. Novel use...... localization and mapping in topology space, and fuses camera and robot pose estimations to build an automatically annotated global topo-metric map. It is developed as a framework for a hospital service robot and tested in a real hospital. Experiments show that the method is capable of producing globally...... consistent, automatically annotated hybrid metric-topological maps that is needed by mobile service robots....

  13. Workshop report

    African Journals Online (AJOL)

    abp

    2017-09-14

    Sep 14, 2017 ... health: report of first EQUIST training workshop in Nigeria .... The difference between the before and after measurements was ... After the administration of the pre-workshop questionnaire the ... represent Likert rating scale of 1-5 points, where 1point = grossly .... Procedures Manual for the "Evaluating.

  14. Mathematical Approaches to Cognitive Linguistics

    Directory of Open Access Journals (Sweden)

    Chuluundorj Begz

    2013-05-01

    Full Text Available Cognitive linguistics, neuro-cognitive and psychological analysis of human verbal cognition present important area of multidisciplinary research. Mathematical methods and models have been introduced in number of publications with increasing attention to these theories. In this paper we have described some possible applications of mathematical methods to cognitive linguistics. Human verbal perception and verbal mapping deal with dissipative mental structures and symmetric/asymmetric relationships between objects of perception and deep (also surface structures of language. In that’s way methods of tensor analysis are ambitious candidate to be applied to analysis of human verbal thinking and mental space.

  15. Linguistics, human communication and psychiatry.

    Science.gov (United States)

    Thomas, P; Fraser, W

    1994-11-01

    Psycholinguistics and sociolinguistics have extended our understanding of the abnormal communication seen in psychosis, as well as that of people with autism and Asperger's syndrome. Psycholinguistics has the potential to increase the explanatory power of cognitive and neuropsychological approaches to psychosis and new methods of assessment and therapy are now being developed, based on linguistic theory. A MEDLINE literature search was used. Of 205 relevant articles identified, 65 were selected for review. Greater familiarity with linguistic theory could improve psychiatrists' assessment skills and their understanding of the relevance of human communication to the new cognitive models of psychosis.

  16. Untangling Linguistic Salience

    NARCIS (Netherlands)

    Boswijk, Vincent; Coler, Matt; Loerts, Hanneke; Hilton, Nanna

    2018-01-01

    The concept of linguistic salience is broadly used within sociolinguistics to account for processes as diverse as language change (Kerswill & Williams, 2002) and language acquisition (Ellis, 2016) in that salient forms are e.g. more likely to undergo change, or are often acquired earlier than other

  17. Linguistics and Literacy.

    Science.gov (United States)

    Kindell, Gloria

    1983-01-01

    Discusses four general areas of linguistics studies that are particularly relevant to literacy issues: (1) discourse analysis, including text analysis, spoken and written language, and home and school discourse; (2) relationships between speech and writing, the distance between dialects and written norms, and developmental writing; (3)…

  18. Saussure and Linguistic Geography.

    Science.gov (United States)

    Harris, Roy

    1993-01-01

    Discusses Saussures's "Cours de linguistique generale," which was published in 1916, and devotes specific attention to the significance of Part VI, which is devoted to linguistic geography. (16 references) (Author/VWL)

  19. AGORA : Organellar genome annotation from the amino acid and nucleotide references.

    Science.gov (United States)

    Jung, Jaehee; Kim, Jong Im; Jeong, Young-Sik; Yi, Gangman

    2018-03-29

    Next-generation sequencing (NGS) technologies have led to the accumulation of highthroughput sequence data from various organisms in biology. To apply gene annotation of organellar genomes for various organisms, more optimized tools for functional gene annotation are required. Almost all gene annotation tools are mainly focused on the chloroplast genome of land plants or the mitochondrial genome of animals.We have developed a web application AGORA for the fast, user-friendly, and improved annotations of organellar genomes. AGORA annotates genes based on a BLAST-based homology search and clustering with selected reference sequences from the NCBI database or user-defined uploaded data. AGORA can annotate the functional genes in almost all mitochondrion and plastid genomes of eukaryotes. The gene annotation of a genome with an exon-intron structure within a gene or inverted repeat region is also available. It provides information of start and end positions of each gene, BLAST results compared with the reference sequence, and visualization of gene map by OGDRAW. Users can freely use the software, and the accessible URL is https://bigdata.dongguk.edu/gene_project/AGORA/.The main module of the tool is implemented by the python and php, and the web page is built by the HTML and CSS to support all browsers. gangman@dongguk.edu.

  20. What Does Corpus Linguistics Have to Offer to Language Assessment?

    Science.gov (United States)

    Xi, Xiaoming

    2017-01-01

    In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they…

  1. Workshops as a Research Methodology

    DEFF Research Database (Denmark)

    Ørngreen, Rikke; Levinsen, Karin Tweddell

    2017-01-01

    , and workshops as a research methodology. Focusing primarily on the latter, this paper presents five studies on upper secondary and higher education teachers’ professional development and on teaching and learning through video conferencing. Through analysis and discussion of these studies’ findings, we argue......This paper contributes to knowledge on workshops as a research methodology, and specifically on how such workshops pertain to e-learning. A literature review illustrated that workshops are discussed according to three different perspectives: workshops as a means, workshops as practice...... that workshops provide a platform that can aid researchers in identifying and exploring relevant factors in a given domain by providing means for understanding complex work and knowledge processes that are supported by technology (for example, e-learning). The approach supports identifying factors...

  2. An annotated corpus with nanomedicine and pharmacokinetic parameters

    Directory of Open Access Journals (Sweden)

    Lewinski NA

    2017-10-01

    Full Text Available Nastassja A Lewinski,1 Ivan Jimenez,1 Bridget T McInnes2 1Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, 2Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA Abstract: A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. Keywords: nanotechnology, informatics, natural language processing, text mining, corpora

  3. Annotated Tsunami bibliography: 1962-1976

    International Nuclear Information System (INIS)

    Pararas-Carayannis, G.; Dong, B.; Farmer, R.

    1982-08-01

    This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts

  4. Assessment of community-submitted ontology annotations from a novel database-journal partnership.

    Science.gov (United States)

    Berardini, Tanya Z; Li, Donghui; Muller, Robert; Chetty, Raymond; Ploetz, Larry; Singh, Shanker; Wensel, April; Huala, Eva

    2012-01-01

    As the scientific literature grows, leading to an increasing volume of published experimental data, so does the need to access and analyze this data using computational tools. The most commonly used method to convert published experimental data on gene function into controlled vocabulary annotations relies on a professional curator, employed by a model organism database or a more general resource such as UniProt, to read published articles and compose annotation statements based on the articles' contents. A more cost-effective and scalable approach capable of capturing gene function data across the whole range of biological research organisms in computable form is urgently needed. We have analyzed a set of ontology annotations generated through collaborations between the Arabidopsis Information Resource and several plant science journals. Analysis of the submissions entered using the online submission tool shows that most community annotations were well supported and the ontology terms chosen were at an appropriate level of specificity. Of the 503 individual annotations that were submitted, 97% were approved and community submissions captured 72% of all possible annotations. This new method for capturing experimental results in a computable form provides a cost-effective way to greatly increase the available body of annotations without sacrificing annotation quality. Database URL: www.arabidopsis.org.

  5. Linguistic Policies, Linguistic Planning, and Brazilian Sign Language in Brazil

    Science.gov (United States)

    de Quadros, Ronice Muller

    2012-01-01

    This article explains the consolidation of Brazilian Sign Language in Brazil through a linguistic plan that arose from the Brazilian Sign Language Federal Law 10.436 of April 2002 and the subsequent Federal Decree 5695 of December 2005. Two concrete facts that emerged from this existing language plan are discussed: the implementation of bilingual…

  6. Hemispheric lateralization of linguistic prosody recognition in comparison to speech and speaker recognition.

    Science.gov (United States)

    Kreitewolf, Jens; Friederici, Angela D; von Kriegstein, Katharina

    2014-11-15

    Hemispheric specialization for linguistic prosody is a controversial issue. While it is commonly assumed that linguistic prosody and emotional prosody are preferentially processed in the right hemisphere, neuropsychological work directly comparing processes of linguistic prosody and emotional prosody suggests a predominant role of the left hemisphere for linguistic prosody processing. Here, we used two functional magnetic resonance imaging (fMRI) experiments to clarify the role of left and right hemispheres in the neural processing of linguistic prosody. In the first experiment, we sought to confirm previous findings showing that linguistic prosody processing compared to other speech-related processes predominantly involves the right hemisphere. Unlike previous studies, we controlled for stimulus influences by employing a prosody and speech task using the same speech material. The second experiment was designed to investigate whether a left-hemispheric involvement in linguistic prosody processing is specific to contrasts between linguistic prosody and emotional prosody or whether it also occurs when linguistic prosody is contrasted against other non-linguistic processes (i.e., speaker recognition). Prosody and speaker tasks were performed on the same stimulus material. In both experiments, linguistic prosody processing was associated with activity in temporal, frontal, parietal and cerebellar regions. Activation in temporo-frontal regions showed differential lateralization depending on whether the control task required recognition of speech or speaker: recognition of linguistic prosody predominantly involved right temporo-frontal areas when it was contrasted against speech recognition; when contrasted against speaker recognition, recognition of linguistic prosody predominantly involved left temporo-frontal areas. The results show that linguistic prosody processing involves functions of both hemispheres and suggest that recognition of linguistic prosody is based on

  7. Guatemalan Linguistics Project

    Science.gov (United States)

    Linguistic Reporter, 1974

    1974-01-01

    The general goals of the Guatemalan technical institution, the Proyecto Linguistico Francisco Marroquin, are to: create a national technical resource institution in linguistics and Mayan languages; enable Indians to influence programs for their communities; and stimulate the study of Mayan languages and their use as communication medium. (SW)

  8. Risk Management Techniques and Practice Workshop Workshop Report

    Energy Technology Data Exchange (ETDEWEB)

    Quinn, T; Zosel, M

    2008-12-02

    At the request of the Department of Energy (DOE) Office of Science (SC), Lawrence Livermore National Laboratory (LLNL) hosted a two-day Risk Management Techniques and Practice (RMTAP) workshop held September 18-19 at the Hotel Nikko in San Francisco. The purpose of the workshop, which was sponsored by the SC/Advanced Scientific Computing Research (ASCR) program and the National Nuclear Security Administration (NNSA)/Advanced Simulation and Computing (ASC) program, was to assess current and emerging techniques, practices, and lessons learned for effectively identifying, understanding, managing, and mitigating the risks associated with acquiring leading-edge computing systems at high-performance computing centers (HPCCs). Representatives from fifteen high-performance computing (HPC) organizations, four HPC vendor partners, and three government agencies attended the workshop. The overall workshop findings were: (1) Standard risk management techniques and tools are in the aggregate applicable to projects at HPCCs and are commonly employed by the HPC community; (2) HPC projects have characteristics that necessitate a tailoring of the standard risk management practices; (3) All HPCC acquisition projects can benefit by employing risk management, but the specific choice of risk management processes and tools is less important to the success of the project; (4) The special relationship between the HPCCs and HPC vendors must be reflected in the risk management strategy; (5) Best practices findings include developing a prioritized risk register with special attention to the top risks, establishing a practice of regular meetings and status updates with the platform partner, supporting regular and open reviews that engage the interests and expertise of a wide range of staff and stakeholders, and documenting and sharing the acquisition/build/deployment experience; and (6) Top risk categories include system scaling issues, request for proposal/contract and acceptance testing, and

  9. Cross-Linguistic Transfer among Iranian Learners of English as a Foreign Language

    Science.gov (United States)

    Talebi, Seyed Hassan

    2014-01-01

    Cross-linguistic transfer studies began from linguistic aspects of language learning and moved to non-linguistic aspects. The intriguing question is whether students are aware of the nature of these cross-linguistic interactions in their minds. For this purpose, a semi-structured interview was conducted with four Iranian university students. It…

  10. Functional Grammar in the Context of Linguistic Applications in Turkish Language Teaching

    Science.gov (United States)

    Epcacan, Cahit

    2013-01-01

    In the last century, language researches adopted the scientific method and linguistics became an autonomous discipline. Linguistics is a framework concept that analyzes all languages in the world in various contexts according to its own rules and draws conclusions using the systematic approach. Functional linguistics is a linguistic trend that…

  11. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...

  12. The Actualization of Literary Learning Model Based on Verbal-Linguistic Intelligence

    Directory of Open Access Journals (Sweden)

    Nur Ihsan Halil

    2017-10-01

    Full Text Available This article is inspired by Howard Gardner's concept of linguistic intelligence and also from some authors' previous writings. All of them became the authors' reference in developing ideas on constructing a literary learning model based on linguistic intelligence. The writing of this article is not done by collecting data empirically, but by developing and constructing an existing concept, namely the concept of linguistic intelligence, which is disseminated into a literature-based learning of verbal-linguistic intelligence. The purpose of this paper is to answer the question of how to apply the literary learning model based on the verbal-linguistic intelligence. Then, regarding Gardner's concept, the author formulated a literary learning model based on the verbal-linguistic intelligence through a story-telling learning model with five steps namely arguing, discussing, interpreting, speaking, and writing about literary works. In short, the writer draw a conclusion that learning-based models of verbal-linguistic intelligence can be designed with attention into five components namely (1 definition, (2 characteristics, (3 teaching strategy, (4 final learning outcomes, and (5 figures.

  13. A General Overview of Motivation in Linguistics

    Institute of Scientific and Technical Information of China (English)

    王航

    2014-01-01

    In recent years, the term of motivation in linguistics study has aroused the interests of scholars. Different studies of mo -tivation have been produced by different scholars. In this paper, the writer organizes the recent studies on motivation in linguistics. the paper is divided into three parts, the introduction of the term motivation, different types of motivation, and theories of moti -vation.

  14. A note on statistical methods in comparative linguistics

    NARCIS (Netherlands)

    Cowan, H.K.J.

    1959-01-01

    It is desirable to distinguish between lexicostatistics as a means of proving relationships between languages or linguistic groups not previously known to be related, and glottochronology as a means of measuring the time depths of separations between languages or linguistic groups already known to

  15. Genre and Literacy-Modeling Context in Educational Linguistics.

    Science.gov (United States)

    Martin, James R.

    1992-01-01

    Complements review in previous volume concerning Australian literacy (in first- and second-language) initiatives that drew on systemic functional linguistics, highlights ongoing research within the same theoretical framework, and focuses on the question of modeling context in educational linguistics. The discussion includes modeling context as…

  16. Linguistic Ethnography, Literacy Teaching and Teacher Training

    DEFF Research Database (Denmark)

    Dolmer, Grete; Nielsen, Henrik Balle

    in current attempts to research-base teacher education. Lefstein, A. & J. Snell. 2014. Better than best practice. Developing teaching and learning through dialogue. London: Routledge. Keywords: literacy teaching classroom dialogue teacher feedback linguistic ethnography research-based teacher education...... material consists of field notes and video observations from the literacy classroom combined with reflective interviews with the literacy teacher and analyses of pupils’ oral and written texts. Taking a linguistic ethnographic approach, the case study investigates the interplay between teacher, pupil...... eclecticism, openness and systematicity characteristic of a linguistic ethnographic analysis (Lefstein & Snell 2014, 185-86). In the poster, we will focus on emergent data analysis. Our main points of interest are 1) the classroom dialogue between teacher and pupils and 2) the literacy teacher’s assessment...

  17. Linguistic Culture and Essentialism in South Africa

    Directory of Open Access Journals (Sweden)

    Stephanie Rudwick

    2008-10-01

    Full Text Available This paper explores how language and culture are intertwined and often regarded as “invariable fixed properties” in contemporary South Africa by focusing on one particular indigenous African language group, i.e. isiZulu-speakers. Drawing from general theoretical sociolinguistic approaches to language and culture and considering South Africa’s socio-political history, the paper demonstrates the significance and saliency of Zulu linguistic culture to Zulu people in the post-apartheid state. It is examined, how Zulu linguistic culture is regarded a resource in the isiZulu-speaking community and as one of the most salient tools of in-group identification in the larger contemporary South African society. Zulu people’s culture is profoundly language-embedded and Zulu linguistic culture often based on essentialism.

  18. Workshop meeting

    International Nuclear Information System (INIS)

    Veland, Oeystein

    2004-04-01

    1-2 September 2003 the Halden Project arranged a workshop on 'Innovative Human-System Interfaces and their Evaluation'. This topic is new in the HRP 2003-2005 programme, and it is important to get feedback from member organizations to the work that is being performed in Halden. It is also essential that relevant activities and experiences in this area from the member organizations are shared with the Halden staff and other HRP members. Altogether 25 persons attended the workshop. The workshop had a mixture of presentations and discussions, and was chaired by Dominique Pirus of EDF, France. Day one focused on the HRP/IFE activities on Human-System Interface design, including Function-oriented displays, Ecological Interface Design, Task-oriented displays, as well as work on innovative display solutions for the oil and gas domain. There were also presentations of relevant work in France, Japan and the Czech Republic. The main focus of day two was the verification and validation of human-system interfaces, with presentations of work at HRP on Human-Centered Validation, Criteria-Based System Validation, and Control Room Verification and Validation. The chairman concluded that it was a successful workshop, although one could have had more time for discussions. The Halden Project got valuable feedback and viewpoints on this new topic during the workshop, and will consider all recommendations related to the future work in this area. (Author)

  19. A responsible agenda for applied linguistics: Confessions of a philosopher

    Directory of Open Access Journals (Sweden)

    Albert Weideman

    2011-08-01

    Full Text Available When we undertake academic, disciplinary work, we rely on philosophical starting points. Several straightforward illustrations of this can be found in the history of applied linguistics. It is evident from the history of our field that various historically influential approaches to our discipline base themselves upon different academic confessions. This paper examines the effects of basing our applied linguistic work on the idea that applied linguistics is a discipline concerned with design. Such a characterisation does justice to both modernist and postmodernist emphases in applied linguistics. Conceptualisations of applied linguistics that came with the proposals for communicative language teaching (CLT some thirty to forty years ago propelled the discipline squarely into postmodern times. To account for this, we need to develop a theory of applied linguistics which shows what constitutive and regulative conditions exist for doing applied linguistic designs. A responsible agenda for applied linguistics today has as its first responsibility to free the users of its designs from toil and drudgery, as well as from becoming victims of fashion, ideology or theory. Secondly, it should design solutions to language problems in such a way that the technical imagination of the designer is not restricted but supported by theory and empirical investigation, and that the productive pedagogical fantasy of the implementers of such plans is set free. Thirdly, it must seek to become accountable by designing theoretically and socially defensible solutions to language problems, solutions that relieve some of the suffering, pain, poverty and injustice in our world.

  20. Creating New Medical Ontologies for Image Annotation A Case Study

    CERN Document Server

    Stanescu, Liana; Brezovan, Marius; Mihai, Cristian Gabriel

    2012-01-01

    Creating New Medical Ontologies for Image Annotation focuses on the problem of the medical images automatic annotation process, which is solved in an original manner by the authors. All the steps of this process are described in detail with algorithms, experiments and results. The original algorithms proposed by authors are compared with other efficient similar algorithms. In addition, the authors treat the problem of creating ontologies in an automatic way, starting from Medical Subject Headings (MESH). They have presented some efficient and relevant annotation models and also the basics of the annotation model used by the proposed system: Cross Media Relevance Models. Based on a text query the system will retrieve the images that contain objects described by the keywords.

  1. Elucidating high-dimensional cancer hallmark annotation via enriched ontology.

    Science.gov (United States)

    Yan, Shankai; Wong, Ka-Chun

    2017-09-01

    Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  3. Image schemas and mimetic schemas in cognitive linguistics and gesture studies

    NARCIS (Netherlands)

    Cienki, A.J.

    2013-01-01

    Image schemas have been a fundamental construct in cognitive linguistics, providing grounds for psychological, philosophical, as well as linguistic research. Given the focus in cognitive linguistics on embodied experience as a fundamental basis for language structure and meaning, the employment of

  4. Linguistic Barriers and Bridges

    DEFF Research Database (Denmark)

    Thuesen, Frederik

    2016-01-01

    The influence of language on social capital in low-skill and ethnically diverse workplaces has thus far received very limited attention within the sociology of work. As the ethnically diverse workplace is an important social space for the construction of social relations bridging different social...... groups, the sociology of work needs to develop a better understanding of the way in which linguistic diversity influences the formation of social capital, i.e. resources such as the trust and reciprocity inherent in social relations in such workplaces. Drawing on theories about intergroup contact...... and intercultural communication, this article analyses interviews with 31 employees from two highly ethnically diverse Danish workplaces. The article shows how linguistic barriers such as different levels of majority language competence and their consequent misunderstandings breed mistrust and hostility, whilst...

  5. Intra-species sequence comparisons for annotating genomes

    Energy Technology Data Exchange (ETDEWEB)

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  6. Can delusions be understood linguistically?

    Science.gov (United States)

    Hinzen, Wolfram; Rosselló, Joana; McKenna, Peter

    2016-01-01

    ABSTRACT Delusions are widely believed to reflect disturbed cognitive function, but the nature of this remains elusive. The “un-Cartesian” cognitive-linguistic hypothesis maintains (a) that there is no thought separate from language, that is, there is no distinct mental space removed from language where “thinking” takes place; and (b) that a somewhat broadened concept of grammar is responsible for bestowing meaning on propositions, and this among other things gives them their quality of being true or false. It is argued that a loss of propositional meaning explains why delusions are false, impossible and sometimes fantastic. A closely related abnormality, failure of linguistic embedding, can additionally account for why delusions are held with fixed conviction and are not adequately justified by the patient. The un-Cartesian linguistic approach to delusions has points of contact with Frith’s theory that inability to form meta-representations underlies a range of schizophrenic symptoms. It may also be relevant to the nature of the “second factor” in monothematic delusions in neurological disease. Finally, it can inform the current debate about whether or not delusions really are beliefs. PMID:27322493

  7. Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome

    Directory of Open Access Journals (Sweden)

    McCarthy Fiona M

    2007-11-01

    Full Text Available Abstract Background The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology, we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and

  8. Automatically annotating topics in transcripts of patient-provider interactions via machine learning.

    Science.gov (United States)

    Wallace, Byron C; Laws, M Barton; Small, Kevin; Wilson, Ira B; Trikalinos, Thomas A

    2014-05-01

    Annotated patient-provider encounters can provide important insights into clinical communication, ultimately suggesting how it might be improved to effect better health outcomes. But annotating outpatient transcripts with Roter or General Medical Interaction Analysis System (GMIAS) codes is expensive, limiting the scope of such analyses. We propose automatically annotating transcripts of patient-provider interactions with topic codes via machine learning. We use a conditional random field (CRF) to model utterance topic probabilities. The model accounts for the sequential structure of conversations and the words comprising utterances. We assess predictive performance via 10-fold cross-validation over GMIAS-annotated transcripts of 360 outpatient visits (>230,000 utterances). We then use automated in place of manual annotations to reproduce an analysis of 116 additional visits from a randomized trial that used GMIAS to assess the efficacy of an intervention aimed at improving communication around antiretroviral (ARV) adherence. With respect to 6 topic codes, the CRF achieved a mean pairwise kappa compared with human annotators of 0.49 (range: 0.47-0.53) and a mean overall accuracy of 0.64 (range: 0.62-0.66). With respect to the RCT reanalysis, results using automated annotations agreed with those obtained using manual ones. According to the manual annotations, the median number of ARV-related utterances without and with the intervention was 49.5 versus 76, respectively (paired sign test P = 0.07). When automated annotations were used, the respective numbers were 39 versus 55 (P = 0.04). While moderately accurate, the predicted annotations are far from perfect. Conversational topics are intermediate outcomes, and their utility is still being researched. This foray into automated topic inference suggests that machine learning methods can classify utterances comprising patient-provider interactions into clinically relevant topics with reasonable accuracy.

  9. Linguistic Features of Humor in Academic Writing

    Directory of Open Access Journals (Sweden)

    Stephen Skalicky

    2016-06-01

    Full Text Available A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall Humor component score for each essay in the corpus. In addition, the essays were also scored for overall writing quality by human raters, which correlated (r = .195 with the humor component score. Correlations between the humor component scores and linguistic features were examined. To investigate the potential for linguistic features to predict the Humor component scores, regression analysis identified four linguistic indices that accounted for approximately 17.5% of the variance in humor scores. These indices were related to text descriptiveness (i.e., more adjective and adverb use, lower cohesion (i.e., less paragraph-to-paragraph similarity, and lexical sophistication (lower word frequency. The findings suggest that humor can be partially predicted by linguistic features in the text. Furthermore, there was a small but significant correlation between the humor and essay quality scores, suggesting a positive relation between humor and writing quality. Keywords: humor, academic writing, text analysis, essay score, human rating

  10. Probabilistic Linguistic Power Aggregation Operators for Multi-Criteria Group Decision Making

    Directory of Open Access Journals (Sweden)

    Agbodah Kobina

    2017-12-01

    Full Text Available As an effective aggregation tool, power average (PA allows the input arguments being aggregated to support and reinforce each other, which provides more versatility in the information aggregation process. Under the probabilistic linguistic term environment, we deeply investigate the new power aggregation (PA operators for fusing the probabilistic linguistic term sets (PLTSs. In this paper, we firstly develop the probabilistic linguistic power average (PLPA, the weighted probabilistic linguistic power average (WPLPA operators, the probabilistic linguistic power geometric (PLPG and the weighted probabilistic linguistic power geometric (WPLPG operators. At the same time, we carefully analyze the properties of these new aggregation operators. With the aid of the WPLPA and WPLPG operators, we further design the approaches for the application of multi-criteria group decision-making (MCGDM with PLTSs. Finally, we use an illustrated example to expound our proposed methods and verify their performances.

  11. Annotating risk factors for heart disease in clinical narratives for diabetic patients.

    Science.gov (United States)

    Stubbs, Amber; Uzuner, Özlem

    2015-12-01

    The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 296 patients for risk factors and the times they were present. We designed the annotation task for this track with the goal of balancing annotation load and time with quality, so as to generate a gold standard corpus that can benefit a clinically-relevant task. We applied light annotation procedures and determined the gold standard using majority voting. On average, the agreement of annotators with the gold standard was above 0.95, indicating high reliability. The resulting document-level annotations generated for each record in each longitudinal EMR in this corpus provide information that can support studies of progression of heart disease risk factors in the included patients over time. These annotations were used in the Risk Factor track of the 2014 i2b2/UTHealth shared task. Participating systems achieved a mean micro-averaged F1 measure of 0.815 and a maximum F1 measure of 0.928 for identifying these risk factors in patient records. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. A Python Library for Historical Comparative Linguistics

    OpenAIRE

    Moran , Steven; List , Johann-Mattis

    2012-01-01

    Awarded best paper award; International audience; In this talk we will discuss a European Research Council funded collaborative effort to build a Python library for undertaking academic research in historical-comparative linguistics. Our aim of implementing quantitative methods, specifically in Python, is to transform historical-comparative linguistics from a primarily handcrafted scientific scholarly endeavor, performed by individual researchers, into a quantitative and collaborative field o...

  13. Linguistic relativity.

    Science.gov (United States)

    Wolff, Phillip; Holmes, Kevin J

    2011-05-01

    The central question in research on linguistic relativity, or the Whorfian hypothesis, is whether people who speak different languages think differently. The recent resurgence of research on this question can be attributed, in part, to new insights about the ways in which language might impact thought. We identify seven categories of hypotheses about the possible effects of language on thought across a wide range of domains, including motion, color, spatial relations, number, and false belief understanding. While we do not find support for the idea that language determines the basic categories of thought or that it overwrites preexisting conceptual distinctions, we do find support for the proposal that language can make some distinctions difficult to avoid, as well as for the proposal that language can augment certain types of thinking. Further, we highlight recent evidence suggesting that language may induce a relatively schematic mode of thinking. Although the literature on linguistic relativity remains contentious, there is growing support for the view that language has a profound effect on thought. WIREs Cogni Sci 2011 2 253-265 DOI: 10.1002/wcs.104 For further resources related to this article, please visit the WIREs website. Copyright © 2010 John Wiley & Sons, Ltd.

  14. Annotating Emotions in Meetings

    NARCIS (Netherlands)

    Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.

    We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete

  15. Microtask crowdsourcing for disease mention annotation in PubMed abstracts.

    Science.gov (United States)

    Good, Benjamin M; Nanis, Max; Wu, Chunlei; Su, Andrew I

    2015-01-01

    Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses. Many biological natural language processing (BioNLP) projects attempt to address this challenge, but the state of the art still leaves much room for improvement. Progress in BioNLP research depends on large, annotated corpora for evaluating information extraction systems and training machine learning models. Traditionally, such corpora are created by small numbers of expert annotators often working over extended periods of time. Recent studies have shown that workers on microtask crowdsourcing platforms such as Amazon's Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts. We used the NCBI Disease corpus as a gold standard for refining and benchmarking our crowdsourcing protocol. After several iterations, we arrived at a protocol that reproduced the annotations of the 593 documents in the 'training set' of this gold standard with an overall F measure of 0.872 (precision 0.862, recall 0.883). The output can also be tuned to optimize for precision (max = 0.984 when recall = 0.269) or recall (max = 0.980 when precision = 0.436). Each document was completed by 15 workers, and their annotations were merged based on a simple voting method. In total 145 workers combined to complete all 593 documents in the span of 9 days at a cost of $.066 per abstract per worker. The quality of the annotations, as judged with the F measure, increases with the number of workers assigned to each task; however minimal performance gains were observed beyond 8 workers per task. These results add further evidence that microtask crowdsourcing can be a valuable tool for generating well-annotated corpora in BioNLP. Data produced for this analysis are available at http://figshare.com/articles/Disease_Mention_Annotation_with_Mechanical_Turk/1126402.

  16. The Linguistic and Embodied Nature of Conceptual Processing

    Science.gov (United States)

    Louwerse, Max M.; Jeuniaux, Patrick

    2010-01-01

    Recent theories of cognition have argued that embodied experience is important for conceptual processing. Embodiment can be contrasted with linguistic factors such as the typical order in which words appear in language. Here, we report four experiments that investigated the conditions under which embodiment and linguistic factors determine…

  17. Breaking Classroom Silences: A View from Linguistic Ethnography

    Science.gov (United States)

    Rampton, Ben; Charalambous, Constadina

    2016-01-01

    This paper addresses potentially problematic classroom episodes in which someone foregrounds a social division that is normally taken for granted. It illustrates the way in which linguistic ethnography can unpack the layered processes that collide in the breaking of silence, showing how linguistic form and practice, individual positioning, local…

  18. Gradual linguistic summaries

    NARCIS (Netherlands)

    Wilbik, A.M.; Kaymak, U.; Laurent, A.; Strauss, O.; Bouchon-Meunier, xx

    2014-01-01

    In this paper we propose a new type of protoform-based linguistic summary – the gradual summary. This new type of summaries aims in capturing the change over some time span. Such summaries can be useful in many domains, for instance in economics, e.g., "prices of X are getting smaller" in eldercare,

  19. Perspectives in Linguistics.

    Science.gov (United States)

    Waterman, John T.

    Intended for the student of linguistics or the structural grammarian, who must develop an awareness of their intellectual heritage, the present work surveys the study of language in ancient times, the medieval and early modern periods, the nineteenth century, and the twentieth century to 1950. (This second edition includes additional material on…

  20. Consumer energy research: an annotated bibliography. Vol. 3

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, D.C.; McDougall, G.H.G.

    1983-04-01

    This annotated bibliography attempts to provide a comprehensive package of existing information in consumer related energy research. A concentrated effort was made to collect unpublished material as well as material from journals and other sources, including governments, utilities research institutes and private firms. A deliberate effort was made to include agencies outside North America. For the most part the bibliography is limited to annotations of empiracal studies. However, it includes a number of descriptive reports which appear to make a significant contribution to understanding consumers and energy use. The format of the annotations displays the author, date of publication, title and source of the study. Annotations of empirical studies are divided into four parts: objectives, methods, variables and findings/implications. Care was taken to provide a reasonable amount of detail in the annotations to enable the reader to understand the methodology, the results and the degree to which the implications fo the study can be generalized to other situations. Studies are arranged alphabetically by author. The content of the studies reviewed is classified in a series of tables which are intended to provide a summary of sources, types and foci of the various studies. These tables are intended to aid researchers interested in specific topics to locate those studies most relevant to their work. The studies are categorized using a number of different classification criteria, for example, methodology used, type of energy form, type of policy initiative, and type of consumer activity. A general overview of the studies is also presented. 17 tabs.

  1. Cross-linguistic perspectives on speech assessment in cleft palate

    DEFF Research Database (Denmark)

    Willadsen, Elisabeth; Henningsson, Gunilla

    2012-01-01

    . Finally, the influence of different languages on some aspects of language acquisition in young children with cleft palate is presented and discussed. Until recently, not much has been written about cross linguistic perspectives when dealing with cleft palate speech. Most literature about assessment......This chapter deals with cross linguistic perspectives that need to be taken into account when comparing speech assessment and speech outcome obtained from cleft palate speakers of different languages. Firstly, an overview of consonants and vowels vulnerable to the cleft condition is presented. Then......, consequences for assessment of cleft palate speech by native versus non-native speakers of a language are discussed, as well as the use of phonemic versus phonetic transcription in cross linguistic studies. Specific recommendations for the construction of speech samples in cross linguistic studies are given...

  2. A kindergarten experiment in linguistic e-learning

    DEFF Research Database (Denmark)

    Valente, Andrea; Marchetti, Emanuela

    2006-01-01

    As part of the BlaSq project, we are developing a set of linguistic games to be used in kindergartens. The first of these games is Crazipes, that we are currently testing in a Danish kindergarten, with the support of the local teachers. Here we discuss the architecture of the game, its potentials...... as a linguistic e-learning tool, together with the design and methodology adopted for the study. Some early results are also discussed.......As part of the BlaSq project, we are developing a set of linguistic games to be used in kindergartens. The first of these games is Crazipes, that we are currently testing in a Danish kindergarten, with the support of the local teachers. Here we discuss the architecture of the game, its potentials...

  3. A Brief History of the 19th-century Historical and Comparative Linguistics

    Institute of Scientific and Technical Information of China (English)

    郭丽娟

    2016-01-01

    In a broad sense Linguistics boasts a history as long as the history of writing. Knowledge of linguistics involves its history. And a history of linguistics is related to the origin of human language. Language is one of the most wonderful phenomena in human ’s social life. This paper introduce a brief history of historical and comparative linguistics in 19th–century.

  4. An Analysis of Social Class Classification Based on Linguistic Variables

    Institute of Scientific and Technical Information of China (English)

    QU Xia-sha

    2016-01-01

    Since language is an influential tool in social interaction, the relationship of speech and social factors, such as social class, gender, even age is worth studying. People employ different linguistic variables to imply their social class, status and iden-tity in the social interaction. Thus the linguistic variation involves vocabulary, sounds, grammatical constructions, dialects and so on. As a result, a classification of social class draws people’s attention. Linguistic variable in speech interactions indicate the social relationship between people. This paper attempts to illustrate three main linguistic variables which influence the social class, and further sociolinguistic studies need to be more concerned about.

  5. Building "Applied Linguistic Historiography": Rationale, Scope, and Methods

    Science.gov (United States)

    Smith, Richard

    2016-01-01

    In this article I argue for the establishment of "Applied Linguistic Historiography" (ALH), that is, a new domain of enquiry within applied linguistics involving a rigorous, scholarly, and self-reflexive approach to historical research. Considering issues of rationale, scope, and methods in turn, I provide reasons why ALH is needed and…

  6. Image annotation based on positive-negative instances learning

    Science.gov (United States)

    Zhang, Kai; Hu, Jiwei; Liu, Quan; Lou, Ping

    2017-07-01

    Automatic image annotation is now a tough task in computer vision, the main sense of this tech is to deal with managing the massive image on the Internet and assisting intelligent retrieval. This paper designs a new image annotation model based on visual bag of words, using the low level features like color and texture information as well as mid-level feature as SIFT, and mixture the pic2pic, label2pic and label2label correlation to measure the correlation degree of labels and images. We aim to prune the specific features for each single label and formalize the annotation task as a learning process base on Positive-Negative Instances Learning. Experiments are performed using the Corel5K Dataset, and provide a quite promising result when comparing with other existing methods.

  7. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  8. A Linguistic Analysis of Suicide-Related Twitter Posts.

    Science.gov (United States)

    O'Dea, Bridianne; Larsen, Mark E; Batterham, Philip J; Calear, Alison L; Christensen, Helen

    2017-09-01

    Suicide is a leading cause of death worldwide. Identifying those at risk and delivering timely interventions is challenging. Social media site Twitter is used to express suicidality. Automated linguistic analysis of suicide-related posts may help to differentiate those who require support or intervention from those who do not. This study aims to characterize the linguistic profiles of suicide-related Twitter posts. Using a dataset of suicide-related Twitter posts previously coded for suicide risk by experts, Linguistic Inquiry and Word Count (LIWC) and regression analyses were conducted to determine differences in linguistic profiles. When compared with matched non-suicide-related Twitter posts, strongly concerning suicide-related posts were characterized by a higher word count, increased use of first-person pronouns, and more references to death. When compared with safe-to-ignore suicide-related posts, strongly concerning suicide-related posts were characterized by increased use of first-person pronouns, greater anger, and increased focus on the present. Other differences were found. The predictive validity of the identified features needs further testing before these results can be used for interventional purposes. This study demonstrates that strongly concerning suicide-related Twitter posts have unique linguistic profiles. The examination of Twitter data for the presence of such features may help to validate online risk assessments and determine those in need of further support or intervention.

  9. Backward Dependencies and in-Situ wh-Questions as Test Cases on How to Approach Experimental Linguistics Research That Pursues Theoretical Linguistics Questions

    Science.gov (United States)

    Pablos, Leticia; Doetjes, Jenny; Cheng, Lisa L.-S.

    2018-01-01

    The empirical study of language is a young field in contemporary linguistics. This being the case, and following a natural development process, the field is currently at a stage where different research methods and experimental approaches are being put into question in terms of their validity. Without pretending to provide an answer with respect to the best way to conduct linguistics related experimental research, in this article we aim at examining the process that researchers follow in the design and implementation of experimental linguistics research with a goal to validate specific theoretical linguistic analyses. First, we discuss the general challenges that experimental work faces in finding a compromise between addressing theoretically relevant questions and being able to implement these questions in a specific controlled experimental paradigm. We discuss the Granularity Mismatch Problem (Poeppel and Embick, 2005) which addresses the challenges that research that is trying to bridge the representations and computations of language and their psycholinguistic/neurolinguistic evidence faces, and the basic assumptions that interdisciplinary research needs to consider due to the different conceptual granularity of the objects under study. To illustrate the practical implications of the points addressed, we compare two approaches to perform linguistic experimental research by reviewing a number of our own studies strongly grounded on theoretically informed questions. First, we show how linguistic phenomena similar at a conceptual level can be tested within the same language using measurement of event-related potentials (ERP) by discussing results from two ERP experiments on the processing of long-distance backward dependencies that involve coreference and negative polarity items respectively in Dutch. Second, we examine how the same linguistic phenomenon can be tested in different languages using reading time measures by discussing the outcome of four self

  10. Backward Dependencies and in-Situ wh-Questions as Test Cases on How to Approach Experimental Linguistics Research That Pursues Theoretical Linguistics Questions.

    Science.gov (United States)

    Pablos, Leticia; Doetjes, Jenny; Cheng, Lisa L-S

    2017-01-01

    The empirical study of language is a young field in contemporary linguistics. This being the case, and following a natural development process, the field is currently at a stage where different research methods and experimental approaches are being put into question in terms of their validity. Without pretending to provide an answer with respect to the best way to conduct linguistics related experimental research, in this article we aim at examining the process that researchers follow in the design and implementation of experimental linguistics research with a goal to validate specific theoretical linguistic analyses. First, we discuss the general challenges that experimental work faces in finding a compromise between addressing theoretically relevant questions and being able to implement these questions in a specific controlled experimental paradigm. We discuss the Granularity Mismatch Problem (Poeppel and Embick, 2005) which addresses the challenges that research that is trying to bridge the representations and computations of language and their psycholinguistic/neurolinguistic evidence faces, and the basic assumptions that interdisciplinary research needs to consider due to the different conceptual granularity of the objects under study. To illustrate the practical implications of the points addressed, we compare two approaches to perform linguistic experimental research by reviewing a number of our own studies strongly grounded on theoretically informed questions. First, we show how linguistic phenomena similar at a conceptual level can be tested within the same language using measurement of event-related potentials (ERP) by discussing results from two ERP experiments on the processing of long-distance backward dependencies that involve coreference and negative polarity items respectively in Dutch. Second, we examine how the same linguistic phenomenon can be tested in different languages using reading time measures by discussing the outcome of four self

  11. Backward Dependencies and in-Situ wh-Questions as Test Cases on How to Approach Experimental Linguistics Research That Pursues Theoretical Linguistics Questions

    Directory of Open Access Journals (Sweden)

    Leticia Pablos

    2018-01-01

    Full Text Available The empirical study of language is a young field in contemporary linguistics. This being the case, and following a natural development process, the field is currently at a stage where different research methods and experimental approaches are being put into question in terms of their validity. Without pretending to provide an answer with respect to the best way to conduct linguistics related experimental research, in this article we aim at examining the process that researchers follow in the design and implementation of experimental linguistics research with a goal to validate specific theoretical linguistic analyses. First, we discuss the general challenges that experimental work faces in finding a compromise between addressing theoretically relevant questions and being able to implement these questions in a specific controlled experimental paradigm. We discuss the Granularity Mismatch Problem (Poeppel and Embick, 2005 which addresses the challenges that research that is trying to bridge the representations and computations of language and their psycholinguistic/neurolinguistic evidence faces, and the basic assumptions that interdisciplinary research needs to consider due to the different conceptual granularity of the objects under study. To illustrate the practical implications of the points addressed, we compare two approaches to perform linguistic experimental research by reviewing a number of our own studies strongly grounded on theoretically informed questions. First, we show how linguistic phenomena similar at a conceptual level can be tested within the same language using measurement of event-related potentials (ERP by discussing results from two ERP experiments on the processing of long-distance backward dependencies that involve coreference and negative polarity items respectively in Dutch. Second, we examine how the same linguistic phenomenon can be tested in different languages using reading time measures by discussing the outcome of

  12. Developing a tagset and tagger for the African languages of South ...

    African Journals Online (AJOL)

    annotations in the form of linguistic tags and annotations. That is, the annotations are used to direct the searches to specific grammatical and lexical phenomena in a corpus. In this article, we propose a corpus-based approach and a tagset to be used on a corpus of spoken language for the African languages of South Africa.

  13. Graph-based sequence annotation using a data integration approach

    Directory of Open Access Journals (Sweden)

    Pesch Robert

    2008-06-01

    Full Text Available The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.

  14. EST-PAC a web package for EST annotation and protein sequence prediction

    Directory of Open Access Journals (Sweden)

    Strahm Yvan

    2006-10-01

    Full Text Available Abstract With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1 searching local or remote biological databases for sequence similarities using Blast services, 2 predicting protein coding sequence from EST data and, 3 annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics.

  15. 2014 Penn State Bioinorganic Workshop

    Energy Technology Data Exchange (ETDEWEB)

    Golbeck, John [Pennsylvania State Univ., State College, PA (United States)

    2015-10-01

    The 3rd Penn State Bioinorganic Workshop took place in early June 2014 and was combined with the 3rd Penn State Frontiers in Metallobiochemistry Symposium. The workshop was even larger than the 2nd Penn State Bioinorganic Workshop we offered in 2012. It had even more participants (162 rather than 123 in 2012). Like the 2012 workshop, the 2014 workshop had three parts. The first part consisted of 16 90-minute lectures presented by faculty experts on the topic of their expertise (see below). Based on the suggestions from the 2012 workshop, we have recorded all 16 lectures professionally and make them available to the entire bioinorganic community via online streaming. In addition, hard copies of the recordings are available as backup.

  16. Knowing linguistic conventions | Robinson | South African Journal of ...

    African Journals Online (AJOL)

    These are three standard accounts of the epistemic status of linguistic conventions, which all play into the first camp: (1) knowledge by intuition, (2) inferential a priori knowledge and (3) a posteriori knowledge. I give reasons why these accounts should be rejected. I then argue that linguistic conventions, if conceived of as ...

  17. Resources on Quantitative/Statistical Research for Applied Linguists

    Science.gov (United States)

    Brown, James Dean

    2004-01-01

    The purpose of this review article is to survey and evaluate existing books on quantitative/statistical research in applied linguistics. The article begins by explaining the types of texts that will not be reviewed, then it briefly describes nine books that address how to do quantitative/statistical applied linguistics research. The review then…

  18. Abnormal motor cortex excitability during linguistic tasks in adductor-type spasmodic dysphonia.

    Science.gov (United States)

    Suppa, A; Marsili, L; Giovannelli, F; Di Stasio, F; Rocchi, L; Upadhyay, N; Ruoppolo, G; Cincotta, M; Berardelli, A

    2015-08-01

    In healthy subjects (HS), transcranial magnetic stimulation (TMS) applied during 'linguistic' tasks discloses excitability changes in the dominant hemisphere primary motor cortex (M1). We investigated 'linguistic' task-related cortical excitability modulation in patients with adductor-type spasmodic dysphonia (ASD), a speech-related focal dystonia. We studied 10 ASD patients and 10 HS. Speech examination included voice cepstral analysis. We investigated the dominant/non-dominant M1 excitability at baseline, during 'linguistic' (reading aloud/silent reading/producing simple phonation) and 'non-linguistic' tasks (looking at non-letter strings/producing oral movements). Motor evoked potentials (MEPs) were recorded from the contralateral hand muscles. We measured the cortical silent period (CSP) length and tested MEPs in HS and patients performing the 'linguistic' tasks with different voice intensities. We also examined MEPs in HS and ASD during hand-related 'action-verb' observation. Patients were studied under and not-under botulinum neurotoxin-type A (BoNT-A). In HS, TMS over the dominant M1 elicited larger MEPs during 'reading aloud' than during the other 'linguistic'/'non-linguistic' tasks. Conversely, in ASD, TMS over the dominant M1 elicited increased-amplitude MEPs during 'reading aloud' and 'syllabic phonation' tasks. CSP length was shorter in ASD than in HS and remained unchanged in both groups performing 'linguistic'/'non-linguistic' tasks. In HS and ASD, 'linguistic' task-related excitability changes were present regardless of the different voice intensities. During hand-related 'action-verb' observation, MEPs decreased in HS, whereas in ASD they increased. In ASD, BoNT-A improved speech, as demonstrated by cepstral analysis and restored the TMS abnormalities. ASD reflects dominant hemisphere excitability changes related to 'linguistic' tasks; BoNT-A returns these excitability changes to normal. © 2015 Federation of European Neuroscience Societies and John

  19. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

    Science.gov (United States)

    Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

    2010-07-02

    The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data

  20. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  1. INDICO Workshop

    CERN Multimedia

    CERN. Geneva; Fabbrichesi, Marco

    2004-01-01

    The INtegrated DIgital COnferencing EU project has finished building a complete software solution to facilitate the MANAGEMENT OF CONFERENCES, workshops, schools or simple meetings from their announcement to their archival. Everybody involved in the organization of events is welcome to join this workshop, in order to understand the scope of the project and to see demonstrations of the various features.

  2. LINGUISTIC FEATURES ANALYSIS OF THE ENGLISH ELECTRONIC COMMERCE WEBSITES

    Directory of Open Access Journals (Sweden)

    Siti Nurani

    2014-06-01

    Full Text Available This research aims at identifying linguistic features used in the English electronic commerce websites used in correlation with the field, tenor and mode of discourse as parts of Systemic Functional Linguistics (SFL approach. Findings have shown that in the field of discourse, the linguistic features are largely appeared in the experiential domain analysis which shows that all terms of registers function as technical terms, of which the two major forms of nouns and verbs were the most frequent categories among other kinds of technical terms. The goal orientation is considered to be as a long term and the social activity is exchange. In the tenor of discourse, the linguistic features are highly appeared in the social distance analysis which shows that the social distance between participants is considered minimal. The agentive role is said to be equal and the social role is considered as non-hierarchic. In the mode of discourse, the linguistic features are excessively occurred in the language role analysis which exists equally of both constitutive and ancillary. The channel is in graphic mode. The medium is in written with a visual contact as its device.

  3. The Linguistic Landscape as a Learning Space for Contextual Language Learning

    Science.gov (United States)

    Aladjem, Ruthi; Jou, Bibiana

    2016-01-01

    One of the challenges of teaching and learning a foreign language is that students are not being sufficiently exposed to the target language. However, it is quite common to find linguistic and cultural exponents of different foreign languages in authentic contexts (termed the "Linguistic landscape"). Using the Linguistic landscape as a…

  4. Mapping the Linguistic Landscape of Athens: The Case of Shop Signs

    Science.gov (United States)

    Nikolaou, Alexander

    2017-01-01

    This paper focuses on the linguistic composition of commercial signs in the linguistic landscape (LL) of Athens, Greece. Previous studies have mainly been carried out in officially multilingual and multi-ethnic areas [Ben-Rafael, E., Shohamy, E., Amara, M. H., & Trumper-Hecht, N. (2006). "Linguistic landscape as symbolic construction of…

  5. Early Detection of Cognitive-Linguistic Change Associated with Mild Cognitive Impairment

    Science.gov (United States)

    Fleming, Valarie B.

    2014-01-01

    Individuals with mild cognitive impairment (MCI) may present with subtle declines in linguistic ability that go undetected by tasks not challenging enough to tax a relatively intact cognitive-linguistic system. This study was designed to replicate and extend a previous study of cognitive-linguistic ability in MCI using a complex discourse…

  6. A primer in macromolecular linguistics.

    Science.gov (United States)

    Searls, David B

    2013-03-01

    Polymeric macromolecules, when viewed abstractly as strings of symbols, can be treated in terms of formal language theory, providing a mathematical foundation for characterizing such strings both as collections and in terms of their individual structures. In addition this approach offers a framework for analysis of macromolecules by tools and conventions widely used in computational linguistics. This article introduces the ways that linguistics can be and has been applied to molecular biology, covering the relevant formal language theory at a relatively nontechnical level. Analogies between macromolecules and human natural language are used to provide intuitive insights into the relevance of grammars, parsing, and analysis of language complexity to biology. Copyright © 2012 Wiley Periodicals, Inc.

  7. The QED Workshop

    Energy Technology Data Exchange (ETDEWEB)

    Pieper, G.W.

    1994-07-01

    On May 18--20, 1994, Argonne National Laboratory hosted the QED Workshop. The workshop was supported by special funding from the Office of Naval Research. The purpose of the workshop was to assemble of a group of researchers to consider whether it is desirable and feasible to build a proof-checked encyclopedia of mathematics, with an associated facility for theorem proving and proof checking. Among the projects represented were Coq, Eves, HOL, ILF, Imps, MathPert, Mizar, NQTHM, NuPrl, OTTER, Proof Pad, Qu-Prolog, and RRL. Although the content of the QED project is highly technical rigorously proof-checked mathematics of all sorts the discussions at the workshop were rarely technical. No prepared talks or papers were given. Instead, the discussions focused primarily on such political, sociological, practical, and aesthetic questions, such as Why do it? Who are the customers? How can one get mathematicians interested? What sort of interfaces are desirable? The most important conclusion of the workshop was that QED is an idea worthy pursuing, a statement with which virtually all the participants agreed. In this document, the authors capture some of the discussions and outline suggestions for the start of a QED scientific community.

  8. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications.

    Science.gov (United States)

    Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart

    2013-01-01

    Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.

  9. Reflections on Mixing Methods in Applied Linguistics Research

    Science.gov (United States)

    Hashemi, Mohammad R.

    2012-01-01

    This commentary advocates the use of mixed methods research--that is the integration of qualitative and quantitative methods in a single study--in applied linguistics. Based on preliminary findings from a research project in progress, some reflections on the current practice of mixing methods as a new trend in applied linguistics are put forward.…

  10. Linguistic Error Analysis on Students' Thesis Proposals

    Science.gov (United States)

    Pescante-Malimas, Mary Ann; Samson, Sonrisa C.

    2017-01-01

    This study identified and analyzed the common linguistic errors encountered by Linguistics, Literature, and Advertising Arts majors in their Thesis Proposal classes in the First Semester 2016-2017. The data were the drafts of the thesis proposals of the students from the three different programs. A total of 32 manuscripts were analyzed which was…

  11. Automatic medical image annotation and keyword-based image retrieval using relevance feedback.

    Science.gov (United States)

    Ko, Byoung Chul; Lee, JiHyeon; Nam, Jae-Yeal

    2012-08-01

    This paper presents novel multiple keywords annotation for medical images, keyword-based medical image retrieval, and relevance feedback method for image retrieval for enhancing image retrieval performance. For semantic keyword annotation, this study proposes a novel medical image classification method combining local wavelet-based center symmetric-local binary patterns with random forests. For keyword-based image retrieval, our retrieval system use the confidence score that is assigned to each annotated keyword by combining probabilities of random forests with predefined body relation graph. To overcome the limitation of keyword-based image retrieval, we combine our image retrieval system with relevance feedback mechanism based on visual feature and pattern classifier. Compared with other annotation and relevance feedback algorithms, the proposed method shows both improved annotation performance and accurate retrieval results.

  12. Linguistic Legitimation of Political Events in Newspaper Discourse

    Directory of Open Access Journals (Sweden)

    Marwah Kareem Ali

    2016-08-01

    Full Text Available This paper examines the discursive structures employed in legitimizing the event of U.S. forces withdrawal from Iraq and identifies them in relation to linguistic features. It attempts to describe the relation between language use and legitimation discursive structures in depicting political events. The paper focuses on the political event of U.S. forces’ withdrawal from Iraq in the English newspaper issued in Iraq. The study shows the way in which journalists express their values and attitudes concerning this critical event. Consequently, this requires a critical discourse analysis (henceforth, CDA to analyse news articles in the Iraqi English newspaper: The Kurdish Globe (henceforth, KG newspaper. Accordingly, the study presents a qualitative content analysis of newspaper articles to identify the legitimation discursive structures and their linguistic features. It is found that the main discursive structures of legitimation employed in the KG newspaper are: authorization, rationalization, and moral evaluation. Besides, there were five verb processes used to represent this legitimation, including material, verbal, relational, mental, and existential. Keywords: Critical discourse analysis, legitimation discursive structures, linguistic features, newspaper discourse, systemic functional linguistics

  13. OCCASIONAL ADNOMINAL IDIOM MODIFICATION - A COGNITIVE LINGUISTIC APPROACH

    Directory of Open Access Journals (Sweden)

    Andreas Langlotz

    2006-06-01

    Full Text Available occasional Adnominal Idiom Modification - A Cognitive Linguistic Approach From a cognitive-linguistic perspective, this paper explores alternative types of adnoniinal modification in occasional variants of English verbal idioms. Being discussed against data extracted from the British National Corpiis (BNC, the model claims that in idioni-production idiomatic constructions are activated as complex linguistic schemas to code a context-specific target-conceptualisation. Adnominal pre- and postmodifications are one specific form of creative alteration to adapt the idiom for this purpose. Semantically, idiom-interna1 NPextension is not a uniforni process. It is necessary to distinguish two systematic types of adnominal modification: external and internal modification (Ernst 1981. While external NPmodification has adverbial function, ¡.e. it modifies the idiom as a unit, internal modification directly applies to the head-noun and thus depends on the degree of motivation and analysability of a given idiom. Following the cognitive-linguistic framework, these dimensions of idiom-transparency result from the language user's ability to remotivate the bipartite semantic structure by conceptual metaphors and metonymies.

  14. The Effects of Linguistic Labels Related to Abstract Scenes on Memory

    Directory of Open Access Journals (Sweden)

    Kentaro Inomata

    2011-10-01

    Full Text Available Boundary extension is the false memory beyond the actual boundary of a picture scene. Gagnier (2011 suggested that a linguistic label has no effect on the magnitude of boundary extension. Although she controlled the timing of the presentation or information of the linguistic label, the information of stimulus was not changed. In the present study, the depiction of the main object was controlled in order to change the contextual information of a scene. In experiment, the 68 participants were shown 12 pictures. The stimulus consisted pictures that depicted the main object or did not depict the main object, and half of them were presented with linguistic description. Participants rated the object-less pictures more closely than the original pictures, when the former were presented with linguistic labels. However, when they were presented without linguistic labels, boundary extension did not occur. There was no effect of labels on the pictures that depicted the main objects. On the basis of these results, the linguistic label enhances the representation of the abstract scene like a homogeneous field or a wall. This finding suggests that boundary extension may be affected by not only visual information but also by other sensory information mediated by linguistic representation.

  15. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  16. Opening Minds or Changing Them? Some Observations on Teaching Introductory Linguistics.

    Science.gov (United States)

    Milambiling, Joyce

    2001-01-01

    Analyzes the teaching of linguistics and ethics of representing linguistic issues in a persuasive way, examining tensions between telling it like it is and telling it in a way that students will listen. The paper highlights persuasion, the introduction of linguistic concepts, the importance of understanding dialects and their role in education,…

  17. Ling An: Linguistic analysis of NPP instructions

    Energy Technology Data Exchange (ETDEWEB)

    Karlsson, F.; Salo, L. (Helsingfors Univ., Institutionen foer allmaen spraakvetenskap (Finland)); Wahlstroem, B. (VTT (Finland))

    2008-07-15

    The project consists of two sub-projects, 1) to find out whether the available linguistic method SWECG (Swedish Constraint Grammar) might be used for analyzing the safety manuals for Forsmark nuclear power plant, and 2) to find out whether it is possible to create a working system based on the SWECG method. The conclusion of the project is that an applicable linguistic analysis system may be realized by the company Lingsoft Inc., Aabo, Finland. (ln)

  18. A kindergarten experiment in linguistic e-learning

    DEFF Research Database (Denmark)

    Valente, Andrea; Marchetti, Emanuela

    2006-01-01

    As part of the BlaSq project, we are developing a set of linguistic games to be used in kindergartens. The first of these games is Crazipes, that we are currently testing in a Danish kindergarten, with the support of the local teachers. Here we discuss the architecture of the game, its potentials...... as a linguistic e-learning tool, together with the design and methodology adopted for the study. Some early results are also discussed....

  19. Ling An: LINGUISTIC ANALYSIS OF NPP INSTRUCTIONS

    International Nuclear Information System (INIS)

    Karlsson, F.; Salo, L.; Wahlstroem, B.

    2008-07-01

    The project consists of two sub-projects, 1) to find out whether the available linguistic method SWECG (Swedish Constraint Grammar) might be used for analyzing the safety manuals for Forsmark nuclear power plant, and 2) to find out whether it is possible to create a working system based on the SWECG method. The conclusion of the project is that an applicable linguistic analysis system may be realized by the company Lingsoft Inc., Aabo, Finland. (ln)

  20. Neurological evidence linguistic processes precede perceptual simulation in conceptual processing.

    Science.gov (United States)

    Louwerse, Max; Hutchinson, Sterling

    2012-01-01

    There is increasing evidence from response time experiments that language statistics and perceptual simulations both play a role in conceptual processing. In an EEG experiment we compared neural activity in cortical regions commonly associated with linguistic processing and visual perceptual processing to determine to what extent symbolic and embodied accounts of cognition applied. Participants were asked to determine the semantic relationship of word pairs (e.g., sky - ground) or to determine their iconic relationship (i.e., if the presentation of the pair matched their expected physical relationship). A linguistic bias was found toward the semantic judgment task and a perceptual bias was found toward the iconicity judgment task. More importantly, conceptual processing involved activation in brain regions associated with both linguistic and perceptual processes. When comparing the relative activation of linguistic cortical regions with perceptual cortical regions, the effect sizes for linguistic cortical regions were larger than those for the perceptual cortical regions early in a trial with the reverse being true later in a trial. These results map upon findings from other experimental literature and provide further evidence that processing of concept words relies both on language statistics and on perceptual simulations, whereby linguistic processes precede perceptual simulation processes.

  1. GEOLINGUISTICS: THE LINGUISTIC ATLAS OF PARANÁ

    Directory of Open Access Journals (Sweden)

    Rosa Evangelina de Santana BELLI RODRIGUES

    2015-06-01

    Full Text Available The objective of this work is to analyze the methodology adopted by the Linguistic Atlas of Paraná – APLR (AGUILERA, 1990 and to describe its results in relation to other Brazilian atlas. To meet this objective, we first present the modifications, mainly methodological, under gone by Geolinguistics towards a more complete and in depth description of linguistic variation. The Pluridimensional Geolinguistics and Contractual model of Harald Thun (1998 and the Linguistics Atlas of Brazil – ALiB (CARDOSO et all, 2014, published in October, 2014, are presented. It was also necessary to describe, although briefly, the most traditional Geolinguistics research method, characteristic of the ALPR, before referring the text back to Aguilera’s Atlas. After discussing the criteria on which the ALPR was constructed, from choice of informers to the Geolinguistics charts that compose it, as well as its complementation by the ALPR II (ALTINO, 2007, it was possible to analyze the results and relate them to the hypotheses posed by the thesis which gave origin to it.

  2. Music playschool enhances children's linguistic skills.

    Science.gov (United States)

    Linnavalli, Tanja; Putkinen, Vesa; Lipsanen, Jari; Huotilainen, Minna; Tervaniemi, Mari

    2018-06-08

    Several studies have suggested that intensive musical training enhances children's linguistic skills. Such training, however, is not available to all children. We studied in a community setting whether a low-cost, weekly music playschool provided to 5-6-year-old children in kindergartens could already affect their linguistic abilities. Children (N = 66) were tested four times over two school-years with Phoneme processing and Vocabulary subtests, along with tests for Perceptual reasoning skills and Inhibitory control. We compared the development of music playschool children to their peers either attending to similarly organized dance lessons or not attending to either activity. Music playschool significantly improved the development of children's phoneme processing and vocabulary skills. No such improvements on children's scores for non-verbal reasoning and inhibition were obtained. Our data suggest that even playful group music activities - if attended to for several years - have a positive effect on pre-schoolers' linguistic skills. Therefore we promote the concept of implementing regular music playschool lessons given by professional teachers in early childhood education.

  3. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

    Science.gov (United States)

    Holt, Carson; Yandell, Mark

    2011-12-22

    Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

  4. Grammatical Gender Trouble and Hungarian Gender[lessness]. Part I: Comparative Linguistic Gender

    Directory of Open Access Journals (Sweden)

    Louise O. Vasvári

    2011-01-01

    Full Text Available The aim of this study is to define linguistic gender[lessness], with particular reference in the latter part of the article to Hungarian, and to show why it is a feminist issue. I will discuss the [socio]linguistics of linguistic gender in three types of languages, those, like German and the Romance languages, among others, which possess grammatical gender, languages such as English, with only pronominal gender (sometimes misnamed ‘natural gender’, and languages such as Hungarian and other Finno-Ugric languages, as well as many other languages in the world, such as Turkish and Chinese, which have no linguistic or pronomial gender, but, like all languages, can make lexical gender distinctions. While in a narrow linguistic sense linguistic gender can be said to be afunctional, this does not take into account the ideological ramifications in gendered languages of the “leakage” between gender and sex[ism], while at the same time so-called genderless languages can express societal sexist assumptions linguistically through, for example, lexical gender, semantic derogation of women, and naming conventions. Thus, both languages with overt grammatical gender and those with gender-related asymmetries of a more covert nature show language to represent traditional cultural expectations, illustrating that linguistic gender is a feminist issue.

  5. Annotation: The Savant Syndrome

    Science.gov (United States)

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  6. Introduction

    Directory of Open Access Journals (Sweden)

    Merete Anderssen

    2008-02-01

    Full Text Available The third volume of the 34th edition of Nordlyd contains the proceedings for a workshop on language acquisition that took place at the Twenty Second Scandinavian Conference of Linguistics (SCL 22. The overall theme of the conference was 'brain, mind and language,' and the workshop invited abstracts in first, second and bilingual acquisition related to this theme. SCL 22 was held at the Centre For Linguistics (CFL at the University of Aalborg in Denmark on June 19–22, 2006, under the auspices of the Nordic Association of Linguists (NAL.

  7. Analysis of LYSA-calculus with explicit confidentiality annotations

    DEFF Research Database (Denmark)

    Gao, Han; Nielson, Hanne Riis

    2006-01-01

    Recently there has been an increased research interest in applying process calculi in the verification of cryptographic protocols due to their ability to formally model protocols. This work presents LYSA with explicit confidentiality annotations for indicating the expected behavior of target...... malicious activities performed by attackers as specified by the confidentiality annotations. The proposed analysis approach is fully automatic without the need of human intervention and has been applied successfully to a number of protocols....

  8. A responsible agenda for applied linguistics: Confessions of a philosopher

    OpenAIRE

    Albert Weideman

    2011-01-01

    When we undertake academic, disciplinary work, we rely on philosophical starting points. Several straightforward illustrations of this can be found in the history of applied linguistics. It is evident from the history of our field that various historically influential approaches to our discipline base themselves upon different academic confessions. This paper examines the effects of basing our applied linguistic work on the idea that applied linguistics is a discipline concerned with design. ...

  9. Lexicography and Linguistic Creativity*

    African Journals Online (AJOL)

    rbr

    It could be argued that lexicography has little business with linguistic creativ- ...... The forms in which traditional proverbs are found can also vary greatly: many ... BoE has examples of the proverb every cloud has a silver lining but many more ...

  10. On possible linguistic correlates to brain lateralization

    Directory of Open Access Journals (Sweden)

    Tania Kouteva/Kuteva

    2014-04-01

    The present paper compares the two modes of processing proposed by Van Lancker Sidtis (2009 in her dual process model and the two domains of discourse organization distinguished in the framework of Discourse Grammar (Heine et al. 2013; Kaltenböck et al. 2011. These two frameworks were developed on different kinds of data. In the dual process model it is observations on patients with left or right hemisphere damage that marked the starting point of analysis. Central to the dual process model is the distinction between novel speech (or novel language, or newly created language, or propositional speech and formulaic speech (or formulaic expressions or automatic speech. Easily identified instances of formulaic speech are swear words, interjections, pause fillers, discourse elements, non-literal lexical meanings for idioms, proverbs. Unlike the dual process model, in the Discourse Grammar model it is linguistic discontinuities that provided the basis of analysis. Discourse grammar in this model is understood as all the linguistic resources that are available for constructing spoken and written (and signed texts. We argue that Discourse Grammar can be divided into two distinct domains, namely Sentence Grammar and Thetical Grammar. Whereas Sentence Grammar has been at the centre of interest in mainstream linguistics, Thetical Grammar encompasses linguistic phenomena – such as formulae of social exchange, imperatives, vocatives, interjections, including hesitation markers and pause fillers and what is traditionally known as “parenthetical” constructions – that pose a problem to orthodox grammatical analysis. We show that the findings made within the two frameworks are largely compatible with one another: both models converge on claiming that there is a significant correlation between linguistic categorization and hemisphere-based brain activity. In the dual process model it is hypothesized that there is a significant correlation between certain kinds of speech

  11. First generation annotations for the fathead minnow (Pimephales promelas) genome

    Science.gov (United States)

    Ab initio gene prediction and evidence alignment were used to produce the first annotations for the fathead minnow SOAPdenovo genome assembly. Additionally, a genome browser hosted at genome.setac.org provides simplified access to the annotation data in context with fathead minno...

  12. Translating extra-linguistic culture-bound concepts in Mofolo: a ...

    African Journals Online (AJOL)

    Academic work on translation issues abounds, carried out by linguists and trans- .... to linguistics as they are cultural, social, literary, etc. (615–6). In other ..... places the term in italics on page 95 and makes a brief translator's note, “council.

  13. Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.

    Science.gov (United States)

    Agapito, Giuseppe; Milano, Marianna; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-01-01

    Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches.

  14. Text Linguistics in the Context of the Communication Sciences

    Directory of Open Access Journals (Sweden)

    Silviu Serban

    2011-05-01

    Full Text Available This paper tries to analyse the conditions of emerging of text linguistics, taking into consideration the rootsof the preoccupations in its domain, originated in the framework of the communication studies. Thus, the change ofthe perspective on communication, from the mechanistic transmission to interactivity and the exchange of themeanings, led to the pragmatic orientation of the linguistic researches, not just to the message itself, but also to theelements of the communicative act and to the context where the exchange of the meanings takes place. As a result,text linguistics defines the text as communicational occurrence, involving both the members of the communicationand the conditions of the production and the reception of the message, unlike conventional linguistics which studiesthe text in abstracto, just the message itself, ignoring the world that the text refers to, or the users of the message, thetransmitter and the receiver.

  15. Non-linguistic Conditions for Causativization as a Linguistic Attractor

    OpenAIRE

    Johanna Nichols; Johanna Nichols; Johanna Nichols

    2018-01-01

    An attractor, in complex systems theory, is any state that is more easily or more often entered or acquired than departed or lost; attractor states therefore accumulate more members than non-attractors, other things being equal. In the context of language evolution, linguistic attractors include sounds, forms, and grammatical structures that are prone to be selected when sociolinguistics and language contact make it possible for speakers to choose between competing forms. The reasons why an e...

  16. Alternate fusion fuels workshop

    International Nuclear Information System (INIS)

    1981-06-01

    The workshop was organized to focus on a specific confinement scheme: the tokamak. The workshop was divided into two parts: systems and physics. The topics discussed in the systems session were narrowly focused on systems and engineering considerations in the tokamak geometry. The workshop participants reviewed the status of system studies, trade-offs between d-t and d-d based reactors and engineering problems associated with the design of a high-temperature, high-field reactor utilizing advanced fuels. In the physics session issues were discussed dealing with high-beta stability, synchrotron losses and transport in alternate fuel systems. The agenda for the workshop is attached

  17. Jannovar: a java library for exome annotation.

    Science.gov (United States)

    Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N

    2014-05-01

    Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.

  18. The Third ATLAS ROD Workshop

    CERN Multimedia

    Poggioli, L.

    A new-style Workshop After two successful ATLAS ROD Workshops dedicated to the ROD hardware and held at the Geneva University in 1998 and in 2000, a new style Workshop took place at LAPP in Annecy on November 14-15, 2002. This time the Workshop was fully dedicated to the ROD-TDAQ integration and software in view of the near future integration activities of the final RODs for the detector assembly and commissioning. More precisely, the aim of this workshop was to get from the sub-detectors the parameters needed for T-DAQ, as well as status and plans from ROD builders. On the other hand, what was decided and assumed had to be stated (like EB decisions and URDs), and also support plans. The Workshop gathered about 70 participants from all ATLAS sub-detectors and the T-DAQ community. The quite dense agenda allowed nevertheless for many lively discussions, and for a dinner in the old town of Annecy. The Sessions The Workshop was organized in five main sessions: Assumptions and recommendations Sub-de...

  19. Graph-based sequence annotation using a data integration approach.

    Science.gov (United States)

    Pesch, Robert; Lysenko, Artem; Hindle, Matthew; Hassani-Pak, Keywan; Thiele, Ralf; Rawlings, Christopher; Köhler, Jacob; Taubert, Jan

    2008-08-25

    The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.

  20. Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator

    Science.gov (United States)

    Seyed, P.; Chastain, K.; McGuinness, D. L.

    2013-12-01

    Use of Semantic Web technologies for data management in the Earth sciences (and beyond) has great potential but is still in its early stages, since the challenges of translating data into a more explicit or semantic form for immediate use within applications has not been fully addressed. In this abstract we help address this challenge by introducing the SemantEco Annotator, which enables anyone, regardless of expertise, to semantically annotate tabular Earth Science data and translate it into linked data format, while applying the logic inherent in community-standard vocabularies to guide the process. The Annotator was conceived under a desire to unify dataset content from a variety of sources under common vocabularies, for use in semantically-enabled web applications. Our current use case employs linked data generated by the Annotator for use in the SemantEco environment, which utilizes semantics to help users explore, search, and visualize water or air quality measurement and species occurrence data through a map-based interface. The generated data can also be used immediately to facilitate discovery and search capabilities within 'big data' environments. The Annotator provides a method for taking information about a dataset, that may only be known to its maintainers, and making it explicit, in a uniform and machine-readable fashion, such that a person or information system can more easily interpret the underlying structure and meaning. Its primary mechanism is to enable a user to formally describe how columns of a tabular dataset relate and/or describe entities. For example, if a user identifies columns for latitude and longitude coordinates, we can infer the data refers to a point that can be plotted on a map. Further, it can be made explicit that measurements of 'nitrate' and 'NO3-' are of the same entity through vocabulary assignments, thus more easily utilizing data sets that use different nomenclatures. The Annotator provides an extensive and searchable

  1. Emergency response workers workshop

    International Nuclear Information System (INIS)

    Agapeev, S.A.; Glukhikh, E.N.; Tyurin, R.L.

    2012-01-01

    A training workshop entitled Current issues and potential improvements in Rosatom Corporation emergency prevention and response system was held in May-June, 2012. The workshop combined theoretical training with full-scale practical exercise that demonstrated the existing innovative capabilities for radiation reconnaissance, diving equipment and robotics, aircraft, emergency response and rescue hardware and machinery. This paper describes the activities carried out during the workshop [ru

  2. Corpus methods and their reflection in linguistic theories of the 20th century

    Directory of Open Access Journals (Sweden)

    Simon Krek

    2013-05-01

    Full Text Available In the 20th century structuralism established itself as the central linguistic theory, in the first half mainly through its originator Ferdinand de Saussure, and in the second half with the figure of Noam Chomsky. The latter consistently refused to acknowledge analysis of extensive quantity of texts as a valuable method, and favoured linguistic intuition of a native speaker instead. In parallel with structuralism other trends in linguistics emerged which pointed to the inadequateness of the prevailing linguistic paradigm and to theoretical insights which were only possible after the systematic analysis of large quantities of texts. The paper discusses some of the dilemmas stemming from this dichotomy and places corpus linguistics in a broader linguistic context.

  3. Annotation of phenotypic diversity: decoupling data curation and ontology curation using Phenex.

    Science.gov (United States)

    Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J

    2014-01-01

    Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.

  4. Automated evaluation of annotators for museum collections using subjective login

    NARCIS (Netherlands)

    Ceolin, D.; Nottamkandath, A.; Fokkink, W.J.; Dimitrakos, Th.; Moona, R.; Patel, Dh.; Harrison McKnight, D.

    2012-01-01

    Museums are rapidly digitizing their collections, and face a huge challenge to annotate every digitized artifact in store. Therefore they are opening up their archives for receiving annotations from experts world-wide. This paper presents an architecture for choosing the most eligible set of

  5. 2015 Inverter Workshop | Photovoltaic Research | NREL

    Science.gov (United States)

    Inverter Workshop 2015 Inverter Workshop Wednesday, February 25, 2015 Chair: Jack Flicker In about inverters. This workshop represented a follow-on to the inverter workshops that Sandia National conversations between module and inverter experts. Agenda For a detailed schedule of the day's events, access

  6. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...... and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross...

  7. Text genres and registers the computation of linguistic features

    CERN Document Server

    Fang, Chengyu Alex

    2015-01-01

    This book is a description of some of the most recent advances in text classification as part of a concerted effort to achieve computer understanding of human language. In particular, it addresses state-of-the-art developments in the computation of higher-level linguistic features, ranging from etymology to grammar and syntax for the practical task of text classification according to genres, registers and subject domains. Serving as a bridge between computational methods and sophisticated linguistic analysis, this book will be of particular interest to academics and students of computational linguistics as well as professionals in natural language engineering.

  8. Phonetics, Phonology, and Applied Linguistics.

    Science.gov (United States)

    Nadasdy, Adam

    1995-01-01

    Examines recent trends in phonetics and phonology and their influence on second language instruction, specifically grammar and lexicography. An annotated bibliography discusses nine important works in the field. (99 references) (MDM)

  9. Interface of Linguistic and Visual Information During Audience Design.

    Science.gov (United States)

    Fukumura, Kumiko

    2015-08-01

    Evidence suggests that speakers can take account of the addressee's needs when referring. However, what representations drive the speaker's audience design has been less clear. This study aims to go beyond previous studies by investigating the interplay between the visual and linguistic context during audience design. Speakers repeated subordinate descriptions (e.g., firefighter) given in the prior linguistic context less and used basic-level descriptions (e.g., man) more when the addressee did not hear the linguistic context than when s/he did. But crucially, this effect happened only when the referent lacked the visual attributes associated with the expressions (e.g., the referent was in plain clothes rather than in a firefighter uniform), so there was no other contextual cue available for the identification of the referent. This suggests that speakers flexibly use different contextual cues to help their addressee map the referring expression onto the intended referent. In addition, speakers used fewer pronouns when the addressee did not hear the linguistic antecedent than when s/he did. This suggests that although speakers may be egocentric during anaphoric reference (Fukumura & Van Gompel, 2012), they can cooperatively avoid pronouns when the linguistic antecedents were not shared with their addressee during initial reference. © 2014 Cognitive Science Society, Inc.

  10. First International Workshop on Grid Simulator Testing of Wind Turbine Drivetrains: Workshop Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Gevorgian, V.; Link, H.; McDade, M.; Mander, A.; Fox, J. C.; Rigas, N.

    2013-11-01

    This report summarizes the proceedings of the First International Workshop on Grid Simulator Testing of Wind Turbine Drivetrains, held from June 13 to 14, 2013, at the National Renewable Energy Laboratory's National Wind Technology Center, located south of Boulder, Colorado. The workshop was sponsored by the U.S. Department of Energy and cohosted by the National Renewable Energy Laboratory and Clemson University under ongoing collaboration via a cooperative research and development agreement. The purpose of the workshop was to provide a forum to discuss the research, testing needs, and state-of-the-art apparatuses involved in grid compliance testing of utility-scale wind turbine generators. This includes both dynamometer testing of wind turbine drivetrains ('ground testing') and field testing grid-connected wind turbines. Four sessions followed by discussions in which all attendees of the workshop were encouraged to participate comprised the workshop.

  11. Genome Annotation and Transcriptomics of Oil-Producing Algae

    Science.gov (United States)

    2015-03-16

    AFRL-OSR-VA-TR-2015-0103 GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE Sabeeha Merchant UNIVERSITY OF CALIFORNIA LOS ANGELES Final...2010 To 12-31-2014 4. TITLE AND SUBTITLE GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE 5a. CONTRACT NUMBER FA9550-10-1-0095 5b...NOTES 14. ABSTRACT Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some

  12. Paradigm Changes in Linguistics: From Reductionism to Holism

    Science.gov (United States)

    Weigand, Edda

    2011-01-01

    Linguistics like any science has undergone a series of paradigm changes, one of the most important being the change from the view of language as a sign system to that of language-in-use. In the face of the progress made in the natural and social sciences in recent years, from biology to economics, the question has to be posed where linguistics and…

  13. SCHOOL LINGUISTIC CREATIVITY BASED ON SCIENTIFIC GEOGRAPHICAL TEXTS

    OpenAIRE

    VIORICA BLÎNDĂ

    2012-01-01

    The analysis and observation of the natural environment and of the social and economic one, observing phenomena, objects, beings, and geographical events are at the basis of producing geographical scientific texts. The symbols of iconotexts and cartotexts are another source of inspiration for linguistic interpretation. The linguistic creations that we selected for our study are the scientific analysis, the commentary, the characteriz...

  14. Resources on quantitative/statistical research for applied linguists

    OpenAIRE

    Brown , James Dean

    2004-01-01

    Abstract The purpose of this review article is to survey and evaluate existing books on quantitative/statistical research in applied linguistics. The article begins by explaining the types of texts that will not be reviewed, then it briefly describes nine books that address how to do quantitative/statistical applied linguistics research. The review then compares (in prose and tables) the general characteris...

  15. Annotating smart environment sensor data for activity learning.

    Science.gov (United States)

    Szewcyzk, S; Dwan, K; Minor, B; Swedlove, B; Cook, D

    2009-01-01

    The pervasive sensing technologies found in smart homes offer unprecedented opportunities for providing health monitoring and assistance to individuals experiencing difficulties living independently at home. In order to monitor the functional health of smart home residents, we need to design technologies that recognize and track the activities that people perform at home. Machine learning techniques can perform this task, but the software algorithms rely upon large amounts of sample data that is correctly labeled with the corresponding activity. Labeling, or annotating, sensor data with the corresponding activity can be time consuming, may require input from the smart home resident, and is often inaccurate. Therefore, in this paper we investigate four alternative mechanisms for annotating sensor data with a corresponding activity label. We evaluate the alternative methods along the dimensions of annotation time, resident burden, and accuracy using sensor data collected in a real smart apartment.

  16. BEASTling: A software tool for linguistic phylogenetics using BEAST 2

    Science.gov (United States)

    Forkel, Robert; Kaiping, Gereon A.; Atkinson, Quentin D.

    2017-01-01

    We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF) permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists) and relevant domain experts. PMID:28796784

  17. BEASTling: A software tool for linguistic phylogenetics using BEAST 2.

    Directory of Open Access Journals (Sweden)

    Luke Maurits

    Full Text Available We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists and relevant domain experts.

  18. Formation of new linguistic competences in education space: naming examination

    Directory of Open Access Journals (Sweden)

    Remchukova Elena

    2016-01-01

    Full Text Available The naming examination is a new kind of linguistic examination. The article deals with linguistic aspects of teaching this course in higher school for the special training of experts. In order to form professional competence on naming examination in the process of teaching special attention is paid to studies of theory of nomination and onomastics, to examination of language units from the point of view of component analysis, semantic-stylistic analysis and others, as well as the formation of the skills of work with different lexicographic sources and digital resources and database. In the laboratory course “Applied and mathematical linguistics,” the skills of lexico-semantic, morphological, etymological, morphemic, word-formation, phonetic analysis of concrete names are practiced. We focus on the studies of artificial naming patterns, including advertising names, which bring out the creative potential of the Russian language. Creative trends dominate in this area of nomination. Naming examination as a new kind of forensic linguistic examination is taught within the course ”Forensic linguistic examination” which accomplishes technical education of students

  19. Linguistic adaptation between mothers and children in ASD

    DEFF Research Database (Denmark)

    Fusaroli, Riccardo; Weed, Ethan; Fein, Deborah

    We investigate mother-child linguistic adaptation in 33 ASD and 33 matched TD children at two time-scales: conversational match and longitudinal development. We employ a longitudinal corpus (6 visits over 2 years) consisting of controlled playful activities between mothers and their children...... (Goodwin et al. 2012). We quantified amount (number of words and utterances) and complexity (lexical repertoire and utterance length) of linguistic behavior in both mother and child. We used mixed-effects growth curve models to quantify i)match within-conversation and ii)longitudinal impact between visits....... Child and mother are strongly correlated in their linguistic behaviors (R2 between .07 and .62, pMother-child pairs in the ASD group, however, show a shallower increase in match. Amount and complexity...

  20. Annotation Method (AM): SE7_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE7_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  1. Annotation Method (AM): SE36_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE36_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  2. Annotation Method (AM): SE14_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE14_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  3. Annotation Method (AM): SE33_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE33_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  4. Annotation Method (AM): SE12_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE12_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  5. Annotation Method (AM): SE20_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE20_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  6. Annotation Method (AM): SE2_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE2_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  7. Annotation Method (AM): SE28_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE28_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  8. Annotation Method (AM): SE11_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE11_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  9. Annotation Method (AM): SE17_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE17_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  10. Annotation Method (AM): SE10_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE10_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  11. Annotation Method (AM): SE4_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE4_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  12. Annotation Method (AM): SE9_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE9_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  13. Annotation Method (AM): SE3_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE3_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  14. Annotation Method (AM): SE25_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE25_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  15. Annotation Method (AM): SE30_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE30_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  16. Annotation Method (AM): SE16_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE16_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  17. Annotation Method (AM): SE29_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE29_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  18. Annotation Method (AM): SE35_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE35_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  19. Annotation Method (AM): SE6_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE6_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  20. Annotation Method (AM): SE1_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE1_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data