WorldWideScience

Sample records for handwritten textual annotations

  1. The Effects of Visual and Textual Annotations on Spanish Listening Comprehension, Vocabulary Acquisition and Cognitive Load

    Science.gov (United States)

    Cottam, Michael Evan

    2010-01-01

    The purpose of this experimental study was to investigate the effects of textual and visual annotations on Spanish listening comprehension and vocabulary acquisition in the context of an online multimedia listening activity. 95 students who were enrolled in different sections of first year Spanish classes at a community college and a large…

  2. The Generator of the Event Structure Lexicon (GESL): Automatic Annotation of Event Structure for Textual Inference Tasks

    Science.gov (United States)

    Im, Seohyun

    2013-01-01

    This dissertation aims to develop the Generator of the Event Structure Lexicon (GESL) which is a tool to automate annotating the event structure of verbs in text to support textual inference tasks related to lexically entailed subevents. The output of the GESL is the Event Structure Lexicon (ESL), which is a lexicon of verbs in text which includes…

  3. Online handwritten mathematical expression recognition

    Science.gov (United States)

    Büyükbayrak, Hakan; Yanikoglu, Berrin; Erçil, Aytül

    2007-01-01

    We describe a system for recognizing online, handwritten mathematical expressions. The system is designed with a user-interface for writing scientific articles, supporting the recognition of basic mathematical expressions as well as integrals, summations, matrices etc. A feed-forward neural network recognizes symbols which are assumed to be single-stroke and a recursive algorithm parses the expression by combining neural network output and the structure of the expression. Preliminary results show that writer-dependent recognition rates are very high (99.8%) while writer-independent symbol recognition rates are lower (75%). The interface associated with the proposed system integrates the built-in recognition capabilities of the Microsoft's Tablet PC API for recognizing textual input and supports conversion of hand-drawn figures into PNG format. This enables the user to enter text, mathematics and draw figures in a single interface. After recognition, all output is combined into one LATEX code and compiled into a PDF file.

  4. The surplus value of semantic annotations

    NARCIS (Netherlands)

    Marx, M.

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,

  5. Use of Splines in Handwritten Character Recognition

    OpenAIRE

    Sunil Kumar; Gopinath S,; Satish Kumar; Rajesh Chhikara

    2010-01-01

    Handwritten Character Recognition is software used to identify the handwritten characters and receive and interpret intelligible andwritten input from sources such as manuscript documents. The recent past several years has seen the development of many systems which are able to simulate the human brain actions. Among the many, the neural networks and the artificial intelligence are the most two important paradigms used. In this paper we propose a new algorithm for recognition of handwritten t...

  6. Collaborative Paper-Based Annotation of Lecture Slides

    Science.gov (United States)

    Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max

    2009-01-01

    In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…

  7. Exploring textual data

    CERN Document Server

    Lebart, Ludovic; Berry, Lisette

    1998-01-01

    Researchers in a number of disciplines deal with large text sets requiring both text management and text analysis. Faced with a large amount of textual data collected in marketing surveys, literary investigations, historical archives and documentary data bases, these researchers require assistance with organizing, describing and comparing texts. Exploring Textual Data demonstrates how exploratory multivariate statistical methods such as correspondence analysis and cluster analysis can be used to help investigate, assimilate and evaluate textual data. The main text does not contain any strictly mathematical demonstrations, making it accessible to a large audience. This book is very user-friendly with proofs abstracted in the appendices. Full definitions of concepts, implementations of procedures and rules for reading and interpreting results are fully explored. A succession of examples is intended to allow the reader to appreciate the variety of actual and potential applications and the complementary processin...

  8. Connecting textual segments

    DEFF Research Database (Denmark)

    Brügger, Niels

    2017-01-01

    history than just the years of the emergence of the web, the chapter traces the history of how segments of text have deliberately been connected to each other by the use of specific textual and media features, from clay tablets, manuscripts on parchment, and print, among others, to hyperlinks on stand......In “Connecting textual segments: A brief history of the web hyperlink” Niels Brügger investigates the history of one of the most fundamental features of the web: the hyperlink. Based on the argument that the web hyperlink is best understood if it is seen as another step in a much longer and broader...

  9. DATABASES FOR RECOGNITION OF HANDWRITTEN ARABIC CHEQUES

    NARCIS (Netherlands)

    Alohali, Y.; Cheriet, M.; Suen, C.Y.

    2004-01-01

    This paper describes an effort toward building Arabic cheque databases for research in recognition of handwritten Arabic cheques. Databases of Arabic legal amounts, Arabic sub­ words, courtesy amounts, Indian digits, and Arabic cheques are provided. This paper highlights the characteristics of the

  10. Beyond OCR: Handwritten manuscript attribute understanding

    NARCIS (Netherlands)

    He, Sheng

    2017-01-01

    Knowing the author, date and location of handwritten historical documents is very important for historians to completely understand and reveal the valuable information they contain. In this thesis, three attributes, such as writer, date and geographical location, are studied by analyzing the

  11. RECOGNITION AND VERIFICATION OF TOUCHING HANDWRITTEN NUMERALS

    NARCIS (Netherlands)

    Zhou, J.; Kryzak, A.; Suen, C.Y.

    2004-01-01

    In the field of financial document processing, recognition of touching handwritten numerals has been limited by lack of good benchmarking databases and low reliability of algorithms. This paper addresses the efforts toward solving the two problems. Two databases IRIS-Bell\\\\\\'98 and TNIST are

  12. A NEW APPROACH TO SEGMENT HANDWRITTEN DIGITS

    NARCIS (Netherlands)

    Oliveira, L.S.; Lethelier, E.; Bortolozzi, F.; Sabourin, R.

    2004-01-01

    This article presents a new segmentation approach applied to unconstrained handwritten digits. The novelty of the proposed algorithm is based on the combination of two types of structural features in order to provide the best segmentation path between connected entities. In this article, we first

  13. Do handwritten words magnify lexical effects in visual word recognition?

    Science.gov (United States)

    Perea, Manuel; Gil-López, Cristina; Beléndez, Victoria; Carreiras, Manuel

    2016-01-01

    An examination of how the word recognition system is able to process handwritten words is fundamental to formulate a comprehensive model of visual word recognition. Previous research has revealed that the magnitude of lexical effects (e.g., the word-frequency effect) is greater with handwritten words than with printed words. In the present lexical decision experiments, we examined whether the quality of handwritten words moderates the recruitment of top-down feedback, as reflected in word-frequency effects. Results showed a reading cost for difficult-to-read and easy-to-read handwritten words relative to printed words. But the critical finding was that difficult-to-read handwritten words, but not easy-to-read handwritten words, showed a greater word-frequency effect than printed words. Therefore, the inherent physical variability of handwritten words does not necessarily boost the magnitude of lexical effects.

  14. Cryptographic key generation using handwritten signature

    OpenAIRE

    Freire, Manuel R.; Fiérrez, Julián; Ortega-García, Javier

    2006-01-01

    M. Freire-Santos ; J. Fierrez-Aguilar ; J. Ortega-Garcia; "Cryptographic key generation using handwritten signature", Biometric Technology for Human Identification III, Proc. SPIE 6202 (April 17, 2006); doi:10.1117/12.665875. Copyright 2006 Society of Photo‑Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of...

  15. Font generation of personal handwritten Chinese characters

    Science.gov (United States)

    Lin, Jeng-Wei; Wang, Chih-Yin; Ting, Chao-Lung; Chang, Ray-I.

    2014-01-01

    Today, digital multimedia messages have drawn more and more attention due to the great achievement of computer and network techniques. Nevertheless, text is still the most popular media for people to communicate with others. Many fonts have been developed so that product designers can choose unique fonts to demonstrate their idea gracefully. It is commonly believed that handwritings can reflect one's personality, emotion, feeling, education level, and so on. This is especially true in Chinese calligraphy. However, it is not easy for ordinary users to customize a font of their personal handwritings. In this study, we performed a process reengineering in font generation. We present a new method to create font in a batch mode. Rather than to create glyphs of characters one by one according to their codepoints, people create glyphs incrementally in an on-demand manner. A Java Implementation is developed to read a document image of user handwritten Chinese characters, and make a vector font of these handwritten Chinese characters. Preliminary experiment result shows that the proposed method can help ordinary users create their personal handwritten fonts easily and quickly.

  16. Handwritten Digits Recognition Using Neural Computing

    Directory of Open Access Journals (Sweden)

    Călin Enăchescu

    2009-12-01

    Full Text Available In this paper we present a method for the recognition of handwritten digits and a practical implementation of this method for real-time recognition. A theoretical framework for the neural networks used to classify the handwritten digits is also presented.The classification task is performed using a Convolutional Neural Network (CNN. CNN is a special type of multy-layer neural network, being trained with an optimized version of the back-propagation learning algorithm.CNN is designed to recognize visual patterns directly from pixel images with minimal preprocessing, being capable to recognize patterns with extreme variability (such as handwritten characters, and with robustness to distortions and simple geometric transformations.The main contributions of this paper are related to theoriginal methods for increasing the efficiency of the learning algorithm by preprocessing the images before the learning process and a method for increasing the precision and performance for real-time applications, by removing the non useful information from the background.By combining these strategies we have obtained an accuracy of 96.76%, using as training set the NIST (National Institute of Standards and Technology database.

  17. Assessment of legibility and completeness of handwritten and electronic prescriptions.

    Science.gov (United States)

    Albarrak, Ahmed I; Al Rashidi, Eman Abdulrahman; Fatani, Rwaa Kamil; Al Ageel, Shoog Ibrahim; Mohammed, Rafiuddin

    2014-12-01

    To assess the legibility and completeness of handwritten prescriptions and compare with electronic prescription system for medication errors. Prospective study. King Khalid University Hospital (KKUH), Riyadh, Saudi Arabia. Handwritten prescriptions were received from clinical units of Medicine Outpatient Department (MOPD), Primary Care Clinic (PCC) and Surgery Outpatient Department (SOPD) whereas electronic prescriptions were collected from the pediatric ward. The handwritten prescription was assessed for completeness by the checklist designed according to the hospital prescription and evaluated for legibility by two pharmacists. The comparison between handwritten and electronic prescription errors was evaluated based on the validated checklist adopted from previous studies. Legibility and completeness of prescriptions. 398 prescriptions (199 handwritten and 199 e-prescriptions) were assessed. About 71 (35.7%) of handwritten and 5 (2.5%) of electronic prescription errors were identified. A significant statistical difference (P prescriptions in omitted dose and omitted route of administration category of error distribution. The rate of completeness in patient identification in handwritten prescriptions was 80.97% in MOPD, 76.36% in PCC and 85.93% in SOPD clinic units. Assessment of medication prescription completeness was 91.48% in MOPD, 88.48% in PCC, and 89.28% in SOPD. This study revealed a high incidence of prescribing errors in handwritten prescriptions. The use of e-prescription system showed a significant decline in the incidence of errors. The legibility of handwritten prescriptions was relatively good whereas the level of completeness was very low.

  18. Big Textual Data in Transportation

    DEFF Research Database (Denmark)

    Beheshti-Kashi, Samaneh; Buch, Rasmus Brødsgaard; Lachaize, Maxime

    2018-01-01

    applications have been converting to utilizable and meaningful insights. However, prior to this, the availability of textual sources relevant for logistics and transportation has to be examined. Accordingly, the identification of potential textual sources and their evaluation in terms of extraction barriers...

  19. Eye movements when reading sentences with handwritten words.

    Science.gov (United States)

    Perea, Manuel; Marcet, Ana; Uixera, Beatriz; Vergara-Martínez, Marta

    2016-10-17

    The examination of how we read handwritten words (i.e., the original form of writing) has typically been disregarded in the literature on reading. Previous research using word recognition tasks has shown that lexical effects (e.g., the word-frequency effect) are magnified when reading difficult handwritten words. To examine this issue in a more ecological scenario, we registered the participants' eye movements when reading handwritten sentences that varied in the degree of legibility (i.e., sentences composed of words in easy vs. difficult handwritten style). For comparison purposes, we included a condition with printed sentences. Results showed a larger reading cost for sentences with difficult handwritten words than for sentences with easy handwritten words, which in turn showed a reading cost relative to the sentences with printed words. Critically, the effect of word frequency was greater for difficult handwritten words than for easy handwritten words or printed words in the total times on a target word, but not on first-fixation durations or gaze durations. We examine the implications of these findings for models of eye movement control in reading.

  20. Slant correction for handwritten English documents

    Science.gov (United States)

    Shridhar, Malayappan; Kimura, Fumitaka; Ding, Yimei; Miller, John W. V.

    2004-12-01

    Optical character recognition of machine-printed documents is an effective means for extracting textural material. While the level of effectiveness for handwritten documents is much poorer, progress is being made in more constrained applications such as personal checks and postal addresses. In these applications a series of steps is performed for recognition beginning with removal of skew and slant. Slant is a characteristic unique to the writer and varies from writer to writer in which characters are tilted some amount from vertical. The second attribute is the skew that arises from the inability of the writer to write on a horizontal line. Several methods have been proposed and discussed for average slant estimation and correction in the earlier papers. However, analysis of many handwritten documents reveals that slant is a local property and slant varies even within a word. The use of an average slant for the entire word often results in overestimation or underestimation of the local slant. This paper describes three methods for local slant estimation, namely the simple iterative method, high-speed iterative method, and the 8-directional chain code method. The experimental results show that the proposed methods can estimate and correct local slant more effectively than the average slant correction.

  1. A Proposed Arabic Handwritten Text Normalization Method

    Directory of Open Access Journals (Sweden)

    Tarik Abu-Ain

    2014-11-01

    Full Text Available Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and characters recognition. In the present article, a new method for text baseline detection, straightening, and slant correction for Arabic handwritten texts is proposed. The method comprises a set of sequential steps: first components segmentation is done followed by components text thinning; then, the direction features of the skeletons are extracted, and the candidate baseline regions are determined. After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line.  The experiments are conducted on IFN/ENIT benchmark Arabic dataset. The results show that the proposed method has a promising and encouraging performance.

  2. Ensemble methods for handwritten digit recognition

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Liisberg, Christian; Salamon, P.

    1992-01-01

    Neural network ensembles are applied to handwritten digit recognition. The individual networks of the ensemble are combinations of sparse look-up tables (LUTs) with random receptive fields. It is shown that the consensus of a group of networks outperforms the best individual of the ensemble....... It is further shown that it is possible to estimate the ensemble performance as well as the learning curve on a medium-size database. In addition the authors present preliminary analysis of experiments on a large database and show that state-of-the-art performance can be obtained using the ensemble approach...... by optimizing the receptive fields. It is concluded that it is possible to improve performance significantly by introducing moderate-size ensembles; in particular, a 20-25% improvement has been found. The ensemble random LUTs, when trained on a medium-size database, reach a performance (without rejects) of 94...

  3. [Prescription annotations in Welfare Pharmacy].

    Science.gov (United States)

    Han, Yi

    2018-03-01

    Welfare Pharmacy contains medical formulas documented by the government and official prescriptions used by the official pharmacy in the pharmaceutical process. In the last years of Southern Song Dynasty, anonyms gave a lot of prescription annotations, made textual researches for the name, source, composition and origin of the prescriptions, and supplemented important historical data of medical cases and researched historical facts. The annotations of Welfare Pharmacy gathered the essence of medical theory, and can be used as precious materials to correctly understand the syndrome differentiation, compatibility regularity and clinical application of prescriptions. This article deeply investigated the style and form of the prescription annotations in Welfare Pharmacy, the name of prescriptions and the evolution of terminology, the major functions of the prescriptions, processing methods, instructions for taking medicine and taboos of prescriptions, the medical cases and clinical efficacy of prescriptions, the backgrounds, sources, composition and cultural meanings of prescriptions, proposed that the prescription annotations played an active role in the textual dissemination, patent medicine production and clinical diagnosis and treatment of Welfare Pharmacy. This not only helps understand the changes in the names and terms of traditional Chinese medicines in Welfare Pharmacy, but also provides the basis for understanding the knowledge sources, compatibility regularity, important drug innovations and clinical medications of prescriptions in Welfare Pharmacy. Copyright© by the Chinese Pharmaceutical Association.

  4. Formal Verification of Annotated Textual Use-Cases

    Czech Academy of Sciences Publication Activity Database

    Šimko, V.; Hauzar, D.; Hnětynka, P.; Bureš, Tomáš; Plášil, F.

    2015-01-01

    Roč. 58, č. 7 (2015), s. 1495-1529 ISSN 0010-4620 Grant - others:GA AV ČR(CZ) GAP103/11/1489 Institutional support: RVO:67985807 Keywords : specification * use-cases * behavior modeling * verification * temporal logic * formalization Subject RIV: JC - Computer Hardware ; Software Impact factor: 1.000, year: 2015

  5. A novel handwritten character recognition system using gradient ...

    Indian Academy of Sciences (India)

    The issues faced by the handwritten character recognition systems are the similarity. ∗ ... tical/structural features have also been successfully used in character ..... The coordinates (xc, yc) of centroid are calculated by equations (4) and (5). xc =.

  6. Features fusion based approach for handwritten Gujarati character recognition

    Directory of Open Access Journals (Sweden)

    Ankit Sharma

    2017-02-01

    Full Text Available Handwritten character recognition is a challenging area of research. Lots of research activities in the area of character recognition are already done for Indian languages such as Hindi, Bangla, Kannada, Tamil and Telugu. Literature review on handwritten character recognition indicates that in comparison with other Indian scripts research activities on Gujarati handwritten character recognition are very less.  This paper aims to bring Gujarati character recognition in attention. Recognition of isolated Gujarati handwritten characters is proposed using three different kinds of features and their fusion. Chain code based, zone based and projection profiles based features are utilized as individual features. One of the significant contribution of proposed work is towards the generation of large and representative dataset of 88,000 handwritten Gujarati characters. Experiments are carried out on this developed dataset. Artificial Neural Network (ANN, Support Vector Machine (SVM and Naive Bayes (NB classifier based methods are implemented for handwritten Gujarati character recognition. Experimental results show substantial enhancement over state-of-the-art and authenticate our proposals.

  7. La busqueda textual por computadora (Textual Search by Computer)

    Science.gov (United States)

    Davison, Ned J.

    1977-01-01

    Describes the use of the computer program EDIT for textual searches to locate a certain programmed word or word root. In the examples explained here, the vocabulary search is performed on poetry and allows examination of the metaphorical and conceptual poetic atmosphere achieved through word use. (Text is in Spanish.) (CHK)

  8. Handwritten Sindhi Character Recognition Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shafique Ahmed Awan

    2018-01-01

    Full Text Available OCR (OpticalCharacter Recognition is a technology in which text image is used to understand and write text by machines. The work on languages containing isolated characters such as German, English, French and others is at its peak. The OCR and ICR (Intelligent Character Recognition research in Sindhi script is currently at in starting stages and not sufficient work have been cited in this area even though Sindhi language is rich in culture and history. This paper presents one of the initial steps in recognizing Sindhi handwritten characters. The isolated characters of Sindhi script written by thesubjects have been recognized. The various subjects were asked to write Sindhi characters in unconstrained form and then the written samples were collected and scanned through a flatbed scanner. The scanned documents were preprocessedwith the help of binary conversion, removing noise by pepper noise and the lines were segmented with the help of horizontal profile technique. The segmented lines were used to extract characters from scanned pages.This character segmentation was done by vertical projection. The extracted characters have been used to extract features so that the characters can be classified easily. Zoning was used for the feature extraction technique. For the classification, neural network has been used. The recognized characters converted into editable text with an average accuracy of 85%.

  9. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  10. Sunspot drawings handwritten character recognition method based on deep learning

    Science.gov (United States)

    Zheng, Sheng; Zeng, Xiangyun; Lin, Ganghua; Zhao, Cui; Feng, Yongli; Tao, Jinping; Zhu, Daoyuan; Xiong, Li

    2016-05-01

    High accuracy scanned sunspot drawings handwritten characters recognition is an issue of critical importance to analyze sunspots movement and store them in the database. This paper presents a robust deep learning method for scanned sunspot drawings handwritten characters recognition. The convolution neural network (CNN) is one algorithm of deep learning which is truly successful in training of multi-layer network structure. CNN is used to train recognition model of handwritten character images which are extracted from the original sunspot drawings. We demonstrate the advantages of the proposed method on sunspot drawings provided by Chinese Academy Yunnan Observatory and obtain the daily full-disc sunspot numbers and sunspot areas from the sunspot drawings. The experimental results show that the proposed method achieves a high recognition accurate rate.

  11. Handwritten recognition of Tamil vowels using deep learning

    Science.gov (United States)

    Ram Prashanth, N.; Siddarth, B.; Ganesh, Anirudh; Naveen Kumar, Vaegae

    2017-11-01

    We come across a large volume of handwritten texts in our daily lives and handwritten character recognition has long been an important area of research in pattern recognition. The complexity of the task varies among different languages and it so happens largely due to the similarity between characters, distinct shapes and number of characters which are all language-specific properties. There have been numerous works on character recognition of English alphabets and with laudable success, but regional languages have not been dealt with very frequently and with similar accuracies. In this paper, we explored the performance of Deep Belief Networks in the classification of Handwritten Tamil vowels, and conclusively compared the results obtained. The proposed method has shown satisfactory recognition accuracy in light of difficulties faced with regional languages such as similarity between characters and minute nuances that differentiate them. We can further extend this to all the Tamil characters.

  12. Handwritten Word Recognition Using Multi-view Analysis

    Science.gov (United States)

    de Oliveira, J. J.; de A. Freitas, C. O.; de Carvalho, J. M.; Sabourin, R.

    This paper brings a contribution to the problem of efficiently recognizing handwritten words from a limited size lexicon. For that, a multiple classifier system has been developed that analyzes the words from three different approximation levels, in order to get a computational approach inspired on the human reading process. For each approximation level a three-module architecture composed of a zoning mechanism (pseudo-segmenter), a feature extractor and a classifier is defined. The proposed application is the recognition of the Portuguese handwritten names of the months, for which a best recognition rate of 97.7% was obtained, using classifier combination.

  13. Handwritten document age classification based on handwriting styles

    Science.gov (United States)

    Ramaiah, Chetan; Kumar, Gaurav; Govindaraju, Venu

    2012-01-01

    Handwriting styles are constantly changing over time. We approach the novel problem of estimating the approximate age of Historical Handwritten Documents using Handwriting styles. This system will have many applications in handwritten document processing engines where specialized processing techniques can be applied based on the estimated age of the document. We propose to learn a distribution over styles across centuries using Topic Models and to apply a classifier over weights learned in order to estimate the approximate age of the documents. We present a comparison of different distance metrics such as Euclidean Distance and Hellinger Distance within this application.

  14. WORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION

    NARCIS (Netherlands)

    Chen, W.; Gader, P.

    2004-01-01

    Word level training refers to the process of learning the parameters of a word recognition system based on word level criteria functions. Previously, researchers trained lexicon­driven handwritten word recognition systems at the character level individually. These systems generally use statistical

  15. Interpreting Chicken-Scratch: Lexical Access for Handwritten Words

    Science.gov (United States)

    Barnhart, Anthony S.; Goldinger, Stephen D.

    2010-01-01

    Handwritten word recognition is a field of study that has largely been neglected in the psychological literature, despite its prevalence in society. Whereas studies of spoken word recognition almost exclusively employ natural, human voices as stimuli, studies of visual word recognition use synthetic typefaces, thus simplifying the process of word…

  16. Beyond OCR : Multi-faceted understanding of handwritten document characteristics

    NARCIS (Netherlands)

    He, Sheng; Schomaker, Lambert

    Handwritten document understanding is a fundamental research problem in pattern recognition and it relies on the effective features. In this paper, we propose a joint feature distribution (JFD) principle to design novel discriminative features which could be the joint distribution of features on

  17. ADAPTIVE CONTEXT PROCESSING IN ON-LINE HANDWRITTEN CHARACTER RECOGNITION

    NARCIS (Netherlands)

    Iwayama, N.; Ishigaki, K.

    2004-01-01

    We propose a new approach to context processing in on-line handwritten character recognition (OLCR). Based on the observation that writers often repeat the strings that they input, we take the approach of adaptive context processing. (ACP). In ACP, the strings input by a writer are automatically

  18. Handwritten-word spotting using biologically inspired features

    NARCIS (Netherlands)

    van der Zant, Tijn; Schomaker, Lambert; Haak, Koen

    For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language, and collection. We propose a biologically inspired

  19. Recognition of handwritten characters using local gradient feature descriptors

    NARCIS (Netherlands)

    Surinta, Olarik; Karaaba, Mahir F.; Schomaker, Lambert R.B.; Wiering, Marco A.

    2015-01-01

    Abstract In this paper we propose to use local gradient feature descriptors, namely the scale invariant feature transform keypoint descriptor and the histogram of oriented gradients, for handwritten character recognition. The local gradient feature descriptors are used to extract feature vectors

  20. Where are the Search Engines for Handwritten Documents?

    NARCIS (Netherlands)

    van der Zant, Tijn; Schomaker, Lambert; Zinger, Svitlana; van Schie, Henny

    Although the problems of optical character recognition for contemporary printed text have been resolved, for historical printed and handwritten connected cursive text (i.e. western style writing), they have not. This does not mean that scanning historical documents is not useful. This article

  1. Where are the search engines for handwritten documents?

    NARCIS (Netherlands)

    Zant, T.; Schomaker, L.; Zinger, S.; Schie, H.

    2009-01-01

    Although the problems of optical character recognition for contemporary printed text have been resolved, for historical printed and handwritten connected cursive text (i.e. western style writing), they have not. This does not mean that scanning historical documents is not useful. This article

  2. Integrating Ontological Knowledge and Textual Evidence in Estimating Gene and Gene Product Similarity

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Posse, Christian; Gopalan, Banu; Tratz, Stephen C.; Gregory, Michelle L.

    2006-06-08

    With the rising influence of the Gene On-tology, new approaches have emerged where the similarity between genes or gene products is obtained by comparing Gene Ontology code annotations associ-ated with them. So far, these approaches have solely relied on the knowledge en-coded in the Gene Ontology and the gene annotations associated with the Gene On-tology database. The goal of this paper is to demonstrate that improvements to these approaches can be obtained by integrating textual evidence extracted from relevant biomedical literature.

  3. Annotating Document Changes

    NARCIS (Netherlands)

    Spadini, E.

    2015-01-01

    Textual scholars use the collation for creating critical and genetic editions, or for studying textual transmission. Collation tools allow to compare the sources and detect the presence of textual variation; but they do not take into account the kind of variation involved. In this paper, we aim at

  4. A Study of Moment Based Features on Handwritten Digit Recognition

    Directory of Open Access Journals (Sweden)

    Pawan Kumar Singh

    2016-01-01

    Full Text Available Handwritten digit recognition plays a significant role in many user authentication applications in the modern world. As the handwritten digits are not of the same size, thickness, style, and orientation, therefore, these challenges are to be faced to resolve this problem. A lot of work has been done for various non-Indic scripts particularly, in case of Roman, but, in case of Indic scripts, the research is limited. This paper presents a script invariant handwritten digit recognition system for identifying digits written in five popular scripts of Indian subcontinent, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. A 130-element feature set which is basically a combination of six different types of moments, namely, geometric moment, moment invariant, affine moment invariant, Legendre moment, Zernike moment, and complex moment, has been estimated for each digit sample. Finally, the technique is evaluated on CMATER and MNIST databases using multiple classifiers and, after performing statistical significance tests, it is observed that Multilayer Perceptron (MLP classifier outperforms the others. Satisfactory recognition accuracies are attained for all the five mentioned scripts.

  5. A computational framework for converting textual clinical diagnostic criteria into the quality data model.

    Science.gov (United States)

    Hong, Na; Li, Dingcheng; Yu, Yue; Xiu, Qiongying; Liu, Hongfang; Jiang, Guoqian

    2016-10-01

    Constructing standard and computable clinical diagnostic criteria is an important but challenging research field in the clinical informatics community. The Quality Data Model (QDM) is emerging as a promising information model for standardizing clinical diagnostic criteria. To develop and evaluate automated methods for converting textual clinical diagnostic criteria in a structured format using QDM. We used a clinical Natural Language Processing (NLP) tool known as cTAKES to detect sentences and annotate events in diagnostic criteria. We developed a rule-based approach for assigning the QDM datatype(s) to an individual criterion, whereas we invoked a machine learning algorithm based on the Conditional Random Fields (CRFs) for annotating attributes belonging to each particular QDM datatype. We manually developed an annotated corpus as the gold standard and used standard measures (precision, recall and f-measure) for the performance evaluation. We harvested 267 individual criteria with the datatypes of Symptom and Laboratory Test from 63 textual diagnostic criteria. We manually annotated attributes and values in 142 individual Laboratory Test criteria. The average performance of our rule-based approach was 0.84 of precision, 0.86 of recall, and 0.85 of f-measure; the performance of CRFs-based classification was 0.95 of precision, 0.88 of recall and 0.91 of f-measure. We also implemented a web-based tool that automatically translates textual Laboratory Test criteria into the QDM XML template format. The results indicated that our approaches leveraging cTAKES and CRFs are effective in facilitating diagnostic criteria annotation and classification. Our NLP-based computational framework is a feasible and useful solution in developing diagnostic criteria representation and computerization. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. The Instructional Text like a Textual Genre

    Directory of Open Access Journals (Sweden)

    Adiane Fogali Marinello

    2011-07-01

    Full Text Available This article analyses the instructional text as a textual genre and is part of the research called Reading and text production from the textual genre perspective, done at Universidade de Caxias do Sul, Campus Universitário da Região dos Vinhedos. Firstly, some theoretical assumptions about textual genre are presented, then, the instructional text is characterized. After that an instructional text is analyzed and, finally, some activities related to reading and writing of the mentioned genre directed to High School and University students are suggested.

  7. Annotated bibliography

    International Nuclear Information System (INIS)

    1997-08-01

    Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners

  8. Inventions on presenting textual items in Graphical User Interface

    OpenAIRE

    Mishra, Umakant

    2014-01-01

    Although a GUI largely replaces textual descriptions by graphical icons, the textual items are not completely removed. The textual items are inevitably used in window titles, message boxes, help items, menu items and popup items. Textual items are necessary for communicating messages that are beyond the limitation of graphical messages. However, it is necessary to harness the textual items on the graphical interface in such a way that they complement each other to produce the best effect. One...

  9. Incremental Tensor Principal Component Analysis for Handwritten Digit Recognition

    Directory of Open Access Journals (Sweden)

    Chang Liu

    2014-01-01

    Full Text Available To overcome the shortcomings of traditional dimensionality reduction algorithms, incremental tensor principal component analysis (ITPCA based on updated-SVD technique algorithm is proposed in this paper. This paper proves the relationship between PCA, 2DPCA, MPCA, and the graph embedding framework theoretically and derives the incremental learning procedure to add single sample and multiple samples in detail. The experiments on handwritten digit recognition have demonstrated that ITPCA has achieved better recognition performance than that of vector-based principal component analysis (PCA, incremental principal component analysis (IPCA, and multilinear principal component analysis (MPCA algorithms. At the same time, ITPCA also has lower time and space complexity.

  10. Multi-script handwritten character recognition : Using feature descriptors and machine learning

    NARCIS (Netherlands)

    Surinta, Olarik

    2016-01-01

    Handwritten character recognition plays an important role in transforming raw visual image data obtained from handwritten documents using for example scanners to a format which is understandable by a computer. It is an important application in the field of pattern recognition, machine learning and

  11. IMPROVEMENT IN HANDWRITTEN NUMERAL STRING RECOGNITION BY SLANT NORMALIZATION AND CONTEXTUAL INFORMATION

    NARCIS (Netherlands)

    Britto jr., A. de S.; Sabourin, R.; Lethelier, E.; Bortolozzi, F.; Suen, C.Y.

    2004-01-01

    This work describes a way of enhancing handwritten numeral string recognition by considering slant normalization and contextual information to train an implicit segmentation­based system. A word slant normalization method is modified in order to improve the results for handwritten numeral strings.

  12. Pictorial, Textual, and Picto-Textual Glosses in E-Reading: A Comparative Study

    Science.gov (United States)

    Shalmani, Hamed Babaie; Sabet, Masoud Khalili

    2010-01-01

    This research explored the effects of three types of multimedia glosses on the reading comprehension of learners in an EFL context. From among the three experimental groups under study, one received treatment on five academic reading passages through picto-textual glosses where both textual definitions and relevant images of words popped up, thus…

  13. Second Language Incidental Vocabulary Learning: The Effect of Online Textual, Pictorial, and Textual Pictorial Glosses

    Science.gov (United States)

    Shahrokni, Seyyed Abdollah

    2009-01-01

    This empirical study investigates the effect of online textual, pictorial, and textual pictorial glosses on the incidental vocabulary learning of 90 adult elementary Iranian EFL learners. The participants were selected from a pool of 140 volunteers based on their performance on an English placement test as well as a knowledge test of the target…

  14. MPEG-7 based video annotation and browsing

    Science.gov (United States)

    Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens

    2003-11-01

    The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.

  15. Modeling the lexical morphology of Western handwritten signatures.

    Directory of Open Access Journals (Sweden)

    Moises Diaz-Cabrera

    Full Text Available A handwritten signature is the final response to a complex cognitive and neuromuscular process which is the result of the learning process. Because of the many factors involved in signing, it is possible to study the signature from many points of view: graphologists, forensic experts, neurologists and computer vision experts have all examined them. Researchers study written signatures for psychiatric, penal, health and automatic verification purposes. As a potentially useful, multi-purpose study, this paper is focused on the lexical morphology of handwritten signatures. This we understand to mean the identification, analysis, and description of the signature structures of a given signer. In this work we analyze different public datasets involving 1533 signers from different Western geographical areas. Some relevant characteristics of signature lexical morphology have been selected, examined in terms of their probability distribution functions and modeled through a General Extreme Value distribution. This study suggests some useful models for multi-disciplinary sciences which depend on handwriting signatures.

  16. Script-independent text line segmentation in freestyle handwritten documents.

    Science.gov (United States)

    Li, Yi; Zheng, Yefeng; Doermann, David; Jaeger, Stefan; Li, Yi

    2008-08-01

    Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.

  17. Spatio-textual search on Spark

    OpenAIRE

    Kloster, Tord

    2017-01-01

    The amount of spatially aware data is growing at a rapid rate, and challenges both processing and organizing such data is in focus in the scientific world and the industry. But spatial data seldom exists alone, usually accompanied by some form of textual property. The challenges increase as we attempt to process the spatio-textual documents that are created, and the usage of Big Data platforms become a necessity. This paper provides an insight into different approaches on how to meet the spat...

  18. Automating generation of textual class definitions from OWL to English.

    Science.gov (United States)

    Stevens, Robert; Malone, James; Williams, Sandra; Power, Richard; Third, Allan

    2011-05-17

    developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point. An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html.

  19. Boosting bonsai trees for handwritten/printed text discrimination

    Science.gov (United States)

    Ricquebourg, Yann; Raymond, Christian; Poirriez, Baptiste; Lemaitre, Aurélie; Coüasnon, Bertrand

    2013-12-01

    Boosting over decision-stumps proved its efficiency in Natural Language Processing essentially with symbolic features, and its good properties (fast, few and not critical parameters, not sensitive to over-fitting) could be of great interest in the numeric world of pixel images. In this article we investigated the use of boosting over small decision trees, in image classification processing, for the discrimination of handwritten/printed text. Then, we conducted experiments to compare it to usual SVM-based classification revealing convincing results with very close performance, but with faster predictions and behaving far less as a black-box. Those promising results tend to make use of this classifier in more complex recognition tasks like multiclass problems.

  20. A Novel Handwritten Letter Recognizer Using Enhanced Evolutionary Neural Network

    Science.gov (United States)

    Mahmoudi, Fariborz; Mirzashaeri, Mohsen; Shahamatnia, Ehsan; Faridnia, Saed

    This paper introduces a novel design for handwritten letter recognition by employing a hybrid back-propagation neural network with an enhanced evolutionary algorithm. Feeding the neural network consists of a new approach which is invariant to translation, rotation, and scaling of input letters. Evolutionary algorithm is used for the global search of the search space and the back-propagation algorithm is used for the local search. The results have been computed by implementing this approach for recognizing 26 English capital letters in the handwritings of different people. The computational results show that the neural network reaches very satisfying results with relatively scarce input data and a promising performance improvement in convergence of the hybrid evolutionary back-propagation algorithms is exhibited.

  1. Ancient administrative handwritten documents: X-ray analysis and imaging

    International Nuclear Information System (INIS)

    Albertin, F.; Astolfo, A.; Stampanoni, M.; Peccenini, Eva; Hwu, Y.; Kaplan, F.; Margaritondo, G.

    2015-01-01

    The heavy-element content of ink in ancient administrative documents makes it possible to detect the characters with different synchrotron imaging techniques, based on attenuation or refraction. This is the first step in the direction of non-interactive virtual X-ray reading. Handwritten characters in administrative antique documents from three centuries have been detected using different synchrotron X-ray imaging techniques. Heavy elements in ancient inks, present even for everyday administrative manuscripts as shown by X-ray fluorescence spectra, produce attenuation contrast. In most cases the image quality is good enough for tomography reconstruction in view of future applications to virtual page-by-page ‘reading’. When attenuation is too low, differential phase contrast imaging can reveal the characters from refractive index effects. The results are potentially important for new information harvesting strategies, for example from the huge Archivio di Stato collection, objective of the Venice Time Machine project

  2. Ancient administrative handwritten documents: X-ray analysis and imaging

    Energy Technology Data Exchange (ETDEWEB)

    Albertin, F., E-mail: fauzia.albertin@epfl.ch [Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland); Astolfo, A. [Paul Scherrer Institut (PSI), Villigen (Switzerland); Stampanoni, M. [Paul Scherrer Institut (PSI), Villigen (Switzerland); ETHZ, Zürich (Switzerland); Peccenini, Eva [University of Ferrara (Italy); Technopole of Ferrara (Italy); Hwu, Y. [Academia Sinica, Taipei, Taiwan (China); Kaplan, F. [Ecole Polytechnique Fédérale de Lausanne (EPFL) (Switzerland); Margaritondo, G. [Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland)

    2015-01-30

    The heavy-element content of ink in ancient administrative documents makes it possible to detect the characters with different synchrotron imaging techniques, based on attenuation or refraction. This is the first step in the direction of non-interactive virtual X-ray reading. Handwritten characters in administrative antique documents from three centuries have been detected using different synchrotron X-ray imaging techniques. Heavy elements in ancient inks, present even for everyday administrative manuscripts as shown by X-ray fluorescence spectra, produce attenuation contrast. In most cases the image quality is good enough for tomography reconstruction in view of future applications to virtual page-by-page ‘reading’. When attenuation is too low, differential phase contrast imaging can reveal the characters from refractive index effects. The results are potentially important for new information harvesting strategies, for example from the huge Archivio di Stato collection, objective of the Venice Time Machine project.

  3. Textual Enhancement of Input: Issues and Possibilities

    Science.gov (United States)

    Han, ZhaoHong; Park, Eun Sung; Combs, Charles

    2008-01-01

    The input enhancement hypothesis proposed by Sharwood Smith (1991, 1993) has stimulated considerable research over the last 15 years. This article reviews the research on textual enhancement of input (TE), an area where the majority of input enhancement studies have aggregated. Methodological idiosyncrasies are the norm of this body of research.…

  4. Transcription of Spanish Historical Handwritten Documents with Deep Neural Networks

    Directory of Open Access Journals (Sweden)

    Emilio Granell

    2018-01-01

    Full Text Available The digitization of historical handwritten document images is important for the preservation of cultural heritage. Moreover, the transcription of text images obtained from digitization is necessary to provide efficient information access to the content of these documents. Handwritten Text Recognition (HTR has become an important research topic in the areas of image and computational language processing that allows us to obtain transcriptions from text images. State-of-the-art HTR systems are, however, far from perfect. One difficulty is that they have to cope with image noise and handwriting variability. Another difficulty is the presence of a large amount of Out-Of-Vocabulary (OOV words in ancient historical texts. A solution to this problem is to use external lexical resources, but such resources might be scarce or unavailable given the nature and the age of such documents. This work proposes a solution to avoid this limitation. It consists of associating a powerful optical recognition system that will cope with image noise and variability, with a language model based on sub-lexical units that will model OOV words. Such a language modeling approach reduces the size of the lexicon while increasing the lexicon coverage. Experiments are first conducted on the publicly available Rodrigo dataset, which contains the digitization of an ancient Spanish manuscript, with a recognizer based on Hidden Markov Models (HMMs. They show that sub-lexical units outperform word units in terms of Word Error Rate (WER, Character Error Rate (CER and OOV word accuracy rate. This approach is then applied to deep net classifiers, namely Bi-directional Long-Short Term Memory (BLSTMs and Convolutional Recurrent Neural Nets (CRNNs. Results show that CRNNs outperform HMMs and BLSTMs, reaching the lowest WER and CER for this image dataset and significantly improving OOV recognition.

  5. Annotating images by mining image search results.

    Science.gov (United States)

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  6. Annotation an effective device for student feedback: a critical review of the literature.

    Science.gov (United States)

    Ball, Elaine C

    2010-05-01

    The paper examines hand-written annotation, its many features, difficulties and strengths as a feedback tool. It extends and clarifies what modest evidence is in the public domain and offers an evaluation of how to use annotation effectively in the support of student feedback [Marshall, C.M., 1998a. The Future of Annotation in a Digital (paper) World. Presented at the 35th Annual GLSLIS Clinic: Successes and Failures of Digital Libraries, June 20-24, University of Illinois at Urbana-Champaign, March 24, pp. 1-20; Marshall, C.M., 1998b. Toward an ecology of hypertext annotation. Hypertext. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, June 20-24, Pittsburgh Pennsylvania, US, pp. 40-49; Wolfe, J.L., Nuewirth, C.M., 2001. From the margins to the centre: the future of annotation. Journal of Business and Technical Communication, 15(3), 333-371; Diyanni, R., 2002. One Hundred Great Essays. Addison-Wesley, New York; Wolfe, J.L., 2002. Marginal pedagogy: how annotated texts affect writing-from-source texts. Written Communication, 19(2), 297-333; Liu, K., 2006. Annotation as an index to critical writing. Urban Education, 41, 192-207; Feito, A., Donahue, P., 2008. Minding the gap annotation as preparation for discussion. Arts and Humanities in Higher Education, 7(3), 295-307; Ball, E., 2009. A participatory action research study on handwritten annotation feedback and its impact on staff and students. Systemic Practice and Action Research, 22(2), 111-124; Ball, E., Franks, H., McGrath, M., Leigh, J., 2009. Annotation is a valuable tool to enhance learning and assessment in student essays. Nurse Education Today, 29(3), 284-291]. Although a significant number of studies examine annotation, this is largely related to on-line tools and computer mediated communication and not hand-written annotation as comment, phrase or sign written on the student essay to provide critique. Little systematic research has been conducted to consider how this latter form

  7. A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition

    OpenAIRE

    Zhang, Jianshu; Du, Jun; Dai, Lirong

    2017-01-01

    In this study, we present a novel end-to-end approach based on the encoder-decoder framework with the attention mechanism for online handwritten mathematical expression recognition (OHMER). First, the input two-dimensional ink trajectory information of handwritten expression is encoded via the gated recurrent unit based recurrent neural network (GRU-RNN). Then the decoder is also implemented by the GRU-RNN with a coverage-based attention model. The proposed approach can simultaneously accompl...

  8. On writing legibly: Processing fluency systematically biases evaluations of handwritten material

    OpenAIRE

    Greifeneder, Rainer; Alt, Alexander; Bottenberg, Konstantin; Seele, Tim; Zelt, Sarah; Wagener, Dietrich

    2010-01-01

    Evaluations of handwritten essays or exams are often suspected of being biased, such as by mood states or individual predilections. Although most of these influences are unsystematic, at least one bias is problematic because it systematically affects evaluations of handwritten materials. Three experiments revealed that essays in legible as compared to less legible handwriting were evaluated more positively. This robust finding was related to a basic judgmental mechanism that builds on the flu...

  9. Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words

    OpenAIRE

    Ball , Gregory R.; Srihari , Sargur N.; Srinivasan , Harish

    2006-01-01

    http://www.suvisoft.com; Given a set of handwritten documents, a common goal is to search for a relevant subset. Attempting to find a query word or image in such a set of documents is called word spotting. Spotting handwritten words in documents written in the Latin alphabet, and more recently in Arabic, has received considerable attention. One issue is generating candidate word regions on a page. Attempting to definitely segment the document into such regions (automatic segmentation) can mee...

  10. Comparison of concept recognizers for building the Open Biomedical Annotator

    Directory of Open Access Journals (Sweden)

    Rubin Daniel

    2009-09-01

    Full Text Available Abstract The National Center for Biomedical Ontology (NCBO is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2:S1. The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers – NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.

  11. Emotion models for textual emotion classification

    Science.gov (United States)

    Bruna, O.; Avetisyan, H.; Holub, J.

    2016-11-01

    This paper deals with textual emotion classification which gained attention in recent years. Emotion classification is used in user experience, product evaluation, national security, and tutoring applications. It attempts to detect the emotional content in the input text and based on different approaches establish what kind of emotional content is present, if any. Textual emotion classification is the most difficult to handle, since it relies mainly on linguistic resources and it introduces many challenges to assignment of text to emotion represented by a proper model. A crucial part of each emotion detector is emotion model. Focus of this paper is to introduce emotion models used for classification. Categorical and dimensional models of emotion are explained and some more advanced approaches are mentioned.

  12. The Proximate Unit in Chinese Handwritten Character Production

    Directory of Open Access Journals (Sweden)

    Jenn-Yeu eChen

    2013-08-01

    Full Text Available In spoken word production, a proximate unit is the first phonological unit at the sublexical level that is selectable for production (O’Seaghdha, Chen, & Chen, 2010. The present study investigated whether the proximate unit in Chinese handwritten word production is the stroke, the radical, or something in between. A written version of the form preparation task was adopted. Chinese participants learned sets of two-character words, later were cued with the first character of each word, and had to write down the second character (the target. Response times were measured from the onset of a cue character to the onset of a written response. In Experiment 1, the target characters within a block shared (homogeneous or did not share (heterogeneous the first stroke. In Experiment 2, the first two strokes were shared in the homogeneous blocks. Response times in the homogeneous blocks and in the heterogeneous blocks were comparable in both experiments (Exp. 1: 687 ms vs. 684 ms, Exp. 2: 717 vs. 716. In Experiment 3 and 4, the target characters within a block shared or did not share the first radical. Response times in the homogeneous blocks were significantly faster than those in the heterogeneous blocks (Exp. 3: 685 vs. 704, Exp. 4: 594 vs. 650. In Experiment 5 and 6, the shared component was a Gestalt-like form that is more than a stroke, constitutes a portion of the target character, can be a stand-alone character itself, can be a radical of another character but is not a radical of the target character (e.g., 士in聲, 鼓, 穀, 款; called a logographeme. Response times in the homogeneous blocks were significantly faster than those in the heterogeneous blocks (Exp. 5: 576 vs. 625, Exp. 6: 586 vs. 620. These results suggest a model of Chinese handwritten character production in which the stroke is not a functional unit, the radical plays the role of a morpheme, and the logographeme is the proximate unit.

  13. ASM Based Synthesis of Handwritten Arabic Text Pages

    Directory of Open Access Journals (Sweden)

    Laslo Dinges

    2015-01-01

    Full Text Available Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.

  14. ASM Based Synthesis of Handwritten Arabic Text Pages.

    Science.gov (United States)

    Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed

    2015-01-01

    Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.

  15. Optical character recognition of handwritten Arabic using hidden Markov models

    Science.gov (United States)

    Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.

    2011-04-01

    The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.

  16. WATERSHED ALGORITHM BASED SEGMENTATION FOR HANDWRITTEN TEXT IDENTIFICATION

    Directory of Open Access Journals (Sweden)

    P. Mathivanan

    2014-02-01

    Full Text Available In this paper we develop a system for writer identification which involves four processing steps like preprocessing, segmentation, feature extraction and writer identification using neural network. In the preprocessing phase the handwritten text is subjected to slant removal process for segmentation and feature extraction. After this step the text image enters into the process of noise removal and gray level conversion. The preprocessed image is further segmented by using morphological watershed algorithm, where the text lines are segmented into single words and then into single letters. The segmented image is feature extracted by Daubechies’5/3 integer wavelet transform to reduce training complexity [1, 6]. This process is lossless and reversible [10], [14]. These extracted features are given as input to our neural network for writer identification process and a target image is selected for each training process in the 2-layer neural network. With the several trained output data obtained from different target help in text identification. It is a multilingual text analysis which provides simple and efficient text segmentation.

  17. Handwritten Javanese Character Recognition Using Several Artificial Neural Network Methods

    Directory of Open Access Journals (Sweden)

    Gregorius Satia Budhi

    2015-07-01

    Full Text Available Javanese characters are traditional characters that are used to write the Javanese language. The Javanese language is a language used by many people on the island of Java, Indonesia. The use of Javanese characters is diminishing more and more because of the difficulty of studying the Javanese characters themselves. The Javanese character set consists of basic characters, numbers, complementary characters, and so on. In this research we have developed a system to recognize Javanese characters. Input for the system is a digital image containing several handwritten Javanese characters. Preprocessing and segmentation are performed on the input image to get each character. For each character, feature extraction is done using the ICZ-ZCZ method. The output from feature extraction will become input for an artificial neural network. We used several artificial neural networks, namely a bidirectional associative memory network, a counterpropagation network, an evolutionary network, a backpropagation network, and a backpropagation network combined with chi2. From the experimental results it can be seen that the combination of chi2 and backpropagation achieved better recognition accuracy than the other methods.

  18. Recognizing textual entailment models and applications

    CERN Document Server

    Dagan, Ido; Sammons, Mark

    2013-01-01

    In the last few years, a number of NLP researchers have developed and participated in the task of Recognizing Textual Entailment (RTE). This task encapsulates Natural Language Understanding capabilities within a very simple interface: recognizing when the meaning of a text snippet is contained in the meaning of a second piece of text. This simple abstraction of an exceedingly complex problem has broad appeal partly because it can be conceived also as a component in other NLP applications, from Machine Translation to Semantic Search to Information Extraction. It also avoids commitment to any sp

  19. Beyond Sleep – a case study on textual instability and challenges for textual scholarship

    NARCIS (Netherlands)

    Kegel, P.W.

    2016-01-01

    Although W.F. Hermans’ novel Nooit meer slapen (Beyond Sleep) has been published within the multi-volume Complete Works edition already in 2010, the work remains a challenge for textual scholarship. First published in 1966, it has been revised by the author many times. Next to that, the actual

  20. Annotating image ROIs with text descriptions for multimodal biomedical document retrieval

    Science.gov (United States)

    You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-01-01

    Regions of interest (ROIs) that are pointed to by overlaid markers (arrows, asterisks, etc.) in biomedical images are expected to contain more important and relevant information than other regions for biomedical article indexing and retrieval. We have developed several algorithms that localize and extract the ROIs by recognizing markers on images. Cropped ROIs then need to be annotated with contents describing them best. In most cases accurate textual descriptions of the ROIs can be found from figure captions, and these need to be combined with image ROIs for annotation. The annotated ROIs can then be used to, for example, train classifiers that separate ROIs into known categories (medical concepts), or to build visual ontologies, for indexing and retrieval of biomedical articles. We propose an algorithm that pairs visual and textual ROIs that are extracted from images and figure captions, respectively. This algorithm based on dynamic time warping (DTW) clusters recognized pointers into groups, each of which contains pointers with identical visual properties (shape, size, color, etc.). Then a rule-based matching algorithm finds the best matching group for each textual ROI mention. Our method yields a precision and recall of 96% and 79%, respectively, when ground truth textual ROI data is used.

  1. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  2. Comparison of crisp and fuzzy character networks in handwritten word recognition

    Science.gov (United States)

    Gader, Paul; Mohamed, Magdi; Chiang, Jung-Hsien

    1992-01-01

    Experiments involving handwritten word recognition on words taken from images of handwritten address blocks from the United States Postal Service mailstream are described. The word recognition algorithm relies on the use of neural networks at the character level. The neural networks are trained using crisp and fuzzy desired outputs. The fuzzy outputs were defined using a fuzzy k-nearest neighbor algorithm. The crisp networks slightly outperformed the fuzzy networks at the character level but the fuzzy networks outperformed the crisp networks at the word level.

  3. Detection of Text Lines of Handwritten Arabic Manuscripts using Markov Decision Processes

    Directory of Open Access Journals (Sweden)

    Youssef Boulid

    2016-09-01

    Full Text Available In a character recognition systems, the segmentation phase is critical since the accuracy of the recognition depend strongly on it. In this paper we present an approach based on Markov Decision Processes to extract text lines from binary images of Arabic handwritten documents. The proposed approach detects the connected components belonging to the same line by making use of knowledge about features and arrangement of those components. The initial results show that the system is promising for extracting Arabic handwritten lines.

  4. Dealing with behavioral ambiguity in textual process descriptions

    NARCIS (Netherlands)

    van der Aa, Han; Leopold, Henrik; Reijers, Hajo A.

    2016-01-01

    Textual process descriptions are widely used in organizations since they can be created and understood by virtually everyone. The inherent ambiguity of natural language, however, impedes the automated analysis of textual process descriptions. While human readers can use their context knowledge to

  5. Textual Condensation in Printed Dictionaries. A Theoretical Draft ...

    African Journals Online (AJOL)

    This article presents an excerpt from a theory of lexicographic texts which deals particularly with dictionary articles. Almost all characteristics of dictionary articles considered as typically lexicographic may be regarded as results of textual condensation processes. A theory of textual condensation in lexicography thus makes it ...

  6. Towards a textual history of La Galatea

    Directory of Open Access Journals (Sweden)

    Juan Montero

    2010-12-01

    Full Text Available Studying the different printings and editions of La Galatea allows one to establish the work’s textual history. The present study seeks to accomplish this, following this pattern: a the princeps of 1585; b early printings from 1590-1618 and from the book’s editorial resurgence from 1736-1846; c the editions from 1863 until today and the foundations of a critical edition. The following conclusions are proposed: a the princeps is the closest text to the author’s intention, but contains errors that any editor should identify and correct, if possible; b the editions that date from 1590-1846 include useful readings in that endeavor; c modern editions of La Galatea have erred in general in an exaggerated fidelity to the princeps, with the exception of the one by C. Rosell in 1863, in which other printings were taken into account and the text was emended, even excessively.

  7. Antiabecedarian Desires: Odd Narratology and Digital Textuality

    Directory of Open Access Journals (Sweden)

    Asunción López-Varela Azcárate

    2014-05-01

    Full Text Available Los sistemas de escritura rompen barreras temporales y permiten compartir conocimiento y preservarlo. Como si fuesen organismos vivos, las estructuras narratológicas que conforman la comunicación textual crecen a partir de principios de orden y formas de codificación cuyas raíces retroceden hasta un proto-alfabeto de origen semítico. Sin embargo, la historia literaria incluye muchos ejemplos que, cual virus, han buscado quebrantar el cuerpo de la textualidad alfabética. Este ensayo estudia tres artistas fundamentales, James Joyce, Jorge Luis Borges,William Burroughs, junto con varias obras contemporáneas de literatura electrónica. Todos ellos ponen en cuestión los principios organizativos alfabéticos y anticipan el debate sobre la importancia o no de las estructuras lineales en los sistemas de representación.

  8. Device of Definition of Hand-Written Documents Belonging to One Executor

    Directory of Open Access Journals (Sweden)

    S. D. Kulik

    2012-03-01

    Full Text Available Results of working out of the device of definition of hand-written documents belonging to the executor of the text in Russian are presented. The device is intended for automation of work of experts and allows to solve problems of information security and search of criminals.

  9. Students' Perceived Preference for Visual and Auditory Assessment with E-Handwritten Feedback

    Science.gov (United States)

    Crews, Tena B.; Wilkinson, Kelly

    2010-01-01

    Undergraduate business communication students were surveyed to determine their perceived most effective method of assessment on writing assignments. The results indicated students' preference for a process that incorporates visual, auditory, and e-handwritten presentation via a tablet PC. Students also identified this assessment process would…

  10. Comparing Postsecondary Marketing Student Performance on Computer-Based and Handwritten Essay Tests

    Science.gov (United States)

    Truell, Allen D.; Alexander, Melody W.; Davis, Rodney E.

    2004-01-01

    The purpose of this study was to determine if there were differences in postsecondary marketing student performance on essay tests based on test format (i.e., computer-based or handwritten). Specifically, the variables of performance, test completion time, and gender were explored for differences based on essay test format. Results of the study…

  11. An adaptive deep Q-learning strategy for handwritten digit recognition.

    Science.gov (United States)

    Qiao, Junfei; Wang, Gongming; Li, Wenjing; Chen, Min

    2018-02-22

    Handwritten digits recognition is a challenging problem in recent years. Although many deep learning-based classification algorithms are studied for handwritten digits recognition, the recognition accuracy and running time still need to be further improved. In this paper, an adaptive deep Q-learning strategy is proposed to improve accuracy and shorten running time for handwritten digit recognition. The adaptive deep Q-learning strategy combines the feature-extracting capability of deep learning and the decision-making of reinforcement learning to form an adaptive Q-learning deep belief network (Q-ADBN). First, Q-ADBN extracts the features of original images using an adaptive deep auto-encoder (ADAE), and the extracted features are considered as the current states of Q-learning algorithm. Second, Q-ADBN receives Q-function (reward signal) during recognition of the current states, and the final handwritten digits recognition is implemented by maximizing the Q-function using Q-learning algorithm. Finally, experimental results from the well-known MNIST dataset show that the proposed Q-ADBN has a superiority to other similar methods in terms of accuracy and running time. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Textual emotion recognition for enhancing enterprise computing

    Science.gov (United States)

    Quan, Changqin; Ren, Fuji

    2016-05-01

    The growing interest in affective computing (AC) brings a lot of valuable research topics that can meet different application demands in enterprise systems. The present study explores a sub area of AC techniques - textual emotion recognition for enhancing enterprise computing. Multi-label emotion recognition in text is able to provide a more comprehensive understanding of emotions than single label emotion recognition. A representation of 'emotion state in text' is proposed to encompass the multidimensional emotions in text. It ensures the description in a formal way of the configurations of basic emotions as well as of the relations between them. Our method allows recognition of the emotions for the words bear indirect emotions, emotion ambiguity and multiple emotions. We further investigate the effect of word order for emotional expression by comparing the performances of bag-of-words model and sequence model for multi-label sentence emotion recognition. The experiments show that the classification results under sequence model are better than under bag-of-words model. And homogeneous Markov model showed promising results of multi-label sentence emotion recognition. This emotion recognition system is able to provide a convenient way to acquire valuable emotion information and to improve enterprise competitive ability in many aspects.

  13. Extracting meronomy relations from domain-specific, textual corporate databases

    NARCIS (Netherlands)

    Ittoo, R.A.; Bouma, G.; Maruster, L.; Wortmann, J.C.; Hopfe, C.J.; Rezgui, Y.; Métais, E.; Preece, A.; Li, H.

    2010-01-01

    Various techniques for learning meronymy relationships from open-domain corpora exist. However, extracting meronymy relationships from domain-specific, textual corporate databases has been overlooked, despite numerous application opportunities particularly in domains like product development and/or

  14. The Effects of Handwritten Feedback on Paper and Tablet PC in Learning Japanese Writing

    Directory of Open Access Journals (Sweden)

    Kai LI

    2007-12-01

    Full Text Available This paper compares the effect of paper-basedhandwritten feedback (PBHF and that of Tablet PC-basedhandwritten feedback (TBHF in learning Japanese writing.The study contributes to the research on motivation,usability and presence when learners are given differentmedia-based handwritten error feedback. The resultsindicated that there was little difference in the effect of thetwo media on motivation and usability factors. However,PBHF showed a positive effect on presence factor thanTBHF. Also, there was little difference in proficiencyimprovement after the students reviewed different mediabased handwritten feedback. The results of this studysuggest that language teachers should not use ICT withtraditional strategies, but in an innovative way to improvetheir writing instruction and enhance learners’ writingproficiency.

  15. Segmentation of Arabic Handwritten Documents into Text Lines using Watershed Transform

    Directory of Open Access Journals (Sweden)

    Abdelghani Souhar

    2017-12-01

    Full Text Available A crucial task in character recognition systems is the segmentation of the document into text lines and especially if it is handwritten. When dealing with non-Latin document such as Arabic, the challenge becomes greater since in addition to the variability of writing, the presence of diacritical points and the high number of ascender and descender characters complicates more the process of the segmentation. To remedy with this complexity and even to make this difficulty an advantage since the focus is on the Arabic language which is semi-cursive in nature, a method based on the Watershed Transform technique is proposed. Tested on «Handwritten Arabic Proximity Datasets» a segmentation rate of 93% for a 95% of matching score is achieved.

  16. Marker Registration Technique for Handwritten Text Marker in Augmented Reality Applications

    Science.gov (United States)

    Thanaborvornwiwat, N.; Patanukhom, K.

    2018-04-01

    Marker registration is a fundamental process to estimate camera poses in marker-based Augmented Reality (AR) systems. We developed AR system that creates correspondence virtual objects on handwritten text markers. This paper presents a new method for registration that is robust for low-content text markers, variation of camera poses, and variation of handwritten styles. The proposed method uses Maximally Stable Extremal Regions (MSER) and polygon simplification for a feature point extraction. The experiment shows that we need to extract only five feature points per image which can provide the best registration results. An exhaustive search is used to find the best matching pattern of the feature points in two images. We also compared performance of the proposed method to some existing registration methods and found that the proposed method can provide better accuracy and time efficiency.

  17. Recognition of Handwritten Arabic words using a neuro-fuzzy network

    International Nuclear Information System (INIS)

    Boukharouba, Abdelhak; Bennia, Abdelhak

    2008-01-01

    We present a new method for the recognition of handwritten Arabic words based on neuro-fuzzy hybrid network. As a first step, connected components (CCs) of black pixels are detected. Then the system determines which CCs are sub-words and which are stress marks. The stress marks are then isolated and identified separately and the sub-words are segmented into graphemes. Each grapheme is described by topological and statistical features. Fuzzy rules are extracted from training examples by a hybrid learning scheme comprised of two phases: rule generation phase from data using a fuzzy c-means, and rule parameter tuning phase using gradient descent learning. After learning, the network encodes in its topology the essential design parameters of a fuzzy inference system.The contribution of this technique is shown through the significant tests performed on a handwritten Arabic words database

  18. Handwritten dynamics assessment through convolutional neural networks: An application to Parkinson's disease identification.

    Science.gov (United States)

    Pereira, Clayton R; Pereira, Danilo R; Rosa, Gustavo H; Albuquerque, Victor H C; Weber, Silke A T; Hook, Christian; Papa, João P

    2018-04-16

    Parkinson's disease (PD) is considered a degenerative disorder that affects the motor system, which may cause tremors, micrography, and the freezing of gait. Although PD is related to the lack of dopamine, the triggering process of its development is not fully understood yet. In this work, we introduce convolutional neural networks to learn features from images produced by handwritten dynamics, which capture different information during the individual's assessment. Additionally, we make available a dataset composed of images and signal-based data to foster the research related to computer-aided PD diagnosis. The proposed approach was compared against raw data and texture-based descriptors, showing suitable results, mainly in the context of early stage detection, with results nearly to 95%. The analysis of handwritten dynamics using deep learning techniques showed to be useful for automatic Parkinson's disease identification, as well as it can outperform handcrafted features. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters

    Directory of Open Access Journals (Sweden)

    Mithun Biswas

    2017-06-01

    Full Text Available BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

  20. Handwritten Character Recognition Based on the Specificity and the Singularity of the Arabic Language

    Directory of Open Access Journals (Sweden)

    Youssef Boulid

    2017-08-01

    Full Text Available A good Arabic handwritten recognition system must consider the characteristics of Arabic letters which can be explicit such as the presence of diacritics or implicit such as the baseline information (a virtual line on which cursive text are aligned and/join. In order to find an adequate method of features extraction, we have taken into consideration the nature of the Arabic characters. The paper investigate two methods based on two different visions: one describes the image in terms of the distribution of pixels, and the other describes it in terms of local patterns. Spatial Distribution of Pixels (SDP is used according to the first vision; whereas Local Binary Patterns (LBP are used for the second one. Tested on the Arabic portion of the Isolated Farsi Handwritten Character Database (IFHCDB and using neural networks as a classifier, SDP achieve a recognition rate around 94% while LBP achieve a recognition rate of about 96%.

  1. Contribution to automatic handwritten characters recognition. Application to optical moving characters recognition

    International Nuclear Information System (INIS)

    Gokana, Denis

    1986-01-01

    This paper describes a research work on computer aided vision relating to the design of a vision system which can recognize isolated handwritten characters written on a mobile support. We use a technique which consists in analyzing information contained in the contours of the polygon circumscribed to the character's shape. These contours are segmented and labelled to give a new set of features constituted by: - right and left 'profiles', - topological and algebraic unvarying properties. A new method of character's recognition induced from this representation based on a multilevel hierarchical technique is then described. In the primary level, we use a fuzzy classification with dynamic programming technique using 'profiles'. The other levels adjust the recognition by using topological and algebraic unvarying properties. Several results are presented and an accuracy of 99 pc was reached for handwritten numeral characters, thereby attesting the robustness of our algorithm. (author) [fr

  2. A study of symbol segmentation method for handwritten mathematical formula recognition using mathematical structure information

    OpenAIRE

    Toyozumi, Kenichi; Yamada, Naoya; Kitasaka, Takayuki; Mori, Kensaku; Suenaga, Yasuhito; Mase, Kenji; Takahashi, Tomoichi

    2004-01-01

    Symbol segmentation is very important in handwritten mathematical formula recognition, since it is the very first portion of the recognition, since it is the very first portion of the recognition process. This paper proposes a new symbol segmentation method using mathematical structure information. The base technique of symbol segmentation employed in theexisting methods is dynamic programming which optimizes the overall results of individual symbol recognition. The new method we propose here...

  3. HWNet v2: An Efficient Word Image Representation for Handwritten Documents

    OpenAIRE

    Krishnan, Praveen; Jawahar, C. V.

    2018-01-01

    We present a framework for learning efficient holistic representation for handwritten word images. The proposed method uses a deep convolutional neural network with traditional classification loss. The major strengths of our work lie in: (i) the efficient usage of synthetic data to pre-train a deep network, (ii) an adapted version of ResNet-34 architecture with region of interest pooling (referred as HWNet v2) which learns discriminative features with variable sized word images, and (iii) rea...

  4. [About da tai - abortion in old Chinese folk medicine handwritten manuscripts].

    Science.gov (United States)

    Zheng, Jinsheng

    2013-01-01

    Of 881 Chinese handwritten volumes with medical texts of the 17th through mid-20th century held by Staatsbibliothek zu Berlin and Ethnologisches Museum Berlin-Dahlem, 48 volumes include prescriptions for induced abortion. A comparison shows that these records are significantly different from references to abortion in Chinese printed medical texts of pre-modern times. For example, the percentage of recipes recommended for artificial abortions in handwritten texts is significantly higher than those in printed medical books. Authors of handwritten texts used 25 terms to designate artificial abortion, with the term da tai [see text], lit.: "to strike the fetus", occurring most frequently. Its meaning is well defined, in contrast to other terms used, such as duo tai [see text], lit: "to make a fetus fall", xia tai [see text], lit. "to bring a fetus down", und duan chan [see text], lit., to interrupt birthing", which is mostly used to indicate a temporary or permanent sterilization. Pre-modern Chinese medicine has not generally abstained from inducing abortions; physicians showed a differentiating attitude. While abortions were descibed as "things a [physician with an attitude of] humaneness will not do", in case a pregnancy was seen as too risky for a woman she was offered medication to terminate this pregnancy. The commercial application of abortifacients has been recorded in China since ancient times. A request for such services has continued over time for various reasons, including so-called illegitimate pregnancies, and those by nuns, widows and prostitutes. In general, recipes to induce abortions documented in printed medical literature have mild effects and are to be ingested orally. In comparison, those recommended in handwritten texts are rather toxic. Possibly to minimize the negative side-effects of such medication, practitioners of folk medicine developed mechanical devices to perform "external", i.e., vaginal approaches.

  5. Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture

    Science.gov (United States)

    Bideault, Gautier; Mioulet, Luc; Chatelain, Clément; Paquet, Thierry

    2015-01-01

    In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.

  6. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  7. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  8. Multi-digit handwritten sindhi numerals recognition using som neural network

    International Nuclear Information System (INIS)

    Chandio, A.A.; Jalbani, A.H.; Awan, S.A.

    2017-01-01

    In this research paper a multi-digit Sindhi handwritten numerals recognition system using SOM Neural Network is presented. Handwritten digits recognition is one of the challenging tasks and a lot of research is being carried out since many years. A remarkable work has been done for recognition of isolated handwritten characters as well as digits in many languages like English, Arabic, Devanagari, Chinese, Urdu and Pashto. However, the literature reviewed does not show any remarkable work done for Sindhi numerals recognition. The recognition of Sindhi digits is a difficult task due to the various writing styles and different font sizes. Therefore, SOM (Self-Organizing Map), a NN (Neural Network) method is used which can recognize digits with various writing styles and different font sizes. Only one sample is required to train the network for each pair of multi-digit numerals. A database consisting of 4000 samples of multi-digits consisting only two digits from 10-50 and other matching numerals have been collected by 50 users and the experimental results of proposed method show that an accuracy of 86.89% is achieved. (author)

  9. Development of an optical character recognition pipeline for handwritten form fields from an electronic health record.

    Science.gov (United States)

    Rasmussen, Luke V; Peissig, Peggy L; McCarty, Catherine A; Starren, Justin

    2012-06-01

    Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.

  10. Segmentation of Handwritten Chinese Character Strings Based on improved Algorithm Liu

    Directory of Open Access Journals (Sweden)

    Zhihua Cai

    2014-09-01

    Full Text Available Algorithm Liu attracts high attention because of its high accuracy in segmentation of Japanese postal address. But the disadvantages, such as complexity and difficult implementation of algorithm, etc. have an adverse effect on its popularization and application. In this paper, the author applies the principles of algorithm Liu to handwritten Chinese character segmentation according to the characteristics of the handwritten Chinese characters, based on deeply study on algorithm Liu.In the same time, the author put forward the judgment criterion of Segmentation block classification and adhering mode of the handwritten Chinese characters.In the process of segmentation, text images are seen as the sequence made up of Connected Components (CCs, while the connected components are made up of several horizontal itinerary set of black pixels in image. The author determines whether these parts will be merged into segmentation through analyzing connected components. And then the author does image segmentation through adhering mode based on the analysis of outline edges. Finally cut the text images into character segmentation. Experimental results show that the improved Algorithm Liu obtains high segmentation accuracy and produces a satisfactory segmentation result.

  11. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2010-09-14

    The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".

  12. Aligning ontologies and integrating textual evidence for pathway analysis of microarray data

    Energy Technology Data Exchange (ETDEWEB)

    Gopalan, Banu; Posse, Christian; Sanfilippo, Antonio P.; Stenzel-Poore, Mary; Stevens, S.L.; Castano, Jose; Beagley, Nathaniel; Riensche, Roderick M.; Baddeley, Bob; Simon, R.P.; Pustejovsky, James

    2006-10-08

    Expression arrays are introducing a paradigmatic change in biology by shifting experimental approaches from single gene studies to genome-level analysis, monitoring the ex-pression levels of several thousands of genes in parallel. The massive amounts of data obtained from the microarray data needs to be integrated and interpreted to infer biological meaning within the context of information-rich pathways. In this paper, we present a methodology that integrates textual information with annotations from cross-referenced ontolo-gies to map genes to pathways in a semi-automated way. We illustrate this approach and compare it favorably to other tools by analyzing the gene expression changes underlying the biological phenomena related to stroke. Stroke is the third leading cause of death and a major disabler in the United States. Through years of study, researchers have amassed a significant knowledge base about stroke, and this knowledge, coupled with new technologies, is providing a wealth of new scientific opportunities. The potential for neu-roprotective stroke therapy is enormous. However, the roles of neurogenesis, angiogenesis, and other proliferative re-sponses in the recovery process following ischemia and the molecular mechanisms that lead to these processes still need to be uncovered. Improved annotation of genomic and pro-teomic data, including annotation of pathways in which genes and proteins are involved, is required to facilitate their interpretation and clinical application. While our approach is not aimed at replacing existing curated pathway databases, it reveals multiple hidden relationships that are not evident with the way these databases analyze functional groupings of genes from the Gene Ontology.

  13. Recognizing Textual Entailment: Challenges in the Portuguese Language

    Directory of Open Access Journals (Sweden)

    Gil Rocha

    2018-03-01

    Full Text Available Recognizing textual entailment comprises the task of determining semantic entailment relations between text fragments. A text fragment entails another text fragment if, from the meaning of the former, one can infer the meaning of the latter. If such relation is bidirectional, then we are in the presence of a paraphrase. Automatically recognizing textual entailment relations captures major semantic inference needs in several natural language processing (NLP applications. As in many NLP tasks, textual entailment corpora for English abound, while the same is not true for more resource-scarce languages such as Portuguese. Exploiting what seems to be the only Portuguese corpus for textual entailment and paraphrases (the ASSIN corpus, in this paper, we address the task of automatically recognizing textual entailment (RTE and paraphrases from text written in the Portuguese language, by employing supervised machine learning techniques. We employ lexical, syntactic and semantic features, and analyze the impact of using semantic-based approaches in the performance of the system. We then try to take advantage of the bi-dialect nature of ASSIN to compensate its limited size. With the same aim, we explore modeling the task of recognizing textual entailment and paraphrases as a binary classification problem by considering the bidirectional nature of paraphrases as entailment relationships. Addressing the task as a multi-class classification problem, we achieve results in line with the winner of the ASSIN Challenge. In addition, we conclude that semantic-based approaches are promising in this task, and that combining data from European and Brazilian Portuguese is less straightforward than it may initially seem. The binary classification modeling of the problem does not seem to bring advantages to the original multi-class model, despite the outstanding results obtained by the binary classifier for recognizing textual entailments.

  14. Students' Perceptions of the Usefulness of an E-Book with Annotative and Sharing Capabilities as a Tool for Learning: A Case Study

    Science.gov (United States)

    Lim, Ee-Lon; Hew, Khe Foon

    2014-01-01

    E-books offer a range of benefits to both educators and students, including ease of accessibility and searching capabilities. However, the majority of current e-books are repository-cum-delivery platforms of textual information. Hitherto, there is a lack of empirical research that examines e-books with annotative and sharing capabilities. This…

  15. Annotation: The Savant Syndrome

    Science.gov (United States)

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  16. Annotating Emotions in Meetings

    NARCIS (Netherlands)

    Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.

    We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete

  17. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  18. TRANS-TEXTUALIZATION AND CARNIVALIZATION IN "WHISTLER," BY ONDJAKI

    Directory of Open Access Journals (Sweden)

    Karine Miranda Campos

    2013-04-01

    Full Text Available This article aims to observe the phenomenon of carnivalization and trans­textuality the novel The Whistler, the Angolan writer Ondjaki. Comprise the theoretical analysis of Bakhtin’s theory on carnivalization and its im­portance for social subversion of monologic discourse established by of­ficial bodies, the theory of Gérard Genette on transtextuality pointing five possible textual relationships. An understanding of the theories and car­nivalization transtextuality pervades the concepts of animism and taboo presented the theory of Sigmund Freud.

  19. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  20. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2017-11-09

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  1. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan; Jia, Fan; Liu, Wei; Ghanem, Bernard

    2017-01-01

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  2. SECRETARIAT AND TEXTUAL PRODUCTION: ARGUMENTATION IN THE TEXTUAL GENRE LETTER OF DECLARATION

    Directory of Open Access Journals (Sweden)

    Erivaldo Pereira do Nascimento

    2012-05-01

    Full Text Available This article aims at describing the semantic argumentative structure of the textual/discursive genre Letter of Declaration, one of the documents with which the executive secretary frequently deals. This investigation is based on the Argumention Theory in the Language proposed by Ducrot (1988,1987. We also used the studies about Discursive Modalization proposed by Koch (2002, Castilho e Castilho (1993, Nascimento (2005, among others. Modalization is considered here as a semantic argumentative strategy, as it enables the speaker to make a statement or to express a point of view about the content of his/her enunciation, according to the interlocution. This study about the previously mentioned gender is qualitative, quantitative and descriptive. The corpus used is composed of 20 (twenty Letters of Declaration issued by different organizations or private and public institutions. We perceived that in the Letters of Declaration analysed argumentation is achieved by the use of modalizers and argumentative operators, used by the speaker to produce different effects of meaning in the texts.

  3. The Comparison of Typed and Handwritten Essays of Iranian EFL Students in terms of Length, Spelling, and Grammar

    Directory of Open Access Journals (Sweden)

    Behrouz Sarbakhshian

    2016-11-01

    Full Text Available This study attempted to compare typed and handwritten essays of Iranian EFL students in terms of length, spelling, and grammar. To administer the study, the researchers utilized Alice Touch Typing Tutor software to select 15 upper intermediate students with higher ability to write two essays: one typed and the other handwritten. The students were both males and females between the ages of 22 to 35. The analyses of the students’ scores in the three mentioned criteria through three paired samples t-tests indicate that typed essays are significantly better than handwritten ones in terms of length of texts and grammatical mistakes, but not significantly different in spelling mistakes. Positive effects of typing can provide a logical reason for students, especially TOEFL applicants, to spend more time on acquiring typing skill and also for teachers to encourage their students with higher typing ability to choose typed format in their essays.

  4. Textual appropriation in engineering master's theses: a preliminary study.

    Science.gov (United States)

    Eckel, Edward J

    2011-09-01

    In the thesis literature review, an engineering graduate student is expected to place original research in the context of previous work by other researchers. However, for some students, particularly those for whom English is a second language, the literature review may be a mixture of original writing and verbatim source text appropriated without quotations. Such problematic use of source material leaves students vulnerable to an accusation of plagiarism, which carries severe consequences. Is such textual appropriation common in engineering master's writing? Furthermore, what, if anything, can be concluded when two texts have been found to have textual material in common? Do existing definitions of plagiarism provide a sufficient framework for determining if an instance of copying is transgressive or not? In a preliminary attempt to answer these questions, text strings from a random sample of 100 engineering master's theses from the ProQuest Dissertations and Theses database were searched for appropriated verbatim source text using the Google search engine. The results suggest that textual borrowing may indeed be a common feature of the master's engineering literature review, raising questions about the ability of graduate students to synthesize the literature. The study also illustrates the difficulties of making a determination of plagiarism based on simple textual similarity. A context-specific approach is recommended when dealing with any instance of apparent copying.

  5. The struggle for textual conventions in a language support programme

    African Journals Online (AJOL)

    In this article, the writer explores the experience of a group of South African learners with regard to a language support course that aims to facilitate their struggle to master English textual conventions in discipline specific contexts. The academic context of this study was that of a nursing science degree programme where ...

  6. Shaping Up? Three Acts of Education Studies as Textual Critique

    Science.gov (United States)

    McDougall, Julian; Walker, Stephen; Kendall, Alex

    2006-01-01

    This paper presents a study of dominant educational discourses through textual critique and argues that such an approach enables education studies to preserve an important distinction from teacher training. The texts deconstructed here are specific to English education, but the discourses at work have international relevance as the rhetorics of…

  7. Textual Condensation in Printed Dictionaries. A Theoretical Draft·

    African Journals Online (AJOL)

    applying procedures of textual condensation in relation to a respective full text. The full text ..... and, if one does not consider the inserted wordforms, of the same number of letters. .... soccer clogging: the match turned into a rough game da32.

  8. An Integrated Textual Case-Based System A. Almu

    African Journals Online (AJOL)

    Almu: An Integrated Textual Case-Based System. 66. The semantic relationship in WordNet links the Four-. Part-Of-Speech of Nouns, Verbs, Adverbs and. Adjectives together to synonyms sets (Miller 1995). Therefore, the words or terms of the problem have to be tagged with their appropriate POS before passing them to the ...

  9. On the metaphorical nature of intellectual capital: a textual analysis

    NARCIS (Netherlands)

    Dr. Daan Andriessen

    2006-01-01

    Purpose – To analyse common metaphors used in the intellectual capital (IC) and knowledge management literatures to conceptualise knowledge, in order to study the nature of the intellectual capital concept. Design/methodology/approach – A textual analysis methodology is used to analyse texts

  10. Legitimate Textual Borrowing: Direct Quotation in L2 Student Writing

    Science.gov (United States)

    Petric, Bojana

    2012-01-01

    Using textual analysis and interviews with student writers, this study aims to provide an insight into second language students' use of direct quotations in their MA theses by comparing direct quotations in high-rated and low-rated Master's theses, and by exploring student writers' own motivations to quote directly from sources. The corpus…

  11. Finding Environmental Knowledge in SCUBA-Based Textual Materials

    Science.gov (United States)

    Gündogdu, Cemal; Aygün, Yalin; Ilkim, Mehmet

    2018-01-01

    As marine environments within the adventure domain are future key-settings for recreational SCUBA diving experience, SCUBA-based textual materials should provide insight into environmental knowledge that is well connected to the novice divers' behaviour and attitude. This research is concerned with a major recreational SCUBA diver manual for…

  12. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Science.gov (United States)

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh

  13. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Directory of Open Access Journals (Sweden)

    Anika Oellrich

    Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content

  14. ML-Ask: Open Source Affect Analysis Software for Textual Input in Japanese

    Directory of Open Access Journals (Sweden)

    Michal Ptaszynski

    2017-06-01

    Full Text Available We present ML-Ask – the first Open Source Affect Analysis system for textual input in Japanese. ML-Ask analyses the contents of an input (e.g., a sentence and annotates it with information regarding the contained general emotive expressions, specific emotional words, valence-activation dimensions of overall expressed affect, and particular emotion types expressed with their respective expressions. ML-Ask also incorporates the Contextual Valence Shifters model for handling negation in sentences to deal with grammatically expressible shifts in the conveyed valence. The system, designed to work mainly under Linux and MacOS, can be used for research on, or applying the techniques of Affect Analysis within the framework Japanese language. It can also be used as an experimental baseline for specific research in Affect Analysis, and as a practical tool for written contents annotation.   Funding statement: This research has been supported by: a Research Grant from the Nissan Science Foundation (years 2009–2010, The GCOE Program founded by Japan’s Ministry of Education, Culture, Sports, Science and Technology (years 2009–2010, (JSPS KAKENHI Grant-in-Aid for JSPS Fellows (Project Number: 22-00358 (years 2010–2012, (JSPS KAKENHI Grant-in-Aid for Scientific Research (Project Number: 24600001 (years 2012–2015, (JSPS KAKENHI Grant-in-Aid for Research Activity Start-up (Project Number: 25880003 (years 2013–2015, and (JSPS KAKENHI Grant-in-Aid for Encouragement of Young Scientists (B (Project Number: 15K16044 (years 2015-present, project estimated to end in March 2018.

  15. DISCURSO POLÍTICO DE RENÚNCIA: UMA ANÁLISE TEXTUAL / Political speech of resignation: a textual analysis

    Directory of Open Access Journals (Sweden)

    Maria Eliete de Queiroz

    2016-05-01

    Full Text Available RESUMO O presente artigo analisa a estrutura composicional do gênero discurso político de renúncia do senador Antônio Carlos Magalhães (ACM. A teoria da pesquisa pauta-se na Linguística Textual e na Análise Textual dos Discursos – ATD –  em Adam (2011. O procedimento metodológico e analítico volta-se para a forma de constituição das sequências textuais e do plano de texto do discurso de renúncia. Pela análise, observamos que o discurso de renúncia apresenta uma estrutura composicional composta por blocos temáticos diversos que formam as proposições enunciadas no aspecto global do texto. As etapas que seguem o plano de texto obedecem a seguinte ordem: abertura, parte intermediária e encerramento intermediário conclusivo que são construídas pela mesclagem de sequências narrativas, descritivas, explicativas e argumentativas. A sequência argumentativa é a predominante na construção da materialidade textual. Nesse sentido, o discurso de renúncia leva em conta as especificidades do nível textual e apresenta a genericidade complexa de um discurso político de renúncia. Palavras-chave: Discurso político. Estrutura composicional do gênero. Plano de texto. ABSTRACT This paper analizes the compositional structure of the genre political speech of resignation delivered by the Senator Antonio Carlos Magalhães (ACM. The research theory is based on text linguistics and textual analysis of discourses - TAD - in Adam (2011. The methodological and analysis procedures were conducted according to the constitution of textual sequences and the resignation speech text plan. In this analysis, we were able to see that the resignation speech presents a compositional structure made up of several thematic blocks which form the propositions stated in the global  aspect of the text.  The steps which follow the text plan are the following: opening, intermediate part and intermediate closing which are built by blending the narrative

  16. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... and metonymic. We have conducted an analysis in English, Danish and Spanish. Later on, we have tried to replicate the human judgments by means of unsupervised and semi-supervised sense prediction. The automatic sense-prediction systems have been unable to find empiric evidence for the underspecified sense, even...

  17. Impingement: an annotated bibliography

    International Nuclear Information System (INIS)

    Uziel, M.S.; Hannon, E.H.

    1979-04-01

    This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title

  18. BreakingNews: Article Annotation by Image and Text Processing.

    Science.gov (United States)

    Ramisa, Arnau; Yan, Fei; Moreno-Noguer, Francesc; Mikolajczyk, Krystian

    2018-05-01

    Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of Computer Vision and Natural Language Processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of news articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce an adaptive CNN architecture that shares most of the structure for multiple tasks including source detection, article illustration and geolocation of articles. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and user comments). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.

  19. Structural analysis of online handwritten mathematical symbols based on support vector machines

    Science.gov (United States)

    Simistira, Foteini; Papavassiliou, Vassilis; Katsouros, Vassilis; Carayannis, George

    2013-01-01

    Mathematical expression recognition is still a very challenging task for the research community mainly because of the two-dimensional (2d) structure of mathematical expressions (MEs). In this paper, we present a novel approach for the structural analysis between two on-line handwritten mathematical symbols of a ME, based on spatial features of the symbols. We introduce six features to represent the spatial affinity of the symbols and compare two multi-class classification methods that employ support vector machines (SVMs): one based on the "one-against-one" technique and one based on the "one-against-all", in identifying the relation between a pair of symbols (i.e. subscript, numerator, etc). A dataset containing 1906 spatial relations derived from the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2012 training dataset is constructed to evaluate the classifiers and compare them with the rule-based classifier of the ILSP-1 system participated in the contest. The experimental results give an overall mean error rate of 2.61% for the "one-against-one" SVM approach, 6.57% for the "one-against-all" SVM technique and 12.31% error rate for the ILSP-1 classifier.

  20. Evaluating structural pattern recognition for handwritten math via primitive label graphs

    Science.gov (United States)

    Zanibbi, Richard; Mouchère, Harold; Viard-Gaudin, Christian

    2013-01-01

    Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.

  1. Corticospinal excitability during the processing of handwritten and typed words and non-words.

    Science.gov (United States)

    Gordon, Chelsea L; Spivey, Michael J; Balasubramaniam, Ramesh

    2017-06-09

    A number of studies have suggested that perception of actions is accompanied by motor simulation of those actions. To further explore this proposal, we applied Transcranial magnetic stimulation (TMS) to the left primary motor cortex during the observation of handwritten and typed language stimuli, including words and non-word consonant clusters. We recorded motor-evoked potentials (MEPs) from the right first dorsal interosseous (FDI) muscle to measure cortico-spinal excitability during written text perception. We observed a facilitation in MEPs for handwritten stimuli, regardless of whether the stimuli were words or non-words, suggesting potential motor simulation during observation. We did not observe a similar facilitation for the typed stimuli, suggesting that motor simulation was not occurring during observation of typed text. By demonstrating potential simulation of written language text during observation, these findings add to a growing literature suggesting that the motor system plays a strong role in the perception of written language. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  3. Phylogenetic molecular function annotation

    International Nuclear Information System (INIS)

    Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  4. BioCause: Annotating and analysing causality in the biomedical domain.

    Science.gov (United States)

    Mihăilă, Claudiu; Ohta, Tomoko; Pyysalo, Sampo; Ananiadou, Sophia

    2013-01-16

    Biomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems. However, bio-event annotation alone cannot cater for all the needs of biologists. Unlike work on relation and event extraction, most of which focusses on specific events and named entities, we aim to build a comprehensive resource, covering all statements of causal association present in discourse. Causality lies at the heart of biomedical knowledge, such as diagnosis, pathology or systems biology, and, thus, automatic causality recognition can greatly reduce the human workload by suggesting possible causal connections and aiding in the curation of pathway models. A biomedical text corpus annotated with such relations is, hence, crucial for developing and evaluating biomedical text mining. We have defined an annotation scheme for enriching biomedical domain corpora with causality relations. This schema has subsequently been used to annotate 851 causal relations to form BioCause, a collection of 19 open-access full-text biomedical journal articles belonging to the subdomain of infectious diseases. These documents have been pre-annotated with named entity and event information in the context of previous shared tasks. We report an inter-annotator agreement rate of over 60% for triggers and of over 80% for arguments using an exact match constraint. These increase significantly using a relaxed match setting. Moreover, we analyse and describe the causality relations in BioCause from various points of view. This information can then be leveraged for the training of automatic causality detection systems. Augmenting named entity and event annotations with information about causal discourse relations could benefit the development of more sophisticated IE systems. These will further influence the development of multiple tasks, such as enabling textual inference to detect entailments, discovering new facts and providing new

  5. Textual meaning and its place in a theory of language

    Directory of Open Access Journals (Sweden)

    Jeffries Lesley

    2015-06-01

    Full Text Available Following the development of a framework for critical stylistics (Jeffries 2010 and the explication of some of the theoretical assumptions behind this framework (Jeffries 2014a, 2014b, 2015a, 2015b, the present article attempts to put this framework into a larger theoretical context as a way to approach textual meaning. Using examples from the popular U.S. television show, The Big Bang Theory, I examine the evidence that there is a kind of textual meaning which can be distinguished from the core propositional meaning on the one hand and from contextual, interpersonal meaning on the other. The specific aim, to demonstrate a layer of meaning belonging to text specifically, is set within an argument which claims that progress in linguistics can better be served by adherence to a rigorous scientific discipline.

  6. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  7. THE DIMENSIONS OF COMPOSITION ANNOTATION.

    Science.gov (United States)

    MCCOLLY, WILLIAM

    ENGLISH TEACHER ANNOTATIONS WERE STUDIED TO DETERMINE THE DIMENSIONS AND PROPERTIES OF THE ENTIRE SYSTEM FOR WRITING CORRECTIONS AND CRITICISMS ON COMPOSITIONS. FOUR SETS OF COMPOSITIONS WERE WRITTEN BY STUDENTS IN GRADES 9 THROUGH 13. TYPESCRIPTS OF THE COMPOSITIONS WERE ANNOTATED BY CLASSROOM ENGLISH TEACHERS. THEN, 32 ENGLISH TEACHERS JUDGED…

  8. STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE

    Science.gov (United States)

    2017-12-01

    compensate for parser errors. We replace deterministic conjunction by an average combiner, which encodes causal independence. Our framework was the...sentence similarity (STS) and sentence paraphrasing, but not Textual Entailment, where deeper inferences are required. As the formula for conjunction ...When combined, our algorithm learns to rely on systems that not just agree on an output but also the provenance of this output in conjunction with the

  9. Reading comprehension and textual consciousness on primary school

    Directory of Open Access Journals (Sweden)

    Vera Wannmacher Pereira

    2014-03-01

    Full Text Available The difficulties on reading comprehension in the primary school are evidenced by several official exams applied. Given these statistics and the evidences obtained through academic research and observations on children’s performance during the school life, there is acknowledgment of the situation as a problem that requires further development and finding solutions. The Psycholinguistics is giving its contribution, especially regarding the role of linguistic consciousness on reading learning. Many studies have been conducted specifically focusing on phonological consciousness. Studies on syntactic consciousness are also found, although less than phonological ones. Regarding the role of textual consciousness, few initiatives considers the students of the primary school. This makes the author proposes as the heartland of this communication the textual consciousness with support predominantly on Gombert (1992, aiming to examine the relationship between this level of consciousness and learning to read. Based on recent studies (PEREIRA; SCLIAR-CABRAL, 2012, the author presents in this paper: a the analysis of the context of learning and teaching of reading; b a theoretical exposition about reading learning and textual consciousness; c the pedagogical referrals for education based on the interaction between these two topics; and d the development of reflections on the possibility of the proposed path contribute to the solution of the worrying problem on read learning by the primary schools students.

  10. Punctuation as readability and textuality factor in technical discourse

    Directory of Open Access Journals (Sweden)

    Carmen Sancho Guinda

    2002-04-01

    Full Text Available This paper studies the incidence of punctuation on the reading comprehension of technical discourse and its role as a factor of textuality. Starting out from the notions of textuality and punctuation functions formulated by different linguistic approaches, an analysis has been made to quantify the decoding skills and punctuating competence of 60 Aeronautical Engineering students, as well as to determine the nature and effects of their punctuation errors. The survey has been focused on the Full Stop, the comma and the hyphen due to their highly conflicting uses as regards the identification of immediate sentence constituents and semantic relationships. The results obtained suggest that most students have a poor knowledge of punctuation rules and little awareness of punctuation as a textual elementaffecting readability. Errors are in the main related to comma use and produced by transference, either of Spanish punctuating habits into English, or of individual prosodic patterns into writing, while meaning appears to be the prevailing punctuating criterion over sound and syntax. Punctuation proves an effective tool for the anticipation of implicit meanings and an untapped resource in the teaching of the diverse communicative and stylistic possibilities offered by technical texts.

  11. Hardware Acceleration on Cloud Services: The use of Restricted Boltzmann Machines on Handwritten Digits Recognition

    Directory of Open Access Journals (Sweden)

    Eleni Bougioukou

    2018-02-01

    Full Text Available Cloud computing allows users and enterprises to process their data in high performance servers, thus reducing the need for advanced hardware at the client side. Although local processing is viable in many cases, collecting data from multiple clients and processing them in a server gives the best possible performance in terms of processing rate. In this work, the implementation of a high performance cloud computing engine for recognizing handwritten digits is presented. The engine exploits the benefits of cloud and uses a powerful hardware accelerator in order to classify the images received concurrently from multiple clients. The accelerator implements a number of neural networks, operating in parallel, resulting to a processing rate of more than 10 MImages/sec.

  12. A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic

    Science.gov (United States)

    Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier

    2015-01-01

    In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

  13. Comparative implementation of Handwritten and Machine written Gurmukhi text utilizing appropriate parameters

    Science.gov (United States)

    Kaur, Jaswinder; Jagdev, Gagandeep, Dr.

    2018-01-01

    Optical character recognition is concerned with the recognition of optically processed characters. The recognition is done offline after the writing or printing has been completed, unlike online recognition where the computer has to recognize the characters instantly as they are drawn. The performance of character recognition depends upon the quality of scanned documents. The preprocessing steps are used for removing low-frequency background noise and normalizing the intensity of individual scanned documents. Several filters are used for reducing certain image details and enabling an easier or faster evaluation. The primary aim of the research work is to recognize handwritten and machine written characters and differentiate them. The language opted for the research work is Punjabi Gurmukhi and tool utilized is Matlab.

  14. Chado controller: advanced annotation management with a community annotation system.

    Science.gov (United States)

    Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie

    2012-04-01

    We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form valentin.guignon@cirad.fr; stephanie.sidibe-bocs@cirad.fr Supplementary data are available at Bioinformatics online.

  15. Displaying Annotations for Digitised Globes

    Science.gov (United States)

    Gede, Mátyás; Farbinger, Anna

    2018-05-01

    Thanks to the efforts of the various globe digitising projects, nowadays there are plenty of old globes that can be examined as 3D models on the computer screen. These globes usually contain a lot of interesting details that an average observer would not entirely discover for the first time. The authors developed a website that can display annotations for such digitised globes. These annotations help observers of the globe to discover all the important, interesting details. Annotations consist of a plain text title, a HTML formatted descriptive text and a corresponding polygon and are stored in KML format. The website is powered by the Cesium virtual globe engine.

  16. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

    Science.gov (United States)

    Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

    2010-07-02

    The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data

  17. Computer-Assisted Search Of Large Textual Data Bases

    Science.gov (United States)

    Driscoll, James R.

    1995-01-01

    "QA" denotes high-speed computer system for searching diverse collections of documents including (but not limited to) technical reference manuals, legal documents, medical documents, news releases, and patents. Incorporates previously available and emerging information-retrieval technology to help user intelligently and rapidly locate information found in large textual data bases. Technology includes provision for inquiries in natural language; statistical ranking of retrieved information; artificial-intelligence implementation of semantics, in which "surface level" knowledge found in text used to improve ranking of retrieved information; and relevance feedback, in which user's judgements of relevance of some retrieved documents used automatically to modify search for further information.

  18. Textual and chemical information processing: different domains but similar algorithms

    Directory of Open Access Journals (Sweden)

    Peter Willett

    2000-01-01

    Full Text Available This paper discusses the extent to which algorithms developed for the processing of textual databases are also applicable to the processing of chemical structure databases, and vice versa. Applications discussed include: an algorithm for distribution sorting that has been applied to the design of screening systems for rapid chemical substructure searching; the use of measures of inter-molecular structural similarity for the analysis of hypertext graphs; a genetic algorithm for calculating term weights for relevance feedback searching for determining whether a molecule is likely to exhibit biological activity; and the use of data fusion to combine the results of different chemical similarity searches.

  19. Methodology to build medical ontology from textual resources.

    Science.gov (United States)

    Baneyx, Audrey; Charlet, Jean; Jaulent, Marie-Christine

    2006-01-01

    In the medical field, it is now established that the maintenance of unambiguous thesauri goes through ontologies. Our research task is to help pneumologists code acts and diagnoses with a software that represents medical knowledge through a domain ontology. In this paper, we describe our general methodology aimed at knowledge engineers in order to build various types of medical ontologies based on terminology extraction from texts. The hypothesis is to apply natural language processing tools to textual patient discharge summaries to develop the resources needed to build an ontology in pneumology. Results indicate that the joint use of distributional analysis and lexico-syntactic patterns performed satisfactorily for building such ontologies.

  20. Handwritten Devanagari Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks and Adaptive Gradient Methods

    Directory of Open Access Journals (Sweden)

    Mahesh Jangid

    2018-02-01

    Full Text Available Handwritten character recognition is currently getting the attention of researchers because of possible applications in assisting technology for blind and visually impaired users, human–robot interaction, automatic data entry for business documents, etc. In this work, we propose a technique to recognize handwritten Devanagari characters using deep convolutional neural networks (DCNN which are one of the recent techniques adopted from the deep learning community. We experimented the ISIDCHAR database provided by (Information Sharing Index ISI, Kolkata and V2DMDCHAR database with six different architectures of DCNN to evaluate the performance and also investigate the use of six recently developed adaptive gradient methods. A layer-wise technique of DCNN has been employed that helped to achieve the highest recognition accuracy and also get a faster convergence rate. The results of layer-wise-trained DCNN are favorable in comparison with those achieved by a shallow technique of handcrafted features and standard DCNN.

  1. A Record Book of Open Heart Surgical Cases between 1959 and 1982, Hand-Written by a Cardiac Surgeon.

    Science.gov (United States)

    Kim, Won-Gon

    2016-08-01

    A book of brief records of open heart surgery underwent between 1959 and 1982 at Seoul National University Hospital was recently found. The book was hand-written by the late professor and cardiac surgeon Yung Kyoon Lee (1921-1994). This book contains valuable information about cardiac patients and surgery at the early stages of the establishment of open heart surgery in Korea, and at Seoul National University Hospital. This report is intended to analyze the content of the book.

  2. A comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron

    OpenAIRE

    Das, Nibaran; Mollah, Ayatullah Faruk; Sarkar, Ram; Basu, Subhadip

    2010-01-01

    The work presents a comparative assessment of seven different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron (MLP) based classifier. The seven feature sets employed here consist of shadow features, octant centroids, longest runs, angular distances, effective spans, dynamic centers of gravity, and some of their combinations. On experimentation with a database of 3000 samples, the maximum recognition rate of 95.80% is observed with both of two separat...

  3. Textual standardization and the DSM-5 "common language".

    Science.gov (United States)

    Kelly, Patty A

    2014-06-01

    In February 2010, the American Psychiatric Association (APA) launched their DSM-5 website with details about the development of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). The APA invited "the general public" to review the draft diagnostic criteria and provide written comments and suggestions. This revision marks the first time the APA has solicited public review of their diagnostic manual. This article analyzes reported speech on the DSM-5 draft diagnostic criteria for the classification Posttraumatic Stress Disorder. It demonstrates how textual standardization facilitates the cultural portability of the DSM-5 diagnostic criteria such that a community of speakers beyond the borders of the APA come to be seen as exemplary speakers, writers, and revisers of the professional style. Furthermore, analysis shows how co-authoring practices recontextualize the "voice" and persona of putative patient reported speech on Criterion D2. As a consequence of textual standardization, spoken discourse becomes recontextualized as the product of scientific inquiry and the organization of psychiatric knowledge.

  4. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  5. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts.

    Science.gov (United States)

    Bharath, A; Madhvanath, Sriganesh

    2012-04-01

    Research for recognizing online handwritten words in Indic scripts is at its early stages when compared to Latin and Oriental scripts. In this paper, we address this problem specifically for two major Indic scripts--Devanagari and Tamil. In contrast to previous approaches, the techniques we propose are largely data driven and script independent. We propose two different techniques for word recognition based on Hidden Markov Models (HMM): lexicon driven and lexicon free. The lexicon-driven technique models each word in the lexicon as a sequence of symbol HMMs according to a standard symbol writing order derived from the phonetic representation. The lexicon-free technique uses a novel Bag-of-Symbols representation of the handwritten word that is independent of symbol order and allows rapid pruning of the lexicon. On handwritten Devanagari word samples featuring both standard and nonstandard symbol writing orders, a combination of lexicon-driven and lexicon-free recognizers significantly outperforms either of them used in isolation. In contrast, most Tamil word samples feature the standard symbol order, and the lexicon-driven recognizer outperforms the lexicon free one as well as their combination. The best recognition accuracies obtained for 20,000 word lexicons are 87.13 percent for Devanagari when the two recognizers are combined, and 91.8 percent for Tamil using the lexicon-driven technique.

  6. Image annotation under X Windows

    Science.gov (United States)

    Pothier, Steven

    1991-08-01

    A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.

  7. Spiking neural networks for handwritten digit recognition-Supervised learning and network optimization.

    Science.gov (United States)

    Kulkarni, Shruti R; Rajendran, Bipin

    2018-07-01

    We demonstrate supervised learning in Spiking Neural Networks (SNNs) for the problem of handwritten digit recognition using the spike triggered Normalized Approximate Descent (NormAD) algorithm. Our network that employs neurons operating at sparse biological spike rates below 300Hz achieves a classification accuracy of 98.17% on the MNIST test database with four times fewer parameters compared to the state-of-the-art. We present several insights from extensive numerical experiments regarding optimization of learning parameters and network configuration to improve its accuracy. We also describe a number of strategies to optimize the SNN for implementation in memory and energy constrained hardware, including approximations in computing the neuronal dynamics and reduced precision in storing the synaptic weights. Experiments reveal that even with 3-bit synaptic weights, the classification accuracy of the designed SNN does not degrade beyond 1% as compared to the floating-point baseline. Further, the proposed SNN, which is trained based on the precise spike timing information outperforms an equivalent non-spiking artificial neural network (ANN) trained using back propagation, especially at low bit precision. Thus, our study shows the potential for realizing efficient neuromorphic systems that use spike based information encoding and learning for real-world applications. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Textual Input Enhancement for Vowel Blindness: A Study with Arabic ESL Learners

    Science.gov (United States)

    Alsadoon, Reem; Heift, Trude

    2015-01-01

    This study explores the impact of textual input enhancement on the noticing and intake of English vowels by Arabic L2 learners of English. Arabic L1 speakers are known to experience "vowel blindness," commonly defined as a difficulty in the textual decoding and encoding of English vowels due to an insufficient decoding of the word form.…

  9. The Target of the Question: A Taxonomy of Textual Features for Cambridge University "O" Levels English

    Science.gov (United States)

    Benjamin, Shanti Isabelle

    2015-01-01

    This study investigates the typical textual features that are most frequently targeted in short-answer reading comprehension questions of the Cambridge University "O" Level English Paper 2. Test writers' awareness of how textual features impact on understanding of meanings in text decisions will determine to great extent their decisions…

  10. Técnicas aplicadas al reconocimiento de implicación textual.

    OpenAIRE

    Herrera, Jesús; Peñas, Anselmo; Verdejo, Felisa

    2005-01-01

    Tras establecer qué se entiende por implicación textual, se expone la situación actual y el futuro deseable de los sistemas dirigidos a reconocerla. Se realiza una identificación de las técnicas que implementan actualmente los principales sistemas de Reconocimiento de Implicación Textual.

  11. Textual and Pictorial Glosses: Effectiveness on Incidental Vocabulary Growth When Reading in a Foreign Language.

    Science.gov (United States)

    Kost, Claudia R.; Foss, Pamelo; Lenzini, John J.

    1999-01-01

    Investigates the effects of pictorial and textual glosses and a combination thereof on incidental vocabulary growth of foreign language learners. Subjects from second-semester German classes read a narrative text passage under one of three marginal gloss conditions: textual gloss (English translation); pictorial gloss; and text and pictures in the…

  12. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Textual healing: tailor-made kabbalistic therapeutics in Jerusalem.

    Science.gov (United States)

    Guzmen-Carmeli, Shlomo; Sharabi, Asaf

    2017-10-30

    This paper, based on fieldwork conducted in a Jerusalem yeshiva, describes how the yeshiva, a traditional institute of religious studies, also serves as an institution of healing and personal therapy in which sacred religious texts assume a central place. The article focuses on personal sessions between the rabbi who heads the yeshiva, and his audience of believers who turn to him for help in coping with personal hardships and tribulations. The paper contextualizes and elaborates upon the concept of 'deep healing' to describe how the rabbi uses his regular 'tool kit' to diagnose the problems of the person facing him and to offer optimal, personalized therapy. The rabbi uses religious texts to create textual deep healing processes that are tailor-made for the individual supplicant and are intended to accompany supplicants for a long period of time.

  14. Textual Transformations in Contemporary Black Writing in Britain

    Directory of Open Access Journals (Sweden)

    Jawhar Ahmed Dhouib

    2014-04-01

    Full Text Available While the first wave of Caribbean immigrant writers brilliantly explored race-related issues, black Britons like Andrea Levy, Zadie Smith and Caryl Phillips, among others, have sought to depart from earlier fiction, motivated in their project by the changing white face of Britain. In this article, I would like to argue that cultural change in Britain has deeply influenced literary production and has, consequently, laid the ground for a series of textual transformations. To capture instances of creative excess in contemporary black writing in Britain, I will bring under examination Caryl Phillips’s (2009 novel In the Falling Snow. My intention is to show to what extent Phillips’s work surpasses the ‘noose of race’ and already-familiar representations of multicultural Britain to celebrate a ‘post-racial’ society.

  15. Algunas consideraciones sobre la economía textual

    Directory of Open Access Journals (Sweden)

    Lic. Carlos Lázaro Nodals García

    2016-03-01

    Full Text Available l artículo aborda el fenómeno de la proliferación de nuevos textos en las redes sociales, otras plataformas interactivas y equipos donde se aplican las tecnologías de la información y la comunicación. Se hace un análisis de sus características, contextos en que se producen y la escritura con economía textual, como centro de un debate entre las posiciones que la rechazan o exponen los problemas que puede originar en el ámbito del respeto a la lengua materna, los que ven en los mismos un fenómeno objetivo, que es necesario asimilar además de contextualizar, entre los que se encuentran los autores del trabajo, que incluso aprecian su dominio, como parte de las competencias culturales y comunicativas del profesional proactivo del siglo XXI.

  16. Frequency shifting approach towards textual transcription of heartbeat sounds.

    Science.gov (United States)

    Arvin, Farshad; Doraisamy, Shyamala; Safar Khorasani, Ehsan

    2011-10-04

    Auscultation is an approach for diagnosing many cardiovascular problems. Automatic analysis of heartbeat sounds and extraction of its audio features can assist physicians towards diagnosing diseases. Textual transcription allows recording a continuous heart sound stream using a text format which can be stored in very small memory in comparison with other audio formats. In addition, a text-based data allows applying indexing and searching techniques to access to the critical events. Hence, the transcribed heartbeat sounds provides useful information to monitor the behavior of a patient for the long duration of time. This paper proposes a frequency shifting method in order to improve the performance of the transcription. The main objective of this study is to transfer the heartbeat sounds to the music domain. The proposed technique is tested with 100 samples which were recorded from different heart diseases categories. The observed results show that, the proposed shifting method significantly improves the performance of the transcription.

  17. Frequency shifting approach towards textual transcription of heartbeat sounds

    Directory of Open Access Journals (Sweden)

    Safar Khorasani Ehsan

    2011-10-01

    Full Text Available Abstract Auscultation is an approach for diagnosing many cardiovascular problems. Automatic analysis of heartbeat sounds and extraction of its audio features can assist physicians towards diagnosing diseases. Textual transcription allows recording a continuous heart sound stream using a text format which can be stored in very small memory in comparison with other audio formats. In addition, a text-based data allows applying indexing and searching techniques to access to the critical events. Hence, the transcribed heartbeat sounds provides useful information to monitor the behavior of a patient for the long duration of time. This paper proposes a frequency shifting method in order to improve the performance of the transcription. The main objective of this study is to transfer the heartbeat sounds to the music domain. The proposed technique is tested with 100 samples which were recorded from different heart diseases categories. The observed results show that, the proposed shifting method significantly improves the performance of the transcription.

  18. [Textual pragmatics in adolescents with attention deficit hyperactivity disorder: argument].

    Science.gov (United States)

    Gallardo-Paúls, B; Gimeno-Martínez, M; Moreno-Campos, V

    2010-03-03

    Clinical linguistics involves a study of linguistic deficits which focuses on a series of aspects that range from strictly formal, grammatical points to the effective and contextualised use of language. Thus, it is also inevitably concerned with the cognitive, i.e. mental, correlate of such language use, whose basic textual dimensions are narration and argument. To describe the argumentative skills in adolescents with attention deficit hyperactivity disorder (ADHD) and to examine their relationship with academic achievement and sociability. We analysed 79 argumentative texts written by adolescents with ADHD, using a methodology from cognitive linguistics and from theories of argumentation with a dialogical foundation. Adolescents with ADHD provided a greater number of arguments than those in the control group, but with a higher predominance of emotional and negative sanction strategies compared with a greater use of fallacious or circular arguments in those in the control group; the difference between the use of rational arguments in the two groups is not significant.

  19. The Ubiquity of Humanity and Textuality in Human Experience

    Directory of Open Access Journals (Sweden)

    Daihyun Chung

    2015-11-01

    Full Text Available The so-called “crisis of the humanities” can be understood in terms of an asymmetry between the natural and social sciences on the one hand and the humanities on the other. While the sciences approach topics related to human experience in quantificational or experimental terms, the humanities turn to ancient, canonical, and other texts in the search for truths about human experience. As each approach has its own unique limitations, it is desirable to overcome or remove the asymmetry between them. The present article seeks to do just that by advancing and defending the following two claims: (a that humanity is ubiquitous wherever language is used; and (b that anything that can be experienced by humans is in need of an interpretation. Two arguments are presented in support of these claims. The first argument concerns the nature of questions, which are one of the fundamental marks or manifestations of human language. All questions are ultimately attempts to find meanings or interpretations of what is presented. As such, in questioning phenomena, one seeks to transcend the negative space or oppression of imposed structures; in doing so, one reveals one’s humanity. Second, all phenomena are textual in nature: that which astrophysicists find in distant galaxies or which cognitive neuroscientists find in the structures of the human brain are no less in need of interpretation than the dialogues of Plato or the poems of Homer. Texts are ubiquitous. The implications of these two arguments are identified and discussed in this article. In particular, it is argued that the ubiquity of humanity and textuality points to a view of human nature that is neither individualistic nor collectivist but rather integrational in suggesting that the realization of oneself is inseparable from the realization of others.

  20. Public Relations: Selected, Annotated Bibliography.

    Science.gov (United States)

    Demo, Penny

    Designed for students and practitioners of public relations (PR), this annotated bibliography focuses on recent journal articles and ERIC documents. The 34 citations include the following: (1) surveys of public relations professionals on career-related education; (2) literature reviews of research on measurement and evaluation of PR and…

  1. Persuasion: A Selected, Annotated Bibliography.

    Science.gov (United States)

    McDermott, Steven T.

    Designed to reflect the diversity of approaches to persuasion, this annotated bibliography cites materials selected for their contribution to that diversity as well as for being relatively current and/or especially significant representatives of particular approaches. The bibliography starts with a list of 17 general textbooks on approaches to…

  2. Systems Theory and Communication. Annotated Bibliography.

    Science.gov (United States)

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  3. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search

  4. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  5. Interactive SIGHT: textual access to simple bar charts

    Science.gov (United States)

    Demir, Seniz; Oliver, David; Schwartz, Edward; Elzer, Stephanie; Carberry, Sandra; Mccoy, Kathleen F.; Chester, Daniel

    2010-12-01

    Information graphics, such as bar charts and line graphs, are an important component of many articles from popular media. The majority of such graphics have an intention (a high-level message) to communicate to the graph viewer. Since the intended message of a graphic is often not repeated in the accompanying text, graphics together with the textual segments contribute to the overall purpose of an article and cannot be ignored. Unfortunately, these visual displays are provided in a format which is not readily accessible to everyone. For example, individuals with sight impairments who use screen readers to listen to documents have limited access to the graphics. This article presents a new accessibility tool, the Interactive SIGHT (Summarizing Information GrapHics Textually) system, that is intended to enable visually impaired users to access the knowledge that one would gain from viewing information graphics found on the web. The current system, which is implemented as a browser extension that works on simple bar charts, can be invoked by a user via a keystroke combination while navigating the web. Once launched, Interactive SIGHT first provides a brief summary that conveys the underlying intention of a bar chart along with the chart's most significant and salient features, and then produces history-aware follow-up responses to provide further information about the chart upon request from the user. We present two user studies that were conducted with sighted and visually impaired users to determine how effective the initial summary and follow-up responses are in conveying the informational content of bar charts, and to evaluate how easy it is to use the system interface. The evaluation results are promising and indicate that the system responses are well-structured and enable visually impaired users to answer key questions about bar charts in an easy-to-use manner. Post-experimental interviews revealed that visually impaired participants were very satisfied with

  6. Textual analysis and machine leaning: Crack unstructured data in finance and accounting

    Directory of Open Access Journals (Sweden)

    Li Guo

    2016-09-01

    Full Text Available In finance and accounting, relative to quantitative methods traditionally used, textual analysis becomes popular recently despite of its substantially less precise manner. In an overview of the literature, we describe various methods used in textual analysis, especially machine learning. By comparing their classification performance, we find that neural network outperforms many other machine learning techniques in classifying news category. Moreover, we highlight that there are many challenges left for future development of textual analysis, such as identifying multiple objects within one single document.

  7. Self-plagiarism and textual recycling: legitimate forms of research misconduct.

    Science.gov (United States)

    Bruton, Samuel V

    2014-01-01

    The concept of self-plagiarism frequently elicits skepticism and generates confusion in the research ethics literature, and the ethical status of what is often called "textual recycling" is particularly controversial. I argue that, in general, self-plagiarism is unethical because it is deceptive and dishonest. I then distinguish several forms of it and argue against various common rationalizations for textual recycling. I conclude with a discussion of two instances of textual recycling, distinguishing them in terms of their ethical seriousness but concluding that both are ethically problematic.

  8. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  9. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  10. The Japanese Amateur Textual Production Scene: Activities and Participation

    Directory of Open Access Journals (Sweden)

    Alvaro David Hernández Hernández

    2016-12-01

    Full Text Available Animation, manga (Japanese comic books, and video games are some of the most popular media in Japan. The huge community of fans and amateur creators that concentrate twice a year at the massive event called Comic Market is one expression of this popularity. In this paper I focus on the Japanese culture of amateur manga, anime or other derivative texts, and present analysis of the way in which members relate to media texts and other members, in order to reveal two different orientations towards action within this culture: one that centers on individual activities, and the other on collective participation. To do this, I focus first on the Japanese amateur culture that is often called dōjin culture to find two basic perspectives. One regards this culture as forming communities or based on social interaction, and the other denies commonality in the relationships between members and stresses individual drive. Then, I focus on a similar distinction that shapes the main discourses concerning Japanese subculture and otaku, both categories within which dōjin culture can be categorized. Here I pay particular attention to two different orientations towards the value of media texts, which provide a reference for understanding the aforementioned opposition between commonality and individuality. In my conclusions, I suggest that this amateur culture can be regarded as an institution of textual appropriation, shaped by two orientations: activities and participation. 

  11. de análisis textual cinematográfico

    Directory of Open Access Journals (Sweden)

    Fernando Vizcarra

    2011-01-01

    Full Text Available Se presenta un análisis textual del filme futurista Blade Runner (1982, de Ridley Scott. Busca, sobre todo, indagar cómo se representan los efectos radicalizados de la modernidad en el discurso cinematográfico y, concretamente, en la construcción fílmica de las identidades. ¿Qué elementos de representación de lo social se movilizan en esta película para expresar, mediante la ficción futurista, las consecuencias críticas de la modernidad en las sociedades contemporáneas? Ciertos componentes de la modernidad y la posmodernidad (que aquí trato como modernidades múltiples, tales como la expansión de la racionalidad y la consecuente crisis de sentido, las rearticulaciones del tiempo y del espacio, la creación de entornos de fiabilidad y riesgo, y los procesos de hibridación, se exploran en este ejercicio a la luz de los siguientes códigos identitarios revelados en Blade Runner: 1 mortalidad, 2 memoria y poder, 3 anomia y sentido, 4 solidaridad y 5 ironía.

  12. Qur’an-related Intertextuality: Textual Potentiation in Translation

    Directory of Open Access Journals (Sweden)

    Aladdin Al-Kharabsheh

    2017-09-01

    Full Text Available Qur’an-related intertextuality, envisaged as an enriching communicative act both monolingually and interlingually, represents a case of semantic complexity that is wired to present inconceivable translation challenges. Drawing on Derrida’s (1977 dichotomy iterability/citationality, Kristeva’s (1980 vertical intertextuality, Fairclough’s (1992a; 1992b; 1995 & 2011 manifest intertextuality, and Bakhtin’s (1986 double voicing or re-accentuation, the study argues that Qur’an-related intertextuality is conducive of conceptual densities, the ‘harnessing’ of which requires ‘mobilizing’ those translation strategies that should exceed the lexicographical equivalence (Venuti 2009 to establish intertextual relations relevant to the form and theme of the foreign text. To resolve the arising translation problems, the study basically proposes two synthetic approaches: the gist-paratextual and the gist-exegetical. Translation skopos has been found to be central to the production and reception of intertextuality and to determining which of the two proposed synthetic approaches to operationalize. Finally, analysis shows that Qur’an proved to be a virtual breeding ground for textual dynamism and potentiation.

  13. Taking Stock: Marie Nimier’s Textual Cabinet of Curiosities

    Directory of Open Access Journals (Sweden)

    Adrienne Angelo

    2014-01-01

    Full Text Available In many life-writing projects, the seemingly innocuous description of heteroclite objects and how those objects are stored and recalled in fact plays an important role in demonstrating their importance to the process of memory work. At once the lingering traces of one’s past and also an aggregation of stories evoked by an examination of them, these curios focus attention on the relationship between the individual and the storage of memories. This article will focus on certain collectibles, collections and collectors that appear throughout the fictional, autobiographical and autofictional world that Marie Nimier has scripted to date. This textual cabinet of curiosities and the act of collecting more generally serve as a trope to connect memory with materiality, despite the numerous narrative voices that Nimier assumes—voices that move from a first-person “Marie Nimier” to an unnamed, although clearly identifiable first-person and even float between genders. Despite this nominal and narrational fluidity, objects function to guarantee recognition, both for the reader, and, especially, for the author herself. What is at stake in this intertextual assemblage of objects is not only the roles that they play in allowing the narrator to revisit past traumas and loss, but also in connecting the author’s presence to other, more fictionalized voices that above all signify the primacy of life-writing in her corpus.

  14. Extracting the Textual and Temporal Structure of Supercomputing Logs

    Energy Technology Data Exchange (ETDEWEB)

    Jain, S; Singh, I; Chandra, A; Zhang, Z; Bronevetsky, G

    2009-05-26

    Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.

  15. New Approach of Feature Extraction Method Based on the Raw Form and his Skeleton for Gujarati Handwritten Digits using Neural Networks Classifier

    Directory of Open Access Journals (Sweden)

    K. Moro

    2014-12-01

    Full Text Available This paper presents an optical character recognition (OCR system for Gujarati handwritten digits. One may find so much of work for latin writing, arabic, chines, etc. but Gujarati is a language for which hardly any work is traceable especially for handwritten characters. Here in this work we have proposed a method of feature extraction based on the raw form of the character and his skeleton and we have shown the advantage of using this method over other approaches mentioned in this article.

  16. Classifying unstructed textual data using the Product Score Model: an alternative text mining algorithm

    NARCIS (Netherlands)

    He, Qiwei; Veldkamp, Bernard P.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Unstructured textual data such as students’ essays and life narratives can provide helpful information in educational and psychological measurement, but often contain irregularities and ambiguities, which creates difficulties in analysis. Text mining techniques that seek to extract useful

  17. The Text's the Thing: Using (Neglected) Issues of Textual Scholarship to Help Students Reimagine Shakespeare

    Science.gov (United States)

    Parsons, Scott

    2009-01-01

    Do individuals know what words Shakespeare actually wrote? Exploring these issues can yield dramatic interest. With references to Shakespeare's Quartos and Folios, the author examines key textual issues and discrepancies in classroom studies of "Hamlet." (Contains 8 notes.)

  18. Integrating textual and model-based process descriptions for comprehensive process search

    NARCIS (Netherlands)

    Leopold, Henrik; van der Aa, Han; Pittke, Fabian; Raffel, Manuel; Mendling, Jan; Reijers, Hajo A.

    2016-01-01

    Documenting business processes using process models is common practice in many organizations. However, not all process information is best captured in process models. Hence, many organizations complement these models with textual descriptions that specify additional details. The problem with this

  19. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  20. La superación a los docentes para la producción textual

    Directory of Open Access Journals (Sweden)

    Rafaela Hinojosa Legrá

    2017-11-01

    Full Text Available La producción textual es uno de los componentes más afectados desde los primeros grados en la enseñanza de la Lengua Materna, entre las causas que lo provocan está la deficiente preparación de los docentes para ofrecer el tratamiento adecuado, este trabajo ofrece vías para la superación profesional respecto a la producción textual.

  1. An analysis for understanding the process of textual deconstruction as a motivator for learning

    Directory of Open Access Journals (Sweden)

    Ana Delia Barrera Jiménez

    2010-03-01

    Full Text Available The present article aims to analyze the potential of the process of textual understanding and construction, for the development of motivation towards learning in teacher trainees for Preuniversities. In this direction it advocates in the first place, to understand the dynamic relationship established between the process of textual attribution and production and the motivational one, which provides the indispensable condition for promoting the work with the text from all the subjects in the curriculum.

  2. Semantic annotation in biomedicine: the current landscape.

    Science.gov (United States)

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  3. Experiential thinking in creationism--a textual analysis.

    Science.gov (United States)

    Nieminen, Petteri; Ryökäs, Esko; Mustonen, Anne-Mari

    2015-01-01

    Creationism is a religiously motivated worldview in denial of biological evolution that has been very resistant to change. We performed a textual analysis by examining creationist and pro-evolutionary texts for aspects of "experiential thinking", a cognitive process different from scientific thought. We observed characteristics of experiential thinking as follows: testimonials (present in 100% of sampled creationist texts), such as quotations, were a major form of proof. Confirmation bias (100% of sampled texts) was represented by ignoring or dismissing information that would contradict the creationist hypothesis. Scientifically irrelevant or flawed information was re-interpreted as relevant for the falsification of evolution (75-90% of sampled texts). Evolutionary theory was associated to moral issues by demonizing scientists and linking evolutionary theory to atrocities (63-93% of sampled texts). Pro-evolutionary rebuttals of creationist claims also contained testimonials (93% of sampled texts) and referred to moral implications (80% of sampled texts) but displayed lower prevalences of stereotypical thinking (47% of sampled texts), confirmation bias (27% of sampled texts) and pseudodiagnostics (7% of sampled texts). The aspects of experiential thinking could also be interpreted as argumentative fallacies. Testimonials lead, for instance, to ad hominem and appeals to authorities. Confirmation bias and simplification of data give rise to hasty generalizations and false dilemmas. Moral issues lead to guilt by association and appeals to consequences. Experiential thinking and fallacies can contribute to false beliefs and the persistence of the claims. We propose that science educators would benefit from the systematic analysis of experiential thinking patterns and fallacies in creationist texts and pro-evolutionary rebuttals in order to concentrate on scientific misconceptions instead of the scientifically irrelevant aspects of the creationist-evolutionist debate.

  4. Experiential Thinking in Creationism—A Textual Analysis

    Science.gov (United States)

    Nieminen, Petteri; Ryökäs, Esko; Mustonen, Anne-Mari

    2015-01-01

    Creationism is a religiously motivated worldview in denial of biological evolution that has been very resistant to change. We performed a textual analysis by examining creationist and pro-evolutionary texts for aspects of “experiential thinking”, a cognitive process different from scientific thought. We observed characteristics of experiential thinking as follows: testimonials (present in 100% of sampled creationist texts), such as quotations, were a major form of proof. Confirmation bias (100% of sampled texts) was represented by ignoring or dismissing information that would contradict the creationist hypothesis. Scientifically irrelevant or flawed information was re-interpreted as relevant for the falsification of evolution (75–90% of sampled texts). Evolutionary theory was associated to moral issues by demonizing scientists and linking evolutionary theory to atrocities (63–93% of sampled texts). Pro-evolutionary rebuttals of creationist claims also contained testimonials (93% of sampled texts) and referred to moral implications (80% of sampled texts) but displayed lower prevalences of stereotypical thinking (47% of sampled texts), confirmation bias (27% of sampled texts) and pseudodiagnostics (7% of sampled texts). The aspects of experiential thinking could also be interpreted as argumentative fallacies. Testimonials lead, for instance, to ad hominem and appeals to authorities. Confirmation bias and simplification of data give rise to hasty generalizations and false dilemmas. Moral issues lead to guilt by association and appeals to consequences. Experiential thinking and fallacies can contribute to false beliefs and the persistence of the claims. We propose that science educators would benefit from the systematic analysis of experiential thinking patterns and fallacies in creationist texts and pro-evolutionary rebuttals in order to concentrate on scientific misconceptions instead of the scientifically irrelevant aspects of the creationist

  5. Experiential thinking in creationism--a textual analysis.

    Directory of Open Access Journals (Sweden)

    Petteri Nieminen

    Full Text Available Creationism is a religiously motivated worldview in denial of biological evolution that has been very resistant to change. We performed a textual analysis by examining creationist and pro-evolutionary texts for aspects of "experiential thinking", a cognitive process different from scientific thought. We observed characteristics of experiential thinking as follows: testimonials (present in 100% of sampled creationist texts, such as quotations, were a major form of proof. Confirmation bias (100% of sampled texts was represented by ignoring or dismissing information that would contradict the creationist hypothesis. Scientifically irrelevant or flawed information was re-interpreted as relevant for the falsification of evolution (75-90% of sampled texts. Evolutionary theory was associated to moral issues by demonizing scientists and linking evolutionary theory to atrocities (63-93% of sampled texts. Pro-evolutionary rebuttals of creationist claims also contained testimonials (93% of sampled texts and referred to moral implications (80% of sampled texts but displayed lower prevalences of stereotypical thinking (47% of sampled texts, confirmation bias (27% of sampled texts and pseudodiagnostics (7% of sampled texts. The aspects of experiential thinking could also be interpreted as argumentative fallacies. Testimonials lead, for instance, to ad hominem and appeals to authorities. Confirmation bias and simplification of data give rise to hasty generalizations and false dilemmas. Moral issues lead to guilt by association and appeals to consequences. Experiential thinking and fallacies can contribute to false beliefs and the persistence of the claims. We propose that science educators would benefit from the systematic analysis of experiential thinking patterns and fallacies in creationist texts and pro-evolutionary rebuttals in order to concentrate on scientific misconceptions instead of the scientifically irrelevant aspects of the creationist

  6. Utility-preserving privacy protection of textual healthcare documents.

    Science.gov (United States)

    Sánchez, David; Batet, Montserrat; Viejo, Alexandre

    2014-12-01

    The adoption of ITs by medical organisations makes possible the compilation of large amounts of healthcare data, which are quite often needed to be released to third parties for research or business purposes. Many of this data are of sensitive nature, because they may include patient-related documents such as electronic healthcare records. In order to protect the privacy of individuals, several legislations on healthcare data management, which state the kind of information that should be protected, have been defined. Traditionally, to meet with current legislations, a manual redaction process is applied to patient-related documents in order to remove or black-out sensitive terms. This process is costly and time-consuming and has the undesired side effect of severely reducing the utility of the released content. Automatic methods available in the literature usually propose ad-hoc solutions that are limited to protect specific types of structured information (e.g. e-mail addresses, social security numbers, etc.); as a result, they are hardly applicable to the sensitive entities stated in current regulations that do not present those structural regularities (e.g. diseases, symptoms, treatments, etc.). To tackle these limitations, in this paper we propose an automatic sanitisation method for textual medical documents (e.g. electronic healthcare records) that is able to protect, regardless of their structure, sensitive entities (e.g. diseases) and also those semantically related terms (e.g. symptoms) that may disclose the former ones. Contrary to redaction schemes based on term removal, our approach improves the utility of the protected output by replacing sensitive terms with appropriate generalisations retrieved from several medical and general-purpose knowledge bases. Experiments conducted on highly sensitive documents and in coherency with current regulations on healthcare data privacy show promising results in terms of the practical privacy and utility of the

  7. Pipeline to upgrade the genome annotations

    Directory of Open Access Journals (Sweden)

    Lijin K. Gopi

    2017-12-01

    Full Text Available Current era of functional genomics is enriched with good quality draft genomes and annotations for many thousands of species and varieties with the support of the advancements in the next generation sequencing technologies (NGS. Around 25,250 genomes, of the organisms from various kingdoms, are submitted in the NCBI genome resource till date. Each of these genomes was annotated using various tools and knowledge-bases that were available during the period of the annotation. It is obvious that these annotations will be improved if the same genome is annotated using improved tools and knowledge-bases. Here we present a new genome annotation pipeline, strengthened with various tools and knowledge-bases that are capable of producing better quality annotations from the consensus of the predictions from different tools. This resource also perform various additional annotations, apart from the usual gene predictions and functional annotations, which involve SSRs, novel repeats, paralogs, proteins with transmembrane helices, signal peptides etc. This new annotation resource is trained to evaluate and integrate all the predictions together to resolve the overlaps and ambiguities of the boundaries. One of the important highlights of this resource is the capability of predicting the phylogenetic relations of the repeats using the evolutionary trace analysis and orthologous gene clusters. We also present a case study, of the pipeline, in which we upgrade the genome annotation of Nelumbo nucifera (sacred lotus. It is demonstrated that this resource is capable of producing an improved annotation for a better understanding of the biology of various organisms.

  8. BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments.

    Science.gov (United States)

    López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F

    2013-07-01

    Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. Annotating temporal information in clinical narratives.

    Science.gov (United States)

    Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

    2013-12-01

    Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Estimating the annotation error rate of curated GO database sequence annotations

    Directory of Open Access Journals (Sweden)

    Brown Alfred L

    2007-05-01

    Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.

  11. ANNOTATION SUPPORTED OCCLUDED OBJECT TRACKING

    Directory of Open Access Journals (Sweden)

    Devinder Kumar

    2012-08-01

    Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.

  12. Effects of pictures and textual arguments in sun protection public service announcements.

    Science.gov (United States)

    Boer, Henk; Ter Huurne, Ellen; Taal, Erik

    2006-01-01

    The effect of public service announcements aimed at promoting primary prevention of skin cancer may be limited by superficial cognitive processing. The use of both pictures and textual arguments in sun protection public service announcements were evaluated for their potentially beneficial effects on judgment, cognitive processing and persuasiveness. In a 2 x 2 factorial experimental design individuals were shown public service announcements that advocated the advantages of sun protection measures in different versions in which a picture was present or not present and a textual argument was present or not present. The 159 participants were randomly assigned to one of four conditions. In each condition, participants were shown 12 different public service announcements designed according to the condition. Participants judged each public service announcement on attractiveness, credibility, clarity of communication and the required amount of reflection. After the judgment task, they completed a questionnaire to assess knowledge, perceived advantages and disadvantages of sun protection and intended use of sun protection measures. Pictures enhanced attractiveness, but diminished comprehension. Textual arguments enhanced attractiveness, credibility and comprehension. Pictures as well as textual arguments increased knowledge of sun protection measures. Pictures and textual arguments in public service announcements positively influence the individual's perception of the advantages of sun protection methods and the advantages of their adoption.

  13. Analyzing textual databases using data mining to enable fast product development processes

    International Nuclear Information System (INIS)

    Menon, Rakesh; Tong, Loh Han; Sathiyakeerthi, S.

    2005-01-01

    Currently, companies active in the design of high-tech products are confronted with a number of, often conflicting, challenges. Not only do they have to develop increasingly complex products in ever-shorter amounts of time but they also have to produce them cheaply for the local as well as global markets. In order to manage these conflicting trends adequately, as this paper will demonstrate, companies need to have high quality information at the right moment and at the right location. This serves as the motivation of this paper in which various databases within the product life cycle are examined. Due to the fact that unanticipated information plays a more and more dominant role, especially in highly innovative business processes, we focus our attention on textual databases since textual databases offer the best possibility to handle unanticipated free-formatted information. Also their treatment in the literature has thus far been scant. As will be demonstrated in this paper these databases have a huge potential for valuable information. Further, an analysis tool, data mining that could be used to analyse these textual databases so that we could extract information from them quickly and hence be able to access them at the right time when needed, is presented. We also highlight some of the difficulties faced when analyzing textual databases. Thus with the right information from the textual databases and the tools to deliver this information at the right time companies, would definitely be able to shorten development times and hence gain a competitive edge

  14. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...

  15. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  16. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...

  17. Black English Annotations for Elementary Reading Programs.

    Science.gov (United States)

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  18. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  19. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  20. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    Directory of Open Access Journals (Sweden)

    Qiandong Zeng

    2010-10-01

    Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  1. Essential Requirements for Digital Annotation Systems

    Directory of Open Access Journals (Sweden)

    ADRIANO, C. M.

    2012-06-01

    Full Text Available Digital annotation systems are usually based on partial scenarios and arbitrary requirements. Accidental and essential characteristics are usually mixed in non explicit models. Documents and annotations are linked together accidentally according to the current technology, allowing for the development of disposable prototypes, but not to the support of non-functional requirements such as extensibility, robustness and interactivity. In this paper we perform a careful analysis on the concept of annotation, studying the scenarios supported by digital annotation tools. We also derived essential requirements based on a classification of annotation systems applied to existing tools. The analysis performed and the proposed classification can be applied and extended to other type of collaborative systems.

  2. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  3. Interoperable Multimedia Annotation and Retrieval for the Tourism Sector

    NARCIS (Netherlands)

    Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.

    2015-01-01

    The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies.

  4. Ion implantation: an annotated bibliography

    International Nuclear Information System (INIS)

    Ting, R.N.; Subramanyam, K.

    1975-10-01

    Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately

  5. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  6. Concept annotation in the CRAFT corpus.

    Science.gov (United States)

    Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E

    2012-07-09

    Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

  7. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  8. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....

  9. A Textual Feedback Tool for Empowering Participants in Usability and UX Evaluations

    DEFF Research Database (Denmark)

    Sivaji, Ashok; Clemmensen, Torkil; Nielsen, Søren Feodor

    2017-01-01

    -educated, but low-status, users in UX evaluations in countries and contexts with high power distances. The proposed tool contributes to the HCI community’s pool of localized UX evaluation tools. We evaluate the tool with 40 users from two socio-economic groups in real-life UX usability evaluations setting...... in Malaysia. The results indicate that the Textual Feedback tool may help participants to give their thoughts in UX evaluation in high power distance contexts. In particular, the Textual Feedback tool helps high status females and low status males express more UX problems than they can do with traditional CTA...... alone. We found that classic concurrent think aloud UX evaluation works fine in high power contexts, but only with the addition of Textual feedback to mitigate the effects of socio-economic status in certain user groups. We suggest that future research on UX evaluation look more into how to empower...

  10. Combining textual and visual information for image retrieval in the medical domain.

    Science.gov (United States)

    Gkoufas, Yiannis; Morou, Anna; Kalamboukis, Theodore

    2011-01-01

    In this article we have assembled the experience obtained from our participation in the imageCLEF evaluation task over the past two years. Exploitation on the use of linear combinations for image retrieval has been attempted by combining visual and textual sources of images. From our experiments we conclude that a mixed retrieval technique that applies both textual and visual retrieval in an interchangeably repeated manner improves the performance while overcoming the scalability limitations of visual retrieval. In particular, the mean average precision (MAP) has increased from 0.01 to 0.15 and 0.087 for 2009 and 2010 data, respectively, when content-based image retrieval (CBIR) is performed on the top 1000 results from textual retrieval based on natural language processing (NLP).

  11. Semantic annotation of consumer health questions.

    Science.gov (United States)

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most

  12. Viagem, identidade e memória textual em Antonio Tabucchi

    OpenAIRE

    Melissa Cobra Torre

    2012-01-01

    Esta dissertação apresenta uma reflexão sobre a viagem, a identidade, o jogo e a memória textual na obra de Antonio Tabucchi, tendo em vista o romance 'Noturno Indiano' e os contos 'A frase a seguir é falsa. A frase anterior é verdadeira', 'Il gioco del rovescio', 'Piccoli equivoci senza importanza' e 'I treni che vanno a Madras'. Tem como objetivos repensar o tema da viagem em 'Noturno Indiano'; enfocar a questão da identidade no referido romance; discutir o jogo textual em 'Noturno Indiano'...

  13. Comparison of CTA and Textual Feedback in Usability Testing for Malaysian Users

    DEFF Research Database (Denmark)

    Sivaji, Ashok; Clemmensen, Torkil; Nielsen, Søren Feodor

    Usability moderators found that the concurrent think-aloud (CTA) method has some cultural limitation that impacts usability testing with Malaysian users. This gives rise to proposing a new method called textual feedback. The research question is to determine whether there are any differences...... in usability defects from the concurrent think-aloud (CTA) method (Condition 2) and textual feedback method (Condition 1) within the same group of Malaysian users. A pair-wise t-test was used, whereby users were subjected to performing usability task using both methods. Results reveal that we can reject...

  14. THE POTENCIALITY OF TRANSMEDIA NARRATIVE AND THE PRACTICE OF READING AND TEXTUAL PRODUCTION

    Directory of Open Access Journals (Sweden)

    Daniella de Jesus Lima

    2017-08-01

    Full Text Available In this article we have reflect on the digital culture and its implications in education field, emphasizing the characteristics of the transmedia narrative. The objectives were to discuss the relationship between transmediation and the education, based on the methodology of teaching of textual genres with students of the social communication course - journalism of a private university in the brazilian northeast. We also used as methodology the bibliographic research and participant observation. As a result, we conclude that the elements of the transmedia narrative in the students's textual production have presented advantages and improvements for the educational process.

  15. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  16. COGNATE: comparative gene annotation characterizer.

    Science.gov (United States)

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  17. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  18. Variations in Textualization: A Cross-Generic and Cross-Disciplinary Study, Implications for Readability of the Academic Discourse

    Science.gov (United States)

    Bonabi, Mina Abbasi; Lotfipour-Saedi, Kazem; Hemmati, Fatemeh; Jafarigohar, Manoochehr

    2018-01-01

    According to discoursal views on language, variations in textualization strategies are always sociocontextually motivated and never happen at random. The textual forms employed in a text, along with many other discoursal and contextual factors, could certainly affect the readability of the text, making it more or less processable for the same…

  19. Annotations to quantum statistical mechanics

    CERN Document Server

    Kim, In-Gee

    2018-01-01

    This book is a rewritten and annotated version of Leo P. Kadanoff and Gordon Baym’s lectures that were presented in the book Quantum Statistical Mechanics: Green’s Function Methods in Equilibrium and Nonequilibrium Problems. The lectures were devoted to a discussion on the use of thermodynamic Green’s functions in describing the properties of many-particle systems. The functions provided a method for discussing finite-temperature problems with no more conceptual difficulty than ground-state problems, and the method was equally applicable to boson and fermion systems and equilibrium and nonequilibrium problems. The lectures also explained nonequilibrium statistical physics in a systematic way and contained essential concepts on statistical physics in terms of Green’s functions with sufficient and rigorous details. In-Gee Kim thoroughly studied the lectures during one of his research projects but found that the unspecialized method used to present them in the form of a book reduced their readability. He st...

  20. Meteor showers an annotated catalog

    CERN Document Server

    Kronk, Gary W

    2014-01-01

    Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...

  1. Do tipo textual ao gênero de texto. A redação no vestibular / From textual type to textual genre. The essay in the university entrance exam

    Directory of Open Access Journals (Sweden)

    Maria Helena Cruz Pistori

    2012-06-01

    Full Text Available Os atuais documentos oficiais elaborados pelo Ministério da Educação no Brasil têm preconizado o ensino da produção textual por meio do gênero - discursivo ou textual. Seguindo essa linha, o exame vestibular de ingresso a uma das maiores universidades paulistas de 2011 também solicitou dos candidatos a elaboração de três textos, de diferentes gêneros. Com o objetivo de verificar que horizontes teórico-metodológicos fundamentaram a elaboração daquele exame, analisamos os textos do (1 Manual do Candidato, da (2 Prova de redação e da (3 Expectativa da banca. Nosso parâmetro teórico é o conceito de gênero discursivo conforme desenvolvido pelos membros do Círculo de Bakhtin desde 1920. Observamos, então, como essa nova proposta visa avaliar as características que a Universidade espera encontrar em cada um de seus alunos; como utiliza o arsenal teórico nos textos analisados e como garante a avaliação da capacidade argumentativa dos candidatos. Finalmente, sugerimos a possibilidade de trabalho com o gênero “dissertação escolar”.Today's official documents drawn up by the Ministry of Education in Brazil have advocated that textual production should be taught by the concepts of speech genre or textual genre. Following this guideline, the 2011 edition of the entrance exam, vestibular, of one of the most important universities in the state of São Paulo, Brazil, asked its candidates to write three texts, each one belonging to a different genre. In order to verify the theoretical-methodological horizons that motivated the elaboration of that exam, we analyzed the texts of (1 the candidate's guidelines, (2 the examination essay and (3 the expectations set by the examiners. Our theoretical perspective is the concept of discursive genre as developed by the members of the Bakhtin Circle since 1920. Then, we observe how this new proposal aims at evaluating the characteristics that the University expects to find in each one of

  2. Predicting Learning-Related Emotions from Students' Textual Classroom Feedback via Twitter

    Science.gov (United States)

    Altrabsheh, Nabeela; Cocea, Mihaela; Fallahkhair, Sanaz

    2015-01-01

    Teachers/lecturers typically adapt their teaching to respond to students' emotions, e.g. provide more examples when they think the students are confused. While getting a feel of the students' emotions is easier in small settings, it is much more difficult in larger groups. In these larger settings textual feedback from students could provide…

  3. The Textualization of Problem Handling: Lean Discourses Meet Professional Competence in Eldercare and the Manufacturing Industry

    Science.gov (United States)

    Karlsson, Anna-Malin; Nikolaidou, Zoe

    2016-01-01

    This article reports on research addressing the role of incident reporting at the workplace as a textual representation of lean management techniques. It draws on text and discourse analysis as well as on ethnographic data, including interviews, recorded interaction, and observations, from two projects on workplace literacy in Sweden: a study in…

  4. Joint Textual And Visual Cues For Retrieving Images Using Latent Semantic Indexing

    OpenAIRE

    Pecenovic, Zoran; Ayer, Serge; Vetterli, Martin

    2001-01-01

    In this article we present a novel approach of integrating textual and visual descriptors of images in a unified retrieval structure. The methodology, inspired from text retrieval and information filtering is based on Latent Semantic Indexing (LS1).

  5. Absorbing stories : The effects of textual devices on absorption and evaluative responses

    NARCIS (Netherlands)

    Kuijpers, Moniek

    2014-01-01

    In previous research on absorption, the focus has been primarily on the reader
and only little systematic empirical investigation of the possible textual determinants of absorption has been conducted. Furthermore, no attention has been paid to the relationship between aesthetic experiences, such as

  6. Perceptions and Beliefs about Textual Appropriation and Source Use in Second Language Writing

    Science.gov (United States)

    Polio, Charlene; Shi, Ling

    2012-01-01

    Perceptions and judgments on plagiarism or acceptable use of source texts are contingent on one's interpretations and experiences in reading and writing academic texts in a specific disciplinary context. The lack of consensus on what is acceptable textual appropriation in student writing has led to the scholarship on perceptions of textual…

  7. Teach for America's Long Arc: A Critical Race Theory Textual Analysis of Wendy Kopp's Works

    Science.gov (United States)

    Barnes, Michael C.; Germain, Emily K.; Valenzuela, Angela

    2016-01-01

    We read and analyzed 165,000 words and uncover a series of counter-stories buried within a textual corpus, authored by Teach For America (TFA) founder Wendy Kopp (Kopp, 1989, 2001; Kopp & Farr, 2011), that offers insight into the forms of racism endemic to Teach For America. All three counter-stories align with a critical race theory (CRT)…

  8. Arguing in L2: Discourse Structure and Textual Metadiscourse in Philippine Newspaper Editorials

    Science.gov (United States)

    Tarrayo, Veronico N.; Duque, Marie Claire T.

    2011-01-01

    This study described the discourse structure and textual metadiscourse in newspaper editorials in the Philippines where English is used as a second language or L2. Specifically, it sought answers to the following questions: (1) What discourse features characterize the structure of the following parts of Philippine newspaper editorials--orientation…

  9. Exploration of Textual Interactions in CALL Learning Communities: Emerging Research and Opportunities

    Science.gov (United States)

    White, Jonathan R.

    2017-01-01

    Computer-assisted language learning (CALL) has greatly enhanced the realm of online social interaction and behavior. In language classrooms, it allows the opportunity for students to enhance their learning experiences. "Exploration of Textual Interactions in CALL Learning Communities: Emerging Research and Opportunities" is an ideal…

  10. Discourses of Plagiarism: Moralist, Proceduralist, Developmental and Inter-Textual Approaches

    Science.gov (United States)

    Kaposi, David; Dell, Pippa

    2012-01-01

    This paper reconstructs prevalent academic discourses of student plagiarism: moralism, proceduralism, development, and writing/inter-textuality. It approaches the discourses from three aspects: intention, interpretation and the nature of the academic community. It argues that the assumptions of the moralistic approach regarding suspect intention,…

  11. How Well Do Student Nurses Write Case Studies? A Cohesion-Centered Textual Complexity Analysis

    NARCIS (Netherlands)

    Dascalu, Mihai; Dessus, Philippe; Thuez, Laurent; Trausan-Matu, Stefan

    2017-01-01

    Starting from the presumption that writing style is proven to be a reliable predictor of comprehension, this paper investigates the extent to which textual complexity features of nurse students’ essays are related to the scores they were given. Thus, forty essays about case studies on infectious

  12. Relational Data Modelling of Textual Corpora: The Skaldic Project and its Extensions

    DEFF Research Database (Denmark)

    Wills, Tarrin Jon

    2015-01-01

    Skaldic poetry is a highly complex textual phenomenon both in terms of the intricacy of the poetry and its contextual environment. Extensible Markup Language (XML) applications such as that of the Text Encoding Initiative provide a means of semantic representation of some of these complexities. XML...

  13. A Linguistic Approach to Identify the Affective Dimension Expressed in Textual Messages

    Science.gov (United States)

    Rigo, Sandro José; da Rosa Alves, Isa Mara; Victória Barbosa, Jorge Luis

    2015-01-01

    The digital mediation resources used in Distance Education can hinder the teacher's perception about the student's state of mind. However, the textual expression in natural language is widely encouraged in most Distance Education courses, through the use of Virtual Learning Environments and other digital tools. This fact has motivated research…

  14. Use of a Modified Chaining Procedure with Textual Prompts to Establish Intraverbal Storytelling

    Science.gov (United States)

    Valentino, Amber L.; Conine, Daniel E.; Delfs, Caitlin H.; Furlow, Christopher M.

    2015-01-01

    Echoic, tact, and textual transfer procedures have been proven successful in establishing simple intraverbals (Braam and Poling "Applied Research in Mental Retardation," 4, 279-302, 1983; Luciano "Applied Research in Mental Retardation," 102, 346-357, 1986; Watkins et al. "The Analysis of Verbal Behavior," 7, 69-81,…

  15. A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

    Science.gov (United States)

    Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

    2011-01-01

    Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…

  16. Effects of Textual Enhancement and Input Enrichment on L2 Development

    Science.gov (United States)

    Rassaei, Ehsan

    2015-01-01

    Research on second language (L2) acquisition has recently sought to include formal instruction into second and foreign language classrooms in a more unobtrusive and implicit manner. Textual enhancement and input enrichment are two techniques which are aimed at drawing learners' attention to specific linguistic features in input and at the same…

  17. Analyzing Student Performance and Attitudes toward Textual versus Iconic Programming Languages

    Science.gov (United States)

    Lin, Janet Mei-Chuen; Yang, Mei-Ching

    2009-01-01

    In this study half of 52 sixth graders learned to program in MSWLogo and the other half in Drape. An analysis of students' test scores revealed that Drape (an iconic language) seemed to have a steeper learning curve than MSWLogo (a textual language). However, as students gradually became more familiar with either language, the difference in…

  18. Sharing information books with kindergartners: The role of parents’ extra-textual talk and socioeconomic status

    NARCIS (Netherlands)

    Mol, S.E.; Neuman, S.B.

    2014-01-01

    The purpose of the study was to explore how features of parent-child extra-textual talk during information book-sharing might vary across different socioeconomic backgrounds, and to determine if certain interactional patterns might mediate their effects on children's receptive and expressive

  19. The influence of annotation in graphical organizers

    NARCIS (Netherlands)

    Bezdan, Eniko; Kester, Liesbeth; Kirschner, Paul A.

    2013-01-01

    Bezdan, E., Kester, L., & Kirschner, P. A. (2012, 29-31 August). The influence of annotation in graphical organizers. Poster presented at the biannual meeting of the EARLI Special Interest Group Comprehension of Text and Graphics, Grenoble, France.

  20. An Informally Annotated Bibliography of Sociolinguistics.

    Science.gov (United States)

    Tannen, Deborah

    This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…

  1. The Community Junior College: An Annotated Bibliography.

    Science.gov (United States)

    Rarig, Emory W., Jr., Ed.

    This annotated bibliography on the junior college is arranged by topic: research tools, history, functions and purposes, organization and administration, students, programs, personnel, facilities, and research. It covers publications through the fall of 1965 and has an author index. (HH)

  2. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  3. Annotated Tsunami bibliography: 1962-1976

    International Nuclear Information System (INIS)

    Pararas-Carayannis, G.; Dong, B.; Farmer, R.

    1982-08-01

    This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts

  4. GRADUATE AND PROFESSIONAL EDUCATION, AN ANNOTATED BIBLIOGRAPHY.

    Science.gov (United States)

    HEISS, ANN M.; AND OTHERS

    THIS ANNOTATED BIBLIOGRAPHY CONTAINS REFERENCES TO GENERAL GRADUATE EDUCATION AND TO EDUCATION FOR THE FOLLOWING PROFESSIONAL FIELDS--ARCHITECTURE, BUSINESS, CLINICAL PSYCHOLOGY, DENTISTRY, ENGINEERING, LAW, LIBRARY SCIENCE, MEDICINE, NURSING, SOCIAL WORK, TEACHING, AND THEOLOGY. (HW)

  5. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  6. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  7. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  8. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  9. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  10. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  11. annot8r: GO, EC and KEGG annotation of EST datasets

    Directory of Open Access Journals (Sweden)

    Schmid Ralf

    2008-04-01

    Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non

  12. Influence of the signer's psychophysiological state on the results of his identification using handwritten pattern by natural and artificial intelligence

    Directory of Open Access Journals (Sweden)

    Alexey E. Sulavko

    2017-11-01

    Full Text Available At present, while various mechanisms to ensure information security are actively being improved, particular attention is paid to prevent unauthorized access to information resources.  The human factor and process of identification still remain the most problematic, as well as user authentication. A progress in the technology of information resources protection from internal security threats paves its way towards biometric systems of hidden identification of computer users and their psychophysiological state. A change in psychophysiological state results in the person's handwriting. The influence of the signer’s state of fatigue and excitation on the results of its identification both by a person and by pattern recognition methods on reproduced signatures are studied. Capabilities of human and artificial intelligence are compared in equal conditions. When the state of the signer changes, the probability of erroneous recognition by artificial intelligence increases by factor 3.3 to 3.7. A person identifies a handwritten image with fewer errors in case when the signer is agitated, and with higher error rate if the signer is tired.

  13. Annotated chemical patent corpus: a gold standard for text mining.

    Directory of Open Access Journals (Sweden)

    Saber A Akhondi

    Full Text Available Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  14. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  15. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  16. Active learning reduces annotation time for clinical concept extraction.

    Science.gov (United States)

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life.

    Science.gov (United States)

    Pafilis, Evangelos; Frankild, Sune P; Schnetzer, Julia; Fanini, Lucia; Faulwetter, Sarah; Pavloudi, Christina; Vasileiadou, Katerina; Leary, Patrick; Hammock, Jennifer; Schulz, Katja; Parr, Cynthia Sims; Arvanitidis, Christos; Jensen, Lars Juhl

    2015-06-01

    The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users. The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at http://environments.hcmr.gr. © The Author 2015. Published by Oxford University Press.

  18. ACID: annotation of cassette and integron data

    Directory of Open Access Journals (Sweden)

    Stokes Harold W

    2009-04-01

    Full Text Available Abstract Background Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Description By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided. Conclusion ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.

  19. BisQue: cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery

    Science.gov (United States)

    Fedorov, D.; Miller, R. J.; Kvilekval, K. G.; Doheny, B.; Sampson, S.; Manjunath, B. S.

    2016-02-01

    Logistical and financial limitations of underwater operations are inherent in marine science, including biodiversity observation. Imagery is a promising way to address these challenges, but the diversity of organisms thwarts simple automated analysis. Recent developments in computer vision methods, such as convolutional neural networks (CNN), are promising for automated classification and detection tasks but are typically very computationally expensive and require extensive training on large datasets. Therefore, managing and connecting distributed computation, large storage and human annotations of diverse marine datasets is crucial for effective application of these methods. BisQue is a cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery and associated data. Designed to hide the complexity of distributed storage, large computational clusters, diversity of data formats and inhomogeneous computational environments behind a user friendly web-based interface, BisQue is built around an idea of flexible and hierarchical annotations defined by the user. Such textual and graphical annotations can describe captured attributes and the relationships between data elements. Annotations are powerful enough to describe cells in fluorescent 4D images, fish species in underwater videos and kelp beds in aerial imagery. Presently we are developing BisQue-based analysis modules for automated identification of benthic marine organisms. Recent experiments with drop-out and CNN based classification of several thousand annotated underwater images demonstrated an overall accuracy above 70% for the 15 best performing species and above 85% for the top 5 species. Based on these promising results, we have extended bisque with a CNN-based classification system allowing continuous training on user-provided data.

  20. Biomedical Imaging Modality Classification Using Combined Visual Features and Textual Terms

    Directory of Open Access Journals (Sweden)

    Xian-Hua Han

    2011-01-01

    extraction from medical images and fuses the different extracted visual features and textual feature for modality classification. To extract visual features from the images, we used histogram descriptor of edge, gray, or color intensity and block-based variation as global features and SIFT histogram as local feature. For textual feature of image representation, the binary histogram of some predefined vocabulary words from image captions is used. Then, we combine the different features using normalized kernel functions for SVM classification. Furthermore, for some easy misclassified modality pairs such as CT and MR or PET and NM modalities, a local classifier is used for distinguishing samples in the pair modality to improve performance. The proposed strategy is evaluated with the provided modality dataset by ImageCLEF 2010.

  1. Biomedical imaging modality classification using combined visual features and textual terms.

    Science.gov (United States)

    Han, Xian-Hua; Chen, Yen-Wei

    2011-01-01

    We describe an approach for the automatic modality classification in medical image retrieval task of the 2010 CLEF cross-language image retrieval campaign (ImageCLEF). This paper is focused on the process of feature extraction from medical images and fuses the different extracted visual features and textual feature for modality classification. To extract visual features from the images, we used histogram descriptor of edge, gray, or color intensity and block-based variation as global features and SIFT histogram as local feature. For textual feature of image representation, the binary histogram of some predefined vocabulary words from image captions is used. Then, we combine the different features using normalized kernel functions for SVM classification. Furthermore, for some easy misclassified modality pairs such as CT and MR or PET and NM modalities, a local classifier is used for distinguishing samples in the pair modality to improve performance. The proposed strategy is evaluated with the provided modality dataset by ImageCLEF 2010.

  2. Textual and shape-based feature extraction and neuro-fuzzy classifier for nuclear track recognition

    Science.gov (United States)

    Khayat, Omid; Afarideh, Hossein

    2013-04-01

    Track counting algorithms as one of the fundamental principles of nuclear science have been emphasized in the recent years. Accurate measurement of nuclear tracks on solid-state nuclear track detectors is the aim of track counting systems. Commonly track counting systems comprise a hardware system for the task of imaging and software for analysing the track images. In this paper, a track recognition algorithm based on 12 defined textual and shape-based features and a neuro-fuzzy classifier is proposed. Features are defined so as to discern the tracks from the background and small objects. Then, according to the defined features, tracks are detected using a trained neuro-fuzzy system. Features and the classifier are finally validated via 100 Alpha track images and 40 training samples. It is shown that principle textual and shape-based features concomitantly yield a high rate of track detection compared with the single-feature based methods.

  3. Textual Data Mining Applications in the Service Chain Knowledge Management of e-Government

    Directory of Open Access Journals (Sweden)

    Jalal Rezaeenour

    2017-03-01

    Full Text Available Systems related to knowledge management can improve quality and efficiency of knowledge used for decision making process. Approximately 80 percent of corporate information are in textual data formats. That is why text mining is useful and important in service chain knowledge management. For example, one of the most important applications of text mining is in managing on-line source of digital documents and the analysis of internal documents. This research is based on text-based documents and textual information and interviews processed by Grounded theory. In this research clustering techniques were applied at first step. In the second step, Apriori association rules techniques for discovering and extracting the most useful association rules were applied. In other words, integration of datamining techniques was emphasized to improve the accuracy and precision of classification. Using decision tree technique for classification may result in reducing classification precision. But, the proposed method showed a significant improvement in classification precision.

  4. Linnaeus' restless system: translation as textual engineering in eighteenth-century botany.

    Science.gov (United States)

    Dietz, Bettina

    2016-04-01

    In this essay, translations of Linnaeus' Systema naturae into various European languages will be placed into the context of successively expanded editions of Linnaeus' writings. The ambition and intention of most translators was not only to make the Systema naturae accessible for practical botanical use by a wider readership, but also to supplement and correct it, and thus to shape it. By recruiting more users, translations made a significant contribution to keeping the Systema up to date and thus maintaining its practical value for decades. The need to incorporate countless additions and corrections into an existing text, to document their provenance, to identify inconsistencies, and to refer to relevant observations, descriptions, and illustrations in the botanical literature all helped to develop and refine techniques of textual montage. This form of textual engineering, becoming increasingly complex with each translation cycle, shaped the external appearance of new editions of the Systema, and reflected the modular architecture of a botanical system designed for expansion.

  5. Use of a Modified Chaining Procedure with Textual Prompts to Establish Intraverbal Storytelling.

    Science.gov (United States)

    Valentino, Amber L; Conine, Daniel E; Delfs, Caitlin H; Furlow, Christopher M

    2015-06-01

    Echoic, tact, and textual transfer procedures have been proven successful in establishing simple intraverbals (Braam and Poling Applied Research in Mental Retardation, 4, 279-302, 1983; Luciano Applied Research in Mental Retardation, 102, 346-357, 1986; Watkins et al. The Analysis of Verbal Behavior, 7, 69-81, 1989). However, these strategies may be ineffective for some children due to the complexity of the targeted intraverbals. The current study investigated the use of a novel procedure which included a modified chaining procedure and textual prompts to establish intraverbal behavior in the form of telling short stories. Visual prompts and rule statements were used with some of the participants in order to produce the desired behavior change. Results indicated that the procedure was effective for teaching retelling of short stories in three children with autism.

  6. Generation of Natural-Language Textual Summaries from Longitudinal Clinical Records.

    Science.gov (United States)

    Goldstein, Ayelet; Shahar, Yuval

    2015-01-01

    Physicians are required to interpret, abstract and present in free-text large amounts of clinical data in their daily tasks. This is especially true for chronic-disease domains, but holds also in other clinical domains. We have recently developed a prototype system, CliniText, which, given a time-oriented clinical database, and appropriate formal abstraction and summarization knowledge, combines the computational mechanisms of knowledge-based temporal data abstraction, textual summarization, abduction, and natural-language generation techniques, to generate an intelligent textual summary of longitudinal clinical data. We demonstrate our methodology, and the feasibility of providing a free-text summary of longitudinal electronic patient records, by generating summaries in two very different domains - Diabetes Management and Cardiothoracic surgery. In particular, we explain the process of generating a discharge summary of a patient who had undergone a Coronary Artery Bypass Graft operation, and a brief summary of the treatment of a diabetes patient for five years.

  7. Annotating Logical Forms for EHR Questions.

    Science.gov (United States)

    Roberts, Kirk; Demner-Fushman, Dina

    2016-05-01

    This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

  8. Textual Harassment: A New Historicist Reappraisal of the Parol Evidence With Gender in Mind

    OpenAIRE

    Keren, Hila

    2004-01-01

    This year marks the four hundredth anniversary of the Parol Evidence Rule, the rule that dictates that the interpretation of a written contract should be determined solely according to its text and not influenced by prior contradictory external information. This article uses the occasion to offer a fresh interdisciplinary view of the Rule. The analysis presents a unique contribution to the heated debate regarding the desired levels of formalism and textualism in present-day contract law, by u...

  9. Motion lecture annotation system to learn Naginata performances

    Science.gov (United States)

    Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko

    2013-12-01

    This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

  10. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  11. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  12. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  13. Improved medical image modality classification using a combination of visual and textual features.

    Science.gov (United States)

    Dimitrovski, Ivica; Kocev, Dragi; Kitanovski, Ivan; Loskovska, Suzana; Džeroski, Sašo

    2015-01-01

    In this paper, we present the approach that we applied to the medical modality classification tasks at the ImageCLEF evaluation forum. More specifically, we used the modality classification databases from the ImageCLEF competitions in 2011, 2012 and 2013, described by four visual and one textual types of features, and combinations thereof. We used local binary patterns, color and edge directivity descriptors, fuzzy color and texture histogram and scale-invariant feature transform (and its variant opponentSIFT) as visual features and the standard bag-of-words textual representation coupled with TF-IDF weighting. The results from the extensive experimental evaluation identify the SIFT and opponentSIFT features as the best performing features for modality classification. Next, the low-level fusion of the visual features improves the predictive performance of the classifiers. This is because the different features are able to capture different aspects of an image, their combination offering a more complete representation of the visual content in an image. Moreover, adding textual features further increases the predictive performance. Finally, the results obtained with our approach are the best results reported on these databases so far. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Textual, Genre and Social Features of Spoken Grammar: A Corpus-Based Approach

    Directory of Open Access Journals (Sweden)

    Carmen Pérez-Llantada

    2009-02-01

    Full Text Available This paper describes a corpus-based approach to teaching and learning spoken grammar for English for Academic Purposes with reference to Bhatia’s (2002 multi-perspective model for discourse analysis: a textual perspective, a genre perspective and a social perspective. From a textual perspective, corpus-informed instruction helps students identify grammar items through statistical frequencies, collocational patterns, context-sensitive meanings and discoursal uses of words. From a genre perspective, corpus observation provides students with exposure to recurrent lexico-grammatical patterns across different academic text types (genres. From a social perspective, corpus models can be used to raise learners’ awareness of how speakers’ different discourse roles, discourse privileges and power statuses are enacted in their grammar choices. The paper describes corpus-based instructional procedures, gives samples of learners’ linguistic output, and provides comments on the students’ response to this method of instruction. Data resulting from the assessment process and student production suggest that corpus-informed instruction grounded in Bhatia’s multi-perspective model can constitute a pedagogical approach in order to i obtain positive student responses from input and authentic samples of grammar use, ii help students identify and understand the textual, genre and social aspects of grammar in real contexts of use, and therefore iii help develop students’ ability to use grammar accurately and appropriately.

  15. “Famoso [y valiente] hidalgo”: sobre conjeturas y deturpaciones textuales

    Directory of Open Access Journals (Sweden)

    Florencio Sevilla Arroyo

    2007-12-01

    Full Text Available The textual manipulations to which the epigraph of the first chapter of El ingenioso hidalgo don Quijote de la Mancha has recently been subjected to offer an obvious example of Textual Bibliography critical threats when handled carelessly. Although it is true that spurious textual “changes” caused by the “printing by forms” and its “count of the original text” were current in Golden Age manual presses, the risk of distortions that threaten its non-well-informed amendment is also evident. In this case, a restitution of a supposed involuntary omission by the compositors is pretended, famoso [y valiente] hidalgo, which runs away from the first chapter’s epigraph in the original text of the Quijote, under the evidence from the first edition “Tabla de los capítulos”. Nevertheless, there are many typographical evidences that lead to conclude exactly the opposite: it is rather an apocryphal appendix, introduced by the compositor of the “Tabla” from an original famoso y valiente, and aimed just to solve a remaining blank in the last gathering of the book. The amendment, thus, far from restoring the original text, distorts Cervantine writing replacing it for an evident typographical trap. It would be necessary to bear in mind that the only Cervantine text of the first Quijote with critical authority went out of Juan de la Cuesta’s printing presses around Christmas in 1604- 1605...

  16. In Defence of the Textual Integrity of the Old English Resignation

    Directory of Open Access Journals (Sweden)

    Sobol Helena W.

    2015-03-01

    Full Text Available Bliss & Frantzen’s (1976 paper against the previously assumed textual integrity of Resignation has been a watershed in research upon the poem. Nearly all subsequent studies and editions have followed their theory, the sole dissenting view being expressed by Klinck (1987, 1992. The present paper offers fresh evidence for the textual unity of the poem. First examined are codicological issues, whether the state of the manuscript suggests that a folio might be missing. Next analysed are the spellings of Resignation and its phonology, here the paper discusses peculiarities which both differentiate Resignation from its manuscript context and connect the two hypothetical parts of the text. Then the paper looks at the assumed cut-off point at l.69 to see if it may provide any evidence for textual discontinuity. Finally the whole Resignation, seen as a coherent poem, is placed in the history of Old English literature, with special attention being paid to the traditions of devotional texts and the Old English elegies.

  17. Benjamin’s Dialectical Image and the Textuality of the Built Landscape

    Directory of Open Access Journals (Sweden)

    Ross Lipton

    2016-05-01

    Full Text Available In The Arcades Project, Walter Benjamin describes the architectural expression of nineteenth century Paris as a dialectical manifestation of backwards-looking historicism and the dawn of modern industrial production (in the form of cast iron and mass produced plate glass. Yet in the same text, Benjamin refers to the dialectical image as occurring within the medium of written language. In this paper, I will first discuss the textuality of the dialectical image as it emerges from Benjamin’s discussion of allegorical and symbolic images in his Trauerspiel study and the ‘wish symbol’ in The Arcades Project. I will then discuss the ‘textual reductionism’ implicit in Benjamin’s theory of the dialectical image, in which the dense pluralities of urban space are reduced to a finite script to be pieced together through Benjamin’s constructivist method of historical observation. The textuality of the dialectical image will be elaborated on by discussing it in relation to the practice of translation. This discussion will be further contextualised by discussing a cadre of German/Austrian planners and architects who attempted to translate architectural idioms between cultural identities in Kemalist Era Turkey. The article concludes with a short recapitulation on the dialectical image as both an object of scrutiny and a method of observation, one which also takes into consideration the specific historicity of the observer.

  18. Expressing emotions in blogs : The role of textual paralinguistic cues in online venting and social sharing posts

    NARCIS (Netherlands)

    Rodríguez-Hidalgo, Carmina; Tan, Ed S.H.; Verlegh, Peeter W.J.

    2017-01-01

    Textual paralanguage cues (TPC) have been signaled as effective emotion transmitters online. Though several studies have investigated their properties and occurrence, there remains a gap concerning their communicative impact within specific psychological processes, such as the social sharing of

  19. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  20. Solar Tutorial and Annotation Resource (STAR)

    Science.gov (United States)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  1. Variations in Textualization: A Cross-generic and Cross-disciplinary Study, Implications for Readability of the Academic Discourse

    Directory of Open Access Journals (Sweden)

    Mina Abbasi Bonabi

    2018-01-01

    Full Text Available According to discoursal views on language, variations in textualization strategies are always socio-contextually motivated and never happen at random. The textual forms employed in a text, along with many other discoursal and contextual factors, could certainly affect the readability of the text, making it more or less processable for the same reader. On the basis of these assumptions, the present study set out to examine how our data varied across genres and disciplines in terms of our target textual forms. These forms are as follows: the magnitude of T-unit (MOTU, the degree of embeddedness of the main verb in T-unit (DE, the physical distance between the verb and its satellite elements (PD, the magnitude of the noun phrase appearing before the verb (MOX, and the magnitude of noun phrase appearing after the verb (MOY. Our data consisted of 20 research articles randomly selected from two different disciplines of Biology and Applied Linguistics, to be analyzed in terms of the above-named textual strategies. One way ANOVA and post hoc Tukey tests were used for data analyses. The results revealed cross-generic as well as cross-disciplinary differences in the employment of the above textual forms. These findings were discussed in terms of the academic concepts and discourse on the one hand and the possible effect of the required textual forms on the readability of the text on the other hand.

  2. Legal Information Sources: An Annotated Bibliography.

    Science.gov (United States)

    Conner, Ronald C.

    This 25-page annotated bibliography describes the legal reference materials in the special collection of a medium-sized public library. Sources are listed in 12 categories: cases, dictionaries, directories, encyclopedias, forms, references for the lay person, general, indexes, laws and legislation, legal research aids, periodicals, and specialized…

  3. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  4. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  5. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  6. Just-in-time : on strategy annotations

    NARCIS (Netherlands)

    J.C. van de Pol (Jaco)

    2001-01-01

    textabstractA simple kind of strategy annotations is investigated, giving rise to a class of strategies, including leftmost-innermost. It is shown that under certain restrictions, an interpreter can be written which computes the normal form of a term in a bottom-up traversal. The main contribution

  7. Argumentation Theory. [A Selected Annotated Bibliography].

    Science.gov (United States)

    Benoit, William L.

    Materials dealing with aspects of argumentation theory are cited in this annotated bibliography. The 50 citations are organized by topic as follows: (1) argumentation; (2) the nature of argument; (3) traditional perspectives on argument; (4) argument diagrams; (5) Chaim Perelman's theory of rhetoric; (6) the evaluation of argument; (7) argument…

  8. Annotated Bibliography of EDGE2D Use

    Energy Technology Data Exchange (ETDEWEB)

    J.D. Strachan and G. Corrigan

    2005-06-24

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables.

  9. Nutrition & Adolescent Pregnancy: A Selected Annotated Bibliography.

    Science.gov (United States)

    National Agricultural Library (USDA), Washington, DC.

    This annotated bibliography on nutrition and adolescent pregnancy is intended to be a source of technical assistance for nurses, nutritionists, physicians, educators, social workers, and other personnel concerned with improving the health of teenage mothers and their babies. It is divided into two major sections. The first section lists selected…

  10. Great Basin Experimental Range: Annotated bibliography

    Science.gov (United States)

    E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen

    2013-01-01

    This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...

  11. Evaluating automatically annotated treebanks for linguistic research

    NARCIS (Netherlands)

    Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.

    2016-01-01

    This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group

  12. DIMA – Annotation guidelines for German intonation

    DEFF Research Database (Denmark)

    Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis

    2015-01-01

    This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...

  13. Annotated Bibliography of EDGE2D Use

    International Nuclear Information System (INIS)

    Strachan, J.D.; Corrigan, G.

    2005-01-01

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables

  14. Skin Cancer Education Materials: Selected Annotations.

    Science.gov (United States)

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  15. Book Reviews, Annotation, and Web Technology.

    Science.gov (United States)

    Schulze, Patricia

    From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…

  16. Snap: an integrated SNP annotation platform

    DEFF Research Database (Denmark)

    Li, Shengting; Ma, Lijia; Li, Heng

    2007-01-01

    Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...

  17. Annotating State of Mind in Meeting Data

    NARCIS (Netherlands)

    Heylen, Dirk K.J.; Reidsma, Dennis; Ordelman, Roeland J.F.; Devillers, L.; Martin, J-C.; Cowie, R.; Batliner, A.

    We discuss the annotation procedure for mental state and emotion that is under development for the AMI (Augmented Multiparty Interaction) corpus. The categories that were found to be most appropriate relate not only to emotions but also to (meta-)cognitive states and interpersonal variables. The

  18. ePNK Applications and Annotations

    DEFF Research Database (Denmark)

    Kindler, Ekkart

    2017-01-01

    newapplicationsfor the ePNK and, in particular, visualizing the result of an application in the graphical editor of the ePNK by singannotations, and interacting with the end user using these annotations. In this paper, we give an overview of the concepts of ePNK applications by discussing the implementation...

  19. Multiview Hessian regularization for image annotation.

    Science.gov (United States)

    Liu, Weifeng; Tao, Dacheng

    2013-07-01

    The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.

  20. Special Issue: Annotated Bibliography for Volumes XIX-XXXII.

    Science.gov (United States)

    Pullin, Richard A.

    1998-01-01

    This annotated bibliography lists 310 articles from the "Journal of Cooperative Education" from Volumes XIX-XXXII, 1983-1997. Annotations are presented in the order they appear in the journal; author and subject indexes are provided. (JOW)

  1. Computer systems for annotation of single molecule fragments

    Science.gov (United States)

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  2. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  3. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  4. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  5. Quick Pad Tagger : An Efficient Graphical User Interface for Building Annotated Corpora with Multiple Annotation Layers

    OpenAIRE

    Marc Schreiber; Kai Barkschat; Bodo Kraft; Albert Zundorf

    2015-01-01

    More and more domain specific applications in the internet make use of Natural Language Processing (NLP) tools (e. g. Information Extraction systems). The output quality of these applications relies on the output quality of the used NLP tools. Often, the quality can be increased by annotating a domain specific corpus. However, annotating a corpus is a time consuming and exhaustive task. To reduce the annota tion time we present...

  6. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    Energy Technology Data Exchange (ETDEWEB)

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  7. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

    Science.gov (United States)

    Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  8. SECRETARIAT AND TEXTUAL PRODUCTION: ARGUMENTATION IN THE TEXTUAL GENRE LETTER OF DECLARATION O SECRETARIADO E A PRODUÇÃO TEXTUAL: A ARGUMENTAÇÃO NO GÊNERO DECLARAÇÃO

    Directory of Open Access Journals (Sweden)

    Erivaldo Pereira do Nascimento

    2012-05-01

    Full Text Available

    This article aims at describing the semantic argumentative structure of the textual/discursive genre Letter of Declaration, one of the documents with which the executive secretary frequently deals. This investigation is based on the Argumention Theory in the Language proposed by Ducrot (1988,1987. We also used the studies about Discursive Modalization proposed by Koch (2002, Castilho e Castilho (1993, Nascimento (2005, among others. Modalization is considered here as a semantic argumentative strategy, as it enables the speaker to make a statement or to express a point of view about the content of his/her enunciation, according to the interlocution. This study about the previously mentioned gender is qualitative, quantitative and descriptive. The corpus used is composed of 20 (twenty Letters of Declaration issued by different organizations or private and public institutions. We perceived that in the Letters of Declaration analysed argumentation is achieved by the use of modalizers and argumentative operators, used by the speaker to produce different effects of meaning in the texts.

    Este artigo tem como objetivo descrever a estrutura semântico-argumentativa do gênero textual/discursivo declaração, um dos documentos com o qual o profissional de Secretariado lida frequentemente. Essa investigação tem como base a Teoria da Argumentação na Língua, proposta por Ducrot (1988, 1987. Utilizamos também os estudos de Modalização Discursiva, proposta por Koch (2002, Castilho e Castilho (1993, Nascimento (2005, entre outros. A modalização é aqui considerada como uma estratégia semântico-argumentativa, uma vez que permite ao locutor imprimir uma avaliação ou ponto de vista sobre o conteúdo de seu enunciado, em função da interlocução. A investigação realizada a respeito do referido gênero é de natureza qualitativa e quantitativa, de cunho descritivo, e o corpus utilizado é constituído por 20

  9. Model and Interoperability using Meta Data Annotations

    Science.gov (United States)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and

  10. miRBase: annotating high confidence microRNAs using deep sequencing data.

    Science.gov (United States)

    Kozomara, Ana; Griffiths-Jones, Sam

    2014-01-01

    We describe an update of the miRBase database (http://www.mirbase.org/), the primary microRNA sequence repository. The latest miRBase release (v20, June 2013) contains 24 521 microRNA loci from 206 species, processed to produce 30 424 mature microRNA products. The rate of deposition of novel microRNAs and the number of researchers involved in their discovery continue to increase, driven largely by small RNA deep sequencing experiments. In the face of these increases, and a range of microRNA annotation methods and criteria, maintaining the quality of the microRNA sequence data set is a significant challenge. Here, we describe recent developments of the miRBase database to address this issue. In particular, we describe the collation and use of deep sequencing data sets to assign levels of confidence to miRBase entries. We now provide a high confidence subset of miRBase entries, based on the pattern of mapped reads. The high confidence microRNA data set is available alongside the complete microRNA collection at http://www.mirbase.org/. We also describe embedding microRNA-specific Wikipedia pages on the miRBase website to encourage the microRNA community to contribute and share textual and functional information.

  11. Reconsidering the Rhizome: A Textual Analysis of Web Search Engines as Gatekeepers of the Internet

    Science.gov (United States)

    Hess, A.

    Critical theorists have often drawn from Deleuze and Guattari's notion of the rhizome when discussing the potential of the Internet. While the Internet may structurally appear as a rhizome, its day-to-day usage by millions via search engines precludes experiencing the random interconnectedness and potential democratizing function. Through a textual analysis of four search engines, I argue that Web searching has grown hierarchies, or "trees," that organize data in tracts of knowledge and place users in marketing niches rather than assist in the development of new knowledge.

  12. Comprensión y producción textual narrativa en preescolares

    OpenAIRE

    Luz Stella López Silva; Claudia Patricia Duque Aristizabal; Gina Lizeth Camargo Deluque; Amalia Ovalle Parra

    2014-01-01

    Este estudio buscó identificar y describir las habilidades de comprensión y producción textual narrativa de 158 estudiantes de transición de estrato 1 y 2 de colegios públicos de la ciudad de Barranquilla (Colombia). Para evaluar esta capacidad se les pidió a los niños que narraran un texto que se les había leído anteriormente. Las muestras de lenguaje se analizaron utilizando el software SALT y posteriormente se utilizó el software SPSS para realizar los análisis descriptivos. Se consideró l...

  13. Cocuyo: Una Matriz de identidad y su despliegue en submunrlos textuales

    OpenAIRE

    Cristián Montes

    2016-01-01

    El presente artículo intenta actualizar una lectura coherente de la novela Cocuyo. Para esto, se ha trabajado con la idea de matriz propuesta por Michael Riffatene, y de mundo posible de Tomás Albaladejo Mayordomo. Ambas teorías permiten dar cuenta del problema de la identidad desde ángulos complementarios y a partir del motivo del viaje. De este modo puede apreciarse que la matríz propuesta es desplegada en los diversos submundos textuales los cuales, a su vez, develan la imagen de mundo y d...

  14. Caudal léxico en alumnos universitarios : Implicancias para la interacción textual

    OpenAIRE

    Piatti, Vanesa; Fernicola, Alfredo; Melillo, Oscar Roberto; Peralta, Diego

    2011-01-01

    En este trabajo se comunican los resultados preliminares de una investigación en curso. Específicamente se refiere a los resultados encontrados en pruebas de que examinan el caudal léxico, por el interés de sus relaciones con la interacción textual. En razón de ello en la elaboración de esta comunicación se ha tenido particularmente en cuenta el modelo de comprensión lectora proporcionado por Kintsch y Rawson (2005) sobre los diferentes niveles de procesamiento cuando se lee. Los autores d...

  15. Hypertext in Secondary School in Chile: ¿It is pertinent apply traditional textual approaches?

    OpenAIRE

    Ayala Pérez, Teresa

    2012-01-01

    Los textos digitales constituyen parte importante de la comunicación en la sociedad de la información y recientemente han sido incluidos en los programas de Lenguaje y Comunicación de E.M. en Chile. Ante esta situación, cabe preguntarse si los tradicionales enfoques textuales, elaborados a partir del texto impreso en los años 70 y 80 y utilizados actualmente en Enseñanza Media, resultan aplicables al hipertexto. En este trabajo se describen algunos rasgos de este tipo de texto y se reflexiona...

  16. Redemptive Family Narratives: Olga Lengyel and the Textuality of the Holocaust*

    Science.gov (United States)

    Turda, Marius

    2016-01-01

    Memoirs written by Holocaust survivors and (in some cases) their testimonies retain a salience unmatched by other historical sources. This article discusses one such memoir, Olga Lengyel’s Five Chimneys, alongside her 1998 testimony, aiming to engage with broader methodological issues relating to the history of the Holocaust, particularly those about memory, narrative and textuality. Through a detailed discussion of certain moments shaping Olga Lengyel’s personal experience, both pre-and post-arrival in Auschwitz, the article captures the tensions and contradictions characterizing the harrowing story of one woman’s loss of family in the Holocaust. PMID:27959969

  17. What Good’s a Text? Textuality, Orality, and Mathematical Astronomy in Early Imperial China

    OpenAIRE

    Morgan, Daniel Patrick

    2014-01-01

    Article submitted to Archives internationales d'histoire des sciences; International audience; This paper examines a 226 ce debate on li 曆 mathematical astronomy at the Cao-Wei (226–265) court as a case study in the role of orality and person-to-person exchange in the transmission of astronomical knowledge in early imperial China. The li- and mathematics-related manuscripts to have come down to us from the early imperial period often suffer from textual corruption, the form that this corrupti...

  18. Consumer energy research: an annotated bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, C.D.; McDougall, G.H.G.

    1980-01-01

    This document is an updated and expanded version of an earlier annotated bibliography by Dr. C. Dennis Anderson and Carman Cullen (A Review and Annotation of Energy Research on Consumers, March 1978). It is the final draft of the major report that will be published in English and French and made publicly available through the Consumer Research and Evaluation Branch of Consumer and Corporate Affairs, Canada. Two agencies granting permission to include some of their energy abstracts are the Rand Corporation and the DOE Technical Information Center. The bibliography consists mainly of empirical studies, including surveys and experiments. It also includes a number of descriptive and econometric studies that utilize secondary data. Many of the studies provide summaries of research is specific areas, and point out directions for future research efforts. 14 tables.

  19. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...

  20. Annotating functional RNAs in genomes using Infernal.

    Science.gov (United States)

    Nawrocki, Eric P

    2014-01-01

    Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.

  1. Deburring: an annotated bibliography. Volume V

    International Nuclear Information System (INIS)

    Gillespie, L.K.

    1978-01-01

    An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred

  2. Investigating Differential Learning Outcomes of Students in Physics Using Animation and Textual Information Teaching Strategies in Ondo State Secondary School

    Directory of Open Access Journals (Sweden)

    Blessing Eguabor

    2017-06-01

    Full Text Available BackgroundThe study investigated the main effects of animation information and textual information on students'performance and improving students' attitude towards physics.Material and methodsThe study adopted the pre-test post-test control group design. The population was made up of SSS 2 students inOndo State. Three Local Government Areas were randomly selected from the 18 Local Government Areas ofOndo State. Simple random technique was used to select three schools in the selected Local Government Areas.The schools were randomly assigned to two experimental groups namely animation and, textual informationstrategy and one control group. Two instruments were used for the study.ResultsOne-way Analysis of Variance (ANOVA, Scheffe Post-Hoc pair-wise comparison Analysis, and two-way Analysisof Variance was used. The results showed that there was a significant main effect of animation and textual strategieson students performance in physics. The results also showed that there was a significant difference in the post testattitudinal score of students' exposed to the strategies with the effectiveness in the order of animation, textual, andconventional strategiesConclusionsThe study concluded that computer- based instruction such as animation and textual strategies could enhancelearning outcomes in Physics in senior secondary school irrespective of students' sex.

  3. Automatic Function Annotations for Hoare Logic

    Directory of Open Access Journals (Sweden)

    Daniel Matichuk

    2012-11-01

    Full Text Available In systems verification we are often concerned with multiple, inter-dependent properties that a program must satisfy. To prove that a program satisfies a given property, the correctness of intermediate states of the program must be characterized. However, this intermediate reasoning is not always phrased such that it can be easily re-used in the proofs of subsequent properties. We introduce a function annotation logic that extends Hoare logic in two important ways: (1 when proving that a function satisfies a Hoare triple, intermediate reasoning is automatically stored as function annotations, and (2 these function annotations can be exploited in future Hoare logic proofs. This reduces duplication of reasoning between the proofs of different properties, whilst serving as a drop-in replacement for traditional Hoare logic to avoid the costly process of proof refactoring. We explain how this was implemented in Isabelle/HOL and applied to an experimental branch of the seL4 microkernel to significantly reduce the size and complexity of existing proofs.

  4. Jannovar: a java library for exome annotation.

    Science.gov (United States)

    Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N

    2014-05-01

    Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.

  5. Annotating breast cancer microarray samples using ontologies

    Science.gov (United States)

    Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert

    2008-01-01

    As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108

  6. El concepto de ficcionalidad: Teoría y representaciones textuales

    Directory of Open Access Journals (Sweden)

    Álamo Felices, Francisco

    2014-06-01

    Full Text Available This article is an updated tour —scientific, terminological and bibliographical— of the concept of fictionality and its fundamental textual representations. The structure of this article is drawn from a prior state of the issue and moves on to a theoretical exposition of these fictional forms, to which a sampler of narrative examples applied to each of the analyzed narratives is added. All of this is complemented by recent contributions made within the field of fictionality by both literary critics and the novelists themselves.Este trabajo es un recorrido actualizado —científico, terminológico y bibliográfico— acerca del concepto de ficcionalidad y sus representaciones textuales fundamentales. La estructura del artículo se traza desde un estado previo de la cuestión para continuar con una exposición teórica de dichas modalidades ficcionales a lo que se añade un muestrario de ejemplos narrativos aplicados a cada caso analizado. Todo lo anterior se complementa con las últimas aportaciones que, en el campo de la ficcionalidad, han realizado tanto la crítica literaria como los propios novelistas.

  7. A Didatic Reflection on Distribution of Textual Genres to the Primary School

    Directory of Open Access Journals (Sweden)

    João Bosco Figueiredo-Gomes

    2015-12-01

    Full Text Available Considering the interest on the studies about textual genre and its teaching in the three last decades, the present article presents a report of an analysis of collection of didactic books of Portuguese language of the Primary School II, which aimed at verifying if the didactic book enables a comprehension about different textual genres as for the student, in relation to the context and social use, as well as for the teacher, in relation to parameters in order to do an analysis and choice of didactic collection books. The theoretical methodological framework based mainly in the contributions of Interactionism Socio-discursive: Bronckart (1999, 2003 and Schneuwly and Dolz (2004; in the studies by Marcuschi (2002, 2008 and in the orientations about the teaching of mother language contained in the National Curriculum Guidelines – PCN (BRASIL, 1998. The results provided a reflection that offers param-eters to the analysis, choice and organization of didactic collections, as well as suggestions of pedagogic implementation towards the teaching of Portuguese language.

  8. Evaluation of three automated genome annotations for Halorhabdus utahensis.

    Directory of Open Access Journals (Sweden)

    Peter Bakke

    2009-07-01

    Full Text Available Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus ribosome-binding site. Additionally, we conducted laboratory experiments to test H. utahensis growth and enzyme activity. Current annotation practices need to improve in order to more accurately reflect a genome's biological potential. We make specific recommendations that could improve the quality of microbial annotation projects.

  9. Principles of Textual Communication. On the Basis of Polish Press Reports after President Obama’s 2009 Inauguration

    Directory of Open Access Journals (Sweden)

    Piotr P. Chruszczewski

    2009-11-01

    Full Text Available On the basis of the assumption that any discourse is a highly context-dependent, and dynamically changing phenomenon of textual nature, and with reference to the fact that there may be used in linguistic research certain standards of textuality, the paper shows that the standards can be grouped into, e.g., three general sections (text-oriented, sender-oriented and context-oriented which can be used as a starting point for further research in the study of textlinguistics and journalistic discourse.

  10. Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae

    Directory of Open Access Journals (Sweden)

    Deng Jixin

    2009-02-01

    Full Text Available Abstract Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO. In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57% being annotated with 1,957 distinct and specific GO terms. Unannotated proteins

  11. Plann: A command-line application for annotating plastome sequences.

    Science.gov (United States)

    Huang, Daisie I; Cronk, Quentin C B

    2015-08-01

    Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.

  12. Annotation of mammalian primary microRNAs

    Directory of Open Access Journals (Sweden)

    Enright Anton J

    2008-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are important regulators of gene expression and have been implicated in development, differentiation and pathogenesis. Hundreds of miRNAs have been discovered in mammalian genomes. Approximately 50% of mammalian miRNAs are expressed from introns of protein-coding genes; the primary transcript (pri-miRNA is therefore assumed to be the host transcript. However, very little is known about the structure of pri-miRNAs expressed from intergenic regions. Here we annotate transcript boundaries of miRNAs in human, mouse and rat genomes using various transcription features. The 5' end of the pri-miRNA is predicted from transcription start sites, CpG islands and 5' CAGE tags mapped in the upstream flanking region surrounding the precursor miRNA (pre-miRNA. The 3' end of the pri-miRNA is predicted based on the mapping of polyA signals, and supported by cDNA/EST and ditags data. The predicted pri-miRNAs are also analyzed for promoter and insulator-associated regulatory regions. Results We define sets of conserved and non-conserved human, mouse and rat pre-miRNAs using bidirectional BLAST and synteny analysis. Transcription features in their flanking regions are used to demarcate the 5' and 3' boundaries of the pri-miRNAs. The lengths and boundaries of primary transcripts are highly conserved between orthologous miRNAs. A significant fraction of pri-miRNAs have lengths between 1 and 10 kb, with very few introns. We annotate a total of 59 pri-miRNA structures, which include 82 pre-miRNAs. 36 pri-miRNAs are conserved in all 3 species. In total, 18 of the confidently annotated transcripts express more than one pre-miRNA. The upstream regions of 54% of the predicted pri-miRNAs are found to be associated with promoter and insulator regulatory sequences. Conclusion Little is known about the primary transcripts of intergenic miRNAs. Using comparative data, we are able to identify the boundaries of a significant proportion of

  13. Annotated bibliography of Software Engineering Laboratory literature

    Science.gov (United States)

    Morusiewicz, Linda; Valett, Jon D.

    1991-01-01

    An annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory is given. More than 100 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. All materials have been grouped into eight general subject areas for easy reference: The Software Engineering Laboratory; The Software Engineering Laboratory: Software Development Documents; Software Tools; Software Models; Software Measurement; Technology Evaluations; Ada Technology; and Data Collection. Subject and author indexes further classify these documents by specific topic and individual author.

  14. Attitudes and emotions through written text: the case of textual deformation in internet chat rooms.

    Directory of Open Access Journals (Sweden)

    Francisco Yus Ramos

    2010-11-01

    Full Text Available Normal 0 21 false false false ES X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Los chats españoles de Internet son visitados por muchos jóvenes que usan el lenguaje de una forma muy creativa (ej. repetición de letras y signos de puntuación. En este artículo se evalúan varias hipótesis sobre el uso de la deformación textual respecto a su eficacia comunicativa. Se trata de comprobar si estas deformaciones favorecen una identificación y evaluación más adecuada de las actitudes (proposicionales o afectivas y emociones de sus autores. Las respuestas a un cuestionario revelan que a pesar de la información adicional que la deformación textual aporta, los lectores no suelen coincidir en la cualidad exacta de estas actitudes y emociones, ni establecen grados de intensidad relacionados con la cantidad de texto tecleada. Sin embargo, y a pesar de estos resultados, la deformación textual parece jugar un papel en la interpretación que finalmente se elige de estos mensajes enviados a los chats.

  15. Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.

    Science.gov (United States)

    Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.

  16. A Novel Approach to Semantic and Coreference Annotation at LLNL

    Energy Technology Data Exchange (ETDEWEB)

    Firpo, M

    2005-02-04

    A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.

  17. Review of actinide-sediment reactions with an annotated bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Ames, L.L.; Rai, D.; Serne, R.J.

    1976-02-10

    The annotated bibliography is divided into sections on chemistry and geochemistry, migration and accumulation, cultural distributions, natural distributions, and bibliographies and annual reviews. (LK)

  18. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Directory of Open Access Journals (Sweden)

    Bert Ely

    Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  19. Annotating non-coding regions of the genome.

    Science.gov (United States)

    Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B

    2010-08-01

    Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.

  20. Telling the Holy Sepulchre. “Mises en scène” and textual representations

    Directory of Open Access Journals (Sweden)

    Hans-Joachim Schmidt

    2016-06-01

    Full Text Available During the Middle Ages, places and holy objects were concrete in their materiality and especially in their geographical location. However they could also proliferate in other places, in a real “multiplication of presence”. Indeed, texts and maps informed the reader and the spectator, but also put him in touch with the sphere of the holiness. The idea of the omnipresence of the salvation facilitated a relocation of the poles of worship, which was nevertheless based on material contacts. This contrast between dematerialization and attachment to things maintained a play of opposites between absence and presence. The material or textual reproduction of objects and holy places made sure they became accessible for all those who would not or could not travel to them. This contribution offers an analysis of maps and various stories of medieval authors which make a symbolic approach to the holy places in Jerusalem accessible to many believers.

  1. Multimedia and Textual Reading Comprehension: Multimedia as Personal Learning Environment’s Enriching Format

    Directory of Open Access Journals (Sweden)

    Jose Daniel García Martínez

    2017-01-01

    Full Text Available In this article we will discuss part of a piece of research that was conducted with two 4ESO groups. Textual learning is opposed to multimedia learning within the context of PLE’s (Personal Learning Environment reading tools and strategies. In the research an analysis was made of whether it would be possible to improve the reading process through multimedia over a school term in two different aspects; one evolutionary with six classroom exercises and one evaluation with a final exercise. Concretely, this article states the number of question mistakes that the students made. The data indicates that there is a better evolution in students that performed the multimedia dynamic, although there are not any relevant differences in the final evaluation.

  2. Contributions of the textual analysis of speeches for the teaching in the virtual enviroments

    Directory of Open Access Journals (Sweden)

    Sueli Cristina Marquesi

    2013-12-01

    Full Text Available The present paper aims at discussing theoretical aspects of Textual Analysis of Speeches which guide an autonomous learning methodology for university students. Having as a theoretical ground, mainly, the studies developed by Adam (2008, a thematic unit of theoretical content in the area of Exact Sciences will be discussed. Within this unit, explicative and descriptive sequences are constructed in such a way to promote the interaction between the text, presented in a virtual environment, and the student. Consequently, the new content of learning is facilitated. For doing so, activities – presented totally at distance - prepared for engineering students of a Brazilian university will be brought to discussion. The methodology establishes a dialogue between an issue that is central in dealing with learning in virtual environments – the interaction through language – and the role the student has to assume in these environments: a reader/ author who makes meaning and transfers knowledge.

  3. Itinerarios textuales del "Quijote" en América (siglos XVII a XIX

    Directory of Open Access Journals (Sweden)

    Eva Marìa Valero Juan

    2013-12-01

    Full Text Available Starting with a brief description of the journey taken by the first editions of the Quixote which were sent to America, this article aims to review the textual biography of don Quixote in America between the 17th and 19th centuries. This selective review starts with the accounts of popular festivities that give evidence of the first appearances of don Quixote and Sancho in the texts. The selection of texts goes from 1607 to the Spanish Emancipation century, paying particular attention to its final years, with the analysis of some of the texts that Ruben Dario devoted to the Quixote, and of the rise of Don Quixote as a Spanish American symbol of identity.

  4. Write like a visual artist: Tracing the textually mediated art world

    Directory of Open Access Journals (Sweden)

    Janna Klostermann

    2016-12-01

    Full Text Available This study examines the social organisation of Canada’s art world from the standpoint of practising visual artists. Bringing together theories of literacy and institutional ethnography, the article investigates the literacy practices of visual artists, making visible how artists use written texts to participate in public galleries and in the social and institutional relations of the art world. Drawing on extended ethnographic research, including interviews, observational field notes and textual analyses, this study sheds light on the ways visual artists enact particular texts, enact organisational processes, and to enact the social and conceptual worlds they are a part of. Through the lens of visual artists, this study locates two particular texts – the artist statement and the bio statement – in the extended social and institutional relations of the art world.

  5. Intertextualidade e produção textual: análise das narrativas do CELLIJ

    Directory of Open Access Journals (Sweden)

    Edna Mara da Silva de Souza

    2013-12-01

    Full Text Available The textual construction already stated Bakhtin (2006 is permeated with intertextuality, ie a text takes from other texts, from this perspective, this study examines the intertextuality and the representation of narrative structure presented by students who participated in the workshops creative production of texts after storytime at the Centre for Studies in Reading and Children's and Youth Literature Betty Maria Coelho Silva (CELLIJ at UNESP campus of Presidente Prudente - SP. This paper presents a reflection on the importance of offering new narratives, whether through storytelling or reading literary texts, to build the child repertoire. From the reading of the corpus nine selected texts, we observed a strong influence of the heroes of commercial films and television series in child production, as well as a tendency writing conditional structure of fairy tales.

  6. Training nuclei detection algorithms with simple annotations

    Directory of Open Access Journals (Sweden)

    Henning Kost

    2017-01-01

    Full Text Available Background: Generating good training datasets is essential for machine learning-based nuclei detection methods. However, creating exhaustive nuclei contour annotations, to derive optimal training data from, is often infeasible. Methods: We compared different approaches for training nuclei detection methods solely based on nucleus center markers. Such markers contain less accurate information, especially with regard to nuclear boundaries, but can be produced much easier and in greater quantities. The approaches use different automated sample extraction methods to derive image positions and class labels from nucleus center markers. In addition, the approaches use different automated sample selection methods to improve the detection quality of the classification algorithm and reduce the run time of the training process. We evaluated the approaches based on a previously published generic nuclei detection algorithm and a set of Ki-67-stained breast cancer images. Results: A Voronoi tessellation-based sample extraction method produced the best performing training sets. However, subsampling of the extracted training samples was crucial. Even simple class balancing improved the detection quality considerably. The incorporation of active learning led to a further increase in detection quality. Conclusions: With appropriate sample extraction and selection methods, nuclei detection algorithms trained on the basis of simple center marker annotations can produce comparable quality to algorithms trained on conventionally created training sets.

  7. Phenex: ontological annotation of phenotypic diversity.

    Directory of Open Access Journals (Sweden)

    James P Balhoff

    2010-05-01

    Full Text Available Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices.Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  8. Phenex: ontological annotation of phenotypic diversity.

    Science.gov (United States)

    Balhoff, James P; Dahdul, Wasila M; Kothari, Cartik R; Lapp, Hilmar; Lundberg, John G; Mabee, Paula; Midford, Peter E; Westerfield, Monte; Vision, Todd J

    2010-05-05

    Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  9. Harm reduction in U.S. tobacco control: Constructions in textual news media.

    Science.gov (United States)

    Eversman, Michael H

    2015-06-01

    U.S. tobacco control has long emphasized abstinence, yet quitting smoking is hard and cessation rates low. Tobacco harm reduction alternatives espouse substituting cigarettes with safer nicotine and tobacco products. Policy shifts embracing tobacco harm reduction have increased media attention, yet it remains controversial. Discourse theory posits language as fluid, and socially constructed meaning as neither absolute nor neutral, elevating certain views over others while depicting "discursive struggle" between them. While an abstinence-based framework dominates tobacco policy, discourse theory suggests constructions of nicotine and tobacco use can change, for example by positioning tobacco harm reduction more favorably. Textual discourse analysis was used to explore constructions of tobacco harm reduction in 478 (308 original) U.S. textual news media articles spanning 1996-2014. Using keyword database sampling, retrieved articles were analyzed first as discrete recording units and then to identify emergent thematic content. Constructions of tobacco harm reduction shifted over this time, revealing tension among industry and policy interests through competing definitions of tobacco harm reduction, depictions of its underlying science, and accounts of regulatory matters including tobacco industry support for harm reduction and desired marketing and taxation legislation. Heightened salience surrounding tobacco harm reduction and electronic cigarettes suggests their greater acceptance in U.S. tobacco control. Various media depictions construct harm reduction as a temporary means to cessation, and conflict with other constructions of it that place no subjective value on continued "safer" tobacco/nicotine use. Constructions of science largely obscure claims of the veracity of tobacco harm reduction, with conflict surrounding appropriate public health benchmarks for tobacco policy and health risks of nicotine use. Taxation policies and e-cigarette pricing relative to

  10. Machine learning approaches to analysing textual injury surveillance data: a systematic review.

    Science.gov (United States)

    Vallmuur, Kirsten

    2015-06-01

    To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. Systematic review. The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality

  11. Textual Self-branding: the Rhetorical Ethos in Mallarmé’s Divagations

    Directory of Open Access Journals (Sweden)

    Arild Michel Bakken

    2011-11-01

    Full Text Available This article examines, from a rhetorical perspective, the textual presence of the auctorial figure in Mallarmé’s collection of prose writings, Divagations. It challenges the traditional and structuralist idea of Mallarmé as a poet eager to exclude his own persona from his work, and even as the initiator of the “death of the author.” Recent Mallarméan studies have been shifting the field’s attention away from the myth of the ivory tower to focus on the poet’s social project as it appears in the Divagations. Such a project presupposes a rhetorical commitment, and thus an auctorial presence in the text. The question that is raised here is then what role the figure of the poet plays in Mallarmé’s rhetorical strategy. A close rhetorical analysis of the Divagations reveals that the poet constantly, although discreetly, writes his own persona into the text. Throughout the Divagations, Mallarmé deploys much effort to give his persona qualities likely to win the support of his audience. It is argued that this manifest ethos preoccupation has a double function. The rhetorically efficient image of the poet is obviously intended to add authority to his social project. However, the poet’s constant cultivation of his textual figure shows that the ethos has gained a certain autonomy. An important preoccupation for the poet is in fact to brand himself as an author: contrary to the traditional idea of the absent poet, the auctorial figure seems to be one of the primary subjects of the Divagations. The argument thus invites us, in order to avoid overlooking this central aspect of Mallarmé’s project, to take the ethos perspective into account in any approach to Mallarmé’s prose work.

  12. Assessing Online Textual Feedback to Support Student Intrinsic Motivation Using a Collaborative Text-Based Dialogue System: A Qualitative Study

    Science.gov (United States)

    Shroff, Ronnie H.; Deneen, Christopher

    2011-01-01

    This paper assesses textual feedback to support student intrinsic motivation using a collaborative text-based dialogue system. A research model is presented based on research into intrinsic motivation, and the specific construct of feedback provides a framework for the model. A qualitative research methodology is used to validate the model.…

  13. Translation Competence and Translation Performance: Lexical, Syntactic and Textual Patterns in Student Translations of a Specialized EU Genre

    Science.gov (United States)

    Karoly, Adrienn

    2012-01-01

    This paper reports the findings of a study aiming to reveal the recurring patterns of lexical, syntactic and textual errors in student translations of a specialized EU genre from English into Hungarian. By comparing the student translations to the official translation of the text, this article uncovers the most frequent errors that students made…

  14. Texture, Textuality and Political Discourse: A Study of Lexical Cohesion in Nigeria's President Goodluck Jonathan's Inaugural Address, May, 2011

    Science.gov (United States)

    Enyi, Amaechi Uneke; Chitulu, Mark Ononiwu

    2015-01-01

    This study, entitled, "Texture and textuality in Political Discourse: A Study of Cohesive Devices in President Goodluck Jonathan's Inaugural Address-May, 2011" was an analysis of the lexical cohesive devices employed by Nigeria's President Goodluck Jonathan in crafting his May, 2011's Presidential Inaugural Address (PIA). Guided by the…

  15. How strong is your coffee? : The influence of visual metaphors and textual claims on consumers' flavor perception and product evaluation

    NARCIS (Netherlands)

    Fenko, Anna; de Vries, Roxan; van Rompay, Thomas

    2018-01-01

    This study investigates the relative impact of textual claims and visual metaphors displayed on the product's package on consumers' flavor experience and product evaluation. For consumers, strength is one of the most important sensory attributes of coffee. The 2 × 3 between-subjects experiment (N =

  16. Journeys toward Textual Relevance: Male Readers of Color and the Significance of Malcolm X and Harry Potter

    Science.gov (United States)

    Sciurba, Katie

    2017-01-01

    This article combines interview data from a group of boys of color at an urban single-sex school and content analysis of "The Autobiography of Malcolm X" and "Harry Potter and the Sorcerer's Stone" to demonstrate the complexities of readers' responses to literature. Textual relevance, or the ability to construct personal…

  17. A Cross Cultural Analysis of Textual and Interpersonal Metadiscourse Markers: The Case of Economic Articles in English and Persian Newspapers

    Science.gov (United States)

    Boshrabadi, Abbas Mehrabi; Biria, Reza; Zavari, Zahra

    2014-01-01

    This study was an attempt to investigate the functional role of textual and interpersonal metadiscourse markers in English and Persian Economic news reports. To this end, 10 news articles, 5 in each language, were randomly selected from the Economic sections of the leading newspapers published in 2013-2014 in Iran and the United States. Based on…

  18. The Impact of Textual Input Enhancement and Explicit Rule Presentation on Iranian Elementary EFL Learners' Intake of Simple Past Tense

    Science.gov (United States)

    Nahavandi, Naemeh; Mukundan, Jayakaran

    2013-01-01

    The present study investigated the impact of textual input enhancement and explicit rule presentation on 93 Iranian EFL learners' intake of simple past tense. Three intact general English classes in Tabriz Azad University were randomly assigned to: 1) a control group; 2) a TIE group; and 3) a TIE plus explicit rule presentation group. All…

  19. Textual Enhancement and Simplified Input: Effects on L2 Comprehension and Acquisition of Non-Meaningful Grammatical Form

    Science.gov (United States)

    Wong, Wynne

    2003-01-01

    The study set out to investigate how textual enhancement (TE) as a form of input enhancement and increasing the comprehensibility of input via simplified input (SI) might impact adult L2 French learners' acquisition of the past participle agreement in relative clauses and their comprehension of three texts in which the target forms were embedded.…

  20. Effects of Lexical Features, Textual Properties, and Individual Differences on Word Processing Times during Second Language Reading Comprehension

    Science.gov (United States)

    Kim, Minkyung; Crossley, Scott A.; Skalicky, Stephen

    2018-01-01

    This study examines whether lexical features and textual properties along with individual differences on the part of readers influence word processing times during second language (L2) reading comprehension. Forty-eight Spanish-speaking adolescent and adult learners of English read nine English passages in a self-paced word-by-word reading…

  1. [Textual research on Amara (Mangifera Indica Linn), Butea monsperma (Lam) Kuntze, and Ferula asatoitida L].

    Science.gov (United States)

    Li, Zhaohua; Wang, Yulin

    2015-01-01

    In the Buddhist canons, there are lots of medicines imported from abroad recorded. The dictionary works of such Buddhist canons give detailed annotations and explanations to all these foreign medicines, from which we can investigate the features of all these medicines. It is also clear that these three medicines were imported into China no later than the Tang Dynasty. Amara was originally grown in the xi yu (Western Region) , now called Mango. Its form and connotation appeared no later than the eastern Han Dynasty, and the explanation of this medicine appears in the A Great Modern Dictionary of Chinese is wrong. While its explanation for Butea monsperma should be supplemented. There are two kinds of asafoitida, herbaceous and woody. Only the former one is used for medical purpose, and the annotation appeared in A Great Modern Dictionary of Chinese is problematic.

  2. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease.

    Science.gov (United States)

    Sifrim, Alejandro; Van Houdt, Jeroen Kj; Tranchevent, Leon-Charles; Nowakowska, Beata; Sakai, Ryo; Pavlopoulos, Georgios A; Devriendt, Koen; Vermeesch, Joris R; Moreau, Yves; Aerts, Jan

    2012-01-01

    The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.

  3. ESTUDIO SOBRE LOS ABSTRACTS DE ARTÍCULOS DE INVESTIGACIÓN INFORMÁTICOS: EVIDENCIALIDAD Y MODALIDAD TEXTUAL

    Directory of Open Access Journals (Sweden)

    Ivalla Ortega Barrera

    2010-10-01

    Full Text Available Evidentiality and textual modality are widely debated fi eld categories within the areas of linguistics and textual stylistics. The attitude of the author towards the topic indicates its commitment with the established proposition. In this way, his attitude can have either a rational/evaluative character (objective or an emotional/expressive character (subjective, which thus conveys the tonality of the discourse. The present paper, which forms part of the research project "Evidentiality in a multidisciplinary corpus of research papers in English" of the University of Las Palmas de Gran Canaria, focuses on the genre "abstracts of research papers applied to computer science". In this paper we will analyse two types of characteristic functions: the communicative stamp of the markers which correspond to textual modality, both objective and subjective according to their relationship with the evidential and judgement markers, and the frequency of its use in the genre we are studying.

    La evidencialidad y la modalidad textual son categorías de campo muy debatidas dentro de la lingüística y la estilística textual. La actitud del autor hacia el texto indica el grado de compromiso que éste tiene en relación con su proposición. De este modo, su actitud puede tener carácter racional-valorativo (objetivo o carácter emotivo-expresivo (subjetivo, marcando así la tonalidad del discurso. El presente trabajo,
    enmarcado en el proyecto de investigación "Evidencialidad en un corpus multidisciplinar de artículos científi co-técnicos en lengua inglesa" de la Universidad de Las Palmas de Gran Canaria, se centra en el género "abstracts de artículos de investigación aplicados a la informática", y analiza dos tipos de función característicos: la carga comunicativa de los marcadores correspondientes a la modalidad textual, tanto objetiva como subjetiva en cuanto a su relación con los marcadores evidenciales y de juicio, y su frecuencia de uso

  4. Conexões entre produção textual e consciência lingüística

    Directory of Open Access Journals (Sweden)

    Soroka, Jaqueline Golbspan

    1998-01-01

    Full Text Available Este artigo examina a conexão entre consciência lingüística e produção textual, enfatizando os mecanismos coesivos e as etapas subjacentes ao processo de organização textual. Trata-se de uma investigação interdisciplinar a tipo exploratório, baseada nos pressupostos teóricos da psicologia cognitiva, da psicolingüística e da lingüística textual. Com o objetivo de verificar se o desempenho na produção textual - em termos de uso dos mecanismos coesivos - relaciona-se positivamente com o desenvolvimento da consciência lingüística, e se escritores com déficit na escrita redacional poderiam se beneficiar com uma abordagem terapêutica que estimule suas habilidades para refletir sobre a libguagem, foi desenvolvida nesta pesquisa uma etapa de 30 sessões de tratamento psicopedagógico grupal. Três sujeitos pré-adolescentes, de 11 a 14 anos de idade, integram o grupo (um dos sujeitos encontrava-se repetindo a 5ª série e os demais cursavam a 6ª série do 1º grau. Concluímos que a estimulação das habilidades metalingüísticas podem auxiliar os sujeitos com dificuldades escolares e desenvolver proficiência na produção textual, no que diz respeito ao monitoramento do uso dos mecanismos coesivos

  5. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  6. Prepare-Participate-Connect: Active Learning with Video Annotation

    Science.gov (United States)

    Colasante, Meg; Douglas, Kathy

    2016-01-01

    Annotation of video provides students with the opportunity to view and engage with audiovisual content in an interactive and participatory way rather than in passive-receptive mode. This article discusses research into the use of video annotation in four vocational programs at RMIT University in Melbourne, which allowed students to interact with…

  7. The GATO gene annotation tool for research laboratories

    Directory of Open Access Journals (Sweden)

    A. Fujita

    2005-11-01

    Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  8. A Selected Annotated Bibliography on Work Time Options.

    Science.gov (United States)

    Ivantcho, Barbara

    This annotated bibliography is divided into three sections. Section I contains annotations of general publications on work time options. Section II presents resources on flexitime and the compressed work week. In Section III are found resources related to these reduced work time options: permanent part-time employment, job sharing, voluntary…

  9. Propagating annotations of molecular networks using in silico fragmentation.

    Science.gov (United States)

    da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C

    2018-04-18

    The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

  10. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  11. Online Metacognitive Strategies, Hypermedia Annotations, and Motivation on Hypertext Comprehension

    Science.gov (United States)

    Shang, Hui-Fang

    2016-01-01

    This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…

  12. Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.

    Science.gov (United States)

    Wise, Michael J.

    2000-01-01

    Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…

  13. Automated evaluation of annotators for museum collections using subjective login

    NARCIS (Netherlands)

    Ceolin, D.; Nottamkandath, A.; Fokkink, W.J.; Dimitrakos, Th.; Moona, R.; Patel, Dh.; Harrison McKnight, D.

    2012-01-01

    Museums are rapidly digitizing their collections, and face a huge challenge to annotate every digitized artifact in store. Therefore they are opening up their archives for receiving annotations from experts world-wide. This paper presents an architecture for choosing the most eligible set of

  14. Annotating with Propp's Morphology of the Folktale: Reproducibility and Trainability

    NARCIS (Netherlands)

    Fisseni, B.; Kurji, A.; Löwe, B.

    2014-01-01

    We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it

  15. Developing Annotation Solutions for Online Data Driven Learning

    Science.gov (United States)

    Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.

    2009-01-01

    Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…

  16. Automatic Annotation Method on Learners' Opinions in Case Method Discussion

    Science.gov (United States)

    Samejima, Masaki; Hisakane, Daichi; Komoda, Norihisa

    2015-01-01

    Purpose: The purpose of this paper is to annotate an attribute of a problem, a solution or no annotation on learners' opinions automatically for supporting the learners' discussion without a facilitator. The case method aims at discussing problems and solutions in a target case. However, the learners miss discussing some of problems and solutions.…

  17. First generation annotations for the fathead minnow (Pimephales promelas) genome

    Science.gov (United States)

    Ab initio gene prediction and evidence alignment were used to produce the first annotations for the fathead minnow SOAPdenovo genome assembly. Additionally, a genome browser hosted at genome.setac.org provides simplified access to the annotation data in context with fathead minno...

  18. Improving Microbial Genome Annotations in an Integrated Database Context

    Science.gov (United States)

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2013-01-01

    Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

  19. Ten steps to get started in Genome Assembly and Annotation

    Science.gov (United States)

    Dominguez Del Angel, Victoria; Hjerde, Erik; Sterck, Lieven; Capella-Gutierrez, Salvadors; Notredame, Cederic; Vinnere Pettersson, Olga; Amselem, Joelle; Bouri, Laurent; Bocs, Stephanie; Klopp, Christophe; Gibrat, Jean-Francois; Vlasova, Anna; Leskosek, Brane L.; Soler, Lucile; Binzer-Panchal, Mahesh; Lantz, Henrik

    2018-01-01

    As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). PMID:29568489

  20. Sharing Map Annotations in Small Groups: X Marks the Spot

    Science.gov (United States)

    Congleton, Ben; Cerretani, Jacqueline; Newman, Mark W.; Ackerman, Mark S.

    Advances in location-sensing technology, coupled with an increasingly pervasive wireless Internet, have made it possible (and increasingly easy) to access and share information with context of one’s geospatial location. We conducted a four-phase study, with 27 students, to explore the practices surrounding the creation, interpretation and sharing of map annotations in specific social contexts. We found that annotation authors consider multiple factors when deciding how to annotate maps, including the perceived utility to the audience and how their contributions will reflect on the image they project to others. Consumers of annotations value the novelty of information, but must be convinced of the author’s credibility. In this paper we describe our study, present the results, and discuss implications for the design of software for sharing map annotations.

  1. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  2. Semantator: annotating clinical narratives with semantic web ontologies.

    Science.gov (United States)

    Song, Dezhao; Chute, Christopher G; Tao, Cui

    2012-01-01

    To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.

  3. Annotated bibliography of software engineering laboratory literature

    Science.gov (United States)

    Kistler, David; Bristow, John; Smith, Don

    1994-01-01

    This document is an annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory. Nearly 200 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. This document has been updated and reorganized substantially since the original version (SEL-82-006, November 1982). All materials have been grouped into eight general subject areas for easy reference: (1) The Software Engineering Laboratory; (2) The Software Engineering Laboratory: Software Development Documents; (3) Software Tools; (4) Software Models; (5) Software Measurement; (6) Technology Evaluations; (7) Ada Technology; and (8) Data Collection. This document contains an index of these publications classified by individual author.

  4. Preprocessing Greek Papyri for Linguistic Annotation

    Directory of Open Access Journals (Sweden)

    Vierros, Marja

    2017-08-01

    Full Text Available Greek documentary papyri form an important direct source for Ancient Greek. It has been exploited surprisingly little in Greek linguistics due to a lack of good tools for searching linguistic structures. This article presents a new tool and digital platform, “Sematia”, which enables transforming the digital texts available in TEI EpiDoc XML format to a format which can be morphologically and syntactically annotated (treebanked, and where the user can add new metadata concerning the text type, writer and handwriting of each act of writing. An important aspect in this process is to take into account the original surviving writing vs. the standardization of language and supplements made by the editors. This is performed by creating two different layers of the same text. The platform is in its early development phase. Ongoing and future developments, such as tagging linguistic variation phenomena as well as queries performed within Sematia, are discussed at the end of the article.

  5. Promoting positive parenting: an annotated bibliography.

    Science.gov (United States)

    Ahmann, Elizabeth

    2002-01-01

    Positive parenting is built on respect for children and helps develop self-esteem, inner discipline, self-confidence, responsibility, and resourcefulness. Positive parenting is also good for parents: parents feel good about parenting well. It builds a sense of dignity. Positive parenting can be learned. Understanding normal development is a first step, so that parents can distinguish common behaviors in a stage of development from "problems." Central to positive parenting is developing thoughtful approaches to child guidance that can be used in place of anger, manipulation, punishment, and rewards. Support for developing creative and loving approaches to meet special parenting challenges, such as temperament, disabilities, separation and loss, and adoption, is sometimes necessary as well. This annotated bibliography offers resources to professionals helping parents and to parents wishing to develop positive parenting skills.

  6. Entrainment: an annotated bibliography. Interim report

    International Nuclear Information System (INIS)

    Carrier, R.F.; Hannon, E.H.

    1979-04-01

    The 604 annotated references in this bibliography on the effects of pumped entrainment of aquatic organisms through the cooling systems of thermal power plants were compiled from published and unpublished literature and cover the years 1947 through 1977. References to published literature were obtained by searching large-scale commercial data bases, ORNL in-house-generated data bases, relevant journals, and periodical bibliographies. The unpublished literature is a compilation of Sections 316(a) and 316(b) demonstrations, environmental impact statements, and environmental reports prepared by the utilities in compliance with Federal Water Pollution Control Administration regulations. The bibliography includes references on monitoring studies at power plant sites, laboratory studies of physical and biological effects on entrained organisms, engineering strategies for the mitigation of entrainment effects, and selected theoretical studies concerned with the methodology for determining entrainment effects

  7. Interpretive analysis of the textual codes in the Parrot and Merchant story

    Directory of Open Access Journals (Sweden)

    Zohreh Najafi

    2016-06-01

    Full Text Available Abstract Narratology is a brunch of semiology which considers any kind of narrative, literary or non literary, verbal or visual, story or non story and after this specifies the plot. One of the most important subjects in narratology is the interpretation of textual cods which based on can be understood a lot of hidden meanings. Molavi expresses a lot of mystic points in Mathnavi in narrative form which some of his ideas can be realized in the first study but these ideas aren’t all of subjects which he wants to say and a complex of different meaning is hidden in any narrative which can be revealed by consideration of cods.     In terminology of semiology, code is a special situation in historical process of all of indexes and signs which specified for synchronized analyze. In review of literary texts textual cods are more important. Textual cods are the cods which their ambit is more extended than few special text and links these texts to each other in an interpretational form. Aesthetic cods are a group of textual cods which are used in different arts like: poem, painting, theater, music and so on.       The style of expression of aesthetic cods is the same style of art and literature. In the review of Mathnavi's narratives paying attention to narrative cods and using them can be considered as paralinguistic signs. Narrative cods contain interpretational form which used by authors and commentators of texts. In the story of parrot and merchant the textual cods are as follows:   - Parrot: in this story parrot is a symbolic code which shows all of human soul's features.   - Merchant: in this story merchant act as a cultural code which indicates rich who always are solicitous about their finance and are unaware of spiritual world.    - India: in this story India as a signifier code indicates spiritual world. Hermeneutic cods   Molana uses codes which act as turning point and addressed can understand hidden meanings which haven

  8. Suggestions toward some discourse-analytic approaches to text difficulty: with special reference to ‘T-unit configuration’ in the textual unfolding

    Directory of Open Access Journals (Sweden)

    Kazem Lotfipour-Saedi

    2015-01-01

    Full Text Available This paper represents some suggestions towards discourse-analytic approaches for ESL/EFL education, with the focus on identifying the textual forms which can contribute to the textual difficulty. Textual difficulty / comprehensibility, rather than being purely text-based or reader-dependent, is certainly a matter of interaction between text and reader. The paper will look at some of the textual factors which can be argued to make a text more or less readable for the same reader. The main focus here will be on academic texts. The high cognitive load and low readability of the expository texts in various academic disciplines will be argued to belong to certain textual strategies as well as variations in the configurations of the T-units as the prime scaffolding for the textualization process. Different categories of these variations to be discussed here will be exemplified from a few academic and expository registers. More extensive textual analyses will, of course, be necessary in order to be able to make evidential suggestions for possible correlations between certain types and clusters of T-unit configurations on the one hand, and cognitive load and readability indices on the other, across various academic registers, genres and disciplines.

  9. The effectiveness of annotated (vs. non-annotated) digital pathology slides as a teaching tool during dermatology and pathology residencies.

    Science.gov (United States)

    Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A

    2014-06-01

    With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  10. A document processing pipeline for annotating chemical entities in scientific documents.

    Science.gov (United States)

    Campos, David; Matos, Sérgio; Oliveira, José L

    2015-01-01

    The recognition of drugs and chemical entities in text is a very important task within the field of biomedical information extraction, given the rapid growth in the amount of published texts (scientific papers, patents, patient records) and the relevance of these and other related concepts. If done effectively, this could allow exploiting such textual resources to automatically extract or infer relevant information, such as drug profiles, relations and similarities between drugs, or associations between drugs and potential drug targets. The objective of this work was to develop and validate a document processing and information extraction pipeline for the identification of chemical entity mentions in text. We used the BioCreative IV CHEMDNER task data to train and evaluate a machine-learning based entity recognition system. Using a combination of two conditional random field models, a selected set of features, and a post-processing stage, we achieved F-measure results of 87.48% in the chemical entity mention recognition task and 87.75% in the chemical document indexing task. We present a machine learning-based solution for automatic recognition of chemical and drug names in scientific documents. The proposed approach applies a rich feature set, including linguistic, orthographic, morphological, dictionary matching and local context features. Post-processing modules are also integrated, performing parentheses correction, abbreviation resolution and filtering erroneous mentions using an exclusion list derived from the training data. The developed methods were implemented as a document annotation tool and web service, freely available at http://bioinformatics.ua.pt/becas-chemicals/.

  11. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  12. Text in context: a textual-linguistic approach to Amos 4: 7-8

    Directory of Open Access Journals (Sweden)

    del Barco del Barco, Francisco Javier

    2002-12-01

    Full Text Available This article will study Amos 4:7-8 from a textlinguistic approach: the form of this section will be analyzed within the structure of the chapter in which it is inserted. Such an analysis is needed because the set of verb forms used seems to be different from the rest of verb forms used in the chapter. While the whole chapter tends to be structured as a brief chain of narrative passages with wayyiqtol, the structure of Amos 4:7-8 seems to be a predictive section -developed through weqatal- inserted or pasted in the middle of the chapter. Translations usually do not note the difference between the set of verb forms used. A textlinguistic analysis of Amos 4:7-8 will show that the kind of discourse used here is different from the one used in the rest of the chapter, and, therefore, this difference should be reflected in the translation. The specific function of some discourse types is also discussed.

    En este artículo se presenta un análisis de Amos 4:7-8 a partir de los presupuestos de la lingüística textual. La forma del texto se analizará tomando en cuenta la estructura del capítulo en el que se halla inserto. Este análisis resulta necesario porque el grupo de formas verbales utilizado en la sección propuesta no parece ser el mismo que el del resto del capítulo. Mientras el capítulo en su conjunto es un discurso narrativo estructurado en torno a wayyiqtol, Amos 4:7-8 parece responder al esquema del discurso predictivo desarrollado a partir de weqatal. Un análisis textual se hace necesario porque las traducciones bíblicas no parecen hacerse eco del cambio en el uso de las formas verbales. Además de este análisis, se trata también de la función específica de algunos tipos de discurso.

  13. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    Directory of Open Access Journals (Sweden)

    Jianfang Cao

    2015-01-01

    Full Text Available With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  14. Ontology modularization to improve semantic medical image annotation.

    Science.gov (United States)

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. The caBIG annotation and image Markup project.

    Science.gov (United States)

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Sepukar, Kastubh; Rubin, Daniel L

    2010-04-01

    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

  16. Annotation of the Evaluative Language in a Dependency Treebank

    Directory of Open Access Journals (Sweden)

    Šindlerová Jana

    2017-12-01

    Full Text Available In the paper, we present our efforts to annotate evaluative language in the Prague Dependency Treebank 2.0. The project is a follow-up of the series of annotations of small plaintext corpora. It uses automatic identification of potentially evaluative nodes through mapping a Czech subjectivity lexicon to syntactically annotated data. These nodes are then manually checked by an annotator and either dismissed as standing in a non-evaluative context, or confirmed as evaluative. In the latter case, information about the polarity orientation, the source and target of evaluation is added by the annotator. The annotations unveiled several advantages and disadvantages of the chosen framework. The advantages involve more structured and easy-to-handle environment for the annotator, visibility of syntactic patterning of the evaluative state, effective solving of discontinuous structures or a new perspective on the influence of good/bad news. The disadvantages include little capability of treating cases with evaluation spread among more syntactically connected nodes at once, little capability of treating metaphorical expressions, or disregarding the effects of negation and intensification in the current scheme.

  17. MimoSA: a system for minimotif annotation

    Directory of Open Access Journals (Sweden)

    Kundeti Vamsi

    2010-06-01

    Full Text Available Abstract Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to

  18. PCAS – a precomputed proteome annotation database resource

    Directory of Open Access Journals (Sweden)

    Luo Jingchu

    2003-11-01

    Full Text Available Abstract Background Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. Results We report here the development of PCAS (ProteinCentric Annotation System as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at http://pak.cbi.pku.edu.cn/proteome/gca.php Conclusion PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms

  19. Preparing for opening night: temporal boundary objects in textually-mediated professional practice

    Directory of Open Access Journals (Sweden)

    Elisabeth Davies

    2004-01-01

    Full Text Available The authors report on two projects in which the role of documents as temporal boundary objects mediating information practices across multiple timelines was explored. It has been suggested that studying workplace documents will uncover the information practices of professionals beyond traditional information needs and uses studies. Two workplaces were studied: a professional theatre production and a midwifery clinic. Both settings are communities constructed partly through textual dynamics and both have a pre-production phase leading to an opening night. In the theatre setting, qualitative interviews with the cast and crew and document analysis of the prompt book were the means of data collection. The midwifery clinic setting was investigated by means of interviews and follow-ups with sixteen midwife-client pairs and document analysis of the antenatal record. Preliminary thematic analysis pertaining to time and information was conducted on interview transcripts and the relevant documents. It was possible to show several instances of both the prompt book and the antenatal record being treated as a timeline by the various professionals using them. The authors conclude with a discussion of the temporal aspects of professionals' information practices as revealed by these two projects and encourage further document-focused research.

  20. The language of racism. Textual testimonies of Jewish-Arab hostility in the Israeli Academia

    Directory of Open Access Journals (Sweden)

    Tamar Heger

    2015-02-01

    Full Text Available The persistent Jewish Arab conflict is present in every aspect of life in Israeli society and its echoes penetrate the everyday reality of higher educational institutions. Feelings of mutual hostility among Arab and Jewish students, faculty and administration are common experiences on Israeli campuses. This article analyzes two textual expressions of this mutual resentment which were circulated in 2011 in Tel Hai College, Israel. One of the texts was produced by Muslim Arab student association and the other by a Zionist Jewish organization. Both groups are present on every campus in Israel. Despite the significant difference of the political location occupied by each organization in the Israeli power structure, we argue that these texts share similar attitudes to the conflict and parallel operational strategies. The paper demonstrates the attempts by these texts to encourage the mutual hostility between Jews and Arabs by employing racist and violent discourse. The article tries to explain the silence of the college administration and faculty in the face of these racist acts, subsequently outlining a vision of a responsible academia which will banish any acts of racism.

  1. In the Beginning was the Genome: Genomics and the Bi-textuality of Human Existence.

    Science.gov (United States)

    Zwart, H A E Hub

    2018-04-01

    This paper addresses the cultural impact of genomics and the Human Genome Project (HGP) on human self-understanding. Notably, it addresses the claim made by Francis Collins (director of the HGP) that the genome is the language of God and the claim made by Max Delbrück (founding father of molecular life sciences research) that Aristotle must be credited with having predicted DNA as the soul that organises bio-matter. From a continental philosophical perspective I will argue that human existence results from a dialectical interaction between two types of texts: the language of molecular biology and the language of civilisation; the language of the genome and the language of our socio-cultural, symbolic ambiance. Whereas the former ultimately builds on the alphabets of genes and nucleotides, the latter is informed by primordial texts such as the Bible and the Quran. In applied bioethics deliberations on genomics, science is easily framed as liberating and progressive, religious world-views as conservative and restrictive (Zwart 1993). This paper focusses on the broader cultural ambiance of the debate to discern how the bi-textuality of human existence is currently undergoing a transition, as not only the physiological, but also the normative dimension is being reframed in biomolecular and terabyte terms.

  2. Nuevas alfabetizaciones en un entorno multimodal: nuevas necesidades lectoras para un entorno textual múltiple

    Directory of Open Access Journals (Sweden)

    Juan Patricio Sánchez-Claros

    2016-01-01

    Full Text Available The new technologies of information, the development of the media and the massive presence of multimedia devices have brought a renewal of the traditional concept of literacy. Multimodality in the media, in the dissemination of knowledge and the communicat ion formats have brought a change in the system of reception of media objects and each of its integral elements. This multimodal environment is part of the daily life of students who access through a scenario of screens and images across a virtual environm ent in which both textual and visual are integrated as an informational continuum that requires to users new interpretation skills. And, consequently, a new approach to media literacy should be the subject of attention from educators. Media literac y implies the need to address new visual skills, new reading skills and especially new skills for integration and analysis, among those who cannot be set aside on the critical apparatus necessary. This paper discusses the content of these skills, some cri tical tools for multimodal approach speech as well as the role the four substrata of discourse, design, production and distribution , typical ofCritical Analysis of Multimodal Discourse are analysed .

  3. Dis/Graceful Liberties: Textual Libertinism/ Libertine Texts in J.M. Coetzee’s Disgrace

    Directory of Open Access Journals (Sweden)

    Driss Hager Ben

    2017-12-01

    Full Text Available This essay addresses J.M. Coetzee’s Disgrace, a Booker Prize winner in 1999. The novel captures South African political and cultural turmoil attending the post-apartheid transitional period. Far from overlooking the political allegory, I propose instead to expand on a topic only cursorily developed elsewhere, namely liberty and license. The two terms foreground the textual dynamics of the novel as they compete and/or negotiate meaning and ascendency. I argue that Disgrace is energized by Coetzee’s belief in a total liberty of artistic production. Sex is philosophically problematized in the text and advocated as a serious issue that deserves artistic investigation without restriction or censorship. This essay looks into the subtle libertinism in Coetzee’s text, which displays pornographic overtones without exhibiting a flamboyant libertinage. Disgrace acquires its libertine gesture from its dialogue with several literary works steeped in libertinism. The troubled relationship between the aesthetic and the ethical yields an ambiguous text that invites a responsible act of reading.

  4. Chawton Novels Online, Women’s Writing 1751-1834 and Computer-Aided Textual Analysis

    Directory of Open Access Journals (Sweden)

    Anne Bandry-Scubbi

    2015-10-01

    Full Text Available Using Chawton House Library’s “Novels Online,” several corpora have been set up for a computer-aided textual analysis of the use of vocabulary by women writing “domestic novels” from 1752 to 1834. This corpus stylistics approach makes it possible to map texts according to their word usage and to identify quantitative keywords which provide vocabulary profiles through comparison and contrast with contemporary male and female canonical texts. Items identified include pronouns, markers of dialogue and of intensity; others can be grouped into specific lexical fields such as feelings. One text from the collection then forms the object of a case-study to explore a paradox: although Jane Taylor’s use of vocabulary in her 1817 Rachel appears the most representative of the corpus made up of 42 novels by women, this Chawton text has been called “a highly original tale.” Methodology and findings are both presented to address the challenge of identifying features which constitute typicality.

  5. Location and Unlocation: Examining Gender and Telephony through Autoethnographic Textual and Visual Methods

    Directory of Open Access Journals (Sweden)

    Lia Bryant BSW, PhD in Sociology

    2013-02-01

    Full Text Available Studies on gender and telephony tend to be quantitative and depict the purposes for which women and men use mobile telephones and landlines. Qualitative studies on the topic predominantly rely on face-to-face interviews to examine how telephone use genders space. We suggest these traditional methods of data collection leave unexamined the emotional and social relationships that emerge and are enabled by telephone use, which at times reconfigure and gender social spaces. In this article we present a collaborative autoethnographic inquiry based on our own telephone lives. We introduce a reflexive visual and textual methodological design, specifically diary notes, memory work, and photography, developed from our lives as researcher and researched. We examine an important theme in our findings, the physical placement of the telephone and the phone holder's awareness of the physicality of the telephone, which illustrates the importance of our methodological choices. We show how the placement of the phone by the users both genders space and creates emotional spaces.

  6. Comprensión y producción textual narrativa en preescolares

    Directory of Open Access Journals (Sweden)

    Luz Stella López Silva

    2014-01-01

    Full Text Available Este estudio buscó identificar y describir las habilidades de comprensión y producción textual narrativa de 158 estudiantes de transición de estrato 1 y 2 de colegios públicos de la ciudad de Barranquilla (Colombia. Para evaluar esta capacidad se les pidió a los niños que narraran un texto que se les había leído anteriormente. Las muestras de lenguaje se analizaron utilizando el software SALT y posteriormente se utilizó el software SPSS para realizar los análisis descriptivos. Se consideró la estructura organizativa microestructura, macroestructura y superestructura desde el nivel de comprensión literal e inferencial. Se encontró que la mayoría de los niños tienen un desarrollo narrativo acorde con su edad, sin embargo, la elaboración inferencial fue escasa, lo que posiblemente afectó la comprensión de la historia por parte de los mismos.

  7. Consciência da "estrutura argumentativa" e produção textual

    Directory of Open Access Journals (Sweden)

    Regina Pinheiro

    Full Text Available A suposição cognitivista de que a consciência de um "esquema argumentativo" oriente as produções textuais dos indivíduos está longe de ser pacífica. Alternativamente, sugere-se que os elementos incluídos num texto seriam determinados pela consciência a respeito de parâmetros da situação de produção textual (finalidade, destinatário, etc.. Partindo-se desta controvérsia, investigou-se em que medida a reflexão sobre a "estrutura argumentativa" exerceria um impacto sobre os elementos que crianças e adultos jovens incorporariam a seus textos. A análise dos resultados mostrou que: ponto de vista e justificativa foram os elementos considerados indispensáveis a uma escrita argumentativa. Antecipação de contra-argumentos foi vista como relevante para a consecução do objetivo persuasivo do texto apenas quando eram rebatidos. A reflexão sobre elementos da "estrutura argumentativa" nem sempre correspondeu à inclusão destes nos textos produzidos. Tal inclusão dependeu primordialmente da avaliação dos elementos que mais contribuiriam para a consecução da finalidade persuasiva do texto - consciência retórica.

  8. Intercultural Passages in Ottó Tolnai’s Textual Universe

    Directory of Open Access Journals (Sweden)

    Ispánovics Csapó Julianna

    2014-12-01

    Full Text Available The literary palette of Tolnai’s textual universe within the Hungarian literature from Vojvodina is based, among others, upon the intertwining of various cultural entities. The social and cultural spaces of “Big Yugoslavia,” the phenomena, figures, and works of the European-oriented Yugoslav and ethnic culture (literature, painting, book publishing, theatre, sports, etc., the mentalities of the migrant worker’s life, the legends of the Tito cult embed the narrative procedures of particular texts by Tolnai into a rich culture-historical context. Similarly to the model of Valery’s Mediterranean, the narrator’s Janus-faced Yugoslavia simultaneously generates concrete and utopian spaces, folding upon one another. Above the micro spaces (towns, houses, flats evolving along the traces of reality, there float the Proustian concepts of scent and colour of the Adriatic sea (salt, azure, mimosa, lavender, laurel. The nostalgia towards the lost Eden rises high and waves about the “grand form” of Big Yugoslavia, the related space of which is the Monarchy. The counterpoints of the grand forms are “the small, void forms,” provinces, regions (Vojvodina, North Bačka and the micro spaces coded into them. The text analyses of the paper examine the intercultural motions and identityforming culture-historical elements of the outlined space system.

  9. Discerning applicants’ interests in rural medicine: a textual analysis of admission essays

    Directory of Open Access Journals (Sweden)

    Carol L. Elam

    2015-03-01

    Full Text Available Background: Despite efforts to construct targeted medical school admission processes using applicant-level correlates of future practice location, accurately gauging applicants’ interests in rural medicine remains an imperfect science. This study explores the usefulness of textual analysis to identify rural-oriented themes and values underlying applicants’ open-ended responses to admission essays. Methods: The study population consisted of 75 applicants to the Rural Physician Leadership Program (RPLP at the University of Kentucky College of Medicine. Using WordStat, a proprietary text analysis program, applicants’ American Medical College Application Service personal statement and an admission essay written at the time of interview were searched for predefined keywords and phrases reflecting rural medical values. From these text searches, derived scores were then examined relative to interviewers’ subjective ratings of applicants’ overall acceptability for admission to the RPLP program and likelihood of practicing in a rural area. Results: The two interviewer-assigned ratings of likelihood of rural practice and overall acceptability were significantly related. A statistically significant relationship was also found between the rural medical values scores and estimated likelihood of rural practice. However, there was no association between rural medical values scores and subjective ratings of applicant acceptability. Conclusions: That applicants’ rural values in admission essays were not related to interviewers’ overall acceptability ratings indicates that other factors played a role in the interviewers’ assessments of applicants’ acceptability for admission.

  10. Discerning applicants' interests in rural medicine: a textual analysis of admission essays.

    Science.gov (United States)

    Elam, Carol L; Weaver, Anthony D; Whittler, Elmer T; Stratton, Terry D; Asher, Linda M; Scott, Kimberly L; Wilson, Emery A

    2015-01-01

    Despite efforts to construct targeted medical school admission processes using applicant-level correlates of future practice location, accurately gauging applicants' interests in rural medicine remains an imperfect science. This study explores the usefulness of textual analysis to identify rural-oriented themes and values underlying applicants' open-ended responses to admission essays. The study population consisted of 75 applicants to the Rural Physician Leadership Program (RPLP) at the University of Kentucky College of Medicine. Using WordStat, a proprietary text analysis program, applicants' American Medical College Application Service personal statement and an admission essay written at the time of interview were searched for predefined keywords and phrases reflecting rural medical values. From these text searches, derived scores were then examined relative to interviewers' subjective ratings of applicants' overall acceptability for admission to the RPLP program and likelihood of practicing in a rural area. The two interviewer-assigned ratings of likelihood of rural practice and overall acceptability were significantly related. A statistically significant relationship was also found between the rural medical values scores and estimated likelihood of rural practice. However, there was no association between rural medical values scores and subjective ratings of applicant acceptability. That applicants' rural values in admission essays were not related to interviewers' overall acceptability ratings indicates that other factors played a role in the interviewers' assessments of applicants' acceptability for admission.

  11. COERÊNCIA IMAGÉTICO-TEXTUAL NO GÊNERO QUADRINHOS?

    Directory of Open Access Journals (Sweden)

    Jack Brandão

    2015-12-01

    Full Text Available Este artigo objetiva o estudo analítico das linguagens verbal e não verbal presentes nas histórias em quadrinhos (tiras, com o intuito de verificar como a união das duas linguagens presentes no gênero é eficiente enquanto mecanismo auxiliador no processo de compreensão da mensagem transmitida. Enfatiza-se ainda a importância da parceria autor-leitor e de outros elementos como essenciais para que a leitura do (contexto aconteça, enfatizando-se a força do texto não verbal nesse processo. Esta investigação está alicerçada nos pressupostos teóricos da Linguística Textual e, em especial, do conceito de coerência, utilizando-se da leitura e análise de tiras elucidativas da personagem Mafalda, criação do cartunista argentino Quino, nos anos 60.

  12. The third language: A recurrent textual restriction that translators come across in audiovisual translation.

    Directory of Open Access Journals (Sweden)

    Montse Corrius Gimbert

    2005-01-01

    Full Text Available If the process of translating is not at all simple, the process of translating an audiovisual text is still more complex. Apart rom technical problems such as lip synchronisation, there are other factors to be considered such as the use of the language and textual structures deemed appropriate to the channel of communication. Bearing in mind that most of the films we are continually seeing on our screens were and are produced in the United States, there is an increasing need to translate them into the different languages of the world. But sometimes the source audiovisual text contains more than one language, and, thus, a new problem arises: the ranslators face additional difficulties in translating this “third language” (language or dialect into the corresponding target culture. There are many films containing two languages in the original version but in this paper we will focus mainly on three films: Butch Cassidy and the Sundance Kid (1969, Raid on Rommel (1999 and Blade Runner (1982. This paper aims at briefly illustrating different solutions which may be applied when we come across a “third language”.

  13. True Detective Stories: Media Textuality and the Anthology Format between Remediation and Transmedia Narratives

    Directory of Open Access Journals (Sweden)

    Cristina Demaria

    2014-12-01

    Full Text Available Through the analysis of the recent HBO first season TV series True Detective (2014-, the essays focuses on the renewed anthology format of contemporary seriality as a way to inscribe the form of the novel in the transmedia imagination and its narrative models. While the first part of the essay concentrates on Media and Literary Studies’ debates on the statute of media texts, their materiality and the transformations of their contents in the participatory and convergent culture of prosumers, the second part is devoted to an in-depth reading of some of the main features of True Detective’s first season: from the ways it remediates many other genres and media, to how – as an audiovisual sychretyc text -  it plays with dialogues, cinematography, music and its temporal, spatial and seeing enunciative strategies and positions in order to construct a (quasidystopic narrative of America as an after-image, or, better, a post-collapse America and its Southern Gothic landscapes. In the lst paragraph, this writing briefly engages with how this particular format of TV series helps developing narrative models that fan we bpages and fanfic archives are still struggling not so much to comprehend, but to actually transform into an expanded textuality  able to tell a more ‘true’ story.

  14. Rethinking Over Textuality of Digital Image: A Methodological Proposal for Pleasant Reading on Digital Screens

    Directory of Open Access Journals (Sweden)

    Cristian Álvarez

    2009-12-01

    Full Text Available It sets out the necessity about thinking over the instructional function of image in digital world under the light of the new opportunities of a methodological proposal to read as a game. First, for this reason it exams the perceptions of García Canclini about the reading of university students, and its problems on the context of new technologies: accumulation of information versus weakening of reflection. To this situation it adds the no appreciation of visual images. Faced with this problematic situation, and with the aim of sketching out options, it analyzes two experiences about books: the “tasty” reading of texts (the “good reading”, and the potentialities presented in the essential characteristics of playing. So, it proposes a methodology shaped for five steps to read images on digital screen. Its aim is seizing the possibilities of “good reading” to expand the comprehension of the visual information perceived through the screen. The proposal puts the accent in the textuality of representational surface of an image. Also it brings the attentive visual route about in order to enable to identify both significant forms and spaces. This proposal is illustrated with examples.

  15. Norman Fairclough’s Textually Oriented Discourse Analysis in Vladimir Nabokov’s Lolita

    Directory of Open Access Journals (Sweden)

    Pegah Sheibeh

    2016-03-01

    Full Text Available The present study is an attempt to use “textually oriented discourse analysis” of Norman Fairclough, (1941, to offer a new reading of Vladimir Nabokov’s Lolita (1955. The researcher found it quite appropriate to read the chosen work through Fairclough’s stylistic and discursive features in order to represent the identities of the main characters. Nabokov’s works represent an important portrayal of discursive challenges in the post-war American society that is largely invested in Fairclough’s theory. One of the most apparent aspects in Nabokov’s works is that of social identity and subject formation. As discourse theory denotes, a subject 'misrepresents' the world in ideology because he wants to do so, because there is some reward or benefit to him in doing so. Similarly Humbert Humbert in Lolita is looking for an imaginary world by which he can hide his suppressed identity. He fakes a new identity for himself through fiction. Considering the style of narration in Lolita, some reader consider Humbert as an unreliable narrator, he sometimes makes up some events and in some parts of the story he seems insane and uncertain. In Lolita Nabokov tries to show that each individual creates his/her reality and do not reflect reality. Humbert narrates his personal story from his point of view which might be different from people around him. Reality in Nabokov’s perspective is subjective and mixture of memory and imagination.

  16. Comprensión y Producción Textual Narrativa en Estudiantes de Educación Primaria

    Directory of Open Access Journals (Sweden)

    LEIDY TATIANA GUZMÁN TORRES

    2015-01-01

    Full Text Available Por medio de un enfoque mixto, se caracterizó la comprensión y producción textual narrativa en estudiantes de educación primaria (primero y segundo grado, así como aspectos psicosociales (valor social de la lectura y hábito lector y contextuales (interacciones cognitivas y afectivas docente-niños en docentes de educación primaria de tres instituciones educativas. Se encontró que los estudiantes tuvieron un desempeño medio alto en producción textual y comprensión literal, y bajo en comprensión inferencial. Además, se observó que la mayoría de las docentes son lectoras ocasionales, emplean interacciones cognitivas unidireccionales, ofrecen un mediano apoyo emocional a sus estudiantes y valoran la lectura como una herramienta instrumental y lúdica.

  17. A Cross Cultural Analysis of Textual and Interpersonal Metadiscourse Markers: The Case of Economic Articles in English and Persian Newspapers

    Directory of Open Access Journals (Sweden)

    Abbas Mehrabi Boshrabadi

    2014-04-01

    Full Text Available This study was an attempt to investigate the functional role of textual and interpersonal metadiscourse markers in English and Persian Economic news reports. To this end, 10 news articles, 5 in each language, were randomly selected from the Economic sections of the leading newspapers published in 2013-2014 in Iran and the United States. Based on Kopple’s (1985 taxonomy, the type and frequency of metadiscourse markers used in the texts were analyzed to find out their functions in the text. The findings revealed that the textual markers used by Persian authors were considerably more frequent than those employed by the American writers. Interestingly, unlike the Persian writers, the American authors enlisted a larger number of interpersonal markers, which made their angle of the subject treatment different. It is evident that the differential use of metadiscourse markers by nationally different authors could be attributed to culture-specific norms governing the development and organization of discourse.

  18. Information Literacy on the Web: How College Students Use Visual and Textual Cues to Assess Credibility on Health Websites

    Directory of Open Access Journals (Sweden)

    Katrina L. Pariera

    2012-12-01

    Full Text Available One of the most important literacy skills in today’s information society is the ability to determine the credibility of online information. Users sort through a staggering number of websites while discerning which will provide satisfactory information. In this study, 70 college students assessed the credibility of health websites with a low and high design quality, in either low or high credibility groups. The study’s purpose was to understand if students relied more on textual or visual cues in determining credibility, and to understand if this affected their recall of those cues later. The results indicate that when viewing a high credibility website, high design quality will bolster the credibility perception, but design quality will not compensate for a low credibility website. The recall test also indicated that credibility does impact the participants’ recall of visual and textual cues. Implications are discussed in light of the Elaboration Likelihood Model.

  19. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...... and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross...

  20. Roadmap for annotating transposable elements in eukaryote genomes.

    Science.gov (United States)

    Permal, Emmanuelle; Flutre, Timothée; Quesneville, Hadi

    2012-01-01

    Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.

  1. How Strong Is Your Coffee? The Influence of Visual Metaphors and Textual Claims on Consumers’ Flavor Perception and Product Evaluation

    Science.gov (United States)

    Fenko, Anna; de Vries, Roxan; van Rompay, Thomas

    2018-01-01

    This study investigates the relative impact of textual claims and visual metaphors displayed on the product’s package on consumers’ flavor experience and product evaluation. For consumers, strength is one of the most important sensory attributes of coffee. The 2 × 3 between-subjects experiment (N = 123) compared the effects of visual metaphor of strength (an image of a lion located either on top or on the bottom of the package of coffee beans) and the direct textual claim (“extra strong”) on consumers’ responses to coffee, including product expectation, flavor evaluation, strength perception and purchase intention. The results demonstrate that both the textual claim and the visual metaphor can be efficient in communicating the product attribute of strength. The presence of the image positively influenced consumers’ product expectations before tasting. The textual claim increased the perception of strength of coffee and the purchase intention of the product. The location of the image also played an important role in flavor perception and purchase intention. The image located on the bottom of the package increased the perceived strength of coffee and purchase intention of the product compared to the image on top of the package. This result could be interpreted from the perspective of the grounded cognition theory, which suggests that a picture in the lower part of the package would automatically activate the “strong is heavy” metaphor. As heavy objects are usually associated with a position on the ground, this would explain why perceiving a visually heavy package would lead to the experience of a strong coffee. Further research is needed to better understand the relationships between a metaphorical image and its spatial position in food packaging design. PMID:29459840

  2. How Strong Is Your Coffee? The Influence of Visual Metaphors and Textual Claims on Consumers’ Flavor Perception and Product Evaluation

    Directory of Open Access Journals (Sweden)

    Anna Fenko

    2018-02-01

    Full Text Available This study investigates the relative impact of textual claims and visual metaphors displayed on the product’s package on consumers’ flavor experience and product evaluation. For consumers, strength is one of the most important sensory attributes of coffee. The 2 × 3 between-subjects experiment (N = 123 compared the effects of visual metaphor of strength (an image of a lion located either on top or on the bottom of the package of coffee beans and the direct textual claim (“extra strong” on consumers’ responses to coffee, including product expectation, flavor evaluation, strength perception and purchase intention. The results demonstrate that both the textual claim and the visual metaphor can be efficient in communicating the product attribute of strength. The presence of the image positively influenced consumers’ product expectations before tasting. The textual claim increased the perception of strength of coffee and the purchase intention of the product. The location of the image also played an important role in flavor perception and purchase intention. The image located on the bottom of the package increased the perceived strength of coffee and purchase intention of the product compared to the image on top of the package. This result could be interpreted from the perspective of the grounded cognition theory, which suggests that a picture in the lower part of the package would automatically activate the “strong is heavy” metaphor. As heavy objects are usually associated with a position on the ground, this would explain why perceiving a visually heavy package would lead to the experience of a strong coffee. Further research is needed to better understand the relationships between a metaphorical image and its spatial position in food packaging design.

  3. The Army and the Academy as Textual Communities: Exploring Mismatches in the Concepts of Attribution, Appropriation, and Shared Goals

    Science.gov (United States)

    2010-01-01

    writing manual, Student Text 22-1, provides information on plagiarism violations, which are "subject to review and may be referred to an academic board...charges of plagiarism were leveled by an academic who recognized in the manual various ideas and text from previous published sources. In many ways...worried about academic plagiarism may believe that revealing variations in textual practices "sends a message" to students that rules can be bent in

  4. How Strong Is Your Coffee? The Influence of Visual Metaphors and Textual Claims on Consumers' Flavor Perception and Product Evaluation.

    Science.gov (United States)

    Fenko, Anna; de Vries, Roxan; van Rompay, Thomas

    2018-01-01

    This study investigates the relative impact of textual claims and visual metaphors displayed on the product's package on consumers' flavor experience and product evaluation. For consumers, strength is one of the most important sensory attributes of coffee. The 2 × 3 between-subjects experiment ( N = 123) compared the effects of visual metaphor of strength (an image of a lion located either on top or on the bottom of the package of coffee beans) and the direct textual claim ("extra strong") on consumers' responses to coffee, including product expectation, flavor evaluation, strength perception and purchase intention. The results demonstrate that both the textual claim and the visual metaphor can be efficient in communicating the product attribute of strength. The presence of the image positively influenced consumers' product expectations before tasting. The textual claim increased the perception of strength of coffee and the purchase intention of the product. The location of the image also played an important role in flavor perception and purchase intention. The image located on the bottom of the package increased the perceived strength of coffee and purchase intention of the product compared to the image on top of the package. This result could be interpreted from the perspective of the grounded cognition theory, which suggests that a picture in the lower part of the package would automatically activate the "strong is heavy" metaphor. As heavy objects are usually associated with a position on the ground, this would explain why perceiving a visually heavy package would lead to the experience of a strong coffee. Further research is needed to better understand the relationships between a metaphorical image and its spatial position in food packaging design.

  5. Revisão textual: para além da revisão linguística

    Directory of Open Access Journals (Sweden)

    Sueli Maria Coelho

    2010-07-01

    Full Text Available Este artigo pretende reforçar a ideia de que a revisão textual deve extrapolar a simples correção de questões gramaticais e ortográficas nos textos. Para além dessas questões, observar parâmetros como o gênero e a textualidade no material a ser revisado, bem como se ele está adequado em relação a normas de publicação, discussão do tema e aspectos gráficos faz-se fundamental para uma boa revisão textual. Após discussões teóricas a respeito de aspectos globais do texto, tais como a noção de gêneros textuais/discursivos e a de textualidade, buscou-se mostrar, por meio da análise de três textos (um resumo acadêmico, uma notícia retirada de um sítio e uma piada como os aspectos aqui discutidos influenciam na tomada de decisões do revisor.Palavras-Chave: Revisão textual; Revisão linguística e temática; Revisão gráfica e normalizadora; Gêneros textuais/discursivos.

  6. Fluid inclusions in salt: an annotated bibliography

    International Nuclear Information System (INIS)

    Isherwood, D.J.

    1979-01-01

    An annotated bibliography is presented which was compiled while searching the literature for information on fluid inclusions in salt for the Nuclear Regulatory Commission's study on the deep-geologic disposal of nuclear waste. The migration of fluid inclusions in a thermal gradient is a potential hazard to the safe disposal of nuclear waste in a salt repository. At the present time, a prediction as to whether this hazard precludes the use of salt for waste disposal can not be made. Limited data from the Salt-Vault in situ heater experiments in the early 1960's (Bradshaw and McClain, 1971) leave little doubt that fluid inclusions can migrate towards a heat source. In addition to the bibliography, there is a brief summary of the physical and chemical characteristics that together with the temperature of the waste will determine the chemical composition of the brine in contact with the waste canister, the rate of fluid migration, and the brine-canister-waste interactions

  7. Annotation and Curation of Uncharacterized proteins- Challenges

    Directory of Open Access Journals (Sweden)

    Johny eIjaq

    2015-03-01

    Full Text Available Hypothetical Proteins are the proteins that are predicted to be expressed from an open reading frame (ORF, constituting a substantial fraction of proteomes in both prokaryotes and eukaryotes. Genome projects have led to the identification of many therapeutic targets, the putative function of the protein and their interactions. In this review we have enlisted various methods. Annotation linked to structural and functional prediction of hypothetical proteins assist in the discovery of new structures and functions serving as markers and pharmacological targets for drug designing, discovery and screening. Mass spectrometry is an analytical technique for validating protein characterisation. Matrix-assisted laser desorption ionization–mass spectrometry (MALDI-MS is an efficient analytical method. Microarrays and Protein expression profiles help understanding the biological systems through a systems-wide study of proteins and their interactions with other proteins and non-proteinaceous molecules to control complex processes in cells and tissues and even whole organism. Next generation sequencing technology accelerates multiple areas of genomics research.

  8. Sophia: A Expedient UMLS Concept Extraction Annotator.

    Science.gov (United States)

    Divita, Guy; Zeng, Qing T; Gundlapalli, Adi V; Duvall, Scott; Nebeker, Jonathan; Samore, Matthew H

    2014-01-01

    An opportunity exists for meaningful concept extraction and indexing from large corpora of clinical notes in the Veterans Affairs (VA) electronic medical record. Currently available tools such as MetaMap, cTAKES and HITex do not scale up to address this big data need. Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance. We report on the development, testing and benchmarking of Sophia against MetaMap and cTAKEs. Sophia demonstrated improved performance on recall as compared to cTAKES and MetaMap (0.71 vs 0.66 and 0.38). The overall f-score was similar to cTAKES and an improvement over MetaMap (0.53 vs 0.57 and 0.43). With regard to speed of processing records, we noted Sophia to be several fold faster than cTAKES and the scaled-out MetaMap service. Sophia offers a viable alternative for high-throughput information extraction tasks.

  9. Frame on frames: an annotated bibliography

    International Nuclear Information System (INIS)

    Wright, T.; Tsao, H.J.

    1983-01-01

    The success or failure of any sample survey of a finite population is largely dependent upon the condition and adequacy of the list or frame from which the probability sample is selected. Much of the published survey sampling related work has focused on the measurement of sampling errors and, more recently, on nonsampling errors to a lesser extent. Recent studies on data quality for various types of data collection systems have revealed that the extent of the nonsampling errors far exceeds that of the sampling errors in many cases. While much of this nonsampling error, which is difficult to measure, can be attributed to poor frames, relatively little effort or theoretical work has focused on this contribution to total error. The objective of this paper is to present an annotated bibliography on frames with the hope that it will bring together, for experimenters, a number of suggestions for action when sampling from imperfect frames and that more attention will be given to this area of survey methods research

  10. Annotating Human P-Glycoprotein Bioassay Data.

    Science.gov (United States)

    Zdrazil, Barbara; Pinto, Marta; Vasanthanathan, Poongavanam; Williams, Antony J; Balderud, Linda Zander; Engkvist, Ola; Chichester, Christine; Hersey, Anne; Overington, John P; Ecker, Gerhard F

    2012-08-01

    Huge amounts of small compound bioactivity data have been entering the public domain as a consequence of open innovation initiatives. It is now the time to carefully analyse existing bioassay data and give it a systematic structure. Our study aims to annotate prominent in vitro assays used for the determination of bioactivities of human P-glycoprotein inhibitors and substrates as they are represented in the ChEMBL and TP-search open source databases. Furthermore, the ability of data, determined in different assays, to be combined with each other is explored. As a result of this study, it is suggested that for inhibitors of human P-glycoprotein it is possible to combine data coming from the same assay type, if the cell lines used are also identical and the fluorescent or radiolabeled substrate have overlapping binding sites. In addition, it demonstrates that there is a need for larger chemical diverse datasets that have been measured in a panel of different assays. This would certainly alleviate the search for other inter-correlations between bioactivity data yielded by different assay setups.

  11. The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

    Science.gov (United States)

    Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

    2017-07-03

    BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Comprensión y Producción Textual Narrativa en Estudiantes de Educación Primaria/ Narrative Textual Comprehension and Production in Primary School Students/ Compreensão e Produção Textual Narrativa em Estudantes de Educação Primária

    Directory of Open Access Journals (Sweden)

    Leidy Tatiana Guzmán Torres

    2015-05-01

    Full Text Available Por medio de un enfoque mixto, se caracterizó la comprensión y producción textual narrativa en estudiantes de educación primaria (primero y segundo grado, así como aspectos psicosociales (valor social de la lectura y hábito lector y contextuales (interacciones cognitivas y afectivas docente-niños en docentes de educación primaria de tres instituciones educativas. Se encontró que los estudiantes tuvieron un desempeño medio alto en producción textual y comprensión literal, y bajo en comprensión inferencial. Además, se observó que la mayoría de las docentes son lectoras ocasionales, emplean interacciones cognitivas unidireccionales, ofrecen un mediano apoyo emocional a sus estudiantes y valoran la lectura como una herramienta instrumental y lúdica.

  13. MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences.

    Science.gov (United States)

    Zhidkov, Ilia; Nagar, Tal; Mishmar, Dan; Rubin, Eitan

    2011-11-01

    The use of Next-Generation Sequencing of mitochondrial DNA is becoming widespread in biological and clinical research. This, in turn, creates a need for a convenient tool that detects and analyzes heteroplasmy. Here we present MitoBamAnnotator, a user friendly web-based tool that allows maximum flexibility and control in heteroplasmy research. MitoBamAnnotator provides the user with a comprehensively annotated overview of mitochondrial genetic variation, allowing for an in-depth analysis with no prior knowledge in programming. Copyright © 2011 Elsevier B.V. and Mitochondria Research Society. All rights reserved. All rights reserved.

  14. Detecting modularity "smells" in dependencies injected with Java annotations

    NARCIS (Netherlands)

    Roubtsov, S.; Serebrenik, A.; Brand, van den M.G.J.

    2010-01-01

    Dependency injection is a recent programming mechanism reducing dependencies among components by delegating them to an external entity, called a dependency injection framework. An increasingly popular approach to dependency injection implementation relies upon using Java annotations, a special form

  15. Annotated bibliography of South African indigenous evergreen forest ecology

    CSIR Research Space (South Africa)

    Geldenhuys, CJ

    1985-01-01

    Full Text Available Annotated references to 519 publications are presented, together with keyword listings and keyword, regional, place name and taxonomic indices. This bibliography forms part of the first phase of the activities of the Forest Biome Task Group....

  16. Creating New Medical Ontologies for Image Annotation A Case Study

    CERN Document Server

    Stanescu, Liana; Brezovan, Marius; Mihai, Cristian Gabriel

    2012-01-01

    Creating New Medical Ontologies for Image Annotation focuses on the problem of the medical images automatic annotation process, which is solved in an original manner by the authors. All the steps of this process are described in detail with algorithms, experiments and results. The original algorithms proposed by authors are compared with other efficient similar algorithms. In addition, the authors treat the problem of creating ontologies in an automatic way, starting from Medical Subject Headings (MESH). They have presented some efficient and relevant annotation models and also the basics of the annotation model used by the proposed system: Cross Media Relevance Models. Based on a text query the system will retrieve the images that contain objects described by the keywords.

  17. Geothermal wetlands: an annotated bibliography of pertinent literature

    Energy Technology Data Exchange (ETDEWEB)

    Stanley, N.E.; Thurow, T.L.; Russell, B.F.; Sullivan, J.F.

    1980-05-01

    This annotated bibliography covers the following topics: algae, wetland ecosystems; institutional aspects; macrophytes - general, production rates, and mineral absorption; trace metal absorption; wetland soils; water quality; and other aspects of marsh ecosystems. (MHR)

  18. Managing and Querying Image Annotation and Markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  19. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  20. Annotating Evidence Based Clinical Guidelines : A Lightweight Ontology

    NARCIS (Netherlands)

    Hoekstra, R.; de Waard, A.; Vdovjak, R.; Paschke, A.; Burger, A.; Romano, P.; Marshall, M.S.; Splendiani, A.

    2012-01-01

    This paper describes a lightweight ontology for representing annotations of declarative evidence based clinical guidelines. We present the motivation and requirements for this representation, based on an analysis of several guidelines. The ontology provides the means to connect clinical questions

  1. 06491 Summary -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    OpenAIRE

    Burnard, Lou; Dobreva, Milena; Fuhr, Norbert; Lüdeling, Anke

    2007-01-01

    The seminar "Digital Historical Corpora" brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. The issues that were discussed include digitization, corpus design, corpus architecture, annotation, search, and retrieval.

  2. Combined evidence annotation of transposable elements in genome sequences.

    Directory of Open Access Journals (Sweden)

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  3. A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

    OpenAIRE

    Hamed Hassanzadeh; MohammadReza Keyvanpour

    2011-01-01

    The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as ...

  4. Annotation Method (AM): SE7_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE7_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  5. Annotation Method (AM): SE36_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE36_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  6. Annotation Method (AM): SE14_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE14_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  7. Genome Annotation and Transcriptomics of Oil-Producing Algae

    Science.gov (United States)

    2015-03-16

    AFRL-OSR-VA-TR-2015-0103 GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE Sabeeha Merchant UNIVERSITY OF CALIFORNIA LOS ANGELES Final...2010 To 12-31-2014 4. TITLE AND SUBTITLE GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE 5a. CONTRACT NUMBER FA9550-10-1-0095 5b...NOTES 14. ABSTRACT Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some

  8. Annotation Method (AM): SE33_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE33_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  9. Annotation Method (AM): SE12_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE12_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  10. Annotation Method (AM): SE20_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE20_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  11. Annotation Method (AM): SE2_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE2_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  12. Annotation Method (AM): SE28_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE28_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  13. Annotation Method (AM): SE11_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE11_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  14. Annotation Method (AM): SE17_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE17_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  15. Annotation Method (AM): SE10_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE10_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  16. Annotation Method (AM): SE4_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE4_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  17. Annotation Method (AM): SE9_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE9_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  18. Annotation Method (AM): SE3_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE3_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  19. Annotation Method (AM): SE25_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE25_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  20. Annotation Method (AM): SE30_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE30_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  1. Annotation Method (AM): SE16_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE16_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  2. Annotation Method (AM): SE29_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE29_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  3. Annotation Method (AM): SE35_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE35_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  4. Annotation Method (AM): SE6_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE6_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  5. Annotation Method (AM): SE1_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE1_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  6. Annotation Method (AM): SE8_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE8_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  7. Annotation Method (AM): SE13_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE13_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  8. Annotation Method (AM): SE26_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE26_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  9. Annotation Method (AM): SE27_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE27_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  10. Annotation Method (AM): SE34_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE34_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  11. Annotation Method (AM): SE5_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE5_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  12. Annotation Method (AM): SE15_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE15_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  13. Annotation Method (AM): SE31_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE31_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  14. Annotation Method (AM): SE32_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE32_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  15. Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

    Directory of Open Access Journals (Sweden)

    Danuta Roszko

    2015-06-01

    Full Text Available Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.

  16. Analysis of LYSA-calculus with explicit confidentiality annotations

    DEFF Research Database (Denmark)

    Gao, Han; Nielson, Hanne Riis

    2006-01-01

    Recently there has been an increased research interest in applying process calculi in the verification of cryptographic protocols due to their ability to formally model protocols. This work presents LYSA with explicit confidentiality annotations for indicating the expected behavior of target...... malicious activities performed by attackers as specified by the confidentiality annotations. The proposed analysis approach is fully automatic without the need of human intervention and has been applied successfully to a number of protocols....

  17. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  18. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  19. AutoFACT: An Automatic Functional Annotation and Classification Tool

    Directory of Open Access Journals (Sweden)

    Lang B Franz

    2005-06-01

    Full Text Available Abstract Background Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1 analyzes nucleotide and protein sequence data; (2 determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3 assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4 generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at http://megasun.bch.umontreal.ca/Software/AutoFACT.htm.

  20. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Directory of Open Access Journals (Sweden)

    Gustavo Arango-Argoty

    Full Text Available Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/, which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.