WorldWideScience

Sample records for searching biomedical databases

  1. Sagace: A web-based search engine for biomedical databases in Japan

    Directory of Open Access Journals (Sweden)

    Morita Mizuki

    2012-10-01

    Full Text Available Abstract Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data and biological resource banks (such as mouse models of disease and cell lines. With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/.

  2. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  3. BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.

    Science.gov (United States)

    Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália

    2016-07-01

    Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations

  4. Relational Databases and Biomedical Big Data.

    Science.gov (United States)

    de Silva, N H Nisansa D

    2017-01-01

    In various biomedical applications that collect, handle, and manipulate data, the amounts of data tend to build up and venture into the range identified as bigdata. In such occurrences, a design decision has to be taken as to what type of database would be used to handle this data. More often than not, the default and classical solution to this in the biomedical domain according to past research is relational databases. While this used to be the norm for a long while, it is evident that there is a trend to move away from relational databases in favor of other types and paradigms of databases. However, it still has paramount importance to understand the interrelation that exists between biomedical big data and relational databases. This chapter will review the pros and cons of using relational databases to store biomedical big data that previous researches have discussed and used.

  5. An overview of biomedical literature search on the World Wide Web in the third millennium.

    Science.gov (United States)

    Kumar, Prince; Goel, Roshni; Jain, Chandni; Kumar, Ashish; Parashar, Abhishek; Gond, Ajay Ratan

    2012-06-01

    Complete access to the existing pool of biomedical literature and the ability to "hit" upon the exact information of the relevant specialty are becoming essential elements of academic and clinical expertise. With the rapid expansion of the literature database, it is almost impossible to keep up to date with every innovation. Using the Internet, however, most people can freely access this literature at any time, from almost anywhere. This paper highlights the use of the Internet in obtaining valuable biomedical research information, which is mostly available from journals, databases, textbooks and e-journals in the form of web pages, text materials, images, and so on. The authors present an overview of web-based resources for biomedical researchers, providing information about Internet search engines (e.g., Google), web-based bibliographic databases (e.g., PubMed, IndMed) and how to use them, and other online biomedical resources that can assist clinicians in reaching well-informed clinical decisions.

  6. The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews.

    NARCIS (Netherlands)

    W.M. Bramer (Wichor); D. Giustini (Dean); B.M.R. Kramer (Bianca); P.F. Anderson (Patricia)

    2013-01-01

    textabstractThe usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here

  7. KaBOB: ontology-based semantic integration of biomedical databases.

    Science.gov (United States)

    Livingston, Kevin M; Bada, Michael; Baumgartner, William A; Hunter, Lawrence E

    2015-04-23

    The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources. We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license. KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for

  8. Keyword Search in Databases

    CERN Document Server

    Yu, Jeffrey Xu; Chang, Lijun

    2009-01-01

    It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from

  9. Where to search top-K biomedical ontologies?

    Science.gov (United States)

    Oliveira, Daniela; Butt, Anila Sahar; Haller, Armin; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh

    2018-03-20

    Searching for precise terms and terminological definitions in the biomedical data space is problematic, as researchers find overlapping, closely related and even equivalent concepts in a single or multiple ontologies. Search engines that retrieve ontological resources often suggest an extensive list of search results for a given input term, which leads to the tedious task of selecting the best-fit ontological resource (class or property) for the input term and reduces user confidence in the retrieval engines. A systematic evaluation of these search engines is necessary to understand their strengths and weaknesses in different search requirements. We have implemented seven comparable Information Retrieval ranking algorithms to search through ontologies and compared them against four search engines for ontologies. Free-text queries have been performed, the outcomes have been judged by experts and the ranking algorithms and search engines have been evaluated against the expert-based ground truth (GT). In addition, we propose a probabilistic GT that is developed automatically to provide deeper insights and confidence to the expert-based GT as well as evaluating a broader range of search queries. The main outcome of this work is the identification of key search factors for biomedical ontologies together with search requirements and a set of recommendations that will help biomedical experts and ontology engineers to select the best-suited retrieval mechanism in their search scenarios. We expect that this evaluation will allow researchers and practitioners to apply the current search techniques more reliably and that it will help them to select the right solution for their daily work. The source code (of seven ranking algorithms), ground truths and experimental results are available at https://github.com/danielapoliveira/bioont-search-benchmark.

  10. The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews.

    Science.gov (United States)

    Bramer, Wichor M; Giustini, Dean; Kramer, Bianca Mr; Anderson, Pf

    2013-12-23

    The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and tests whether more complete searches created by an information specialist will improve recall compared to the searches used in the 21 published SRs. The authors identified 21 biomedical SRs that had used GS and PubMed as information sources and reported their use of identical, reproducible search strategies in both databases. These search strategies were rerun in GS and PubMed, and analyzed as to their coverage and recall. Efforts were made to improve searches that underperformed in each database. GS' overall coverage was higher than PubMed (98% versus 91%) and overall recall is higher in GS: 80% of the references included in the 21 SRs were returned by the original searches in GS versus 68% in PubMed. Only 72% of the included references could be used as they were listed among the first 1,000 hits (the maximum number shown). Practical precision (the number of included references retrieved in the first 1,000, divided by 1,000) was on average 1.9%, which is only slightly lower than in other published SRs. Improving searches with the lowest recall resulted in an increase in recall from 48% to 66% in GS and, in PubMed, from 60% to 85%. Although its coverage and precision are acceptable, GS, because of its incomplete recall, should not be used as a single source in SR searching. A specialized, curated medical database such as PubMed provides experienced searchers with tools and functionality that help improve recall, and numerous options in order to optimize precision. Searches for SRs should be

  11. Biomedical journals and databases in Russia and Russian language in the former Soviet Union and beyond

    Directory of Open Access Journals (Sweden)

    Danishevskiy Kirill D

    2008-09-01

    Full Text Available Abstract In the 20th century, Russian biomedical science experienced a decline from the blossom of the early years to a drastic state. Through the first decades of the USSR, it was transformed to suit the ideological requirements of a totalitarian state and biased directives of communist leaders. Later, depressing economic conditions and isolation from the international research community further impeded its development. Contemporary Russia has inherited a system of medical education quite different from the west as well as counterproductive regulations for the allocation of research funding. The methodology of medical and epidemiological research in Russia is largely outdated. Epidemiology continues to focus on infectious disease and results of the best studies tend to be published in international periodicals. MEDLINE continues to be the best database to search for Russian biomedical publications, despite only a small proportion being indexed. The database of the Moscow Central Medical Library is the largest national database of medical periodicals, but does not provide abstracts and full subject heading codes, and it does not cover even the entire collection of the Library. New databases and catalogs (e.g. Panteleimon that have appeared recently are incomplete and do not enable effective searching.

  12. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases.

    Science.gov (United States)

    Liljekvist, Mads Svane; Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob

    2015-01-01

    Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases' criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ's coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases

  13. BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature.

    Directory of Open Access Journals (Sweden)

    Sunwon Lee

    Full Text Available As the volume of publications rapidly increases, searching for relevant information from the literature becomes more challenging. To complement standard search engines such as PubMed, it is desirable to have an advanced search tool that directly returns relevant biomedical entities such as targets, drugs, and mutations rather than a long list of articles. Some existing tools submit a query to PubMed and process retrieved abstracts to extract information at query time, resulting in a slow response time and limited coverage of only a fraction of the PubMed corpus. Other tools preprocess the PubMed corpus to speed up the response time; however, they are not constantly updated, and thus produce outdated results. Further, most existing tools cannot process sophisticated queries such as searches for mutations that co-occur with query terms in the literature. To address these problems, we introduce BEST, a biomedical entity search tool. BEST returns, as a result, a list of 10 different types of biomedical entities including genes, diseases, drugs, targets, transcription factors, miRNAs, and mutations that are relevant to a user's query. To the best of our knowledge, BEST is the only system that processes free text queries and returns up-to-date results in real time including mutation information in the results. BEST is freely accessible at http://best.korea.ac.kr.

  14. BOSS: context-enhanced search for biomedical objects

    Directory of Open Access Journals (Sweden)

    Choi Jaehoon

    2012-04-01

    Full Text Available Abstract Background There exist many academic search solutions and most of them can be put on either ends of spectrum: general-purpose search and domain-specific "deep" search systems. The general-purpose search systems, such as PubMed, offer flexible query interface, but churn out a list of matching documents that users have to go through the results in order to find the answers to their queries. On the other hand, the "deep" search systems, such as PPI Finder and iHOP, return the precompiled results in a structured way. Their results, however, are often found only within some predefined contexts. In order to alleviate these problems, we introduce a new search engine, BOSS, Biomedical Object Search System. Methods Unlike the conventional search systems, BOSS indexes segments, rather than documents. A segment refers to a Maximal Coherent Semantic Unit (MCSU such as phrase, clause or sentence that is semantically coherent in the given context (e.g., biomedical objects or their relations. For a user query, BOSS finds all matching segments, identifies the objects appearing in those segments, and aggregates the segments for each object. Finally, it returns the ranked list of the objects along with their matching segments. Results The working prototype of BOSS is available at http://boss.korea.ac.kr. The current version of BOSS has indexed abstracts of more than 20 million articles published during last 16 years from 1996 to 2011 across all science disciplines. Conclusion BOSS fills the gap between either ends of the spectrum by allowing users to pose context-free queries and by returning a structured set of results. Furthermore, BOSS exhibits the characteristic of good scalability, just as with conventional document search engines, because it is designed to use a standard document-indexing model with minimal modifications. Considering the features, BOSS notches up the technological level of traditional solutions for search on biomedical information.

  15. Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

    Science.gov (United States)

    Alghoson, Abdullah

    2017-01-01

    Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…

  16. The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews

    OpenAIRE

    Bramer, Wichor M; Giustini, Dean; Kramer, Bianca MR; Anderson, PF

    2013-01-01

    Background The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and te...

  17. Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.

    Science.gov (United States)

    Müller, H-M; Van Auken, K M; Li, Y; Sternberg, P W

    2018-03-09

    The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements

  18. BioN∅T: A searchable database of biomedical negated sentences

    Directory of Open Access Journals (Sweden)

    Agarwal Shashank

    2011-10-01

    Full Text Available Abstract Background Negated biomedical events are often ignored by text-mining applications; however, such events carry scientific significance. We report on the development of BioN∅T, a database of negated sentences that can be used to extract such negated events. Description Currently BioN∅T incorporates ≈32 million negated sentences, extracted from over 336 million biomedical sentences from three resources: ≈2 million full-text biomedical articles in Elsevier and the PubMed Central, as well as ≈20 million abstracts in PubMed. We evaluated BioN∅T on three important genetic disorders: autism, Alzheimer's disease and Parkinson's disease, and found that BioN∅T is able to capture negated events that may be ignored by experts. Conclusions The BioN∅T database can be a useful resource for biomedical researchers. BioN∅T is freely available at http://bionot.askhermes.org/. In future work, we will develop semantic web related technologies to enrich BioN∅T.

  19. search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information.

    Science.gov (United States)

    Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur

    2013-03-01

    Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the

  20. BioCarian: search engine for exploratory searches in heterogeneous biological databases.

    Science.gov (United States)

    Zaki, Nazar; Tennakoon, Chandana

    2017-10-02

    There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search

  1. [Biomedical information on the internet using search engines. A one-year trial].

    Science.gov (United States)

    Corrao, Salvatore; Leone, Francesco; Arnone, Sabrina

    2004-01-01

    The internet is a communication medium and content distributor that provide information in the general sense but it could be of great utility regarding as the search and retrieval of biomedical information. Search engines represent a great deal to rapidly find information on the net. However, we do not know whether general search engines and meta-search ones are reliable in order to find useful and validated biomedical information. The aim of our study was to verify the reproducibility of a search by key-words (pediatric or evidence) using 9 international search engines and 1 meta-search engine at the baseline and after a one year period. We analysed the first 20 citations as output of each searching. We evaluated the formal quality of Web-sites and their domain extensions. Moreover, we compared the output of each search at the start of this study and after a one year period and we considered as a criterion of reliability the number of Web-sites cited again. We found some interesting results that are reported throughout the text. Our findings point out an extreme dynamicity of the information on the Web and, for this reason, we advice a great caution when someone want to use search and meta-search engines as a tool for searching and retrieve reliable biomedical information. On the other hand, some search and meta-search engines could be very useful as a first step searching for defining better a search and, moreover, for finding institutional Web-sites too. This paper allows to know a more conscious approach to the internet biomedical information universe.

  2. Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships.

    Science.gov (United States)

    Zong, Nansu; Lee, Sungin; Ahn, Jinhyun; Kim, Hong-Gee

    2017-08-01

    The keyword-based entity search restricts search space based on the preference of search. When given keywords and preferences are not related to the same biomedical topic, existing biomedical Linked Data search engines fail to deliver satisfactory results. This research aims to tackle this issue by supporting an inter-topic search-improving search with inputs, keywords and preferences, under different topics. This study developed an effective algorithm in which the relations between biomedical entities were used in tandem with a keyword-based entity search, Siren. The algorithm, PERank, which is an adaptation of Personalized PageRank (PPR), uses a pair of input: (1) search preferences, and (2) entities from a keyword-based entity search with a keyword query, to formalize the search results on-the-fly based on the index of the precomputed Individual Personalized PageRank Vectors (IPPVs). Our experiments were performed over ten linked life datasets for two query sets, one with keyword-preference topic correspondence (intra-topic search), and the other without (inter-topic search). The experiments showed that the proposed method achieved better search results, for example a 14% increase in precision for the inter-topic search than the baseline keyword-based search engine. The proposed method improved the keyword-based biomedical entity search by supporting the inter-topic search without affecting the intra-topic search based on the relations between different entities. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. PubMed and beyond: a survey of web tools for searching biomedical literature

    Science.gov (United States)

    Lu, Zhiyong

    2011-01-01

    The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and quality of various Web tools that provide comparable literature search service to PubMed. In this study, we review 28 such tools, highlight their respective innovations, compare them to the PubMed system and one another, and discuss directions for future development. Furthermore, we have built a website dedicated to tracking existing systems and future advances in the field of biomedical literature search. Taken together, our work serves information seekers in choosing tools for their needs and service providers and developers in keeping current in the field. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search PMID:21245076

  4. PubMed and beyond: a survey of web tools for searching biomedical literature.

    Science.gov (United States)

    Lu, Zhiyong

    2011-01-01

    The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and quality of various Web tools that provide comparable literature search service to PubMed. In this study, we review 28 such tools, highlight their respective innovations, compare them to the PubMed system and one another, and discuss directions for future development. Furthermore, we have built a website dedicated to tracking existing systems and future advances in the field of biomedical literature search. Taken together, our work serves information seekers in choosing tools for their needs and service providers and developers in keeping current in the field. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search.

  5. Fast Structural Search in Phylogenetic Databases

    Directory of Open Access Journals (Sweden)

    William H. Piel

    2005-01-01

    Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

  6. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases

    DEFF Research Database (Denmark)

    Liljekvist, Mads Svane; Andresen, Kristoffer; Pommergaard, Hans-Christian

    2015-01-01

    biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ...... journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared...

  7. Database Search Engines: Paradigms, Challenges and Solutions.

    Science.gov (United States)

    Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

  8. An integrated biomedical knowledge extraction and analysis platform: using federated search and document clustering technology.

    Science.gov (United States)

    Taylor, Donald P

    2007-01-01

    High content screening (HCS) requires time-consuming and often complex iterative information retrieval and assessment approaches to optimally conduct drug discovery programs and biomedical research. Pre- and post-HCS experimentation both require the retrieval of information from public as well as proprietary literature in addition to structured information assets such as compound libraries and projects databases. Unfortunately, this information is typically scattered across a plethora of proprietary bioinformatics tools and databases and public domain sources. Consequently, single search requests must be presented to each information repository, forcing the results to be manually integrated for a meaningful result set. Furthermore, these bioinformatics tools and data repositories are becoming increasingly complex to use; typically they fail to allow for more natural query interfaces. Vivisimo has developed an enterprise software platform to bridge disparate silos of information. The platform automatically categorizes search results into descriptive folders without the use of taxonomies to drive the categorization. A new approach to information retrieval for HCS experimentation is proposed.

  9. Interactive searching of facial image databases

    Science.gov (United States)

    Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

    1995-09-01

    A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.

  10. Oracle Database 10g: a platform for BLAST search and Regular Expression pattern matching in life sciences.

    Science.gov (United States)

    Stephens, Susie M; Chen, Jake Y; Davidson, Marcel G; Thomas, Shiby; Trute, Barry M

    2005-01-01

    As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.html.

  11. Electronic biomedical literature search for budding researcher.

    Science.gov (United States)

    Thakre, Subhash B; Thakre S, Sushama S; Thakre, Amol D

    2013-09-01

    Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research.

  12. NBIC: Search Ballast Report Database

    Science.gov (United States)

    Smithsonian Environmental Research Center Logo US Coast Guard Logo Submit BW Report | Search NBIC Database developed an online database that can be queried through our website. Data are accessible for all coastal Lakes, have been incorporated into the NBIC database as of August 2004. Information on data availability

  13. SciRide Finder: a citation-based paradigm in biomedical literature search.

    Science.gov (United States)

    Volanakis, Adam; Krawczyk, Konrad

    2018-04-18

    There are more than 26 million peer-reviewed biomedical research items according to Medline/PubMed. This breadth of information is indicative of the progress in biomedical sciences on one hand, but an overload for scientists performing literature searches on the other. A major portion of scientific literature search is to find statements, numbers and protocols that can be cited to build an evidence-based narrative for a new manuscript. Because science builds on prior knowledge, such information has likely been written out and cited in an older manuscript. Thus, Cited Statements, pieces of text from scientific literature supported by citing other peer-reviewed publications, carry significant amount of condensed information on prior art. Based on this principle, we propose a literature search service, SciRide Finder (finder.sciride.org), which constrains the search corpus to such Cited Statements only. We demonstrate that Cited Statements can carry different information to this found in titles/abstracts and full text, giving access to alternative literature search results than traditional search engines. We further show how presenting search results as a list of Cited Statements allows researchers to easily find information to build an evidence-based narrative for their own manuscripts.

  14. An improved rank based disease prediction using web navigation patterns on bio-medical databases

    Directory of Open Access Journals (Sweden)

    P. Dhanalakshmi

    2017-12-01

    Full Text Available Applying machine learning techniques to on-line biomedical databases is a challenging task, as this data is collected from large number of sources and it is multi-dimensional. Also retrieval of relevant document from large repository such as gene document takes more processing time and an increased false positive rate. Generally, the extraction of biomedical document is based on the stream of prior observations of gene parameters taken at different time periods. Traditional web usage models such as Markov, Bayesian and Clustering models are sensitive to analyze the user navigation patterns and session identification in online biomedical database. Moreover, most of the document ranking models on biomedical database are sensitive to sparsity and outliers. In this paper, a novel user recommendation system was implemented to predict the top ranked biomedical documents using the disease type, gene entities and user navigation patterns. In this recommendation system, dynamic session identification, dynamic user identification and document ranking techniques were used to extract the highly relevant disease documents on the online PubMed repository. To verify the performance of the proposed model, the true positive rate and runtime of the model was compared with that of traditional static models such as Bayesian and Fuzzy rank. Experimental results show that the performance of the proposed ranking model is better than the traditional models.

  15. Data integration and knowledge discovery in biomedical databases. Reliable information from unreliable sources

    Directory of Open Access Journals (Sweden)

    A Mitnitski

    2003-01-01

    Full Text Available To better understand information about human health from databases we analyzed three datasets collected for different purposes in Canada: a biomedical database of older adults, a large population survey across all adult ages, and vital statistics. Redundancy in the variables was established, and this led us to derive a generalized (macroscopic state variable, being a fitness/frailty index that reflects both individual and group health status. Evaluation of the relationship between fitness/frailty and the mortality rate revealed that the latter could be expressed in terms of variables generally available from any cross-sectional database. In practical terms, this means that the risk of mortality might readily be assessed from standard biomedical appraisals collected for other purposes.

  16. A unified architecture for biomedical search engines based on semantic web technologies.

    Science.gov (United States)

    Jalali, Vahid; Matash Borujerdi, Mohammad Reza

    2011-04-01

    There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.

  17. Biomedical databases: protecting privacy and promoting research.

    Science.gov (United States)

    Wylie, Jean E; Mineau, Geraldine P

    2003-03-01

    When combined with medical information, large electronic databases of information that identify individuals provide superlative resources for genetic, epidemiology and other biomedical research. Such research resources increasingly need to balance the protection of privacy and confidentiality with the promotion of research. Models that do not allow the use of such individual-identifying information constrain research; models that involve commercial interests raise concerns about what type of access is acceptable. Researchers, individuals representing the public interest and those developing regulatory guidelines must be involved in an ongoing dialogue to identify practical models.

  18. Literature searches on Ayurveda: An update.

    Science.gov (United States)

    Aggithaya, Madhur G; Narahari, Saravu R

    2015-01-01

    The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. To update on Ayurveda literature search and strategy to retrieve maximum publications. Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Five among 46 databases are now relevant - AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. "The Researches in Ayurveda" and "Ayurvedic Research Database" (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand

  19. Search pattern of databases by the undergraduate students of ...

    African Journals Online (AJOL)

    The main objective of this study is to assess the awareness and search pattern of databases in order to determine the extent to which user are aware and search for databases by examining the relationship between their Awareness and search patterns of Databases, and their information literacy skills. The methodology ...

  20. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  1. Molecule database framework: a framework for creating database applications with chemical structure search capability.

    Science.gov (United States)

    Kiener, Joos

    2013-12-11

    Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework

  2. Developing a search engine for pharmacotherapeutic information that is not published in biomedical journals.

    Science.gov (United States)

    Do Pazo-Oubiña, F; Calvo Pita, C; Puigventós Latorre, F; Periañez-Párraga, L; Ventayol Bosch, P

    2011-01-01

    To identify publishers of pharmacotherapeutic information not found in biomedical journals that focuses on evaluating and providing advice on medicines and to develop a search engine to access this information. Compiling web sites that publish information on the rational use of medicines and have no commercial interests. Free-access web sites in Spanish, Galician, Catalan or English. Designing a search engine using the Google "custom search" application. Overall 159 internet addresses were compiled and were classified into 9 labels. We were able to recover the information from the selected sources using a search engine, which is called "AlquimiA" and available from http://www.elcomprimido.com/FARHSD/AlquimiA.htm. The main sources of pharmacotherapeutic information not published in biomedical journals were identified. The search engine is a useful tool for searching and accessing "grey literature" on the internet. Copyright © 2010 SEFH. Published by Elsevier Espana. All rights reserved.

  3. Specialist Bibliographic Databases.

    Science.gov (United States)

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  4. Specialist Bibliographic Databases

    Science.gov (United States)

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  5. How to Search, Write, Prepare and Publish the Scientific Papers in the Biomedical Journals

    Science.gov (United States)

    Masic, Izet

    2011-01-01

    This article describes the methodology of preparation, writing and publishing scientific papers in biomedical journals. given is a concise overview of the concept and structure of the System of biomedical scientific and technical information and the way of biomedical literature retreival from worldwide biomedical databases. Described are the scientific and professional medical journals that are currently published in Bosnia and Herzegovina. Also, given is the comparative review on the number and structure of papers published in indexed journals in Bosnia and Herzegovina, which are listed in the Medline database. Analyzed are three B&H journals indexed in MEDLINE database: Medical Archives (Medicinski Arhiv), Bosnian Journal of Basic Medical Sciences and Medical Gazette (Medicinki Glasnik) in 2010. The largest number of original papers was published in the Medical Archives. There is a statistically significant difference in the number of papers published by local authors in relation to international journals in favor of the Medical Archives. True, the Journal Bosnian Journal of Basic Medical Sciences does not categorize the articles and we could not make comparisons. Journal Medical Archives and Bosnian Journal of Basic Medical Sciences by percentage published the largest number of articles by authors from Sarajevo and Tuzla, the two oldest and largest university medical centers in Bosnia and Herzegovina. The author believes that it is necessary to make qualitative changes in the reception and reviewing of papers for publication in biomedical journals published in Bosnia and Herzegovina which should be the responsibility of the separate scientific authority/ committee composed of experts in the field of medicine at the state level. PMID:23572850

  6. How to search, write, prepare and publish the scientific papers in the biomedical journals.

    Science.gov (United States)

    Masic, Izet

    2011-06-01

    This article describes the methodology of preparation, writing and publishing scientific papers in biomedical journals. given is a concise overview of the concept and structure of the System of biomedical scientific and technical information and the way of biomedical literature retreival from worldwide biomedical databases. Described are the scientific and professional medical journals that are currently published in Bosnia and Herzegovina. Also, given is the comparative review on the number and structure of papers published in indexed journals in Bosnia and Herzegovina, which are listed in the Medline database. Analyzed are three B&H journals indexed in MEDLINE database: Medical Archives (Medicinski Arhiv), Bosnian Journal of Basic Medical Sciences and Medical Gazette (Medicinki Glasnik) in 2010. The largest number of original papers was published in the Medical Archives. There is a statistically significant difference in the number of papers published by local authors in relation to international journals in favor of the Medical Archives. True, the Journal Bosnian Journal of Basic Medical Sciences does not categorize the articles and we could not make comparisons. Journal Medical Archives and Bosnian Journal of Basic Medical Sciences by percentage published the largest number of articles by authors from Sarajevo and Tuzla, the two oldest and largest university medical centers in Bosnia and Herzegovina. The author believes that it is necessary to make qualitative changes in the reception and reviewing of papers for publication in biomedical journals published in Bosnia and Herzegovina which should be the responsibility of the separate scientific authority/ committee composed of experts in the field of medicine at the state level.

  7. G-Bean: an ontology-graph based web tool for biomedical literature retrieval.

    Science.gov (United States)

    Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S

    2014-01-01

    Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean

  8. MetaboSearch: tool for mass-based metabolite identification using multiple databases.

    Directory of Open Access Journals (Sweden)

    Bin Zhou

    Full Text Available Searching metabolites against databases according to their masses is often the first step in metabolite identification for a mass spectrometry-based untargeted metabolomics study. Major metabolite databases include Human Metabolome DataBase (HMDB, Madison Metabolomics Consortium Database (MMCD, Metlin, and LIPID MAPS. Since each one of these databases covers only a fraction of the metabolome, integration of the search results from these databases is expected to yield a more comprehensive coverage. However, the manual combination of multiple search results is generally difficult when identification of hundreds of metabolites is desired. We have implemented a web-based software tool that enables simultaneous mass-based search against the four major databases, and the integration of the results. In addition, more complete chemical identifier information for the metabolites is retrieved by cross-referencing multiple databases. The search results are merged based on IUPAC International Chemical Identifier (InChI keys. Besides a simple list of m/z values, the software can accept the ion annotation information as input for enhanced metabolite identification. The performance of the software is demonstrated on mass spectrometry data acquired in both positive and negative ionization modes. Compared with search results from individual databases, MetaboSearch provides better coverage of the metabolome and more complete chemical identifier information.The software tool is available at http://omics.georgetown.edu/MetaboSearch.html.

  9. Quantum search of a real unstructured database

    Science.gov (United States)

    Broda, Bogusław

    2016-02-01

    A simple circuit implementation of the oracle for Grover's quantum search of a real unstructured classical database is proposed. The oracle contains a kind of quantumly accessible classical memory, which stores the database.

  10. The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges.

    Science.gov (United States)

    Dankar, Fida K; Ptitsyn, Andrey; Dankar, Samar K

    2018-04-10

    Contemporary biomedical databases include a wide range of information types from various observational and instrumental sources. Among the most important features that unite biomedical databases across the field are high volume of information and high potential to cause damage through data corruption, loss of performance, and loss of patient privacy. Thus, issues of data governance and privacy protection are essential for the construction of data depositories for biomedical research and healthcare. In this paper, we discuss various challenges of data governance in the context of population genome projects. The various challenges along with best practices and current research efforts are discussed through the steps of data collection, storage, sharing, analysis, and knowledge dissemination.

  11. The Development of a Combined Search for a Heterogeneous Chemistry Database

    Directory of Open Access Journals (Sweden)

    Lulu Jiang

    2015-05-01

    Full Text Available A combined search, which joins a slow molecule structure search with a fast compound property search, results in more accurate search results and has been applied in several chemistry databases. However, the problems of search speed differences and combining the two separate search results are two major challenges. In this paper, two kinds of search strategies, synchronous search and asynchronous search, are proposed to solve these problems in the heterogeneous structure database and the property database found in ChemDB, a chemistry database owned by the Institute of Process Engineering, CAS. Their advantages and disadvantages under different conditions are discussed in detail. Furthermore, we applied these two searches to ChemDB and used them to screen for potential molecules that can work as CO2 absorbents. The results reveal that this combined search discovers reasonable target molecules within an acceptable time frame.

  12. Searching Harvard Business Review Online. . . Lessons in Searching a Full Text Database.

    Science.gov (United States)

    Tenopir, Carol

    1985-01-01

    This article examines the Harvard Business Review Online (HBRO) database (bibliographic description fields, abstracts, extracted information, full text, subject descriptors) and reports on 31 sample HBRO searches conducted in Bibliographic Retrieval Services to test differences between searching full text and searching bibliographic record. Sample…

  13. muBLASTP: database-indexed protein sequence search on multicore CPUs.

    Science.gov (United States)

    Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

    2016-11-04

    The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

  14. Searching mixed DNA profiles directly against profile databases.

    Science.gov (United States)

    Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John

    2014-03-01

    DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems.

    Science.gov (United States)

    Corwin, John; Silberschatz, Avi; Miller, Perry L; Marenco, Luis

    2007-01-01

    Data sparsity and schema evolution issues affecting clinical informatics and bioinformatics communities have led to the adoption of vertical or object-attribute-value-based database schemas to overcome limitations posed when using conventional relational database technology. This paper explores these issues and discusses why biomedical data are difficult to model using conventional relational techniques. The authors propose a solution to these obstacles based on a relational database engine using a sparse, column-store architecture. The authors provide benchmarks comparing the performance of queries and schema-modification operations using three different strategies: (1) the standard conventional relational design; (2) past approaches used by biomedical informatics researchers; and (3) their sparse, column-store architecture. The performance results show that their architecture is a promising technique for storing and processing many types of data that are not handled well by the other two semantic data models.

  16. MICA: desktop software for comprehensive searching of DNA databases

    Directory of Open Access Journals (Sweden)

    Glick Benjamin S

    2006-10-01

    Full Text Available Abstract Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software.

  17. An approach in building a chemical compound search engine in oracle database.

    Science.gov (United States)

    Wang, H; Volarath, P; Harrison, R

    2005-01-01

    A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.

  18. A scoping review of competencies for scientific editors of biomedical journals.

    Science.gov (United States)

    Galipeau, James; Barbour, Virginia; Baskin, Patricia; Bell-Syer, Sally; Cobey, Kelly; Cumpston, Miranda; Deeks, Jon; Garner, Paul; MacLehose, Harriet; Shamseer, Larissa; Straus, Sharon; Tugwell, Peter; Wager, Elizabeth; Winker, Margaret; Moher, David

    2016-02-02

    Biomedical journals are the main route for disseminating the results of health-related research. Despite this, their editors operate largely without formal training or certification. To our knowledge, no body of literature systematically identifying core competencies for scientific editors of biomedical journals exists. Therefore, we aimed to conduct a scoping review to determine what is known on the competency requirements for scientific editors of biomedical journals. We searched the MEDLINE®, Cochrane Library, Embase®, CINAHL, PsycINFO, and ERIC databases (from inception to November 2014) and conducted a grey literature search for research and non-research articles with competency-related statements (i.e. competencies, knowledge, skills, behaviors, and tasks) pertaining to the role of scientific editors of peer-reviewed health-related journals. We also conducted an environmental scan, searched the results of a previous environmental scan, and searched the websites of existing networks, major biomedical journal publishers, and organizations that offer resources for editors. A total of 225 full-text publications were included, 25 of which were research articles. We extracted a total of 1,566 statements possibly related to core competencies for scientific editors of biomedical journals from these publications. We then collated overlapping or duplicate statements which produced a list of 203 unique statements. Finally, we grouped these statements into seven emergent themes: (1) dealing with authors, (2) dealing with peer reviewers, (3) journal publishing, (4) journal promotion, (5) editing, (6) ethics and integrity, and (7) qualities and characteristics of editors. To our knowledge, this scoping review is the first attempt to systematically identify possible competencies of editors. Limitations are that (1) we may not have captured all aspects of a biomedical editor's work in our searches, (2) removing redundant and overlapping items may have led to the

  19. Searching the ASRS Database Using QUORUM Keyword Search, Phrase Search, Phrase Generation, and Phrase Discovery

    Science.gov (United States)

    McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)

    2001-01-01

    To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.

  20. When is a search not a search? A comparison of searching the AMED complementary health database via EBSCOhost, OVID and DIALOG.

    Science.gov (United States)

    Younger, Paula; Boddy, Kate

    2009-06-01

    The researchers involved in this study work at Exeter Health library and at the Complementary Medicine Unit, Peninsula School of Medicine and Dentistry (PCMD). Within this collaborative environment it is possible to access the electronic resources of three institutions. This includes access to AMED and other databases using different interfaces. The aim of this study was to investigate whether searching different interfaces to the AMED allied health and complementary medicine database produced the same results when using identical search terms. The following Internet-based AMED interfaces were searched: DIALOG DataStar; EBSCOhost and OVID SP_UI01.00.02. Search results from all three databases were saved in an endnote database to facilitate analysis. A checklist was also compiled comparing interface features. In our initial search, DIALOG returned 29 hits, OVID 14 and Ebsco 8. If we assume that DIALOG returned 100% of potential hits, OVID initially returned only 48% of hits and EBSCOhost only 28%. In our search, a researcher using the Ebsco interface to carry out a simple search on AMED would miss over 70% of possible search hits. Subsequent EBSCOhost searches on different subjects failed to find between 21 and 86% of the hits retrieved using the same keywords via DIALOG DataStar. In two cases, the simple EBSCOhost search failed to find any of the results found via DIALOG DataStar. Depending on the interface, the number of hits retrieved from the same database with the same simple search can vary dramatically. Some simple searches fail to retrieve a substantial percentage of citations. This may result in an uninformed literature review, research funding application or treatment intervention. In addition to ensuring that keywords, spelling and medical subject headings (MeSH) accurately reflect the nature of the search, database users should include wildcards and truncation and adapt their search strategy substantially to retrieve the maximum number of appropriate

  1. Phonetic search methods for large speech databases

    CERN Document Server

    Moyal, Ami; Tetariy, Ella; Gishri, Michal

    2013-01-01

    “Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech databases. The brief will begin by outlining the challenges associated with Keyword Spotting within large speech databases using dynamic keyword vocabularies. It will then continue by highlighting the various market segments in need of KWS solutions, as well as, the specific requirements of each market segment. The work also includes a detailed description of the complexity of the task and the different methods that are used, including the advantages and disadvantages of each method and an in-depth comparison. The main focus will be on the Phonetic Search method and its efficient implementation. This will include a literature review of the various methods used for the efficient implementation of Phonetic Search Keyword Spotting, with an emphasis on the authors’ own research which entails a comparative analysis of the Phonetic Search method which includes algorithmic details. This brief is useful for resea...

  2. Optimal database combinations for literature searches in systematic reviews : a prospective exploratory study

    NARCIS (Netherlands)

    Bramer, W. M.; Rethlefsen, Melissa L.; Kleijnen, Jos; Franco, Oscar H.

    2017-01-01

    Background: Within systematic reviews, when searching for relevant references, it is advisable to use multiple databases. However, searching databases is laborious and time-consuming, as syntax of search strategies are database specific. We aimed to determine the optimal combination of databases

  3. Method and electronic database search engine for exposing the content of an electronic database

    NARCIS (Netherlands)

    Stappers, P.J.

    2000-01-01

    The invention relates to an electronic database search engine comprising an electronic memory device suitable for storing and releasing elements from the database, a display unit, a user interface for selecting and displaying at least one element from the database on the display unit, and control

  4. Protein structure database search and evolutionary classification.

    Science.gov (United States)

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  5. Forensic utilization of familial searches in DNA databases.

    Science.gov (United States)

    Gershaw, Cassandra J; Schweighardt, Andrew J; Rourke, Linda C; Wallace, Margaret M

    2011-01-01

    DNA evidence is widely recognized as an invaluable tool in the process of investigation and identification, as well as one of the most sought after types of evidence for presentation to a jury. In the United States, the development of state and federal DNA databases has greatly impacted the forensic community by creating an efficient, searchable system that can be used to eliminate or include suspects in an investigation based on matching DNA profiles - the profile already in the database to the profile of the unknown sample in evidence. Recent changes in legislation have begun to allow for the possibility to expand the parameters of DNA database searches, taking into account the possibility of familial searches. This article discusses prospective positive outcomes of utilizing familial DNA searches and acknowledges potential negative outcomes, thereby presenting both sides of this very complicated, rapidly evolving situation. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  6. MIDAS: a database-searching algorithm for metabolite identification in metabolomics.

    Science.gov (United States)

    Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle

    2014-10-07

    A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.

  7. PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm

    Directory of Open Access Journals (Sweden)

    Chuong Cheng-Ming

    2006-10-01

    Full Text Available Abstract Background Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. Results PubFocus web server automates analysis of MEDLINE/PubMed search queries by enriching them with two widely used human factor-based bibliometric indicators of publication quality: journal impact factor and volume of forward references. In addition to providing basic volumetric statistics, PubFocus also prioritizes citations and evaluates authors' impact on the field of search. PubFocus also analyses presence and occurrence of biomedical key terms within citations by utilizing controlled vocabularies. Conclusion We have developed citations' prioritisation algorithm based on journal impact factor, forward referencing volume, referencing dynamics, and author's contribution level. It can be applied either to the primary set of PubMed search results or to the subsets of these results identified through key terms from controlled biomedical vocabularies and ontologies. NCI (National Cancer Institute thesaurus and MGD (Mouse Genome Database mammalian gene orthology have been implemented for key terms analytics. PubFocus provides a scalable platform for the integration of multiple available ontology databases. PubFocus analytics can be adapted for input sources of biomedical citations other than PubMed.

  8. WGDB: Wood Gene Database with search interface.

    Science.gov (United States)

    Goyal, Neha; Ginwal, H S

    2014-01-01

    Wood quality can be defined in terms of particular end use with the involvement of several traits. Over the last fifteen years researchers have assessed the wood quality traits in forest trees. The wood quality was categorized as: cell wall biochemical traits, fibre properties include the microfibril angle, density and stiffness in loblolly pine [1]. The user friendly and an open-access database has been developed named Wood Gene Database (WGDB) for describing the wood genes along the information of protein and published research articles. It contains 720 wood genes from species namely Pinus, Deodar, fast growing trees namely Poplar, Eucalyptus. WGDB designed to encompass the majority of publicly accessible genes codes for cellulose, hemicellulose and lignin in tree species which are responsive to wood formation and quality. It is an interactive platform for collecting, managing and searching the specific wood genes; it also enables the data mining relate to the genomic information specifically in Arabidopsis thaliana, Populus trichocarpa, Eucalyptus grandis, Pinus taeda, Pinus radiata, Cedrus deodara, Cedrus atlantica. For user convenience, this database is cross linked with public databases namely NCBI, EMBL & Dendrome with the search engine Google for making it more informative and provides bioinformatics tools named BLAST,COBALT. The database is freely available on www.wgdb.in.

  9. Two Search Techniques within a Human Pedigree Database

    OpenAIRE

    Gersting, J. M.; Conneally, P. M.; Rogers, K.

    1982-01-01

    This paper presents the basic features of two search techniques from MEGADATS-2 (MEdical Genetics Acquisition and DAta Transfer System), a system for collecting, storing, retrieving and plotting human family pedigrees. The individual search provides a quick method for locating an individual in the pedigree database. This search uses a modified soundex coding and an inverted file structure based on a composite key. The navigational search uses a set of pedigree traversal operations (individual...

  10. Archiving, ordering and searching: search engines, algorithms, databases and deep mediatization

    DEFF Research Database (Denmark)

    Andersen, Jack

    2018-01-01

    This article argues that search engines, algorithms, and databases can be considered as a way of understanding deep mediatization (Couldry & Hepp, 2016). They are embedded in a variety of social and cultural practices and as such they change our communicative actions to be shaped by their logic o...... reviewed recent trends in mediatization research, the argument is discussed and unfolded in-between the material and social constructivist-phenomenological interpretations of mediatization. In conclusion, it is discussed how deep this form of mediatization can be taken to be.......This article argues that search engines, algorithms, and databases can be considered as a way of understanding deep mediatization (Couldry & Hepp, 2016). They are embedded in a variety of social and cultural practices and as such they change our communicative actions to be shaped by their logic...

  11. Searching the online biomedical literature from developing countries

    African Journals Online (AJOL)

    Administrator

    This commentary highlights popular research literature databases and the use of the internet to obtain valuable research information. These literature retrieval methods include the use of the popular. PubMed as well as internet search engines. Specific websites catering to developing countries' information and journals' ...

  12. Searching the online biomedical literature from developing countries ...

    African Journals Online (AJOL)

    This commentary highlights popular research literature databases and the use of the internet to obtain valuable research information. These literature retrieval methods include the use of the popular PubMed as well as internet search engines. Specific websites catering to developing countries' information and journals' ...

  13. Search Databases and Statistics

    DEFF Research Database (Denmark)

    Refsgaard, Jan C; Munk, Stephanie; Jensen, Lars J

    2016-01-01

    having strengths and weaknesses that must be considered for the individual needs. These are reviewed in this chapter. Equally critical for generating highly confident output datasets is the application of sound statistical criteria to limit the inclusion of incorrect peptide identifications from database...... searches. Additionally, careful filtering and use of appropriate statistical tests on the output datasets affects the quality of all downstream analyses and interpretation of the data. Our considerations and general practices on these aspects of phosphoproteomics data processing are presented here....

  14. On-line biomedical databases-the best source for quick search of the scientific information in the biomedicine.

    Science.gov (United States)

    Masic, Izet; Milinovic, Katarina

    2012-06-01

    Most of medical journals now has it's electronic version, available over public networks. Although there are parallel printed and electronic versions, and one other form need not to be simultaneously published. Electronic version of a journal can be published a few weeks before the printed form and must not has identical content. Electronic form of a journals may have an extension that does not contain a printed form, such as animation, 3D display, etc., or may have available fulltext, mostly in PDF or XML format, or just the contents or a summary. Access to a full text is usually not free and can be achieved only if the institution (library or host) enters into an agreement on access. Many medical journals, however, provide free access for some articles, or after a certain time (after 6 months or a year) to complete content. The search for such journals provide the network archive as High Wire Press, Free Medical Journals.com. It is necessary to allocate PubMed and PubMed Central, the first public digital archives unlimited collect journals of available medical literature, which operates in the system of the National Library of Medicine in Bethesda (USA). There are so called on- line medical journals published only in electronic form. It could be searched over on-line databases. In this paper authors shortly described about 30 data bases and short instructions how to make access and search the published papers in indexed medical journals.

  15. A practical approach for inexpensive searches of radiology report databases.

    Science.gov (United States)

    Desjardins, Benoit; Hamilton, R Curtis

    2007-06-01

    We present a method to perform full text searches of radiology reports for the large number of departments that do not have this ability as part of their radiology or hospital information system. A tool written in Microsoft Access (front-end) has been designed to search a server (back-end) containing the indexed backup weekly copy of the full relational database extracted from a radiology information system (RIS). This front end-/back-end approach has been implemented in a large academic radiology department, and is used for teaching, research and administrative purposes. The weekly second backup of the 80 GB, 4 million record RIS database takes 2 hours. Further indexing of the exported radiology reports takes 6 hours. Individual searches of the indexed database typically take less than 1 minute on the indexed database and 30-60 minutes on the nonindexed database. Guidelines to properly address privacy and institutional review board issues are closely followed by all users. This method has potential to improve teaching, research, and administrative programs within radiology departments that cannot afford more expensive technology.

  16. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  17. The LAILAPS Search Engine: Relevance Ranking in Life Science Databases

    Directory of Open Access Journals (Sweden)

    Lange Matthias

    2010-06-01

    Full Text Available Search engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.

  18. Simplified validation of borderline hits of database searches

    OpenAIRE

    Thomas, Henrik; Shevchenko, Andrej

    2008-01-01

    Along with unequivocal hits produced by matching multiple MS/MS spectra to database sequences, LC-MS/MS analysis often yields a large number of hits of borderline statistical confidence. To simplify their validation, we propose to use rapid de novo interpretation of all acquired MS/MS spectra and, with the help of a simple software tool, display the candidate sequences together with each database search hit. We demonstrate that comparing hit database sequences and independent de novo interpre...

  19. Database searches for qualitative research

    OpenAIRE

    Evans, David

    2002-01-01

    Interest in the role of qualitative research in evidence-based health care is growing. However, the methods currently used to identify quantitative research do not translate easily to qualitative research. This paper highlights some of the difficulties during searches of electronic databases for qualitative research. These difficulties relate to the descriptive nature of the titles used in some qualitative studies, the variable information provided in abstracts, and the differences in the ind...

  20. A Web-based Tool for SDSS and 2MASS Database Searches

    Science.gov (United States)

    Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.

    We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.

  1. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  2. Federated or cached searches: providing expected performance from multiple invasive species databases

    Science.gov (United States)

    Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.

    2011-01-01

    Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.

  3. Searching Databases without Query-Building Aids: Implications for Dyslexic Users

    Science.gov (United States)

    Berget, Gerd; Sandnes, Frode Eika

    2015-01-01

    Introduction: Few studies document the information searching behaviour of users with cognitive impairments. This paper therefore addresses the effect of dyslexia on information searching in a database with no tolerance for spelling errors and no query-building aids. The purpose was to identify effective search interface design guidelines that…

  4. Students are Confident Using Federated Search Tools as much as Single Databases. A Review of: Armstrong, A. (2009. Student perceptions of federated searching vs. single database searching. Reference Services Review, 37(3, 291-303. doi:10.1108/00907320910982785

    Directory of Open Access Journals (Sweden)

    Deena Yanofsky

    2011-09-01

    Full Text Available Objective – To measure students’ perceptions of the ease-of-use and efficacy of a federated search tool versus a single multidisciplinary database.Design – An evaluation worksheet, employing a combination of quantitative and qualitative questions.Setting – A required, first-year English composition course taught at the University of Illinois at Chicago (UIC.Subjects – Thirty-one undergraduate students completed and submitted the worksheet.Methods – Students attended two library instruction sessions. The first session introduced participants to basic Boolean searching (using AND only, selecting appropriate keywords and searching for books in the library catalogue. In the second library session, students were handed an evaluation worksheet and, with no introduction to the process of searching article databases, were asked to find relevant articles on a research topic of their own choosing using both a federated search tool and a single multidisciplinary database. The evaluation worksheet was divided into four sections: step-by-step instructions for accessing the single multidisciplinary database and the federated search tool; space to record search strings in both resources; space to record the titles of up to five relevant articles; and a series of quantitative and qualitative questions regarding ease-of-use, relevancy of results, overall preference (if any between the two resources, likeliness of future use and other preferred research tools. Half of the participants received a worksheet with instructions to search the federated search tool before the single database; the order was reversed for the other half of the students. The evaluation worksheet was designed to be completed in one hour.Participant responses to qualitative questions were analyzed, codified and grouped into thematic categories. If a student mentioned more than one factor in responding to a question, their response was recorded in multiple categories.Main Results

  5. Parallel database search and prime factorization with magnonic holographic memory devices

    Energy Technology Data Exchange (ETDEWEB)

    Khitun, Alexander [Electrical and Computer Engineering Department, University of California - Riverside, Riverside, California 92521 (United States)

    2015-12-28

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.

  6. Parallel database search and prime factorization with magnonic holographic memory devices

    Science.gov (United States)

    Khitun, Alexander

    2015-12-01

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.

  7. Parallel database search and prime factorization with magnonic holographic memory devices

    International Nuclear Information System (INIS)

    Khitun, Alexander

    2015-01-01

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed

  8. Searching for evidence or approval? A commentary on database search in systematic reviews and alternative information retrieval methodologies.

    Science.gov (United States)

    Delaney, Aogán; Tamás, Peter A

    2018-03-01

    Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.

  9. PubMed searches: overview and strategies for clinicians.

    Science.gov (United States)

    Lindsey, Wesley T; Olin, Bernie R

    2013-04-01

    PubMed is a biomedical and life sciences database maintained by a division of the National Library of Medicine known as the National Center for Biotechnology Information (NCBI). It is a large resource with more than 5600 journals indexed and greater than 22 million total citations. Searches conducted in PubMed provide references that are more specific for the intended topic compared with other popular search engines. Effective PubMed searches allow the clinician to remain current on the latest clinical trials, systematic reviews, and practice guidelines. PubMed continues to evolve by allowing users to create a customized experience through the My NCBI portal, new arrangements and options in search filters, and supporting scholarly projects through exportation of citations to reference managing software. Prepackaged search options available in the Clinical Queries feature also allow users to efficiently search for clinical literature. PubMed also provides information regarding the source journals themselves through the Journals in NCBI Databases link. This article provides an overview of the PubMed database's structure and features as well as strategies for conducting an effective search.

  10. HIP2: An online database of human plasma proteins from healthy individuals

    Directory of Open Access Journals (Sweden)

    Shen Changyu

    2008-04-01

    Full Text Available Abstract Background With the introduction of increasingly powerful mass spectrometry (MS techniques for clinical research, several recent large-scale MS proteomics studies have sought to characterize the entire human plasma proteome with a general objective for identifying thousands of proteins leaked from tissues in the circulating blood. Understanding the basic constituents, diversity, and variability of the human plasma proteome is essential to the development of sensitive molecular diagnosis and treatment monitoring solutions for future biomedical applications. Biomedical researchers today, however, do not have an integrated online resource in which they can search for plasma proteins collected from different mass spectrometry platforms, experimental protocols, and search software for healthy individuals. The lack of such a resource for comparisons has made it difficult to interpret proteomics profile changes in patients' plasma and to design protein biomarker discovery experiments. Description To aid future protein biomarker studies of disease and health from human plasma, we developed an online database, HIP2 (Healthy Human Individual's Integrated Plasma Proteome. The current version contains 12,787 protein entries linked to 86,831 peptide entries identified using different MS platforms. Conclusion This web-based database will be useful to biomedical researchers involved in biomarker discovery research. This database has been developed to be the comprehensive collection of healthy human plasma proteins, and has protein data captured in a relational database schema built to contain mappings of supporting peptide evidence from several high-quality and high-throughput mass-spectrometry (MS experimental data sets. Users can search for plasma protein/peptide annotations, peptide/protein alignments, and experimental/sample conditions with options for filter-based retrieval to achieve greater analytical power for discovery and validation.

  11. Biomedical engineering principles

    CERN Document Server

    Ritter, Arthur B; Valdevit, Antonio; Ascione, Alfred N

    2011-01-01

    Introduction: Modeling of Physiological ProcessesCell Physiology and TransportPrinciples and Biomedical Applications of HemodynamicsA Systems Approach to PhysiologyThe Cardiovascular SystemBiomedical Signal ProcessingSignal Acquisition and ProcessingTechniques for Physiological Signal ProcessingExamples of Physiological Signal ProcessingPrinciples of BiomechanicsPractical Applications of BiomechanicsBiomaterialsPrinciples of Biomedical Capstone DesignUnmet Clinical NeedsEntrepreneurship: Reasons why Most Good Designs Never Get to MarketAn Engineering Solution in Search of a Biomedical Problem

  12. Chapter 51: How to Build a Simple Cone Search Service Using a Local Database

    Science.gov (United States)

    Kent, B. R.; Greene, G. R.

    The cone search service protocol will be examined from the server side in this chapter. A simple cone search service will be setup and configured locally using MySQL. Data will be read into a table, and the Java JDBC will be used to connect to the database. Readers will understand the VO cone search specification and how to use it to query a database on their local systems and return an XML/VOTable file based on an input of RA/DEC coordinates and a search radius. The cone search in this example will be deployed as a Java servlet. The resulting cone search can be tested with a verification service. This basic setup can be used with other languages and relational databases.

  13. The new ENSDF search system NESSY: IBM/PC nuclear spectroscopy database

    International Nuclear Information System (INIS)

    Boboshin, I.N.; Varlamov, V.V.

    1996-01-01

    The universal relational nuclear structure and decay database NESSY (New ENSDF Search SYstem) developed for the IBM/PC and compatible PCs, and based on the international file ENSDF (Evaluated Nuclear Structure Data File), is described. The NESSY provides the possibility of high efficiency processing (the search and retrieval of any kind of physical data) of the information from ENSDF. The principles of the database development are described and examples of applications are presented. (orig.)

  14. A Taxonomic Search Engine: federating taxonomic databases using web services.

    Science.gov (United States)

    Page, Roderic D M

    2005-03-09

    The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  15. STEPS: a grid search methodology for optimized peptide identification filtering of MS/MS database search results.

    Science.gov (United States)

    Piehowski, Paul D; Petyuk, Vladislav A; Sandoval, John D; Burnum, Kristin E; Kiebel, Gary R; Monroe, Matthew E; Anderson, Gordon A; Camp, David G; Smith, Richard D

    2013-03-01

    For bottom-up proteomics, there are wide variety of database-searching algorithms in use for matching peptide sequences to tandem MS spectra. Likewise, there are numerous strategies being employed to produce a confident list of peptide identifications from the different search algorithm outputs. Here we introduce a grid-search approach for determining optimal database filtering criteria in shotgun proteomics data analyses that is easily adaptable to any search. Systematic Trial and Error Parameter Selection--referred to as STEPS--utilizes user-defined parameter ranges to test a wide array of parameter combinations to arrive at an optimal "parameter set" for data filtering, thus maximizing confident identifications. The benefits of this approach in terms of numbers of true-positive identifications are demonstrated using datasets derived from immunoaffinity-depleted blood serum and a bacterial cell lysate, two common proteomics sample types. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Searching for religion and mental health studies required health, social science, and grey literature databases.

    Science.gov (United States)

    Wright, Judy M; Cottrell, David J; Mir, Ghazala

    2014-07-01

    To determine the optimal databases to search for studies of faith-sensitive interventions for treating depression. We examined 23 health, social science, religious, and grey literature databases searched for an evidence synthesis. Databases were prioritized by yield of (1) search results, (2) potentially relevant references identified during screening, (3) included references contained in the synthesis, and (4) included references that were available in the database. We assessed the impact of databases beyond MEDLINE, EMBASE, and PsycINFO by their ability to supply studies identifying new themes and issues. We identified pragmatic workload factors that influence database selection. PsycINFO was the best performing database within all priority lists. ArabPsyNet, CINAHL, Dissertations and Theses, EMBASE, Global Health, Health Management Information Consortium, MEDLINE, PsycINFO, and Sociological Abstracts were essential for our searches to retrieve the included references. Citation tracking activities and the personal library of one of the research teams made significant contributions of unique, relevant references. Religion studies databases (Am Theo Lib Assoc, FRANCIS) did not provide unique, relevant references. Literature searches for reviews and evidence syntheses of religion and health studies should include social science, grey literature, non-Western databases, personal libraries, and citation tracking activities. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. A perspective for biomedical data integration: Design of databases for flow cytometry

    Directory of Open Access Journals (Sweden)

    Lakoumentas John

    2008-02-01

    Full Text Available Abstract Background The integration of biomedical information is essential for tackling medical problems. We describe a data model in the domain of flow cytometry (FC allowing for massive management, analysis and integration with other laboratory and clinical information. The paper is concerned with the proper translation of the Flow Cytometry Standard (FCS into a relational database schema, in a way that facilitates end users at either doing research on FC or studying specific cases of patients undergone FC analysis Results The proposed database schema provides integration of data originating from diverse acquisition settings, organized in a way that allows syntactically simple queries that provide results significantly faster than the conventional implementations of the FCS standard. The proposed schema can potentially achieve up to 8 orders of magnitude reduction in query complexity and up to 2 orders of magnitude reduction in response time for data originating from flow cytometers that record 256 colours. This is mainly achieved by managing to maintain an almost constant number of data-mining procedures regardless of the size and complexity of the stored information. Conclusion It is evident that using single-file data storage standards for the design of databases without any structural transformations significantly limits the flexibility of databases. Analysis of the requirements of a specific domain for integration and massive data processing can provide the necessary schema modifications that will unlock the additional functionality of a relational database.

  18. PIR search result - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available e filtered with Expect values lower than 1e-10. Number of data entries 1,549,409 ...he searches. Data analysis method Performed blastx searches against the PIR protein database. The results ar

  19. A Taxonomic Search Engine: Federating taxonomic databases using web services

    Directory of Open Access Journals (Sweden)

    Page Roderic DM

    2005-03-01

    Full Text Available Abstract Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata for each name. Conclusion The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  20. GeneView: a comprehensive semantic search engine for PubMed.

    Science.gov (United States)

    Thomas, Philippe; Starlinger, Johannes; Vowinkel, Alexander; Arzt, Sebastian; Leser, Ulf

    2012-07-01

    Research results are primarily published in scientific literature and curation efforts cannot keep up with the rapid growth of published literature. The plethora of knowledge remains hidden in large text repositories like MEDLINE. Consequently, life scientists have to spend a great amount of time searching for specific information. The enormous ambiguity among most names of biomedical objects such as genes, chemicals and diseases often produces too large and unspecific search results. We present GeneView, a semantic search engine for biomedical knowledge. GeneView is built upon a comprehensively annotated version of PubMed abstracts and openly available PubMed Central full texts. This semi-structured representation of biomedical texts enables a number of features extending classical search engines. For instance, users may search for entities using unique database identifiers or they may rank documents by the number of specific mentions they contain. Annotation is performed by a multitude of state-of-the-art text-mining tools for recognizing mentions from 10 entity classes and for identifying protein-protein interactions. GeneView currently contains annotations for >194 million entities from 10 classes for ∼21 million citations with 271,000 full text bodies. GeneView can be searched at http://bc3.informatik.hu-berlin.de/.

  1. pSort search result - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...name: kome_psort_search_result.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kome/LATEST/kome_psort_searc...abase Description Download License Update History of This Database Site Policy | Contact Us pSort search result - KOME | LSDB Archive ...

  2. Biomedical waste management in Ayurveda hospitals - current practices & future prospectives.

    Science.gov (United States)

    Rajan, Renju; Robin, Delvin T; M, Vandanarani

    2018-03-16

    Biomedical waste management is an integral part of traditional and contemporary system of health care. The paper focuses on the identification and classification of biomedical wastes in Ayurvedic hospitals, current practices of its management in Ayurveda hospitals and its future prospective. Databases like PubMed (1975-2017 Feb), Scopus (1960-2017), AYUSH Portal, DOAJ, DHARA and Google scholar were searched. We used the medical subject headings 'biomedical waste' and 'health care waste' for identification and classification. The terms 'biomedical waste management', 'health care waste management' alone and combined with 'Ayurveda' or 'Ayurvedic' for current practices and recent advances in the treatment of these wastes were used. We made a humble attempt to categorize the biomedical wastes from Ayurvedic hospitals as the available data about its grouping is very scarce. Proper biomedical waste management is the mainstay of hospital cleanliness, hospital hygiene and maintenance activities. Current disposal techniques adopted for Ayurveda biomedical wastes are - sewage/drains, incineration and land fill. But these methods are having some merits as well as demerits. Our review has identified a number of interesting areas for future research such as the logical application of bioremediation techniques in biomedical waste management and the usage of effective micro-organisms and solar energy in waste disposal. Copyright © 2017 Transdisciplinary University, Bangalore and World Ayurveda Foundation. Published by Elsevier B.V. All rights reserved.

  3. Semantic similarity measures in the biomedical domain by leveraging a web search engine.

    Science.gov (United States)

    Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

    2013-07-01

    Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.

  4. PFTijah: text search in an XML database system

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Rode, H.; van Os, R.; Flokstra, Jan

    2006-01-01

    This paper introduces the PFTijah system, a text search system that is integrated with an XML/XQuery database management system. We present examples of its use, we explain some of the system internals, and discuss plans for future work. PFTijah is part of the open source release of MonetDB/XQuery.

  5. A Bayesian network approach to the database search problem in criminal proceedings

    Science.gov (United States)

    2012-01-01

    Background The ‘database search problem’, that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions

  6. Combined semantic and similarity search in medical image databases

    Science.gov (United States)

    Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin

    2011-03-01

    The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.

  7. Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry

    OpenAIRE

    Pevzner, Pavel A.; Mulyukov, Zufar; Dancik, Vlado; Tang, Chris L

    2001-01-01

    Although protein identification by matching tandem mass spectra (MS/MS) against protein databases is a widespread tool in mass spectrometry, the question about reliability of such searches remains open. Absence of rigorous significance scores in MS/MS database search makes it difficult to discard random database hits and may lead to erroneous protein identification, particularly in the case of mutated or post-translationally modified peptides. This problem is especially important for high-thr...

  8. NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

    Directory of Open Access Journals (Sweden)

    N. Kanya

    2016-07-01

    Full Text Available Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR and Information Extraction (IE. The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE. The work was based on machine learning algorithm Conditional Random Field (CRF.

  9. Winnowing sequences from a database search.

    Science.gov (United States)

    Berman, P; Zhang, Z; Wolf, Y I; Koonin, E V; Miller, W

    2000-01-01

    In database searches for sequence similarity, matches to a distinct sequence region (e.g., protein domain) are frequently obscured by numerous matches to another region of the same sequence. In order to cope with this problem, algorithms are developed to discard redundant matches. One model for this problem begins with a list of intervals, each with an associated score; each interval gives the range of positions in the query sequence that align to a database sequence, and the score is that of the alignment. If interval I is contained in interval J, and I's score is less than J's, then I is said to be dominated by J. The problem is then to identify each interval that is dominated by at least K other intervals, where K is a given level of "tolerable redundancy." An algorithm is developed to solve the problem in O(N log N) time and O(N*) space, where N is the number of intervals and N* is a precisely defined value that never exceeds N and is frequently much smaller. This criterion for discarding database hits has been implemented in the Blast program, as illustrated herein with examples. Several variations and extensions of this approach are also described.

  10. The database search problem: a question of rational decision making.

    Science.gov (United States)

    Gittelson, S; Biedermann, A; Bozza, S; Taroni, F

    2012-10-10

    This paper applies probability and decision theory in the graphical interface of an influence diagram to study the formal requirements of rationality which justify the individualization of a person found through a database search. The decision-theoretic part of the analysis studies the parameters that a rational decision maker would use to individualize the selected person. The modeling part (in the form of an influence diagram) clarifies the relationships between this decision and the ingredients that make up the database search problem, i.e., the results of the database search and the different pairs of propositions describing whether an individual is at the source of the crime stain. These analyses evaluate the desirability associated with the decision of 'individualizing' (and 'not individualizing'). They point out that this decision is a function of (i) the probability that the individual in question is, in fact, at the source of the crime stain (i.e., the state of nature), and (ii) the decision maker's preferences among the possible consequences of the decision (i.e., the decision maker's loss function). We discuss the relevance and argumentative implications of these insights with respect to recent comments in specialized literature, which suggest points of view that are opposed to the results of our study. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  11. The LAILAPS search engine: a feature model for relevance ranking in life science databases.

    Science.gov (United States)

    Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

    2010-03-25

    Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.

  12. PolySearch2: a significantly improved text-mining system for discovering associations between human diseases, genes, drugs, metabolites, toxins and more.

    Science.gov (United States)

    Liu, Yifeng; Liang, Yongjie; Wishart, David

    2015-07-01

    PolySearch2 (http://polysearch.ca) is an online text-mining system for identifying relationships between biomedical entities such as human diseases, genes, SNPs, proteins, drugs, metabolites, toxins, metabolic pathways, organs, tissues, subcellular organelles, positive health effects, negative health effects, drug actions, Gene Ontology terms, MeSH terms, ICD-10 medical codes, biological taxonomies and chemical taxonomies. PolySearch2 supports a generalized 'Given X, find all associated Ys' query, where X and Y can be selected from the aforementioned biomedical entities. An example query might be: 'Find all diseases associated with Bisphenol A'. To find its answers, PolySearch2 searches for associations against comprehensive collections of free-text collections, including local versions of MEDLINE abstracts, PubMed Central full-text articles, Wikipedia full-text articles and US Patent application abstracts. PolySearch2 also searches 14 widely used, text-rich biological databases such as UniProt, DrugBank and Human Metabolome Database to improve its accuracy and coverage. PolySearch2 maintains an extensive thesaurus of biological terms and exploits the latest search engine technology to rapidly retrieve relevant articles and databases records. PolySearch2 also generates, ranks and annotates associative candidates and present results with relevancy statistics and highlighted key sentences to facilitate user interpretation. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. [Design and establishment of modern literature database about acupuncture Deqi].

    Science.gov (United States)

    Guo, Zheng-rong; Qian, Gui-feng; Pan, Qiu-yin; Wang, Yang; Xin, Si-yuan; Li, Jing; Hao, Jie; Hu, Ni-juan; Zhu, Jiang; Ma, Liang-xiao

    2015-02-01

    A search on acupuncture Deqi was conducted using four Chinese-language biomedical databases (CNKI, Wan-Fang, VIP and CBM) and PubMed database and using keywords "Deqi" or "needle sensation" "needling feeling" "needle feel" "obtaining qi", etc. Then, a "Modern Literature Database for Acupuncture Deqi" was established by employing Microsoft SQL Server 2005 Express Edition, introducing the contents, data types, information structure and logic constraint of the system table fields. From this Database, detailed inquiries about general information of clinical trials, acupuncturists' experience, ancient medical works, comprehensive literature, etc. can be obtained. The present databank lays a foundation for subsequent evaluation of literature quality about Deqi and data mining of undetected Deqi knowledge.

  14. Successful aging: considering non-biomedical constructs

    Directory of Open Access Journals (Sweden)

    Carver LF

    2016-11-01

    Full Text Available Lisa F Carver,1 Diane Buchanan2 1Department of Sociology, Queen’s University Kingston, ON, Canada; 2School of Nursing, Queen’s University Kingston, ON, Canada Objectives: Successful aging continues to be applied in a variety of contexts and is defined using a number of different constructs. Although previous reviews highlight the multidimensionality of successful aging, a few have focused exclusively on non-biomedical factors, as was done here. Methods: This scoping review searched Ovid Medline database for peer-reviewed English-language articles published between 2006 and 2015, offering a model of successful aging and involving research with older adults. Results: Seventy-two articles were reviewed. Thirty-five articles met the inclusion criteria. Common non-biomedical constructs associated with successful aging included engagement, optimism and/or positive attitude, resilience, spirituality and/or religiosity, self-efficacy and/or self-esteem, and gerotranscendence. Discussion: Successful aging is a complex process best described using a multidimensional model. Given that the majority of elders will experience illness and/or disease during the life course, public health initiatives that promote successful aging need to employ non-biomedical constructs, facilitating the inclusion of elders living with disease and/or disability. Keywords: successful aging, resilience, gerotranscendence, engagement, optimism

  15. Global search tool for the Advanced Photon Source Integrated Relational Model of Installed Systems (IRMIS) database

    International Nuclear Information System (INIS)

    Quock, D.E.R.; Cianciarulo, M.B.

    2007-01-01

    The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, the necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.

  16. mirPub: a database for searching microRNA publications.

    Science.gov (United States)

    Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore

    2015-05-01

    Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. mirPub is freely available at http://www.microrna.gr/mirpub/. vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  17. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion?

    Science.gov (United States)

    Lawrence, D W

    2008-12-01

    To assess what is lost if only one literature database is searched for articles relevant to injury prevention and safety promotion (IPSP) topics. Serial textword (keyword, free-text) searches using multiple synonym terms for five key IPSP topics (bicycle-related brain injuries, ethanol-impaired driving, house fires, road rage, and suicidal behaviors among adolescents) were conducted in four of the bibliographic databases that are most used by IPSP professionals: EMBASE, MEDLINE, PsycINFO, and Web of Science. Through a systematic procedure, an inventory of articles on each topic in each database was conducted to identify the total unduplicated count of all articles on each topic, the number of articles unique to each database, and the articles available if only one database is searched. No single database included all of the relevant articles on any topic, and the database with the broadest coverage differed by topic. A search of only one literature database will return 16.7-81.5% (median 43.4%) of the available articles on any of five key IPSP topics. Each database contributed unique articles to the total bibliography for each topic. A literature search performed in only one database will, on average, lead to a loss of more than half of the available literature on a topic.

  18. [Systematic literature search in PubMed : A short introduction].

    Science.gov (United States)

    Blümle, A; Lagrèze, W A; Motschall, E

    2018-03-01

    In order to identify current (and relevant) evidence for a specific clinical question within the unmanageable amount of information available, solid skills in performing a systematic literature search are essential. An efficient approach is to searchbiomedical database containing relevant literature citations of study reports. The best known database is MEDLINE, which is searchable for free via the PubMed interface. In this article, we explain step by step how to perform a systematic literature search via PubMed by means of an example research question in the field of ophthalmology. First, we demonstrate how to translate the clinical problem into a well-framed and searchable research question, how to identify relevant search terms and how to conduct a text word search and a search with keywords in medical subject headings (MeSH) terms. We then show how to limit the number of search results if the search yields too many irrelevant hits and how to increase the number in the case of too few citations. Finally, we summarize all essential principles that guide a literature search via PubMed.

  19. PIE the search: searching PubMed literature for protein interaction information.

    Science.gov (United States)

    Kim, Sun; Kwon, Dongseop; Shin, Soo-Yong; Wilbur, W John

    2012-02-15

    Finding protein-protein interaction (PPI) information from literature is challenging but an important issue. However, keyword search in PubMed(®) is often time consuming because it requires a series of actions that refine keywords and browse search results until it reaches a goal. Due to the rapid growth of biomedical literature, it has become more difficult for biologists and curators to locate PPI information quickly. Therefore, a tool for prioritizing PPI informative articles can be a useful assistant for finding this PPI-relevant information. PIE (Protein Interaction information Extraction) the search is a web service implementing a competition-winning approach utilizing word and syntactic analyses by machine learning techniques. For easy user access, PIE the search provides a PubMed-like search environment, but the output is the list of articles prioritized by PPI confidence scores. By obtaining PPI-related articles at high rank, researchers can more easily find the up-to-date PPI information, which cannot be found in manually curated PPI databases. http://www.ncbi.nlm.nih.gov/IRET/PIE/.

  20. Enriching Great Britain's National Landslide Database by searching newspaper archives

    Science.gov (United States)

    Taylor, Faith E.; Malamud, Bruce D.; Freeborough, Katy; Demeritt, David

    2015-11-01

    Our understanding of where landslide hazard and impact will be greatest is largely based on our knowledge of past events. Here, we present a method to supplement existing records of landslides in Great Britain by searching an electronic archive of regional newspapers. In Great Britain, the British Geological Survey (BGS) is responsible for updating and maintaining records of landslide events and their impacts in the National Landslide Database (NLD). The NLD contains records of more than 16,500 landslide events in Great Britain. Data sources for the NLD include field surveys, academic articles, grey literature, news, public reports and, since 2012, social media. We aim to supplement the richness of the NLD by (i) identifying additional landslide events, (ii) acting as an additional source of confirmation of events existing in the NLD and (iii) adding more detail to existing database entries. This is done by systematically searching the Nexis UK digital archive of 568 regional newspapers published in the UK. In this paper, we construct a robust Boolean search criterion by experimenting with landslide terminology for four training periods. We then apply this search to all articles published in 2006 and 2012. This resulted in the addition of 111 records of landslide events to the NLD over the 2 years investigated (2006 and 2012). We also find that we were able to obtain information about landslide impact for 60-90% of landslide events identified from newspaper articles. Spatial and temporal patterns of additional landslides identified from newspaper articles are broadly in line with those existing in the NLD, confirming that the NLD is a representative sample of landsliding in Great Britain. This method could now be applied to more time periods and/or other hazards to add richness to databases and thus improve our ability to forecast future events based on records of past events.

  1. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications.

    Science.gov (United States)

    Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart

    2013-01-01

    Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.

  2. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

    Science.gov (United States)

    Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2017-09-13

    Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

  3. Exploration of Global Trend on Biomedical Application of Polyhydroxyalkanoate (PHA): A Patent Survey.

    Science.gov (United States)

    Ponnaiah, Paulraj; Vnoothenei, Nagiah; Chandramohan, Muruganandham; Thevarkattil, Mohamed Javad Pazhayakath

    2018-01-30

    Polyhydroxyalkanoates are bio-based, biodegradable naturally occurring polymers produced by a wide range of organisms, from bacteria to higher mammals. The properties and biocompatibility of PHA make it possible for a wide spectrum of applications. In this context, we analyze the potential applications of PHA in biomedical science by exploring the global trend through the patent survey. The survey suggests that PHA is an attractive candidate in such a way that their applications are widely distributed in the medical industry, drug delivery system, dental material, tissue engineering, packaging material as well as other useful products. In our present study, we explored patents associated with various biomedical applications of polyhydroxyalkanoates. Patent databases of European Patent Office, United States Patent and Trademark Office and World Intellectual Property Organization were mined. We developed an intensive exploration approach to eliminate overlapping patents and sort out significant patents. We demarcated the keywords and search criterions and established search patterns for the database request. We retrieved documents within the recent 6 years, 2010 to 2016 and sort out the collected data stepwise to gather the most appropriate documents in patent families for further scrutiny. By this approach, we retrieved 23,368 patent documents from all the three databases and the patent titles were further analyzed for the relevance of polyhydroxyalkanoates in biomedical applications. This ensued in the documentation of approximately 226 significant patents associated with biomedical applications of polyhydroxyalkanoates and the information was classified into six major groups. Polyhydroxyalkanoates has been patented in such a way that their applications are widely distributed in the medical industry, drug delivery system, dental material, tissue engineering, packaging material as well as other useful products. There are many avenues through which PHA & PHB could be

  4. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.

    Science.gov (United States)

    Rognes, Torbjørn

    2011-06-01

    The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.

  5. search.bioPreprint: a discovery tool for cutting edge, preprint biomedical research articles [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Carrie L. Iwema

    2016-07-01

    Full Text Available The time it takes for a completed manuscript to be published traditionally can be extremely lengthy. Article publication delay, which occurs in part due to constraints associated with peer review, can prevent the timely dissemination of critical and actionable data associated with new information on rare diseases or developing health concerns such as Zika virus. Preprint servers are open access online repositories housing preprint research articles that enable authors (1 to make their research immediately and freely available and (2 to receive commentary and peer review prior to journal submission. There is a growing movement of preprint advocates aiming to change the current journal publication and peer review system, proposing that preprints catalyze biomedical discovery, support career advancement, and improve scientific communication. While the number of articles submitted to and hosted by preprint servers are gradually increasing, there has been no simple way to identify biomedical research published in a preprint format, as they are not typically indexed and are only discoverable by directly searching the specific preprint server websites. To address this issue, we created a search engine that quickly compiles preprints from disparate host repositories and provides a one-stop search solution. Additionally, we developed a web application that bolsters the discovery of preprints by enabling each and every word or phrase appearing on any web site to be integrated with articles from preprint servers. This tool, search.bioPreprint, is publicly available at http://www.hsls.pitt.edu/resources/preprint.

  6. Indexing Bibliographic Database Content Using MariaDB and Sphinx Search Server

    Directory of Open Access Journals (Sweden)

    Arie Nugraha

    2014-07-01

    Full Text Available Fast retrieval of digital content has become mandatory for library and archive information systems. Many software applications have emerged to handle the indexing of digital content, from low-level ones such Apache Lucene, to more RESTful and web-services-ready ones such Apache Solr and ElasticSearch. Solr’s popularity among library software developers makes it the “de-facto” standard software for indexing digital content. For content (full-text content or bibliographic description already stored inside a relational DBMS such as MariaDB (a fork of MySQL or PostgreSQL, Sphinx Search Server (Sphinx is a suitable alternative. This article will cover an introduction on how to use Sphinx with MariaDB databases to index database content as well as some examples of Sphinx API usage.

  7. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    Science.gov (United States)

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-04-06

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  8. A Review of Published Articles in the Field of Biomedical Nanotechnology in Medline Database during 2000-2010

    OpenAIRE

    Peyman Sheikhzade

    2015-01-01

    Background and objectives : Nanotechnology is a new technology which is increasingly used over the past decade. Due to its great significance, governments are tending to invest greatly on the research and development on nanotechnology in various sectors and aspects. The purpose of this study was to determine the status of biomedical nanotechnology publications over the past ten years (2010-2000) in Medline/ PubMed. Material and Methods : This was a descriptive study. The Medline database wa...

  9. Ontological interpretation of biomedical database content.

    Science.gov (United States)

    Santana da Silva, Filipe; Jansen, Ludger; Freitas, Fred; Schulz, Stefan

    2017-06-26

    Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework. By using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs). IND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements. Ambiguity of biological database content is

  10. IMPROVED SEARCH OF PRINCIPAL COMPONENT ANALYSIS DATABASES FOR SPECTRO-POLARIMETRIC INVERSION

    International Nuclear Information System (INIS)

    Casini, R.; Lites, B. W.; Ramos, A. Asensio; Ariste, A. López

    2013-01-01

    We describe a simple technique for the acceleration of spectro-polarimetric inversions based on principal component analysis (PCA) of Stokes profiles. This technique involves the indexing of the database models based on the sign of the projections (PCA coefficients) of the first few relevant orders of principal components of the four Stokes parameters. In this way, each model in the database can be attributed a distinctive binary number of 2 4n bits, where n is the number of PCA orders used for the indexing. Each of these binary numbers (indices) identifies a group of ''compatible'' models for the inversion of a given set of observed Stokes profiles sharing the same index. The complete set of the binary numbers so constructed evidently determines a partition of the database. The search of the database for the PCA inversion of spectro-polarimetric data can profit greatly from this indexing. In practical cases it becomes possible to approach the ideal acceleration factor of 2 4n as compared to the systematic search of a non-indexed database for a traditional PCA inversion. This indexing method relies on the existence of a physical meaning in the sign of the PCA coefficients of a model. For this reason, the presence of model ambiguities and of spectro-polarimetric noise in the observations limits in practice the number n of relevant PCA orders that can be used for the indexing

  11. Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

    Science.gov (United States)

    Huang, Chung-Chi; Lu, Zhiyong

    2016-01-01

    Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as 'CHEMICAL-1 compared to CHEMICAL-2' With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical-disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order

  12. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation

    Directory of Open Access Journals (Sweden)

    Rognes Torbjørn

    2011-06-01

    Full Text Available Abstract Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.

  13. A Review of Abstracting and Indexing Services for Biomedical Journals

    Directory of Open Access Journals (Sweden)

    Sarita Bhardwaj

    2017-10-01

    Full Text Available The days are gone when the researchers used to go to library to look for the articles of their choice. With the introduction of electronic era, searching an article online has become easier. This has been possible due to the availability of various Abstracting and Indexing (A & I services in the world. Of more than 400 online A & I services available, only a few like Google and Thomson Reuters cover all disciplines. Most A & I services cover just one discipline allowing them to cover their area in more depth. There are many databases and indexing services for biomedical journals, most important ones being PubMed/Medline, Scopus, and Web of Science (ISI. This article gives a review of various databases and indexes available for dental journals in the world.

  14. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    Science.gov (United States)

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease

  15. Comparing the Precision of Information Retrieval of MeSH-Controlled Vocabulary Search Method and a Visual Method in the Medline Medical Database.

    Science.gov (United States)

    Hariri, Nadjla; Ravandi, Somayyeh Nadi

    2014-01-01

    Medline is one of the most important databases in the biomedical field. One of the most important hosts for Medline is Elton B. Stephens CO. (EBSCO), which has presented different search methods that can be used based on the needs of the users. Visual search and MeSH-controlled search methods are among the most common methods. The goal of this research was to compare the precision of the retrieved sources in the EBSCO Medline base using MeSH-controlled and visual search methods. This research was a semi-empirical study. By holding training workshops, 70 students of higher education in different educational departments of Kashan University of Medical Sciences were taught MeSH-Controlled and visual search methods in 2012. Then, the precision of 300 searches made by these students was calculated based on Best Precision, Useful Precision, and Objective Precision formulas and analyzed in SPSS software using the independent sample T Test, and three precisions obtained with the three precision formulas were studied for the two search methods. The mean precision of the visual method was greater than that of the MeSH-Controlled search for all three types of precision, i.e. Best Precision, Useful Precision, and Objective Precision, and their mean precisions were significantly different (P searches. Fifty-three percent of the participants in the research also mentioned that the use of the combination of the two methods produced better results. For users, it is more appropriate to use a natural, language-based method, such as the visual method, in the EBSCO Medline host than to use the controlled method, which requires users to use special keywords. The potential reason for their preference was that the visual method allowed them more freedom of action.

  16. Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design.

    Science.gov (United States)

    Spink, Amanda

    1995-01-01

    This study uses the human approach to examine the sources and effectiveness of search terms selected during 40 mediated interactive database searches and focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. (Author/JKP)

  17. DOT Online Database

    Science.gov (United States)

    Page Home Table of Contents Contents Search Database Search Login Login Databases Advisory Circulars accessed by clicking below: Full-Text WebSearch Databases Database Records Date Advisory Circulars 2092 5 data collection and distribution policies. Document Database Website provided by MicroSearch

  18. Supporting ontology-based keyword search over medical databases.

    Science.gov (United States)

    Kementsietsidis, Anastasios; Lim, Lipyeow; Wang, Min

    2008-11-06

    The proliferation of medical terms poses a number of challenges in the sharing of medical information among different stakeholders. Ontologies are commonly used to establish relationships between different terms, yet their role in querying has not been investigated in detail. In this paper, we study the problem of supporting ontology-based keyword search queries on a database of electronic medical records. We present several approaches to support this type of queries, study the advantages and limitations of each approach, and summarize the lessons learned as best practices.

  19. International symposium on Biomedical Data Infrastructure (BDI 2013)

    CERN Document Server

    Dhillon, Sarinder; Advances in biomedical infrastructure 2013

    2013-01-01

    Current Biomedical Databases are independently administered in geographically distinct locations, lending them almost ideally to adoption of intelligent data management approaches. This book focuses on research issues, problems and opportunities in Biomedical Data Infrastructure identifying new issues and directions for future research in Biomedical Data and Information Retrieval, Semantics in Biomedicine, and Biomedical Data Modeling and Analysis. The book will be a useful guide for researchers, practitioners, and graduate-level students interested in learning state-of-the-art development in biomedical data management.

  20. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.

    Science.gov (United States)

    Ravikumar, Komandur Elayavilli; Wagholikar, Kavishwar B; Li, Dingcheng; Kocher, Jean-Pierre; Liu, Hongfang

    2015-06-06

    Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains 'locked' in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems. We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3% for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10% in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5%. Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating

  1. Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.

    Science.gov (United States)

    Arita, Masanori; Suwa, Kazuhiro

    2008-09-17

    In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.

  2. Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs

    Science.gov (United States)

    Munekawa, Yuma; Ino, Fumihiko; Hagihara, Kenichi

    This paper presents a fast method capable of accelerating the Smith-Waterman algorithm for biological database search on a cluster of graphics processing units (GPUs). Our method is implemented using compute unified device architecture (CUDA), which is available on the nVIDIA GPU. As compared with previous methods, our method has four major contributions. (1) The method efficiently uses on-chip shared memory to reduce the data amount being transferred between off-chip video memory and processing elements in the GPU. (2) It also reduces the number of data fetches by applying a data reuse technique to query and database sequences. (3) A pipelined method is also implemented to overlap GPU execution with database access. (4) Finally, a master/worker paradigm is employed to accelerate hundreds of database searches on a cluster system. In experiments, the peak performance on a GeForce GTX 280 card reaches 8.32 giga cell updates per second (GCUPS). We also find that our method reduces the amount of data fetches to 1/140, achieving approximately three times higher performance than a previous CUDA-based method. Our 32-node cluster version is approximately 28 times faster than a single GPU version. Furthermore, the effective performance reaches 75.6 giga instructions per second (GIPS) using 32 GeForce 8800 GTX cards.

  3. Hmrbase: a database of hormones and their receptors

    Science.gov (United States)

    Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

    2009-01-01

    Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, Drug

  4. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

    Directory of Open Access Journals (Sweden)

    Maskell Douglas L

    2009-05-01

    Full Text Available Abstract Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  5. Current Comparative Table (CCT) automates customized searches of dynamic biological databases.

    Science.gov (United States)

    Landsteiner, Benjamin R; Olson, Michael R; Rutherford, Robert

    2005-07-01

    The Current Comparative Table (CCT) software program enables working biologists to automate customized bioinformatics searches, typically of remote sequence or HMM (hidden Markov model) databases. CCT currently supports BLAST, hmmpfam and other programs useful for gene and ortholog identification. The software is web based, has a BioPerl core and can be used remotely via a browser or locally on Mac OS X or Linux machines. CCT is particularly useful to scientists who study large sets of molecules in today's evolving information landscape because it color-codes all result files by age and highlights even tiny changes in sequence or annotation. By empowering non-bioinformaticians to automate custom searches and examine current results in context at a glance, CCT allows a remote database submission in the evening to influence the next morning's bench experiment. A demonstration of CCT is available at http://orb.public.stolaf.edu/CCTdemo and the open source software is freely available from http://sourceforge.net/projects/orb-cct.

  6. Disbiome database: linking the microbiome to disease.

    Science.gov (United States)

    Janssens, Yorick; Nielandt, Joachim; Bronselaer, Antoon; Debunne, Nathan; Verbeke, Frederick; Wynendaele, Evelien; Van Immerseel, Filip; Vandewynckel, Yves-Paul; De Tré, Guy; De Spiegeleer, Bart

    2018-06-04

    Recent research has provided fascinating indications and evidence that the host health is linked to its microbial inhabitants. Due to the development of high-throughput sequencing technologies, more and more data covering microbial composition changes in different disease types are emerging. However, this information is dispersed over a wide variety of medical and biomedical disciplines. Disbiome is a database which collects and presents published microbiota-disease information in a standardized way. The diseases are classified using the MedDRA classification system and the micro-organisms are linked to their NCBI and SILVA taxonomy. Finally, each study included in the Disbiome database is assessed for its reporting quality using a standardized questionnaire. Disbiome is the first database giving a clear, concise and up-to-date overview of microbial composition differences in diseases, together with the relevant information of the studies published. The strength of this database lies within the combination of the presence of references to other databases, which enables both specific and diverse search strategies within the Disbiome database, and the human annotation which ensures a simple and structured presentation of the available data.

  7. Semantic similarity measure in biomedical domain leverage web search engine.

    Science.gov (United States)

    Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

    2010-01-01

    Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

  8. The Establishment of the Chinese Full-text Electronic Periodical Database and Service System

    Directory of Open Access Journals (Sweden)

    Huei-Chu Chang

    2003-12-01

    Full Text Available A database covers important journals to critical mass, with powerful search interface, and easy for remote access is the most reasonable electronic resource for users. This article try to start from the project of digitizing bio-medical journals in Taiwan area to the CEPS, discuss the related issues about the selection of journals, the digitized of back issues, the copyright transfer from authors to database producers, the feedback to authors for payment from revenue. It also talks about the flow of journal publishing, marketing, function and the proposed cost-effectiveness in CEPS.[Article content in Chinese

  9. Google Scholar Out-Performs Many Subscription Databases when Keyword Searching. A Review of: Walters, W. H. (2009. Google Scholar search performance: Comparative recall and precision. portal: Libraries and the Academy, 9(1, 5-24.

    Directory of Open Access Journals (Sweden)

    Giovanna Badia

    2010-09-01

    Full Text Available Objective – To compare the search performance (i.e., recall and precision of Google Scholar with that of 11 other bibliographic databases when using a keyword search to find references on later-life migration. Design – Comparative database evaluation. Setting – Not stated in the article. It appears from the author’s affiliation that this research took place in an academic institution of higher learning. Subjects – Twelve databases were compared: Google Scholar, Academic Search Elite, AgeLine, ArticleFirst, EconLit, Geobase, Medline, PAIS International, Popline, Social Sciences Abstracts, Social Sciences Citation Index, and SocIndex. Methods – The relevant literature on later-life migration was pre-identified as a set of 155 journal articles published from 1990 to 2000. The author selected these articles from database searches, citation tracking, journal scans, and consultations with social sciences colleagues. Each database was evaluated with regards to its performance in finding references to these 155 papers.Elderly and migration were the keywords used to conduct the searches in each of the 12 databases, since these were the words that were the most frequently used in the titles of the 155 relevant articles. The search was performed in the most basic search interface of each database that allowed limiting results by the needed publication dates (1990-2000. Search results were sorted by relevance when possible (for 9 out of the 12 databases, and by date when the relevance sorting option was not available. Recall and precision statistics were then calculated from the search results. Recall is the number of relevant results obtained in the database for a search topic, divided by all the potential results which can be obtained on that topic (in this case, 155 references. Precision is the number of relevant results obtained in the database for a search topic, divided by the total number of results that were obtained in the database on

  10. International biomedical law in search for its normative status.

    Science.gov (United States)

    Krajewska, Atina

    2012-01-01

    The broad and multifaceted problem of global health law and global health governance has been attracting increasing attention in the last few decades. The global community has failed to establish international legal regime that deals comprehensively with the 'technological revolution'. The latter has posed complex questions to regions of the world with widely differing cultural perspectives. At the same time, an increasing number of governmental and non-state actors have become significantly involved in the sector. They use legal, political, and other forms of decision-making that result in regulatory instruments of contrasting normative status. Law created in this heterogeneous environment has been said to be fragmented, inconsistent, and exacerbating uncertainties. Therefore, claims have been made that a centralised and institutionalised system would help address the problems of transparency, legitimacy and efficiency. Nevertheless, little scholarly consideration is paid to the normative status of international biomedical law. This paper explores whether formalisation and "constitutionalisation" of biomedical law are indeed inevitable for its establishment as a separate regulatory regime. It does so by analysing the proliferation of biomedical law in light of two the theory of fragmentation and the theory of global legal pluralism. Investigating the problem in this way helps determine the theoretical framework and methodology of future studies of biomedical law at the international level. This in turn should help its future development in a more consistent and harmonised manner.

  11. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    Directory of Open Access Journals (Sweden)

    Ayush Singhal

    2016-11-01

    Full Text Available The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed. Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD, diabetes mellitus, and cystic fibrosis. We then evaluate our approach in two ways: (1 a direct comparison with the state of the art using benchmark datasets; (2 a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79 over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB, we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets

  12. Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.

    Science.gov (United States)

    Campagne, Fabien

    2008-02-29

    The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79-0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute

  13. Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules—Search Options and Applications in Food Science

    Directory of Open Access Journals (Sweden)

    Piotr Minkiewicz

    2016-12-01

    Full Text Available Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs.

  14. PubMed-based quantitative analysis of biomedical publications in the SAARC countries: 1985-2009.

    Science.gov (United States)

    Azim Majumder, Md Anwarul; Shaban, Sami F; Rahman, Sayeeda; Rahman, Nuzhat; Ahmed, Moslehuddin; Bin Abdulrahman, Khalid A; Islam, Ziauddin

    2012-09-01

    To conduct a geographical analysis of biomedical publications from the South Asian Association for Regional Cooperation (SAARC) countries over the past 25 years (1985-2009) using the PubMed database. A qualitative study. Web-based search during September 2010. A data extraction program, developed by one of the authors (SFS), was used to extract the raw publication counts from the downloaded PubMed data. A search of PubMed was performed for all journals indexed by selecting the advanced search option and entering the country name in the 'affiliation' field. The publications were normalized by total population, adult illiteracy rate, gross domestic product (GDP), secondary school enrollment ratio and Internet usage rate. The number of PubMed-listed papers published by the SAARC countries over the last 25 years totalled 141,783, which is 1.1% of the total papers indexed by PubMed in the same period. India alone produced 90.5% of total publications generated by SAARC countries. The average number of papers published per year from 1985 to 2009 was 5671 and number of publication increased approximately 242-fold. Normalizing by the population (per million) and GDP (per billion), India (133, 27.6%) and Nepal (323, 37.3%) had the highest publications respectively. There was a marked imbalance among the SAARC countries in terms of biomedical research and publication. Because of huge population and the high disease burden, biomedical research and publication output should receive special attention to formulate health policies, re-orient medical education curricula, and alleviate diseases and poverty.

  15. Database citation in full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  16. Design and implementation of Metta, a metasearch engine for biomedical literature retrieval intended for systematic reviewers.

    Science.gov (United States)

    Smalheiser, Neil R; Lin, Can; Jia, Lifeng; Jiang, Yu; Cohen, Aaron M; Yu, Clement; Davis, John M; Adams, Clive E; McDonagh, Marian S; Meng, Weiyi

    2014-01-01

    Individuals and groups who write systematic reviews and meta-analyses in evidence-based medicine regularly carry out literature searches across multiple search engines linked to different bibliographic databases, and thus have an urgent need for a suitable metasearch engine to save time spent on repeated searches and to remove duplicate publications from initial consideration. Unlike general users who generally carry out searches to find a few highly relevant (or highly recent) articles, systematic reviewers seek to obtain a comprehensive set of articles on a given topic, satisfying specific criteria. This creates special requirements and challenges for metasearch engine design and implementation. We created a federated search tool that is connected to five databases: PubMed, EMBASE, CINAHL, PsycINFO, and the Cochrane Central Register of Controlled Trials. Retrieved bibliographic records were shown online; optionally, results could be de-duplicated and exported in both BibTex and XML format. The query interface was extensively modified in response to feedback from users within our team. Besides a general search track and one focused on human-related articles, we also added search tracks optimized to identify case reports and systematic reviews. Although users could modify preset search options, they were rarely if ever altered in practice. Up to several thousand retrieved records could be exported within a few minutes. De-duplication of records returned from multiple databases was carried out in a prioritized fashion that favored retaining citations returned from PubMed. Systematic reviewers are used to formulating complex queries using strategies and search tags that are specific for individual databases. Metta offers a different approach that may save substantial time but which requires modification of current search strategies and better indexing of randomized controlled trial articles. We envision Metta as one piece of a multi-tool pipeline that will assist

  17. The BioLexicon: a large-scale terminological resource for biomedical text mining

    Directory of Open Access Journals (Sweden)

    Thompson Paul

    2011-10-01

    Full Text Available Abstract Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is

  18. The BioLexicon: a large-scale terminological resource for biomedical text mining

    Science.gov (United States)

    2011-01-01

    Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical

  19. [The system of biomedical scientific information of Serbia].

    Science.gov (United States)

    Dacić, M

    1995-09-01

    Building of the System of biomedical scientific information of Yugoslavia (SBMSI YU) began, by the end of 1980, and the system became operative officially in 1986. After the political disintegration of former Yugoslavia SBMSI of Serbia was formed. SBMSI is developed according to the policy of developing of the System of scientific technologic information of Serbia (SSTI S), and with technical support of SSTI S. Reconstruction of the System is done by using former SBMSI YU as a model. Unlike the former SBMSI YU, SBMSI S owns besides the database Biomedicina Serbica, three important databases: database of doctoral dissertations promoted at University Medical School in Belgrade in the period from 1955-1993, database of Master's theses promoted at the University School of Medicine in Belgrade from 1965-1993; A database of foreign biomedical periodicals in libraries of Serbia.

  20. The use and misuse of biomedical data: is bigger really better?

    Science.gov (United States)

    Hoffman, Sharona; Podgurski, Andy

    2013-01-01

    Very large biomedical research databases, containing electronic health records (EHR) and genomic data from millions of patients, have been heralded recently for their potential to accelerate scientific discovery and produce dramatic improvements in medical treatments. Research enabled by these databases may also lead to profound changes in law, regulation, social policy, and even litigation strategies. Yet, is "big data" necessarily better data? This paper makes an original contribution to the legal literature by focusing on what can go wrong in the process of biomedical database research and what precautions are necessary to avoid critical mistakes. We address three main reasons for approaching such research with care and being cautious in relying on its outcomes for purposes of public policy or litigation. First, the data contained in biomedical databases is surprisingly likely to be incorrect or incomplete. Second, systematic biases, arising from both the nature of the data and the preconceptions of investigators, are serious threats to the validity of research results, especially in answering causal questions. Third, data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers. In short, this paper sheds much-needed light on the problems of credulous and uninformed acceptance of research results derived from biomedical databases. An understanding of the pitfalls of big data analysis is of critical importance to anyone who will rely on or dispute its outcomes, including lawyers, policymakers, and the public at large. The Article also recommends technical, methodological, and educational interventions to combat the dangers of database errors and abuses.

  1. The pedagogical benefits of a lexical database (SciE-Lex to assist the production of publishable biomedical texts by EAL writers

    Directory of Open Access Journals (Sweden)

    Natalia Judith Laso

    2017-04-01

    Full Text Available Research has demonstrated that it is challenging for English as an Additional Language (EAL writers to acquire phraseological competence in academic English and develop a good working knowledge of discipline-specific formulaic language. This paper aims to explore if SciE-Lex, a powerful lexical database of biomedical research articles, can be exploited by EAL writers to enhance their command of formulaic language in biomedical English published writing. Our paper reports on the challenges associated with formulaic language (namely collocations for EAL writers, it reflects on the benefits of using a lexical database and it evaluates a pedagogical approach to helping EAL writers produce publishable texts. It specifically highlights results from two writing workshops conducted for EAL writers (medical researchers in the present study. The workshops involved medical researchers working on drafts of their writing using SciE-Lex. Our paper reports on the specific benefits of using SciE-Lex as demonstrated by revisions in the writing produced by the EAL medical researchers. This contribution aims to contribute to current discussion on English for Research Publication Purposes (ERPP for the EAL community who now form the main contributors to research knowledge dissemination.

  2. Decision making in family medicine: randomized trial of the effects of the InfoClinique and Trip database search engines.

    Science.gov (United States)

    Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France

    2013-10-01

    To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Randomized trial. Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Fifteen second-year family medicine residents. Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine's effect on the decision-making process in clinical practice. Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7

  3. Alkemio: association of chemicals with biomedical topics by text and data mining.

    Science.gov (United States)

    Gijón-Correas, José A; Andrade-Navarro, Miguel A; Fontaine, Jean F

    2014-07-01

    The PubMed® database of biomedical citations allows the retrieval of scientific articles studying the function of chemicals in biology and medicine. Mining millions of available citations to search reported associations between chemicals and topics of interest would require substantial human time. We have implemented the Alkemio text mining web tool and SOAP web service to help in this task. The tool uses biomedical articles discussing chemicals (including drugs), predicts their relatedness to the query topic with a naïve Bayesian classifier and ranks all chemicals by P-values computed from random simulations. Benchmarks on seven human pathways showed good retrieval performance (areas under the receiver operating characteristic curves ranged from 73.6 to 94.5%). Comparison with existing tools to retrieve chemicals associated to eight diseases showed the higher precision and recall of Alkemio when considering the top 10 candidate chemicals. Alkemio is a high performing web tool ranking chemicals for any biomedical topics and it is free to non-commercial users. http://cbdm.mdc-berlin.de/∼medlineranker/cms/alkemio. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases

    DEFF Research Database (Denmark)

    Assent, Ira; Wichterich, Marc; Meisen, Tobias

    2008-01-01

    Multimedia similarity search in large databases requires efficient query processing. The Earth mover's distance, introduced in computer vision, is successfully used as a similarity model in a number of small-scale applications. Its computational complexity hindered its adoption in large multimedia...... databases. We enable directly indexing the Earth mover's distance in structures such as the R-tree and the VA-file by providing the accurate 'MinDist' function to any bounding rectangle in the index. We exploit the computational structure of the new MinDist to derive a new lower bound for the EMD Min...

  5. Quantum Partial Searching Algorithm of a Database with Several Target Items

    International Nuclear Information System (INIS)

    Pu-Cha, Zhong; Wan-Su, Bao; Yun, Wei

    2009-01-01

    Choi and Korepin [Quantum Information Processing 6(2007)243] presented a quantum partial search algorithm of a database with several target items which can find a target block quickly when each target block contains the same number of target items. Actually, the number of target items in each target block is arbitrary. Aiming at this case, we give a condition to guarantee performance of the partial search algorithm to be performed and the number of queries to oracle of the algorithm to be minimized. In addition, by further numerical computing we come to the conclusion that the more uniform the distribution of target items, the smaller the number of queries

  6. Preliminary comparison of the Essie and PubMed search engines for answering clinical questions using MD on Tap, a PDA-based program for accessing biomedical literature.

    Science.gov (United States)

    Sutton, Victoria R; Hauser, Susan E

    2005-01-01

    MD on Tap, a PDA application that searches and retrieves biomedical literature, is specifically designed for use by mobile healthcare professionals. With the goal of improving the usability of the application, a preliminary comparison was made of two search engines (PubMed and Essie) to determine which provided most efficient path to the desired clinically-relevant information.

  7. Dialysis search filters for PubMed, Ovid MEDLINE, and Embase databases.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Lee, Christopher W C; Wilczynski, Nancy L; McKibbon, Ann; Shariff, Salimah Z; Blake, Peter G; Lindsay, Robert M; Garg, Amit X

    2012-10-01

    Physicians frequently search bibliographic databases, such as MEDLINE via PubMed, for best evidence for patient care. The objective of this study was to develop and test search filters to help physicians efficiently retrieve literature related to dialysis (hemodialysis or peritoneal dialysis) from all other articles indexed in PubMed, Ovid MEDLINE, and Embase. A diagnostic test assessment framework was used to develop and test robust dialysis filters. The reference standard was a manual review of the full texts of 22,992 articles from 39 journals to determine whether each article contained dialysis information. Next, 1,623,728 unique search filters were developed, and their ability to retrieve relevant articles was evaluated. The high-performance dialysis filters consisted of up to 65 search terms in combination. These terms included the words "dialy" (truncated), "uremic," "catheters," and "renal transplant wait list." These filters reached peak sensitivities of 98.6% and specificities of 98.5%. The filters' performance remained robust in an independent validation subset of articles. These empirically derived and validated high-performance search filters should enable physicians to effectively retrieve dialysis information from PubMed, Ovid MEDLINE, and Embase.

  8. A searching and reporting system for relational databases using a graph-based metadata representation.

    Science.gov (United States)

    Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling

    2005-01-01

    Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.

  9. [Advanced online search techniques and dedicated search engines for physicians].

    Science.gov (United States)

    Nahum, Yoav

    2008-02-01

    In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.

  10. Methods and pitfalls in searching drug safety databases utilising the Medical Dictionary for Regulatory Activities (MedDRA).

    Science.gov (United States)

    Brown, Elliot G

    2003-01-01

    The Medical Dictionary for Regulatory Activities (MedDRA) is a unified standard terminology for recording and reporting adverse drug event data. Its introduction is widely seen as a significant improvement on the previous situation, where a multitude of terminologies of widely varying scope and quality were in use. However, there are some complexities that may cause difficulties, and these will form the focus for this paper. Two methods of searching MedDRA-coded databases are described: searching based on term selection from all of MedDRA and searching based on terms in the safety database. There are several potential traps for the unwary in safety searches. There may be multiple locations of relevant terms within a system organ class (SOC) and lack of recognition of appropriate group terms; the user may think that group terms are more inclusive than is the case. MedDRA may distribute terms relevant to one medical condition across several primary SOCs. If the database supports the MedDRA model, it is possible to perform multiaxial searching: while this may help find terms that might have been missed, it is still necessary to consider the entire contents of the SOCs to find all relevant terms and there are many instances of incomplete secondary linkages. It is important to adjust for multiaxiality if data are presented using primary and secondary locations. Other sources for errors in searching are non-intuitive placement and the selection of terms as preferred terms (PTs) that may not be widely recognised. Some MedDRA rules could also result in errors in data retrieval if the individual is unaware of these: in particular, the lack of multiaxial linkages for the Investigations SOC, Social circumstances SOC and Surgical and medical procedures SOC and the requirement that a PT may only be present under one High Level Term (HLT) and one High Level Group Term (HLGT) within any single SOC. Special Search Categories (collections of PTs assembled from various SOCs by

  11. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    National Research Council Canada - National Science Library

    Ortega-Binderberger, Michael

    2002-01-01

    ... as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement...

  12. The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.

    Science.gov (United States)

    Storvik, Geir; Egeland, Thore

    2007-09-01

    Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.

  13. Allie: a database and a search service of abbreviations and long forms

    Science.gov (United States)

    Yamamoto, Yasunori; Yamaguchi, Atsuko; Bono, Hidemasa; Takagi, Toshihisa

    2011-01-01

    Many abbreviations are used in the literature especially in the life sciences, and polysemous abbreviations appear frequently, making it difficult to read and understand scientific papers that are outside of a reader’s expertise. Thus, we have developed Allie, a database and a search service of abbreviations and their long forms (a.k.a. full forms or definitions). Allie searches for abbreviations and their corresponding long forms in a database that we have generated based on all titles and abstracts in MEDLINE. When a user query matches an abbreviation, Allie returns all potential long forms of the query along with their bibliographic data (i.e. title and publication year). In addition, for each candidate, co-occurring abbreviations and a research field in which it frequently appears in the MEDLINE data are displayed. This function helps users learn about the context in which an abbreviation appears. To deal with synonymous long forms, we use a dictionary called GENA that contains domain-specific terms such as gene, protein or disease names along with their synonymic information. Conceptually identical domain-specific terms are regarded as one term, and then conceptually identical abbreviation-long form pairs are grouped taking into account their appearance in MEDLINE. To keep up with new abbreviations that are continuously introduced, Allie has an automatic update system. In addition, the database of abbreviations and their long forms with their corresponding PubMed IDs is constructed and updated weekly. Database URL: The Allie service is available at http://allie.dbcls.jp/. PMID:21498548

  14. Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

    Science.gov (United States)

    Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.

  15. A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos.

    Science.gov (United States)

    Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian

    2016-04-01

    Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today's keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users' information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively.ResultsThe authors produced a prototype implementation of the proposed system, which is publicly accessible athttps://patentq.njit.edu/oer To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable

  16. Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors

    Science.gov (United States)

    Chikhi, Rayan; Sael, Lee; Kihara, Daisuke

    2010-01-01

    Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259

  17. MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.

    Science.gov (United States)

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M

    2011-07-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.

  18. Design and implementation of a biomedical image database (BDIM).

    Science.gov (United States)

    Aubry, F; Badaoui, S; Kaplan, H; Di Paola, R

    1988-01-01

    We developed a biomedical image database (BDIM) which proposes a standardized representation of value arrays such as images and curves, and of their associated parameters, independently of their acquisition mode to make their transmission and processing easier. It includes three kinds of interactions, oriented to the users. The network concept was kept as a constraint to incorporate the BDIM in a distributed structure and we maintained compatibility with the ACR/NEMA communication protocol. The management of arrays and their associated parameters includes two distinct bases of objects, linked together via a gateway. The first one manages arrays according to their storage mode: long term storage on optionally on-line mass storage devices, and, for consultations, partial copies of long term stored arrays on hard disk. The second one manages the associated parameters and the gateway by means of the relational DBMS ORACLE. Parameters are grouped into relations. Some of them are in agreement with groups defined by the ACR/NEMA. The other relations describe objects resulting from processed initial objects. These new objects are not described by the ACR/NEMA but they can be inserted as shadow groups of ACR/NEMA description. The relations describing the storage and their pathname constitute the gateway. ORACLE distributed tools and the two-level storage technique will allow the integration of the BDIM into a distributed structure, Queries and array (alone or in sequences) retrieval module has access to the relations via a level in which a dictionary managed by ORACLE is included. This dictionary translates ACR/NEMA objects into objects that can be handled by the DBMS.(ABSTRACT TRUNCATED AT 250 WORDS)

  19. Validation of SmartRank: A likelihood ratio software for searching national DNA databases with complex DNA profiles.

    Science.gov (United States)

    Benschop, Corina C G; van de Merwe, Linda; de Jong, Jeroen; Vanvooren, Vanessa; Kempenaers, Morgane; Kees van der Beek, C P; Barni, Filippo; Reyes, Eusebio López; Moulin, Léa; Pene, Laurent; Haned, Hinda; Sijen, Titia

    2017-07-01

    Searching a national DNA database with complex and incomplete profiles usually yields very large numbers of possible matches that can present many candidate suspects to be further investigated by the forensic scientist and/or police. Current practice in most forensic laboratories consists of ordering these 'hits' based on the number of matching alleles with the searched profile. Thus, candidate profiles that share the same number of matching alleles are not differentiated and due to the lack of other ranking criteria for the candidate list it may be difficult to discern a true match from the false positives or notice that all candidates are in fact false positives. SmartRank was developed to put forward only relevant candidates and rank them accordingly. The SmartRank software computes a likelihood ratio (LR) for the searched profile and each profile in the DNA database and ranks database entries above a defined LR threshold according to the calculated LR. In this study, we examined for mixed DNA profiles of variable complexity whether the true donors are retrieved, what the number of false positives above an LR threshold is and the ranking position of the true donors. Using 343 mixed DNA profiles over 750 SmartRank searches were performed. In addition, the performance of SmartRank and CODIS were compared regarding DNA database searches and SmartRank was found complementary to CODIS. We also describe the applicable domain of SmartRank and provide guidelines. The SmartRank software is open-source and freely available. Using the best practice guidelines, SmartRank enables obtaining investigative leads in criminal cases lacking a suspect. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Analysis of Users' Searches of CD-ROM Databases in the National and University Library in Zagreb.

    Science.gov (United States)

    Jokic, Maja

    1997-01-01

    Investigates the search behavior of CD-ROM database users in Zagreb (Croatia) libraries: one group needed a minimum of technical assistance, and the other was completely independent. Highlights include the use of questionnaires and transaction log analysis and the need for end-user education. The questionnaire and definitions of search process…

  1. Building and evaluating an informatics tool to facilitate analysis of a biomedical literature search service in an academic medical center library.

    Science.gov (United States)

    Hinton, Elizabeth G; Oelschlegel, Sandra; Vaughn, Cynthia J; Lindsay, J Michael; Hurst, Sachiko M; Earl, Martha

    2013-01-01

    This study utilizes an informatics tool to analyze a robust literature search service in an academic medical center library. Structured interviews with librarians were conducted focusing on the benefits of such a tool, expectations for performance, and visual layout preferences. The resulting application utilizes Microsoft SQL Server and .Net Framework 3.5 technologies, allowing for the use of a web interface. Customer tables and MeSH terms are included. The National Library of Medicine MeSH database and entry terms for each heading are incorporated, resulting in functionality similar to searching the MeSH database through PubMed. Data reports will facilitate analysis of the search service.

  2. An Online Database Producer's Memoirs and Memories of an Online Pioneer and The Database Industry: Looking into the Future.

    Science.gov (United States)

    Kollegger, James G.; And Others

    1988-01-01

    In the first of three articles, the producer of Energyline, Energynet, and Tele/Scope recalls the development of the databases and database business strategies. The second describes the development of biomedical online databases, and the third discusses future developments, including full text databases, database producers as online host, and…

  3. WAIS Searching of the Current Contents Database

    Science.gov (United States)

    Banholzer, P.; Grabenstein, M. E.

    The Homer E. Newell Memorial Library of NASA's Goddard Space Flight Center is developing capabilities to permit Goddard personnel to access electronic resources of the Library via the Internet. The Library's support services contractor, Maxima Corporation, and their subcontractor, SANAD Support Technologies have recently developed a World Wide Web Home Page (http://www-library.gsfc.nasa.gov) to provide the primary means of access. The first searchable database to be made available through the HomePage to Goddard employees is Current Contents, from the Institute for Scientific Information (ISI). The initial implementation includes coverage of articles from the last few months of 1992 to present. These records are augmented with abstracts and references, and often are more robust than equivalent records in bibliographic databases that currently serve the astronomical community. Maxima/SANAD selected Wais Incorporated's WAIS product with which to build the interface to Current Contents. This system allows access from Macintosh, IBM PC, and Unix hosts, which is an important feature for Goddard's multiplatform environment. The forms interface is structured to allow both fielded (author, article title, journal name, id number, keyword, subject term, and citation) and unfielded WAIS searches. The system allows a user to: Retrieve individual journal article records. Retrieve Table of Contents of specific issues of journals. Connect to articles with similar subject terms or keywords. Connect to other issues of the same journal in the same year. Browse journal issues from an alphabetical list of indexed journal names.

  4. Description of color/race in Brazilian biomedical research.

    Science.gov (United States)

    Ribeiro, Teresa Veronica Catonho; Ferreira, Luzitano Brandão

    2012-01-01

    Over recent years, the terms race and ethnicity have been used to ascertain inequities in public health. However, this use depends on the quality of the data available. This study aimed to investigate the description of color/race in Brazilian scientific journals within the field of biomedicine. Descriptive study with systematic search for scientific articles in the SciELO Brazil database. A wide-ranging systematic search for original articles involving humans, published in 32 Brazilian biomedical scientific journals in the SciELO Brazil database between January and December 2008, was performed. Articles in which the race/ethnicity of the participants was identified were analyzed. In total, 1,180 articles were analyzed. The terms for describing race or ethnicity were often ambiguous and vague. Descriptions of race or ethnicity occurred in 159 articles (13.4%), but only in 42 (26.4%) was there a description of how individuals were identified. In these, race and ethnicity were used almost interchangeably and definition was according to skin color (71.4%), ancestry (19.0%) and self-definition (9.6%). Twenty-two races or ethnicities were cited, and the most common were white (37.3%), black (19.7%), mixed (12.9%), nonwhite (8.1%) and yellow (8.1%). The absence of descriptions of parameters for defining race, as well as the use of vague and ambiguous terms, may hamper and even prevent comparisons between human groups and the use of these data to ascertain inequities in healthcare.

  5. Colil: a database and search service for citation contexts in the life sciences domain.

    Science.gov (United States)

    Fujiwara, Toyofumi; Yamamoto, Yasunori

    2015-01-01

    To promote research activities in a particular research area, it is important to efficiently identify current research trends, advances, and issues in that area. Although review papers in the research area can suffice for this purpose in general, researchers are not necessarily able to obtain these papers from research aspects of their interests at the time they are required. Therefore, the utilization of the citation contexts of papers in a research area has been considered as another approach. However, there are few search services to retrieve citation contexts in the life sciences domain; furthermore, efficiently obtaining citation contexts is becoming difficult due to the large volume and rapid growth of life sciences papers. Here, we introduce the Colil (Comments on Literature in Literature) database to store citation contexts in the life sciences domain. By using the Resource Description Framework (RDF) and a newly compiled vocabulary, we built the Colil database and made it available through the SPARQL endpoint. In addition, we developed a web-based search service called Colil that searches for a cited paper in the Colil database and then returns a list of citation contexts for it along with papers relevant to it based on co-citations. The citation contexts in the Colil database were extracted from full-text papers of the PubMed Central Open Access Subset (PMC-OAS), which includes 545,147 papers indexed in PubMed. These papers are distributed across 3,171 journals and cite 5,136,741 unique papers that correspond to approximately 25 % of total PubMed entries. By utilizing Colil, researchers can easily refer to a set of citation contexts and relevant papers based on co-citations for a target paper. Colil helps researchers to comprehend life sciences papers in a research area more efficiently and makes their biological research more efficient.

  6. Annals of Biomedical Sciences: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  7. License - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Trypanoso... Attribution-Share Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follows: Trypanoso...nse Update History of This Database Site Policy | Contact Us License - Trypanosomes Database | LSDB Archive ...

  8. Figure mining for biomedical research.

    Science.gov (United States)

    Rodriguez-Esteban, Raul; Iossifov, Ivan

    2009-08-15

    Figures from biomedical articles contain valuable information difficult to reach without specialized tools. Currently, there is no search engine that can retrieve specific figure types. This study describes a retrieval method that takes advantage of principles in image understanding, text mining and optical character recognition (OCR) to retrieve figure types defined conceptually. A search engine was developed to retrieve tables and figure types to aid computational and experimental research. http://iossifovlab.cshl.edu/figurome/.

  9. [Evidence-based medicine. 2. Research of clinically relevant biomedical information. Gruppo Italiano per la Medicina Basata sulle Evidenze--GIMBE].

    Science.gov (United States)

    Cartabellotta, A

    1998-05-01

    Evidence-based Medicine is a product of the electronic information age and there are several databases useful for practice it--MEDLINE, EMBASE, specialized compendiums of evidence (Cochrane Library, Best Evidence), practice guidelines--most of them free available through Internet, that offers a growing number of health resources. Because searching best evidence is a basic step to practice Evidence-based Medicine, this second review (the first one has been published in the issue of March 1998) has the aim to provide physicians tools and skills for retrieving relevant biomedical information. Therefore, we discuss about strategies for managing information overload, analyze characteristics, usefulness and limits of medical databases and explain how to use MEDLINE in day-to-day clinical practice.

  10. Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches.

    Science.gov (United States)

    Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd

    2017-07-07

    Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.

  11. Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

    Science.gov (United States)

    Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

    2018-01-01

    Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie

  12. Identifying complications of interventional procedures from UK routine healthcare databases: a systematic search for methods using clinical codes.

    Science.gov (United States)

    Keltie, Kim; Cole, Helen; Arber, Mick; Patrick, Hannah; Powell, John; Campbell, Bruce; Sims, Andrew

    2014-11-28

    Several authors have developed and applied methods to routine data sets to identify the nature and rate of complications following interventional procedures. But, to date, there has been no systematic search for such methods. The objective of this article was to find, classify and appraise published methods, based on analysis of clinical codes, which used routine healthcare databases in a United Kingdom setting to identify complications resulting from interventional procedures. A literature search strategy was developed to identify published studies that referred, in the title or abstract, to the name or acronym of a known routine healthcare database and to complications from procedures or devices. The following data sources were searched in February and March 2013: Cochrane Methods Register, Conference Proceedings Citation Index - Science, Econlit, EMBASE, Health Management Information Consortium, Health Technology Assessment database, MathSciNet, MEDLINE, MEDLINE in-process, OAIster, OpenGrey, Science Citation Index Expanded and ScienceDirect. Of the eligible papers, those which reported methods using clinical coding were classified and summarised in tabular form using the following headings: routine healthcare database; medical speciality; method for identifying complications; length of follow-up; method of recording comorbidity. The benefits and limitations of each approach were assessed. From 3688 papers identified from the literature search, 44 reported the use of clinical codes to identify complications, from which four distinct methods were identified: 1) searching the index admission for specified clinical codes, 2) searching a sequence of admissions for specified clinical codes, 3) searching for specified clinical codes for complications from procedures and devices within the International Classification of Diseases 10th revision (ICD-10) coding scheme which is the methodology recommended by NHS Classification Service, and 4) conducting manual clinical

  13. Development and evaluation of a biomedical search engine using a predicate-based vector space model.

    Science.gov (United States)

    Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

    2013-10-01

    Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (psearching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.

  14. Zirconia in biomedical applications.

    Science.gov (United States)

    Chen, Yen-Wei; Moussi, Joelle; Drury, Jeanie L; Wataha, John C

    2016-10-01

    The use of zirconia in medicine and dentistry has rapidly expanded over the past decade, driven by its advantageous physical, biological, esthetic, and corrosion properties. Zirconia orthopedic hip replacements have shown superior wear-resistance over other systems; however, risk of catastrophic fracture remains a concern. In dentistry, zirconia has been widely adopted for endosseous implants, implant abutments, and all-ceramic crowns. Because of an increasing demand for esthetically pleasing dental restorations, zirconia-based ceramic restorations have become one of the dominant restorative choices. Areas covered: This review provides an updated overview of the applications of zirconia in medicine and dentistry with a focus on dental applications. The MEDLINE electronic database (via PubMed) was searched, and relevant original and review articles from 2010 to 2016 were included. Expert commentary: Recent data suggest that zirconia performs favorably in both orthopedic and dental applications, but quality long-term clinical data remain scarce. Concerns about the effects of wear, crystalline degradation, crack propagation, and catastrophic fracture are still debated. The future of zirconia in biomedical applications will depend on the generation of these data to resolve concerns.

  15. EVALUASI PEMANFAATAN JURNAL DALAM DATABASE "EBSCO BIOMEDICAL REFERENCE COLLECTION" DI UNIT PERPUSTAKAAN DAN INFORMATIKA KEDOKTERAN (UPIK FAKULTAS KEDOKTERAN UGM YOGYAKARTA

    Directory of Open Access Journals (Sweden)

    Eka Wardhani S.

    2015-12-01

    Full Text Available Evaluasi terhadap pemanfaatan koleksi sangat diperlukan untuk mengetahui seberapa besar koleksi tersebut diakses dan dimanfaatkan oleh pengguna. Ebsco Biomedical Reference Collection (Ebsco BRC merupakan salah satu database jurnal yang berparadigma akses. Evaluasi pemanfaatan jurnal dalam database Ebsco BRC merupakan penelitian tentang pemanfaatan koleksi perpustakaan yang dilakukan di UPIK (Unit Perpustakaan dan Informatika Kedokteran Fakultas Kedokteran Universitas Gadjah Mada Yogyakarta. Penelitian ini bertujuan untuk mengetahui tingkat keterpakaian dan pemanfaatan jumal oleh sivitas akademika di FK UGM. Evaluasi dilakukan dengan metode deskriptif dengan pendekatan data kuantitatif dan kualitatif. . Instrumen yang digunakan dalam evaluasi adalah kuesioner dan usage statistics report. Hasil Penelitian ini menunjukkan bahwa tingkat keterpakaian jurnal berdasarkan judul yang ada tinggi (97,96%, akan tetapi tingkat pengaksesannya belum dilakukan secara maksimal. Rata-rata pengaksesan jurnal setiap harinya 25%. Dari data usage statistics report dapat diketahui sebanyak 12 judul jumal yang diakses lebih dari 1000 kali yang dinyatakan sebagai jumal yang paling sering diakses oleh pengguna. Saran peneliti berdasarkan hasil penelitian yang diperoleh adalah bahwa kegiatan melanggan koleksi database Ebsco dapat terus dilakukan , akan tetapi UPIK harus berusaha meningkatkan sosialisasi koleksi, aksesibilitas, fasilitas, dan bimbingan bagi pengguna dalam melakukan penelusuran dalam database tersebut agar dapat dimanfaatkan secara maksimal. Kata Kunci: Evaluasi Koleksi, Ebsco

  16. Biomedical and Health Informatics Education – the IMIA Years

    Science.gov (United States)

    2016-01-01

    Summary Objective This paper presents the development of medical informatics education during the years from the establishment of the International Medical Informatics Association (IMIA) until today. Method A search in the literature was performed using search engines and appropriate keywords as well as a manual selection of papers. The search covered English language papers and was limited to search on papers title and abstract only. Results The aggregated papers were analyzed on the basis of the subject area, origin, time span, and curriculum development, and conclusions were drawn. Conclusions From the results, it is evident that IMIA has played a major role in comparing and integrating the Biomedical and Health Informatics educational efforts across the different levels of education and the regional distribution of educators and institutions. A large selection of references is presented facilitating future work on the field of education in biomedical and health informatics. PMID:27488405

  17. Biomedical information retrieval across languages.

    Science.gov (United States)

    Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger

    2007-06-01

    This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.

  18. A novel biomedical image indexing and retrieval system via deep preference learning.

    Science.gov (United States)

    Pang, Shuchao; Orgun, Mehmet A; Yu, Zhezhou

    2018-05-01

    The traditional biomedical image retrieval methods as well as content-based image retrieval (CBIR) methods originally designed for non-biomedical images either only consider using pixel and low-level features to describe an image or use deep features to describe images but still leave a lot of room for improving both accuracy and efficiency. In this work, we propose a new approach, which exploits deep learning technology to extract the high-level and compact features from biomedical images. The deep feature extraction process leverages multiple hidden layers to capture substantial feature structures of high-resolution images and represent them at different levels of abstraction, leading to an improved performance for indexing and retrieval of biomedical images. We exploit the current popular and multi-layered deep neural networks, namely, stacked denoising autoencoders (SDAE) and convolutional neural networks (CNN) to represent the discriminative features of biomedical images by transferring the feature representations and parameters of pre-trained deep neural networks from another domain. Moreover, in order to index all the images for finding the similarly referenced images, we also introduce preference learning technology to train and learn a kind of a preference model for the query image, which can output the similarity ranking list of images from a biomedical image database. To the best of our knowledge, this paper introduces preference learning technology for the first time into biomedical image retrieval. We evaluate the performance of two powerful algorithms based on our proposed system and compare them with those of popular biomedical image indexing approaches and existing regular image retrieval methods with detailed experiments over several well-known public biomedical image databases. Based on different criteria for the evaluation of retrieval performance, experimental results demonstrate that our proposed algorithms outperform the state

  19. Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic

  20. Fast quantum search algorithm for databases of arbitrary size and its implementation in a cavity QED system

    International Nuclear Information System (INIS)

    Li, H.Y.; Wu, C.W.; Liu, W.T.; Chen, P.X.; Li, C.Z.

    2011-01-01

    We propose a method for implementing the Grover search algorithm directly in a database containing any number of items based on multi-level systems. Compared with the searching procedure in the database with qubits encoding, our modified algorithm needs fewer iteration steps to find the marked item and uses the carriers of the information more economically. Furthermore, we illustrate how to realize our idea in cavity QED using Zeeman's level structure of atoms. And the numerical simulation under the influence of the cavity and atom decays shows that the scheme could be achieved efficiently within current state-of-the-art technology. -- Highlights: ► A modified Grover algorithm is proposed for searching in an arbitrary dimensional Hilbert space. ► Our modified algorithm requires fewer iteration steps to find the marked item. ► The proposed method uses the carriers of the information more economically. ► A scheme for a six-item Grover search in cavity QED is proposed. ► Numerical simulation under decays shows that the scheme can be achieved with enough fidelity.

  1. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System

    Directory of Open Access Journals (Sweden)

    Yu Liu

    2015-01-01

    Full Text Available The Smith-Waterman (SW algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  2. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System.

    Science.gov (United States)

    Liu, Yu; Hong, Yang; Lin, Chun-Yuan; Hung, Che-Lun

    2015-01-01

    The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  3. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database

  5. Native Health Research Database

    Science.gov (United States)

    ... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...

  6. Database Description - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...B-CE Database Description General information of database Database name tRNADB-CE Alter...CC BY-SA Detail Background and funding Name: MEXT Integrated Database Project Reference(s) Article title: tRNAD... 2009 Jan;37(Database issue):D163-8. External Links: Article title: tRNADB-CE 2011: tRNA gene database curat...n Download License Update History of This Database Site Policy | Contact Us Database Description - tRNADB-CE | LSDB Archive ...

  7. Technical editing of research reports in biomedical journals.

    Science.gov (United States)

    Wager, Elizabeth; Middleton, Philippa

    2008-10-08

    Most journals try to improve their articles by technical editing processes such as proof-reading, editing to conform to 'house styles', grammatical conventions and checking accuracy of cited references. Despite the considerable resources devoted to technical editing, we do not know whether it improves the accessibility of biomedical research findings or the utility of articles. This is an update of a Cochrane methodology review first published in 2003. To assess the effects of technical editing on research reports in peer-reviewed biomedical journals, and to assess the level of accuracy of references to these reports. We searched The Cochrane Library Issue 2, 2007; MEDLINE (last searched July 2006); EMBASE (last searched June 2007) and checked relevant articles for further references. We also searched the Internet and contacted researchers and experts in the field. Prospective or retrospective comparative studies of technical editing processes applied to original research articles in biomedical journals, as well as studies of reference accuracy. Two review authors independently assessed each study against the selection criteria and assessed the methodological quality of each study. One review author extracted the data, and the second review author repeated this. We located 32 studies addressing technical editing and 66 surveys of reference accuracy. Only three of the studies were randomised controlled trials. A 'package' of largely unspecified editorial processes applied between acceptance and publication was associated with improved readability in two studies and improved reporting quality in another two studies, while another study showed mixed results after stricter editorial policies were introduced. More intensive editorial processes were associated with fewer errors in abstracts and references. Providing instructions to authors was associated with improved reporting of ethics requirements in one study and fewer errors in references in two studies, but no

  8. Update History of This Database - DMPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DMPD Update History of This Database Date Update contents 2010/03/29 DMPD English archive si....jp/macrophage/ ) is released. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - DMPD | LSDB Archive ...

  9. Blockchain distributed ledger technologies for biomedical and health care applications.

    Science.gov (United States)

    Kuo, Tsung-Ting; Kim, Hyeon-Eui; Ohno-Machado, Lucila

    2017-11-01

    To introduce blockchain technologies, including their benefits, pitfalls, and the latest applications, to the biomedical and health care domains. Biomedical and health care informatics researchers who would like to learn about blockchain technologies and their applications in the biomedical/health care domains. The covered topics include: (1) introduction to the famous Bitcoin crypto-currency and the underlying blockchain technology; (2) features of blockchain; (3) review of alternative blockchain technologies; (4) emerging nonfinancial distributed ledger technologies and applications; (5) benefits of blockchain for biomedical/health care applications when compared to traditional distributed databases; (6) overview of the latest biomedical/health care applications of blockchain technologies; and (7) discussion of the potential challenges and proposed solutions of adopting blockchain technologies in biomedical/health care domains. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  10. Quantum Query Complexity for Searching Multiple Marked States from an Unsorted Database

    International Nuclear Information System (INIS)

    Shang Bin

    2007-01-01

    An important and usual sort of search problems is to find all marked states from an unsorted database with a large number of states. Grover's original quantum search algorithm is for finding single marked state with uncertainty, and it has been generalized to the case of multiple marked states, as well as been modified to find single marked state with certainty. However, the query complexity for finding all multiple marked states has not been addressed. We use a generalized Long's algorithm with high precision to solve such a problem. We calculate the approximate query complexity, which increases with the number of marked states and with the precision that we demand. In the end we introduce an algorithm for the problem on a 'duality computer' and show its advantage over other algorithms.

  11. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    Energy Technology Data Exchange (ETDEWEB)

    Ginzinger, Simon W. [Center of Applied Molecular Engineering, University of Salzburg, Department of Molecular Biology, Division of Bioinformatics (Austria)], E-mail: simon@came.sbg.ac.at; Coles, Murray [Max-Planck-Institute for Developmental Biology, Department of Protein Evolution (Germany)], E-mail: Murray.Coles@tuebingen.mpg.de

    2009-03-15

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.

  12. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    International Nuclear Information System (INIS)

    Ginzinger, Simon W.; Coles, Murray

    2009-01-01

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods

  13. Update History of This Database - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RED Update History of This Database Date Update contents 2015/12/21 Rice Expression Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RED | LSDB Archive ... ...ve site is opened. 2000/10/1 Rice Expression Database ( http://red.dna.affrc.go.jp/RED/ ) is opened. About Thi

  14. Making a search engine for Indocean - A database of abstracts: An experience

    Digital Repository Service at National Institute of Oceanography (India)

    Tapaswi, M.P.; Haravu, L.J.

    stream_size 23701 stream_content_type text/plain stream_name Inf_Manage_Trends_Issues_2003_307.pdf.txt stream_source_info Inf_Manage_Trends_Issues_2003_307.pdf.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8... Information Mallagement : Trends and Issues (Festschrift ill honour of Prof S. Seetharama) 52 . Making a Search Engine for Indocean - A Database of Abstracts : An Experience Murari P Tapaswi* and L J Haravu** *Documentation Officer. National Information...

  15. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Directory of Open Access Journals (Sweden)

    Giovanni Delussu

    Full Text Available This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  16. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

    Science.gov (United States)

    Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. PMID:27936191

  17. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Science.gov (United States)

    Delussu, Giovanni; Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  18. Biomedical data integration in computational drug design and bioinformatics.

    Science.gov (United States)

    Seoane, Jose A; Aguiar-Pulido, Vanessa; Munteanu, Cristian R; Rivero, Daniel; Rabunal, Juan R; Dorado, Julian; Pazos, Alejandro

    2013-03-01

    In recent years, in the post genomic era, more and more data is being generated by biological high throughput technologies, such as proteomics and transcriptomics. This omics data can be very useful, but the real challenge is to analyze all this data, as a whole, after integrating it. Biomedical data integration enables making queries to different, heterogeneous and distributed biomedical data sources. Data integration solutions can be very useful not only in the context of drug design, but also in biomedical information retrieval, clinical diagnosis, system biology, etc. In this review, we analyze the most common approaches to biomedical data integration, such as federated databases, data warehousing, multi-agent systems and semantic technology, as well as the solutions developed using these approaches in the past few years.

  19. Update History of This Database - KEGG MEDICUS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available glish archive site is opened. 2010/10/01 KEGG MEDICUS ( http://www.kegg.jp/kegg/medicus/ ) is opened. About ...[ Credits ] English ]; } else if ( url.search(//en//) != -1 ) { url = url.replace(/...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us KEGG MEDI...CUS Update History of This Database Date Update contents 2014/05/09 KEGG MEDICUS En...This Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - KEGG MEDICUS | LSDB Archive ...

  20. Update History of This Database - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RPD Update History of This Database Date Update contents 2016/02/02 Rice Proteome Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RPD | LSDB Archive ... ...ve site is opened. 2003/01/07 Rice Proteome Database ( http://gene64.dna.affrc.go.jp/RPD/ ) is opened. About Thi

  1. Information retrieval from the INIS database. Is the new online search system poorer than the old one?

    International Nuclear Information System (INIS)

    Adamek, Petr

    2011-01-01

    A brief overview of the search options for the INIS database is presented, categorized into offline and online systems, and their assets and drawbacks are described. In the Online section, the old system on the BASIS platform and the new system on the Google Search Appliance platform are compared. The capabilities of the new system seem to be more limited than those of the old system. (author)

  2. Fine-grained Database Field Search Using Attribute-Based Encryption for E-Healthcare Clouds.

    Science.gov (United States)

    Guo, Cheng; Zhuang, Ruhan; Jie, Yingmo; Ren, Yizhi; Wu, Ting; Choo, Kim-Kwang Raymond

    2016-11-01

    An effectively designed e-healthcare system can significantly enhance the quality of access and experience of healthcare users, including facilitating medical and healthcare providers in ensuring a smooth delivery of services. Ensuring the security of patients' electronic health records (EHRs) in the e-healthcare system is an active research area. EHRs may be outsourced to a third-party, such as a community healthcare cloud service provider for storage due to cost-saving measures. Generally, encrypting the EHRs when they are stored in the system (i.e. data-at-rest) or prior to outsourcing the data is used to ensure data confidentiality. Searchable encryption (SE) scheme is a promising technique that can ensure the protection of private information without compromising on performance. In this paper, we propose a novel framework for controlling access to EHRs stored in semi-trusted cloud servers (e.g. a private cloud or a community cloud). To achieve fine-grained access control for EHRs, we leverage the ciphertext-policy attribute-based encryption (CP-ABE) technique to encrypt tables published by hospitals, including patients' EHRs, and the table is stored in the database with the primary key being the patient's unique identity. Our framework can enable different users with different privileges to search on different database fields. Differ from previous attempts to secure outsourcing of data, we emphasize the control of the searches of the fields within the database. We demonstrate the utility of the scheme by evaluating the scheme using datasets from the University of California, Irvine.

  3. African Journal of Biomedical Research: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  4. Egyptian Journal of Biomedical Sciences: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  5. Update History of This Database - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SAHG Update History of This Database Date Update contents 2016/05/09 SAHG English archive si...te is opened. 2009/10 SAHG ( http://bird.cbrc.jp/sahg ) is opened. About This Database Database Description ...Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SAHG | LSDB Archive ...

  6. Update History of This Database - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMOS Update History of This Database Date Update contents 2015/10/27 RMOS English archive si...12 RMOS (http://cdna01.dna.affrc.go.jp/RMOS/) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMOS | LSDB Archive ...

  7. Toward a public analysis database for LHC new physics searches using M ADA NALYSIS 5

    Science.gov (United States)

    Dumont, B.; Fuks, B.; Kraml, S.; Bein, S.; Chalons, G.; Conte, E.; Kulkarni, S.; Sengupta, D.; Wymant, C.

    2015-02-01

    We present the implementation, in the MadAnalysis 5 framework, of several ATLAS and CMS searches for supersymmetry in data recorded during the first run of the LHC. We provide extensive details on the validation of our implementations and propose to create a public analysis database within this framework.

  8. GRIP Database original data - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us GRI...PDB GRIP Database original data Data detail Data name GRIP Database original data DOI 10....18908/lsdba.nbdc01665-006 Description of data contents GRIP Database original data It consists of data table...s and sequences. Data file File name: gripdb_original_data.zip File URL: ftp://ftp.biosciencedbc.jp/archive/gripdb/LATEST/gri...e Database Description Download License Update History of This Database Site Policy | Contact Us GRIP Database original data - GRIPDB | LSDB Archive ...

  9. Update History of This Database - PLACE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us PLACE Update History of This Database Date Update contents 2016/08/22 The contact address is...s Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - PLACE | LSDB Archive ... ... changed. 2014/10/20 The URLs of the database maintenance site and the portal site are changed. 2014/07/17 PLACE English archi

  10. Update History of This Database - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SSBD Update History of This Database Date Update contents 2016/07/25 SSBD English archive si...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SSBD | LSDB Archive ... ...te is opened. 2013/09/03 SSBD ( http://ssbd.qbic.riken.jp/ ) is opened. About This Database Database Descrip

  11. Protein backbone angle restraints from searching a database for chemical shift and sequence homology

    Energy Technology Data Exchange (ETDEWEB)

    Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    1999-03-15

    Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.

  12. Figure text extraction in biomedical literature.

    Directory of Open Access Journals (Sweden)

    Daehyun Kim

    2011-01-01

    Full Text Available Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures.We first evaluated an off-the-shelf Optical Character Recognition (OCR tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons.The evaluation on 382 figures (9,643 figure texts in total randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for

  13. Database Description - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow

  14. Database Description - TMFunction | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available sidue (or mutant) in a protein. The experimental data are collected from the literature both by searching th...the sequence database, UniProt, structural database, PDB, and literature database

  15. Update History of This Database - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us fRNAdb Update History of This Database Date Update contents 2016/03/29 fRNAdb English archiv...on Download License Update History of This Database Site Policy | Contact Us Update History of This Database - fRNAdb | LSDB Archive ... ...e site is opened. 2006/12 fRNAdb ( http://www.ncrna.org/ ) is opened. About This Database Database Descripti

  16. Application of an automated natural language processing (NLP) workflow to enable federated search of external biomedical content in drug discovery and development.

    Science.gov (United States)

    McEntire, Robin; Szalkowski, Debbie; Butler, James; Kuo, Michelle S; Chang, Meiping; Chang, Man; Freeman, Darren; McQuay, Sarah; Patel, Jagruti; McGlashen, Michael; Cornell, Wendy D; Xu, Jinghai James

    2016-05-01

    External content sources such as MEDLINE(®), National Institutes of Health (NIH) grants and conference websites provide access to the latest breaking biomedical information, which can inform pharmaceutical and biotechnology company pipeline decisions. The value of the sites for industry, however, is limited by the use of the public internet, the limited synonyms, the rarity of batch searching capability and the disconnected nature of the sites. Fortunately, many sites now offer their content for download and we have developed an automated internal workflow that uses text mining and tailored ontologies for programmatic search and knowledge extraction. We believe such an efficient and secure approach provides a competitive advantage to companies needing access to the latest information for a range of use cases and complements manually curated commercial sources. Copyright © 2016. Published by Elsevier Ltd.

  17. Inaccurate Citations in Biomedical Journalism: Effect on the Impact Factor of the American Journal of Roentgenology.

    Science.gov (United States)

    Karabulut, Nevzat

    2017-03-01

    The aim of this study is to investigate the frequency of incorrect citations and its effects on the impact factor of a specific biomedical journal: the American Journal of Roentgenology. The Cited Reference Search function of Thomson Reuters' Web of Science database (formerly the Institute for Scientific Information's Web of Knowledge database) was used to identify erroneous citations. This was done by entering the journal name into the Cited Work field and entering "2011-2012" into the Cited Year(s) field. The errors in any part of the inaccurately cited references (e.g., author names, title, year, volume, issue, and page numbers) were recorded, and the types of errors (i.e., absent, deficient, or mistyped) were analyzed. Erroneous citations were corrected using the Suggest a Correction function of the Web of Science database. The effect of inaccurate citations on the impact factor of the AJR was calculated. Overall, 183 of 1055 citable articles published in 2011-2012 were inaccurately cited 423 times (mean [± SD], 2.31 ± 4.67 times; range, 1-44 times). Of these 183 articles, 110 (60.1%) were web-only articles and 44 (24.0%) were print articles. The most commonly identified errors were page number errors (44.8%) and misspelling of an author's name (20.2%). Incorrect citations adversely affected the impact factor of the AJR by 0.065 in 2012 and by 0.123 in 2013. Inaccurate citations are not infrequent in biomedical journals, yet they can be detected and corrected using the Web of Science database. Although the accuracy of references is primarily the responsibility of authors, the journal editorial office should also define a periodic inaccurate citation check task and correct erroneous citations to reclaim unnecessarily lost credit.

  18. Statistical Measures Alone Cannot Determine Which Database (BNI, CINAHL, MEDLINE, or EMBASE Is the Most Useful for Searching Undergraduate Nursing Topics. A Review of: Stokes, P., Foster, A., & Urquhart, C. (2009. Beyond relevance and recall: Testing new user-centred measures of database performance. Health Information and Libraries Journal, 26(3, 220-231.

    Directory of Open Access Journals (Sweden)

    Giovanna Badia

    2011-03-01

    Full Text Available Objective – The research project sought to determine which of four databases was the most useful for searching undergraduate nursing topics. Design – Comparative database evaluation. Setting – Nursing and midwifery students at Homerton School of Health Studies (now part of Anglia Ruskin University, Cambridge, United Kingdom, in 2005-2006. Subjects – The subjects were four databases: British Nursing Index (BNI, CINAHL, MEDLINE, and EMBASE.Methods – This was a comparative study using title searches to compare BNI (BritishNursing Index, CINAHL, MEDLINE and EMBASE.According to the authors, this is the first study to compare BNI with other databases. BNI is a database produced by British libraries that indexes the nursing and midwifery literature. It covers over 240 British journals, and includes references to articles from health sciences journals that are relevant to nurses and midwives (British Nursing Index, n.d..The researchers performed keyword searches in the title field of the four databases for the dissertation topics of nine nursing and midwifery students enrolled in undergraduate dissertation modules. The list of titles of journals articles on their topics were given to the students and they were asked to judge the relevancy of the citations. The title searches were evaluated in each of the databases using the following criteria: • precision (the number of relevant results obtained in the database for a search topic, divided by the total number of results obtained in the database search;• recall (the number of relevant results obtained in the database for a search topic, divided by the total number of relevant results obtained on that topic from all four database searches;• novelty (the number of relevant results that were unique in the database search, which was calculated as a percentage of the total number of relevant results found in the database;• originality (the number of unique relevant results obtained in the

  19. Exploring and linking biomedical resources through multidimensional semantic spaces.

    Science.gov (United States)

    Berlanga, Rafael; Jiménez-Ruiz, Ernesto; Nebot, Victoria

    2012-01-25

    The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for

  20. Update History of This Database - TogoTV | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TogoTV Update History of This Database Date Update contents 2017/05/12 TogoTV English archiv...ription Download License Update History of This Database Site Policy | Contact Us Update History of This Database - TogoTV | LSDB Archive ... ...e site is opened. 2007/07/20 TogoTV ( http://togotv.dbcls.jp/ ) is opened. About This Database Database Desc

  1. Update History of This Database - ConfC | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us ConfC Update History of This Database Date Update contents 2016/09/20 ConfC English archive ...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - ConfC | LSDB Archive ... ...site is opened. 2005/05/01 ConfC (http://mbs.cbrc.jp/ConfC/) is opened. About This Database Database Descrip

  2. Matrix-product-state simulation of an extended Brueschweiler bulk-ensemble database search

    International Nuclear Information System (INIS)

    SaiToh, Akira; Kitagawa, Masahiro

    2006-01-01

    Brueschweiler's database search in a spin Liouville space can be efficiently simulated on a conventional computer without error as long as the simulation cost of the internal circuit of an oracle function is polynomial, unlike the fact that in true NMR experiments, it suffers from an exponential decrease in the variation of a signal intensity. With the simulation method using the matrix-product-state proposed by Vidal [G. Vidal, Phys. Rev. Lett. 91, 147902 (2003)], we perform such a simulation. We also show the extensions of the algorithm without utilizing the J-coupling or DD-coupling splitting of frequency peaks in observation: searching can be completed with a single query in polynomial postoracle circuit complexities in an extension; multiple solutions of an oracle can be found in another extension whose query complexity is linear in the key length and in the number of solutions (this extension is to find all of marked keys). These extended algorithms are also simulated with the same simulation method

  3. Finding and accessing diagrams in biomedical publications.

    Science.gov (United States)

    Kuhn, Tobias; Luong, ThaiBinh; Krauthammer, Michael

    2012-01-01

    Complex relationships in biomedical publications are often communicated by diagrams such as bar and line charts, which are a very effective way of summarizing and communicating multi-faceted data sets. Given the ever-increasing amount of published data, we argue that the precise retrieval of such diagrams is of great value for answering specific and otherwise hard-to-meet information needs. To this end, we demonstrate the use of advanced image processing and classification for identifying bar and line charts by the shape and relative location of the different image elements that make up the charts. With recall and precisions of close to 90% for the detection of relevant figures, we discuss the use of this technology in an existing biomedical image search engine, and outline how it enables new forms of literature queries over biomedical relationships that are represented in these charts.

  4. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    Science.gov (United States)

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  5. Download - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Download First of all, please read the license of this database. Data ...1.4 KB) Simple search and download Downlaod via FTP FTP server is sometimes jammed. If it is, access [here]. About This Database Data...base Description Download License Update History of This Database Site Policy | Contact Us Download - Trypanosomes Database | LSDB Archive ...

  6. Update History of This Database - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMG Update History of This Database Date Update contents 2016/08/22 The contact address is c...dna.affrc.go.jp/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMG | LSDB Archive ... ... URL of the portal site is changed. 2013/08/07 RMG archive site is opened. 2002/09/25 RMG ( http://rmg.rice.

  7. Update History of This Database - DGBY | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DGBY Update History of This Database Date Update contents 2014/10/20 The URL of the portal s...aro.affrc.go.jp/yakudachi/yeast/index.html ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - DGBY | LSDB Archive ... ... Expression of attribution in License is updated. 2012/03/08 DGBY English archive site is opened. 2006/10/02

  8. Improve Biomedical Information Retrieval using Modified Learning to Rank Methods.

    Science.gov (United States)

    Xu, Bo; Lin, Hongfei; Lin, Yuan; Ma, Yunlong; Yang, Liang; Wang, Jian; Yang, Zhihao

    2016-06-14

    In these years, the number of biomedical articles has increased exponentially, which becomes a problem for biologists to capture all the needed information manually. Information retrieval technologies, as the core of search engines, can deal with the problem automatically, providing users with the needed information. However, it is a great challenge to apply these technologies directly for biomedical retrieval, because of the abundance of domain specific terminologies. To enhance biomedical retrieval, we propose a novel framework based on learning to rank. Learning to rank is a series of state-of-the-art information retrieval techniques, and has been proved effective in many information retrieval tasks. In the proposed framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents, but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate the effectiveness of our framework for biomedical information retrieval.

  9. Update History of This Database - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us KAIKOcDNA Update History of This Database Date Update contents 2014/10/20 The URL of the dat... database ( http://sgp.dna.affrc.go.jp/EST/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - KAIKOcDNA | LSDB Archive ... ...abase maintenance site is changed. 2014/10/08 KAIKOcDNA English archive site is opened. 2004/04/12 KAIKOcDNA

  10. Beyond PubMed: Searching the "Grey Literature" for Clinical Trial Results.

    Science.gov (United States)

    Citrome, Leslie

    2014-07-01

    Clinical trial results have been traditionally communicated through the publication of scholarly reports and reviews in biomedical journals. However, this dissemination of information can be delayed or incomplete, making it difficult to appraise new treatments, or in the case of missing data, evaluate older interventions. Going beyond the routine search of PubMed, it is possible to discover additional information in the "grey literature." Examples of the grey literature include clinical trial registries, patent databases, company and industrywide repositories, regulatory agency digital archives, abstracts of paper and poster presentations on meeting/congress websites, industry investor reports and press releases, and institutional and personal websites.

  11. Update History of This Database - AcEST | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us AcEST Update History of This Database Date Update contents 2013/01/10 Errors found on AcEST ...s Database Database Description Download License Update History of This Data...base Site Policy | Contact Us Update History of This Database - AcEST | LSDB Archive ... ...Conting data have been correceted. For details, please refer to the following page. Data correction 2010/03/29 AcEST English archi

  12. Sierra Leone Journal of Biomedical Research: Submissions

    African Journals Online (AJOL)

    AFRICAN JOURNALS ONLINE (AJOL) · Journals · Advanced Search · USING ... Sierra Leone Journal of Biomedical Research (SLJBR) publishes papers in all ... An original article should give sufficient detail of experimental procedures for .... For references cited in a paper which has been accepted for publication but not ...

  13. Optimal search filters for renal information in EMBASE.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Shariff, Salimah Z; Weir, Matthew; Wilczynski, Nancy L; McKibbon, Ann; Rehman, Faisal; Garg, Amit X

    2010-07-01

    EMBASE is a popular database used to retrieve biomedical information. Our objective was to develop and test search filters to help clinicians and researchers efficiently retrieve articles with renal information in EMBASE. We used a diagnostic test assessment framework because filters operate similarly to screening tests. We divided a sample of 5,302 articles from 39 journals into development and validation sets of articles. Information retrieval properties were assessed by treating each search filter as a "diagnostic test" or screening procedure for the detection of relevant articles. We tested the performance of 1,936,799 search filters made of unique renal terms and their combinations. REFERENCE STANDARD & OUTCOME: The reference standard was manual review of each article. We calculated the sensitivity and specificity of each filter to identify articles with renal information. The best renal filters consisted of multiple search terms, such as "renal replacement therapy," "renal," "kidney disease," and "proteinuria," and the truncated terms "kidney," "dialy," "neph," "glomerul," and "hemodial." These filters achieved peak sensitivities of 98.7% (95% CI, 97.9-99.6) and specificities of 98.5% (95% CI, 98.0-99.0). The retrieval performance of these filters remained excellent in the validation set of independent articles. The retrieval performance of any search will vary depending on the quality of all search concepts used, not just renal terms. We empirically developed and validated high-performance renal search filters for EMBASE. These filters can be programmed into the search engine or used on their own to improve the efficiency of searching.

  14. User needs analysis and usability assessment of DataMed - a biomedical data discovery index.

    Science.gov (United States)

    Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R

    2017-11-30

    To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. The Relationship between Searches Performed in Online Databases and the Number of Full-Text Articles Accessed: Measuring the Interaction between Database and E-Journal Collections

    Science.gov (United States)

    Lamothe, Alain R.

    2011-01-01

    The purpose of this paper is to report the results of a quantitative analysis exploring the interaction and relationship between the online database and electronic journal collections at the J. N. Desmarais Library of Laurentian University. A very strong relationship exists between the number of searches and the size of the online database…

  16. The VirusBanker database uses a Java program to allow flexible searching through Bunyaviridae sequences.

    Science.gov (United States)

    Fourment, Mathieu; Gibbs, Mark J

    2008-02-05

    Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.

  17. Integration of first-principles methods and crystallographic database searches for new ferroelectrics: Strategies and explorations

    International Nuclear Information System (INIS)

    Bennett, Joseph W.; Rabe, Karin M.

    2012-01-01

    In this concept paper, the development of strategies for the integration of first-principles methods with crystallographic database mining for the discovery and design of novel ferroelectric materials is discussed, drawing on the results and experience derived from exploratory investigations on three different systems: (1) the double perovskite Sr(Sb 1/2 Mn 1/2 )O 3 as a candidate semiconducting ferroelectric; (2) polar derivatives of schafarzikite MSb 2 O 4 ; and (3) ferroelectric semiconductors with formula M 2 P 2 (S,Se) 6 . A variety of avenues for further research and investigation are suggested, including automated structure type classification, low-symmetry improper ferroelectrics, and high-throughput first-principles searches for additional representatives of structural families with desirable functional properties. - Graphical abstract: Integration of first-principles methods with crystallographic database mining, for the discovery and design of novel ferroelectric materials, could potentially lead to new classes of multifunctional materials. Highlights: ► Integration of first-principles methods and database mining. ► Minor structural families with desirable functional properties. ► Survey of polar entries in the Inorganic Crystal Structural Database.

  18. Social Work Literature Searching: Current Issues with Databases and Online Search Engines

    Science.gov (United States)

    McGinn, Tony; Taylor, Brian; McColgan, Mary; McQuilkan, Janice

    2016-01-01

    Objectives: To compare the performance of a range of search facilities; and to illustrate the execution of a comprehensive literature search for qualitative evidence in social work. Context: Developments in literature search methods and comparisons of search facilities help facilitate access to the best available evidence for social workers.…

  19. Accessing and using chemical databases

    DEFF Research Database (Denmark)

    Nikolov, Nikolai Georgiev; Pavlov, Todor; Niemelä, Jay Russell

    2013-01-01

    Computer-based representation of chemicals makes it possible to organize data in chemical databases-collections of chemical structures and associated properties. Databases are widely used wherever efficient processing of chemical information is needed, including search, storage, retrieval......, and dissemination. Structure and functionality of chemical databases are considered. The typical kinds of information found in a chemical database are considered-identification, structural, and associated data. Functionality of chemical databases is presented, with examples of search and access types. More details...... are included about the OASIS database and platform and the Danish (Q)SAR Database online. Various types of chemical database resources are discussed, together with a list of examples....

  20. Update History of This Database - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TP Atlas Update History of This Database Date Update contents 2013/12/16 The email address i...s ( http://www.tanpaku.org/tpatlas/ ) is opened. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - TP Atlas | LSDB Archive ... ...n the contact information is corrected. 2013/11/19 TP Atlas English archive site is opened. 2008/4/1 TP Atla

  1. SATORI: a system for ontology-guided visual exploration of biomedical data repositories.

    Science.gov (United States)

    Lekschas, Fritz; Gehlenborg, Nils

    2018-04-01

    The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. nils@hms.harvard.edu. Supplementary data are available at Bioinformatics online.

  2. Update History of This Database - GenLibi | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us GenLibi Update History of This Database Date Update contents 2014/03/25 GenLibi English archi...base Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - GenLibi | LSDB Archive ... ...ve site is opened. 2007/03/01 GenLibi ( http://gene.biosciencedbc.jp/ ) is opened. About This Database Data

  3. Update History of This Database - dbQSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us dbQSNP Update History of This Database Date Update contents 2017/02/16 dbQSNP English archiv...e Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - dbQSNP | LSDB Archive ... ...e site is opened. 2002/10/23 dbQSNP (http://qsnp.gen.kyushu-u.ac.jp/) is opened. About This Database Databas

  4. Database with web interface and search engine as a diagnostics tool for electromagnetic calorimeter

    CERN Document Server

    Paluoja, Priit

    2017-01-01

    During 2016 data collection, the Compact Muon Solenoid Data Acquisition (CMS DAQ) system has shown a very good reliability. Nevertheless, the high complexity of the hardware and the software involved is, by its nature, prone to some occasional problems. As CMS subdetector, electromagnetic calorimeter (ECAL) is affected in the same way. Some of the issues are not predictable and can appear during the year more than once such as components getting noisy, power shortcuts or failing communication between machines. The chain detection-diagnosis-intervention must be as fast as possible to minimise the downtime of the detector. The aim of this project was to create a diagnostic software for ECAL crew, which consists of database and its web interface that allows to search, add and edit the contents of the database.

  5. The Magnetics Information Consortium (MagIC) Online Database: Uploading, Searching and Visualizing Paleomagnetic and Rock Magnetic Data

    Science.gov (United States)

    Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Pisarevsky, S. A.; Jackson, M.; Solheid, P.; Banerjee, S.; Johnson, C.

    2006-12-01

    The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all measurements and the derived properties for studies of paleomagnetic directions (inclination, declination) and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and has two search nodes, one for paleomagnetism and one for rock magnetism. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. The query result set is displayed in a digestible tabular format allowing the user to descend through hierarchical levels such as from locations to sites, samples, specimens, and measurements. At each stage, the result set can be saved and, if supported by the data, can be visualized by plotting global location maps, equal area plots, or typical Zijderveld, hysteresis, and various magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (Version 2.1) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload and takes only a few minutes to process several thousand data records. The standardized MagIC template files are stored in the digital archives of EarthRef.org where they

  6. Online Patent Searching: The Realities.

    Science.gov (United States)

    Kaback, Stuart M.

    1983-01-01

    Considers patent subject searching capabilities of major online databases, noting patent claims, "deep-indexed" files, test searches, retrieval of related references, multi-database searching, improvements needed in indexing of chemical structures, full text searching, improvements needed in handling numerical data, and augmenting a…

  7. Citation searches are more sensitive than keyword searches to identify studies using specific measurement instruments.

    Science.gov (United States)

    Linder, Suzanne K; Kamath, Geetanjali R; Pratt, Gregory F; Saraykar, Smita S; Volk, Robert J

    2015-04-01

    To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a health care decision-making instrument commonly used in clinical settings. We searched the literature using two methods: (1) keyword searching using variations of "Control Preferences Scale" and (2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, and Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Keyword searches in bibliographic databases yielded high average precision (90%) but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45-54%), but precision ranged from 35% to 75% with Scopus being the most precise. Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time, and resources should dictate the combination of which methods and databases are used. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles

    Science.gov (United States)

    Liu, Rey-Long

    2015-01-01

    Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794

  9. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.

    Directory of Open Access Journals (Sweden)

    Rey-Long Liu

    Full Text Available Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.

  10. NCBI2RDF: Enabling Full RDF-Based Access to NCBI Databases

    Directory of Open Access Journals (Sweden)

    Alberto Anguita

    2013-01-01

    Full Text Available RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.

  11. NCBI2RDF: Enabling Full RDF-Based Access to NCBI Databases

    Science.gov (United States)

    Anguita, Alberto; García-Remesal, Miguel; de la Iglesia, Diana; Maojo, Victor

    2013-01-01

    RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments. PMID:23984425

  12. NCBI2RDF: enabling full RDF-based access to NCBI databases.

    Science.gov (United States)

    Anguita, Alberto; García-Remesal, Miguel; de la Iglesia, Diana; Maojo, Victor

    2013-01-01

    RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.

  13. Usability Testing of a Large, Multidisciplinary Library Database: Basic Search and Visual Search

    Directory of Open Access Journals (Sweden)

    Jody Condit Fagan

    2006-09-01

    Full Text Available Visual search interfaces have been shown by researchers to assist users with information search and retrieval. Recently, several major library vendors have added visual search interfaces or functions to their products. For public service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. This study presents the results of eight full-scale usability tests of both the EBSCOhost Basic Search and Visual Search in the context of a large liberal arts university.

  14. Update History of This Database - Q-TARO | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us Q-TARO Update History of This Database Date Update contents 2014/10/20 The URL of the portal...ption Download License Update History of This Database Site Policy | Contact Us Update History of This Database - Q-TARO | LSDB Archive ... ... site is changed. 2013/12/17 The URL of the portal site is changed. 2013/12/13 Q-TARO English archive site i...s opened. 2009/11/15 Q-TARO ( http://qtaro.abr.affrc.go.jp/ ) is opened. About This Database Database Descri

  15. The VirusBanker database uses a Java program to allow flexible searching through Bunyaviridae sequences

    Directory of Open Access Journals (Sweden)

    Gibbs Mark J

    2008-02-01

    Full Text Available Abstract Background Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. Results The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. Conclusion VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.

  16. Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures.

    Science.gov (United States)

    Li, Guo-Zhong; Vissers, Johannes P C; Silva, Jeffrey C; Golick, Dan; Gorenstein, Marc V; Geromanos, Scott J

    2009-03-01

    A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC-MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four-protein mixture, the same four-protein mixture spiked into a complex biological background, and a variety of other "system" type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to

  17. Pediatric Residents and Interns in an Italian Hospital Perform Improved Bibliographic. A Review of: Gardois, P., Calabrese, R., Colombi, N., Lingua, C., Longo, F., Villanacci, M., Miniero, R., & Piga, A. (2011. Effectiveness of bibliographic searches performed by paediatric residents and interns assisted by librarian. A randomised controlled trial. Health Information and Libraries Journal, 28(4, 273-284. doi: 10.1111/j.1471-1842.2011.00957.x

    Directory of Open Access Journals (Sweden)

    Mathew Stone

    2013-03-01

    Full Text Available Objective – To establish whether the assistance of an experienced biomedical librarian delivers an improvement in the searching of bibliographic databases as performed by medical residents and interns.Design – Randomized controlled trial.Setting – The pediatrics department of a large Italian teaching hospital.Subjects – 18 pediatric residents and interns.Methods – 23 residents and interns from the pediatrics department of a large Italian teaching hospital were invited to participate inthis study, of which 18 agreed. Subjects were then randomized into two groups and asked to spend between 30 and 90 minutes searching bibliographic databases for evidence to answer a real-life clinical question which was randomly allocated to them. Each member ofthe intervention group was provided with an experienced biomedical librarian to provide assistance throughout the search session. The control group received no assistance. The outcome of the search was then measured using an assessment tool adapted for the purpose of this study from the Fresno test of competence in evidence based medicine. This adapted assessment tool rated the “global success” of the search and included criteria such as appropriate question formulation, number of PICO terms translated into search terms, use of Boolean logic, use of subject headings, use of filters, use of limits, and the percentage of citations retrieved that matched a gold standard set of citations found in a prior search by two librarians (who were not involved in assisting the subjects together with an expert clinician. Main Results – The intervention group scored a median average of 73.6 points out of a possible 100, compared with the control group which scored 50.4. The difference of 23.2 points in favour of the librarian assisted group was a statistically significant result (p value = 0.013 with a 95% confidence interval of between 4.8 and 33.2.Conclusion – This study presents credible evidence that

  18. JICST Factual DatabaseJICST Chemical Substance Safety Regulation Database

    Science.gov (United States)

    Abe, Atsushi; Sohma, Tohru

    JICST Chemical Substance Safety Regulation Database is based on the Database of Safety Laws for Chemical Compounds constructed by Japan Chemical Industry Ecology-Toxicology & Information Center (JETOC) sponsored by the Sience and Technology Agency in 1987. JICST has modified JETOC database system, added data and started the online service through JOlS-F (JICST Online Information Service-Factual database) in January 1990. JICST database comprises eighty-three laws and fourteen hundred compounds. The authors outline the database, data items, files and search commands. An example of online session is presented.

  19. TU-F-BRD-01: Biomedical Informatics for Medical Physicists

    International Nuclear Information System (INIS)

    Phillips, M; Kalet, I; McNutt, T; Smith, W

    2014-01-01

    Biomedical informatics encompasses a very large domain of knowledge and applications. This broad and loosely defined field can make it difficult to navigate. Physicists often are called upon to provide informatics services and/or to take part in projects involving principles of the field. The purpose of the presentations in this symposium is to help medical physicists gain some knowledge about the breadth of the field and how, in the current clinical and research environment, they can participate and contribute. Three talks have been designed to give an overview from the perspective of physicists and to provide a more in-depth discussion in two areas. One of the primary purposes, and the main subject of the first talk, is to help physicists achieve a perspective about the range of the topics and concepts that fall under the heading of 'informatics'. The approach is to de-mystify topics and jargon and to help physicists find resources in the field should they need them. The other talks explore two areas of biomedical informatics in more depth. The goal is to highlight two domains of intense current interest--databases and models--in enough depth into current approaches so that an adequate background for independent inquiry is achieved. These two areas will serve as good examples of how physicists, using informatics principles, can contribute to oncology practice and research. Learning Objectives: To understand how the principles of biomedical informatics are used by medical physicists. To put the relevant informatics concepts in perspective with regard to biomedicine in general. To use clinical database design as an example of biomedical informatics. To provide a solid background into the problems and issues of the design and use of data and databases in radiation oncology. To use modeling in the service of decision support systems as an example of modeling methods and data use. To provide a background into how uncertainty in our data and knowledge can be

  20. RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures

    Directory of Open Access Journals (Sweden)

    Wasik Szymon

    2010-05-01

    Full Text Available Abstract Background Recent discoveries concerning novel functions of RNA, such as RNA interference, have contributed towards the growing importance of the field. In this respect, a deeper knowledge of complex three-dimensional RNA structures is essential to understand their new biological functions. A number of bioinformatic tools have been proposed to explore two major structural databases (PDB, NDB in order to analyze various aspects of RNA tertiary structures. One of these tools is RNA FRABASE 1.0, the first web-accessible database with an engine for automatic search of 3D fragments within PDB-derived RNA structures. This search is based upon the user-defined RNA secondary structure pattern. In this paper, we present and discuss RNA FRABASE 2.0. This second version of the system represents a major extension of this tool in terms of providing new data and a wide spectrum of novel functionalities. An intuitionally operated web server platform enables very fast user-tailored search of three-dimensional RNA fragments, their multi-parameter conformational analysis and visualization. Description RNA FRABASE 2.0 has stored information on 1565 PDB-deposited RNA structures, including all NMR models. The RNA FRABASE 2.0 search engine algorithms operate on the database of the RNA sequences and the new library of RNA secondary structures, coded in the dot-bracket format extended to hold multi-stranded structures and to cover residues whose coordinates are missing in the PDB files. The library of RNA secondary structures (and their graphics is made available. A high level of efficiency of the 3D search has been achieved by introducing novel tools to formulate advanced searching patterns and to screen highly populated tertiary structure elements. RNA FRABASE 2.0 also stores data and conformational parameters in order to provide "on the spot" structural filters to explore the three-dimensional RNA structures. An instant visualization of the 3D RNA

  1. Uso de bases de datos bibliográficas por investigadores biomédicos latinoamericanos hispanoparlantes: estudio transversal The use of bibliographic databases by Spanish-speaking Latin American biomedical researchers: a cross-sectional study

    Directory of Open Access Journals (Sweden)

    Edgar Guillermo Ospina

    2005-04-01

    bases de datos fue similar en todos los países estudiados, sin diferencias significativas en cuanto al tipo de acceso (formal, informal o libre y el grado de habilidad. Del total, 87% reconocieron no haber incluido referencias importantes en artículos publicados por no disponer del texto completo y 56% afirmaron haber citado artículos que no habían leído. Además, 7,6% de los encuestados reconocieron haber consultado bases de datos de acceso restringido mediante claves prestadas o discos copiados. Más de dos tercios de los autores manifestaron que obtenían los textos completos de los artículos mediante fotocopia o directamente de los autores. CONCLUSIONES: Es necesario entrenar a los investigadores latinoamericanos en la utilización de las bases de datos de uso más frecuente -especialmente MEDLINE- y mejorar su acceso a las fuentes bibliográficas biomédicas, como medidas esenciales para fomentar el desarrollo de la producción científica en la Región.OBJECTIVE: To describe how Spanish-speaking biomedical professionals in Latin America access and utilize bibliographic databases. METHODS: Based on a MEDLINE search, 2 515 articles published between August 2002 and August 2003 were identified that dealt with and/or had authors from 16 countries: Argentina, Bolivia, Chile, Colombia, Costa Rica, Cuba, Ecuador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Uruguay, and Venezuela. The search was limited to references to basic science, clinical science, or social medicine. A survey was sent by e-mail to researchers who lived in 15 of the 16 countries (the exception being Nicaragua. The survey asked about the researcher's area of work (basic science, clinical science, or public health, the level of skill in using databases, the frequency and type of access to the databases most utilized, the impact from not having access to the full text of articles when preparing a manuscript, and how the respondent usually obtained the full-text version of

  2. Constructing Effective Search Strategies for Electronic Searching.

    Science.gov (United States)

    Flanagan, Lynn; Parente, Sharon Campbell

    Electronic databases have grown tremendously in both number and popularity since their development during the 1960s. Access to electronic databases in academic libraries was originally offered primarily through mediated search services by trained librarians; however, the advent of CD-ROM and end-user interfaces for online databases has shifted the…

  3. Heterogeneous Biomedical Database Integration Using a Hybrid Strategy: A p53 Cancer Research Database

    Directory of Open Access Journals (Sweden)

    Vadim Y. Bichutskiy

    2006-01-01

    Full Text Available Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.

  4. Enrolment and Retention of African Women in Biomedical Research ...

    African Journals Online (AJOL)

    Relevant biomedical research literatures on Human Research Participants from Scirus, Pubmed and Medline computerized search were critically evaluated and highlighted. Information was also obtained from research ethics training as well as texts and journals in the medical libraries of the research ethics departments of ...

  5. Self-correction in biomedical publications and the scientific impact

    Science.gov (United States)

    Gasparyan, Armen Yuri; Ayvazyan, Lilit; Akazhanov, Nurbek A.; Kitas, George D.

    2014-01-01

    Aim To analyze mistakes and misconduct in multidisciplinary and specialized biomedical journals. Methods We conducted searches through PubMed to retrieve errata, duplicate, and retracted publications (as of January 30, 2014). To analyze publication activity and citation profiles of countries, multidisciplinary, and specialized biomedical journals, we referred to the latest data from the SCImago Journal & Country Rank database. Total number of indexed articles and values of the h-index of the fifty most productive countries and multidisciplinary journals were recorded and linked to the number of duplicate and retracted publications in PubMed. Results Our analysis found 2597 correction items. A striking increase in the number of corrections appeared in 2013, which is mainly due to 871 (85.3%) corrections from PLOS One. The number of duplicate publications was 1086. Articles frequently published in duplicate were reviews (15.6%), original studies (12.6%), and case reports (7.6%), whereas top three retracted articles were original studies (10.1%), randomized trials (8.8%), and reviews (7%). A strong association existed between the total number of publications across countries and duplicate (rs = 0.86, P < 0.001) and retracted items (rs = 0.812, P < 0.001). A similar trend was found between country-based h-index values and duplicate and retracted publications. Conclusion The study suggests that the intensified self-correction in biomedicine is due to the attention of readers and authors, who spot errors in their hub of evidence-based information. Digitization and open access confound the staggering increase in correction notices and retractions. PMID:24577829

  6. Self-correction in biomedical publications and the scientific impact.

    Science.gov (United States)

    Gasparyan, Armen Yuri; Ayvazyan, Lilit; Akazhanov, Nurbek A; Kitas, George D

    2014-02-01

    To analyze mistakes and misconduct in multidisciplinary and specialized biomedical journals. We conducted searches through PubMed to retrieve errata, duplicate, and retracted publications (as of January 30, 2014). To analyze publication activity and citation profiles of countries, multidisciplinary, and specialized biomedical journals, we referred to the latest data from the SCImago Journal and Country Rank database. Total number of indexed articles and values of the h-index of the fifty most productive countries and multidisciplinary journals were recorded and linked to the number of duplicate and retracted publications in PubMed. Our analysis found 2597 correction items. A striking increase in the number of corrections appeared in 2013, which is mainly due to 871 (85.3%) corrections from PLOS One. The number of duplicate publications was 1086. Articles frequently published in duplicate were reviews (15.6%), original studies (12.6%), and case reports (7.6%), whereas top three retracted articles were original studies (10.1%), randomized trials (8.8%), and reviews (7%). A strong association existed between the total number of publications across countries and duplicate (rs=0.86, P<0.0001) and retracted items (rs=0.812, P<0.0001). A similar trend was found between country-based h-index values and duplicate and retracted publications. The study suggests that the intensified self-correction in biomedicine is due to the attention of readers and authors, who spot errors in their hub of evidence-based information. Digitization and open access confound the staggering increase in correction notices and retractions.

  7. Database Dump - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us fRNAdb Database Dump Data detail Data name Database Dump DOI 10.18908/lsdba.nbdc00452-002 De... data (tab separeted text) Data file File name: Database_Dump File URL: ftp://ftp....biosciencedbc.jp/archive/frnadb/LATEST/Database_Dump File size: 673 MB Simple search URL - Data acquisition...s. Data analysis method - Number of data entries 4 files - About This Database Database Description Download... License Update History of This Database Site Policy | Contact Us Database Dump - fRNAdb | LSDB Archive ...

  8. Seismic Search Engine: A distributed database for mining large scale seismic data

    Science.gov (United States)

    Liu, Y.; Vaidya, S.; Kuzma, H. A.

    2009-12-01

    The International Monitoring System (IMS) of the CTBTO collects terabytes worth of seismic measurements from many receiver stations situated around the earth with the goal of detecting underground nuclear testing events and distinguishing them from other benign, but more common events such as earthquakes and mine blasts. The International Data Center (IDC) processes and analyzes these measurements, as they are collected by the IMS, to summarize event detections in daily bulletins. Thereafter, the data measurements are archived into a large format database. Our proposed Seismic Search Engine (SSE) will facilitate a framework for data exploration of the seismic database as well as the development of seismic data mining algorithms. Analogous to GenBank, the annotated genetic sequence database maintained by NIH, through SSE, we intend to provide public access to seismic data and a set of processing and analysis tools, along with community-generated annotations and statistical models to help interpret the data. SSE will implement queries as user-defined functions composed from standard tools and models. Each query is compiled and executed over the database internally before reporting results back to the user. Since queries are expressed with standard tools and models, users can easily reproduce published results within this framework for peer-review and making metric comparisons. As an illustration, an example query is “what are the best receiver stations in East Asia for detecting events in the Middle East?” Evaluating this query involves listing all receiver stations in East Asia, characterizing known seismic events in that region, and constructing a profile for each receiver station to determine how effective its measurements are at predicting each event. The results of this query can be used to help prioritize how data is collected, identify defective instruments, and guide future sensor placements.

  9. Integrating systems biology models and biomedical ontologies.

    Science.gov (United States)

    Hoehndorf, Robert; Dumontier, Michel; Gennari, John H; Wimalaratne, Sarala; de Bono, Bernard; Cook, Daniel L; Gkoutos, Georgios V

    2011-08-11

    Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology. We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models. We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms.

  10. From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF.

    Science.gov (United States)

    Kılıç, Sefa; Sagitova, Dinara M; Wolfish, Shoshannah; Bely, Benoit; Courtot, Mélanie; Ciufo, Stacy; Tatusova, Tatiana; O'Donovan, Claire; Chibucos, Marcus C; Martin, Maria J; Erill, Ivan

    2016-01-01

    Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/. © The Author(s) 2016

  11. Compound image segmentation of published biomedical figures.

    Science.gov (United States)

    Li, Pengyuan; Jiang, Xiangying; Kambhamettu, Chandra; Shatkay, Hagit

    2018-04-01

    Images convey essential information in biomedical publications. As such, there is a growing interest within the bio-curation and the bio-databases communities, to store images within publications as evidence for biomedical processes and for experimental results. However, many of the images in biomedical publications are compound images consisting of multiple panels, where each individual panel potentially conveys a different type of information. Segmenting such images into constituent panels is an essential first step toward utilizing images. In this article, we develop a new compound image segmentation system, FigSplit, which is based on Connected Component Analysis. To overcome shortcomings typically manifested by existing methods, we develop a quality assessment step for evaluating and modifying segmentations. Two methods are proposed to re-segment the images if the initial segmentation is inaccurate. Experimental results show the effectiveness of our method compared with other methods. The system is publicly available for use at: https://www.eecis.udel.edu/~compbio/FigSplit. The code is available upon request. shatkay@udel.edu. Supplementary data are available online at Bioinformatics.

  12. A scoping review protocol on the roles and tasks of peer reviewers in the manuscript review process in biomedical journals.

    Science.gov (United States)

    Glonti, Ketevan; Cauchi, Daniel; Cobo, Erik; Boutron, Isabelle; Moher, David; Hren, Darko

    2017-10-22

    The primary functions of peer reviewers are poorly defined. Thus far no body of literature has systematically identified the roles and tasks of peer reviewers of biomedical journals. A clear establishment of these can lead to improvements in the peer review process. The purpose of this scoping review is to determine what is known on the roles and tasks of peer reviewers. We will use the methodological framework first proposed by Arksey and O'Malley and subsequently adapted by Levac et al and the Joanna Briggs Institute. The scoping review will include all study designs, as well as editorials, commentaries and grey literature. The following eight electronic databases will be searched (from inception to May 2017): Cochrane Library, Cumulative Index to Nursing and Allied Health Literature, Educational Resources Information Center, EMBASE, MEDLINE, PsycINFO, Scopus and Web of Science. Two reviewers will use inclusion and exclusion criteria based on the 'Population-Concept-Context' framework to independently screen titles and abstracts of articles considered for inclusion. Full-text screening of relevant eligible articles will also be carried out by two reviewers. The search strategy for grey literature will include searching in websites of existing networks, biomedical journal publishers and organisations that offer resources for peer reviewers. In addition we will review journal guidelines to peer reviewers on how to perform the manuscript review. Journals will be selected using the 2016 journal impact factor. We will identify and assess the top five, middle five and lowest-ranking five journals across all medical specialties. This scoping review will undertake a secondary analysis of data already collected and does not require ethical approval. The results will be disseminated through journals and conferences targeting stakeholders involved in peer review in biomedical research. © Article author(s) (or their employer(s) unless otherwise stated in the text of the

  13. Searching the Literatura Latino Americana e do Caribe em Ciências da Saúde (LILACS) database improves systematic reviews.

    Science.gov (United States)

    Clark, Otavio Augusto Camara; Castro, Aldemar Araujo

    2002-02-01

    An unbiased systematic review (SR) should analyse as many articles as possible in order to provide the best evidence available. However, many SR use only databases with high English-language content as sources for articles. Literatura Latino Americana e do Caribe em Ciências da Saúde (LILACS) indexes 670 journals from the Latin American and Caribbean health literature but is seldom used in these SR. Our objective is to evaluate if LILACS should be used as a routine source of articles for SR. First we identified SR published in 1997 in five medical journals with a high impact factor. Then we searched LILACS for articles that could match the inclusion criteria of these SR. We also checked if the authors had already identified these articles located in LILACS. In all, 64 SR were identified. Two had already searched LILACS and were excluded. In 39 of 62 (63%) SR a LILACS search identified articles that matched the inclusion criteria. In 5 (8%) our search was inconclusive and in 18 (29%) no articles were found in LILACS. Therefore, in 71% (44/72) of cases, a LILACS search could have been useful to the authors. This proportion remains the same if we consider only the 37 SR that performed a meta-analysis. In only one case had the article identified in LILACS already been located elsewhere by the authors' strategy. LILACS is an under-explored and unique source of articles whose use can improve the quality of systematic reviews. This database should be used as a routine source to identify studies for systematic reviews.

  14. Update History of This Database - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available CAGE TSS aggregation 」 「 CAGE peaks 」 2015/12/07 FANTOM5 archive site is opened. (Archive V1) 2014/03/27 FANTOM5 ( http://fantom...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM...5 Update History of This Database Date Update contents 2017/03/14 FANTOM5 English arch...escription Download License Update History of This Database Site Policy | Contact Us Update History of This Database - FANTOM5 | LSDB Archive ...

  15. SCALEUS: Semantic Web Services Integration for Biomedical Applications.

    Science.gov (United States)

    Sernadela, Pedro; González-Castro, Lorena; Oliveira, José Luís

    2017-04-01

    In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/ .

  16. Biomedical journals in Republic of Macedonia: the current state.

    Science.gov (United States)

    Polenakovic, Momir; Danevska, Lenche

    2014-01-01

    Several biomedical journals in the Republic of Macedonia have succeeded in maintaining regular publication over the years, but only a few have a long-standing tradition. In this paper we present the basic characteristics of 18 biomedical journals that have been published without a break in the Republic of Macedonia. Of these, more details are given for 14 journals, a particular emphasis being on the journal Prilozi/Contributions of the Macedonian Academy of Sciences and Arts, Section of Medical Sciences as one of the journals with a long-term publishing tradition and one of the journals included in the Medline/PubMed database. A brief or broad description is given for the following journals: Macedonian Medical Review, Acta Morphologica, Physioacta, MJMS-Macedonian Journal of Medical Sciences, International Medical Journal Medicus, Archives of Public Health, Epilepsy, Macedonian Orthopaedics and Traumatology Journal, BANTAO Journal, Macedonian Dental Review, Macedonian Pharmaceutical Bulletin, Macedonian Veterinary Review, Journal of Special Education and Rehabilitation, Balkan Journal of Medical Genetics, Contributions of the Macedonian Scientific Society of Bitola, Vox Medici, Social Medicine: Professional Journal for Public Health, and Prilozi/Contributions of the Macedonian Academy of Sciences and Arts. Journals from Macedonia should aim to be published regularly, should comply with the Uniform requirements for manuscripts submitted to biomedical journals, and with the recommendations of reliable organizations working in the field of publishing and research. These are the key prerequisites which Macedonian journals have to accomplish in order to be included in renowned international bibliographic databases. Thus the results of biomedical science from the Republic of Macedonia will be presented to the international scientific arena.

  17. Usability of some databases for information services in Czechoslovak nuclear programme

    International Nuclear Information System (INIS)

    Kakos, A.

    1988-01-01

    The contents were compared of the databases Chemical Abstracts Search, World Patent Index, Excerpta Medica, Inspec and Compendex with INIS, with regard to possible completing of INIS searches with searches in these other databases. On the basis of the results of test searches made in all said databases on selected topics falling under the INIS scope, concrete cases were determined when INIS searches should be completed with data in some of the other databases. The contents analysis method is described with regard to the concrete search topics and areas are given of the overlapping of the databases with INIS. Numerical results are given. (J.B.). 2 tabs

  18. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

    Directory of Open Access Journals (Sweden)

    Schmidt Bertil

    2010-04-01

    Full Text Available Abstract Background Due to its high sensitivity, the Smith-Waterman algorithm is widely used for biological database searches. Unfortunately, the quadratic time complexity of this algorithm makes it highly time-consuming. The exponential growth of biological databases further deteriorates the situation. To accelerate this algorithm, many efforts have been made to develop techniques in high performance architectures, especially the recently emerging many-core architectures and their associated programming models. Findings This paper describes the latest release of the CUDASW++ software, CUDASW++ 2.0, which makes new contributions to Smith-Waterman protein database searches using compute unified device architecture (CUDA. A parallel Smith-Waterman algorithm is proposed to further optimize the performance of CUDASW++ 1.0 based on the single instruction, multiple thread (SIMT abstraction. For the first time, we have investigated a partitioned vectorized Smith-Waterman algorithm using CUDA based on the virtualized single instruction, multiple data (SIMD abstraction. The optimized SIMT and the partitioned vectorized algorithms were benchmarked, and remarkably, have similar performance characteristics. CUDASW++ 2.0 achieves performance improvement over CUDASW++ 1.0 as much as 1.74 (1.72 times using the optimized SIMT algorithm and up to 1.77 (1.66 times using the partitioned vectorized algorithm, with a performance of up to 17 (30 billion cells update per second (GCUPS on a single-GPU GeForce GTX 280 (dual-GPU GeForce GTX 295 graphics card. Conclusions CUDASW++ 2.0 is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant performance improvement over CUDASW++ 1.0 using either the optimized SIMT algorithm or the partitioned vectorized algorithm for Smith-Waterman protein database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  19. Alignment - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e URL: ftp://ftp.biosciencedbc.jp/archive/sahg/LATEST/sahg_alignment.zip File size: 12.0 MB Simple search UR...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Alignment - SAHG | LSDB Archive ...

  20. Are we studying what matters? Health priorities and NIH-funded biomedical engineering research.

    Science.gov (United States)

    Rubin, Jessica B; Paltiel, A David; Saltzman, W Mark

    2010-07-01

    With the founding of the National Institute of Biomedical Imaging and Bioengineering (NIBIB) in 1999, the National Institutes of Health (NIH) made explicit its dedication to expanding research in biomedical engineering. Ten years later, we sought to examine how closely federal funding for biomedical engineering aligns with U.S. health priorities. Using a publicly accessible database of research projects funded by the NIH in 2008, we identified 641 grants focused on biomedical engineering, 48% of which targeted specific diseases. Overall, we found that these disease-specific NIH-funded biomedical engineering research projects align with national health priorities, as quantified by three commonly utilized measures of disease burden: cause of death, disability-adjusted survival losses, and expenditures. However, we also found some illnesses (e.g., cancer and heart disease) for which the number of research projects funded deviated from our expectations, given their disease burden. Our findings suggest several possibilities for future studies that would serve to further inform the allocation of limited research dollars within the field of biomedical engineering.

  1. Biomedical Risk Factors of Achilles Tendinopathy in Physically Active People: a Systematic Review.

    Science.gov (United States)

    Kozlovskaia, Maria; Vlahovich, Nicole; Ashton, Kevin J; Hughes, David C

    2017-12-01

    Achilles tendinopathy is the most prevalent tendon disorder in people engaged in running and jumping sports. Aetiology of Achilles tendinopathy is complex and requires comprehensive research of contributing risk factors. There is relatively little research focussing on potential biomedical risk factors for Achilles tendinopathy. The purpose of this systematic review is to identify studies and summarise current knowledge of biomedical risk factors of Achilles tendinopathy in physically active people. Research databases were searched for relevant articles followed by assessment in accordance with PRISMA statement and standards of Cochrane collaboration. Levels of evidence and quality assessment designation were implemented in accordance with OCEBM levels of evidence and Newcastle-Ottawa Quality Assessment Scale, respectively. A systematic review of the literature identified 22 suitable articles. All included studies had moderate level of evidence (2b) with the Newcastle-Ottawa score varying between 6 and 9. The majority (17) investigated genetic polymorphisms involved in tendon structure and homeostasis and apoptosis and inflammation pathways. Overweight as a risk factor of Achilles tendinopathy was described in five included studies that investigated non-genetic factors. COL5A1 genetic variants were the most extensively studied, particularly in association with genetic variants in the genes involved in regulation of cell-matrix interaction in tendon and matrix homeostasis. It is important to investigate connections and pathways whose interactions might be disrupted and therefore alter collagen structure and lead to the development of pathology. Polymorphisms in genes involved in apoptosis and inflammation, and Achilles tendinopathy did not show strong association and, however, should be considered for further investigation. This systematic review suggests that biomedical risk factors are an important consideration in the future study of propensity to the development

  2. An optimal big data workflow for biomedical image analysis

    Directory of Open Access Journals (Sweden)

    Aurelle Tchagna Kouanou

    Full Text Available Background and objective: In the medical field, data volume is increasingly growing, and traditional methods cannot manage it efficiently. In biomedical computation, the continuous challenges are: management, analysis, and storage of the biomedical data. Nowadays, big data technology plays a significant role in the management, organization, and analysis of data, using machine learning and artificial intelligence techniques. It also allows a quick access to data using the NoSQL database. Thus, big data technologies include new frameworks to process medical data in a manner similar to biomedical images. It becomes very important to develop methods and/or architectures based on big data technologies, for a complete processing of biomedical image data. Method: This paper describes big data analytics for biomedical images, shows examples reported in the literature, briefly discusses new methods used in processing, and offers conclusions. We argue for adapting and extending related work methods in the field of big data software, using Hadoop and Spark frameworks. These provide an optimal and efficient architecture for biomedical image analysis. This paper thus gives a broad overview of big data analytics to automate biomedical image diagnosis. A workflow with optimal methods and algorithm for each step is proposed. Results: Two architectures for image classification are suggested. We use the Hadoop framework to design the first, and the Spark framework for the second. The proposed Spark architecture allows us to develop appropriate and efficient methods to leverage a large number of images for classification, which can be customized with respect to each other. Conclusions: The proposed architectures are more complete, easier, and are adaptable in all of the steps from conception. The obtained Spark architecture is the most complete, because it facilitates the implementation of algorithms with its embedded libraries. Keywords: Biomedical images, Big

  3. Locus - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...URL: ftp://ftp.biosciencedbc.jp/archive/astra/LATEST/astra_locus.zip File size: 887 KB Simple search URL htt...icing type (ex. cassette) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Locus - ASTRA | LSDB Archive ...

  4. License - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - JSNP | LSDB Archive ...

  5. Download - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...is Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - ASTRA | LSDB Archive ...

  6. Animal Research International: Advanced Search

    African Journals Online (AJOL)

    PROMOTING ACCESS TO AFRICAN RESEARCH ... Animal Research International: Advanced Search ... containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ... Journal of Biomedical Research, African Journal of Biotechnology, African Journal of Chemical Education ...

  7. License - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...out This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - KOME | LSDB Archive ...

  8. Download - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - RED | LSDB Archive ...

  9. Download - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - GRIPDB | LSDB Archive ...

  10. License - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...out This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RMOS | LSDB Archive ...

  11. The Open Spectral Database: an open platform for sharing and searching spectral data.

    Science.gov (United States)

    Chalk, Stuart J

    2016-01-01

    A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.

  12. Millennial Students’ Online Search Strategies are Associated With Their Mental Models of Search. A Review of: Holman, L. (2011. Millennial students’ mental models of search: Implications for academic librarians and database developers. Journal of Academic Librarianship, 37(1, 19-27. doi:10.1016/j.acalib.2010.10.003

    Directory of Open Access Journals (Sweden)

    Leslie Bussert

    2011-09-01

    Full Text Available Objective – To examine first-year college students’ information seeking behaviours and determine whether their mental models of the search process influence their ability to effectively search for and find scholarly materials.Design – Mixed methods including contextual inquiry, concept mapping, observation, and interviews.Setting – University of Baltimore, a public institution in Maryland, United States of America, offering undergraduate, graduate, and professional degrees.Subjects – A total of 21 first-year undergraduate students, ages 16 to 19 years, undertaking research assignments for which they chose to use online resources.Methods – First-year students were recruited in the fall of 2008 and met with the researcher in a university usability lab for about one hour over a three week period. The researcher observed and videotaped the students as they conducted research in their chosen search engines or article databases. The searches were captured using software, and students were encouraged to think aloud about their research process, search strategies, and anticipated search results. Observation sessions concluded with a 10-question interview incorporating a review of the keywords the student used, the student’s reflection on the success of his or her searches, and possible alternate keywords. The interview also offered prompts to help the researcher learn about students’ conceptualizations of search tools’ utilization of keywords to generate results. The researcher then asked the students to provide a visual diagram of the relationship between their search terms and the items retrieved in the search tool.Data were analyzed by identifying the 21 different search tools used by the students and categorizing all 210 searches and student diagrams for further analysis. A scheme similar to Guinee, Eagleton, and Hall’s (2003 characterized the student searches into four categories: simple single-term searches, topic plus focus

  13. License - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - ASTRA | LSDB Archive ...

  14. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  15. Architecture for biomedical multimedia information delivery on the World Wide Web

    Science.gov (United States)

    Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.

    1997-10-01

    Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.

  16. License - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - SSBD | LSDB Archive ...

  17. Download - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...cess [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - PSCDB | LSDB Archive ...

  18. License - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ut notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - SAHG | LSDB Archive ...

  19. License - RPSD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RPSD | LSDB Archive ...

  20. Subject search study. Final report

    International Nuclear Information System (INIS)

    Todeschini, C.

    1995-01-01

    The study gathered information on how users search the database of the International Nuclear Information System (INIS), using indicators such as Subject categories, Controlled terms, Subject headings, Free-text words, combinations of the above. Users participated from the Australian, French, Russian and Spanish INIS Centres, that have different national languages. Participants, both intermediaries and end users, replied to a questionnaire and executed search queries. The INIS Secretariat at the IAEA also participated. A protocol of all search strategies used in actual searches in the database was kept. The thought process for Russian and Spanish users is predominantly non-English and also the actual initial search formulation is predominantly non-English among Russian and Spanish users while it tends to be more in English among French users. A total of 1002 searches were executed by the five INIS centres including the IAEA. The search protocols indicate the following search behaviour: 1) free text words represent about 40% of search points on an average query; 2) descriptors used as search keys have the widest range as percentage of search points, from a low of 25% to a high of 48%; 3) search keys consisting of free text that coincides with a descriptor account for about 15% of search points; 4) Subject Categories are not used in many searches; 5) free text words are present as search points in about 80% of all searches; 6) controlled terms (descriptors) are used very extensively and appear in about 90% of all searches; 7) Subject Headings were used in only a few percent of searches. From the results of the study one can conclude that there is a greater reluctance on the part of non-native English speakers in initiating their searches by using free text word searches. Also: Subject Categories are little used in searching the database; both free text terms and controlled terms are the predominant types of search keys used, whereby the controlled terms are used more

  1. PhysiomeSpace: digital library service for biomedical data.

    Science.gov (United States)

    Testi, Debora; Quadrani, Paolo; Viceconti, Marco

    2010-06-28

    Every research laboratory has a wealth of biomedical data locked up, which, if shared with other experts, could dramatically improve biomedical and healthcare research. With the PhysiomeSpace service, it is now possible with a few clicks to share with selected users biomedical data in an easy, controlled and safe way. The digital library service is managed using a client-server approach. The client application is used to import, fuse and enrich the data information according to the PhysiomeSpace resource ontology and upload/download the data to the library. The server services are hosted on the Biomed Town community portal, where through a web interface, the user can complete the metadata curation and share and/or publish the data resources. A search service capitalizes on the domain ontology and on the enrichment of metadata for each resource, providing a powerful discovery environment. Once the users have found the data resources they are interested in, they can add them to their basket, following a metaphor popular in e-commerce web sites. When all the necessary resources have been selected, the user can download the basket contents into the client application. The digital library service is now in beta and open to the biomedical research community.

  2. The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC case register: development and descriptive data

    Directory of Open Access Journals (Sweden)

    Denis Mike

    2009-08-01

    Full Text Available Abstract Background Case registers have been used extensively in mental health research. Recent developments in electronic medical records, and in computer software to search and analyse these in anonymised format, have the potential to revolutionise this research tool. Methods We describe the development of the South London and Maudsley NHS Foundation Trust (SLAM Biomedical Research Centre (BRC Case Register Interactive Search tool (CRIS which allows research-accessible datasets to be derived from SLAM, the largest provider of secondary mental healthcare in Europe. All clinical data, including free text, are available for analysis in the form of anonymised datasets. Development involved both the building of the system and setting in place the necessary security (with both functional and procedural elements. Results Descriptive data are presented for the Register database as of October 2008. The database at that point included 122,440 cases, 35,396 of whom were receiving active case management under the Care Programme Approach. In terms of gender and ethnicity, the database was reasonably representative of the source population. The most common assigned primary diagnoses were within the ICD mood disorders (n = 12,756 category followed by schizophrenia and related disorders (8158, substance misuse (7749, neuroses (7105 and organic disorders (6414. Conclusion The SLAM BRC Case Register represents a 'new generation' of this research design, built on a long-running system of fully electronic clinical records and allowing in-depth secondary analysis of both numerical, string and free text data, whilst preserving anonymity through technical and procedural safeguards.

  3. [Metrology research on biomedical engineering publications from China in recent years].

    Science.gov (United States)

    Yu, Lu; Su, Juan; Wang, Ying; Sha, Xianzheng

    2014-12-01

    The present paper is to evaluate the scientific research level and development trends of biomedical engineering in China using metrology analysis on Chinese biomedical engineering scientific literatures. Pubmed is used to search the biomedical engineering publications in recent 5 years which are indexed by Science Citation Index, and the number and cited times of these publications and the impact factor of the journals are analyzed. The results show that comparing with the world, although the number of the publication in China has increased in recent 5 years, there is still much room for improvement. Among Chinese mainland, Hongkong and Taiwan, Chinese mainland maintains the obvious advantage in this subject, but Hongkong has the highest average cited number. Shanghai and Beijing have better research ability than other areas in Chinese mainland.

  4. Download - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data .... If it is, access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - RPD | LSDB Archive ...

  5. License - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ts might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RED | LSDB Archive ...

  6. Download - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data .... If it is, access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - JSNP | LSDB Archive ...

  7. Facilitating Full-text Access to Biomedical Literature Using Open Access Resources.

    Science.gov (United States)

    Kang, Hongyu; Hou, Zhen; Li, Jiao

    2015-01-01

    Open access (OA) resources and local libraries often have their own literature databases, especially in the field of biomedicine. We have developed a method of linking a local library to a biomedical OA resource facilitating researchers' full-text article access. The method uses a model based on vector space to measure similarities between two articles in local library and OA resources. The method achieved an F-score of 99.61%. This method of article linkage and mapping between local library and OA resources is available for use. Through this work, we have improved the full-text access of the biomedical OA resources.

  8. Abstract databases in nuclear medicine; New database for articles not indexed in PubMed

    International Nuclear Information System (INIS)

    Ugrinska, A.; Mustafa, B.

    2004-01-01

    Full text: Abstract databases available on Internet free of charge were searched for nuclear medicine contents. The only comprehensive database found was PubMed. Analysis of nuclear medicine journals included in PubMed was performed. PubMed contains 25 medical journals that contain the phrase 'nuclear medicine' in different languages in their title. Searching the Internet with the search engine 'Google' we have found four more peer-reviewed journals with the phrase 'nuclear medicine' in their title. In addition, we are fully aware that many articles related to nuclear medicine are published in national medical journals devoted to general medicine. For example in year 2000 colleagues from Institute of Pathophysiology and Nuclear Medicine, Skopje, Macedonia have published 10 articles out of which none could be found on PubMed. This suggested that a big amount of research work is not accessible for the people professionally involved in nuclear medicine. Therefore, we have created a database framework for abstracts that couldn't be found in PubMed. The database is organized in user-friendly manner. There are two main sections: 'post an abstract' and 'search for abstracts'. Authors of the articles are expected to submit their work in the section 'post an abstract'. During the submission process authors should fill the separate boxes with the Title in English, Title in original language, Country of origin, Journal name, Volume, Issue and Pages. Authors should choose up to five keywords from a drop-down menu. Authors are encouraged if the abstract is not published in English to translate it. The section 'search for abstract' is searchable according to Author, Keywords, and words and phrases incorporated in the English title. The abstract database currently resides on an MS Access back-end, with a front-end in ASP (Active Server Pages). In the future, we plan to migrate the database on a MS SQL Server, which should provide a faster and more reliable framework for hosting a

  9. NIST/Sandia/ICDD Electron Diffraction Database: A Database for Phase Identification by Electron Diffraction.

    Science.gov (United States)

    Carr, M J; Chambers, W F; Melgaard, D; Himes, V L; Stalick, J K; Mighell, A D

    1989-01-01

    A new database containing crystallographic and chemical information designed especially for application to electron diffraction search/match and related problems has been developed. The new database was derived from two well-established x-ray diffraction databases, the JCPDS Powder Diffraction File and NBS CRYSTAL DATA, and incorporates 2 years of experience with an earlier version. It contains 71,142 entries, with space group and unit cell data for 59,612 of those. Unit cell and space group information were used, where available, to calculate patterns consisting of all allowed reflections with d -spacings greater than 0.8 A for ~ 59,000 of the entries. Calculated patterns are used in the database in preference to experimental x-ray data when both are available, since experimental x-ray data sometimes omits high d -spacing data which falls at low diffraction angles. Intensity data are not given when calculated spacings are used. A search scheme using chemistry and r -spacing (reciprocal d -spacing) has been developed. Other potentially searchable data in this new database include space group, Pearson symbol, unit cell edge lengths, reduced cell edge length, and reduced cell volume. Compound and/or mineral names, formulas, and journal references are included in the output, as well as pointers to corresponding entries in NBS CRYSTAL DATA and the Powder Diffraction File where more complete information may be obtained. Atom positions are not given. Rudimentary search software has been written to implement a chemistry and r -spacing bit map search. With typical data, a full search through ~ 71,000 compounds takes 10~20 seconds on a PDP 11/23-RL02 system.

  10. Application of an efficient Bayesian discretization method to biomedical data

    Directory of Open Access Journals (Sweden)

    Gopalakrishnan Vanathi

    2011-07-01

    Full Text Available Abstract Background Several data mining methods require data that are discrete, and other methods often perform better with discrete data. We introduce an efficient Bayesian discretization (EBD method for optimal discretization of variables that runs efficiently on high-dimensional biomedical datasets. The EBD method consists of two components, namely, a Bayesian score to evaluate discretizations and a dynamic programming search procedure to efficiently search the space of possible discretizations. We compared the performance of EBD to Fayyad and Irani's (FI discretization method, which is commonly used for discretization. Results On 24 biomedical datasets obtained from high-throughput transcriptomic and proteomic studies, the classification performances of the C4.5 classifier and the naïve Bayes classifier were statistically significantly better when the predictor variables were discretized using EBD over FI. EBD was statistically significantly more stable to the variability of the datasets than FI. However, EBD was less robust, though not statistically significantly so, than FI and produced slightly more complex discretizations than FI. Conclusions On a range of biomedical datasets, a Bayesian discretization method (EBD yielded better classification performance and stability but was less robust than the widely used FI discretization method. The EBD discretization method is easy to implement, permits the incorporation of prior knowledge and belief, and is sufficiently fast for application to high-dimensional data.

  11. Update History of This Database - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...11/08/25 License is updated. 2010/03/29 tRNADB-CE English archive site is opened. 2008/7/1 tRNADB-CE( http:/...Download License Update History of This Database Site Policy | Contact Us Update History of This Database - tRNADB-CE | LSDB Archive ...

  12. The BioIntelligence Framework: a new computational platform for biomedical knowledge computing.

    Science.gov (United States)

    Farley, Toni; Kiefer, Jeff; Lee, Preston; Von Hoff, Daniel; Trent, Jeffrey M; Colbourn, Charles; Mousses, Spyro

    2013-01-01

    Breakthroughs in molecular profiling technologies are enabling a new data-intensive approach to biomedical research, with the potential to revolutionize how we study, manage, and treat complex diseases. The next great challenge for clinical applications of these innovations will be to create scalable computational solutions for intelligently linking complex biomedical patient data to clinically actionable knowledge. Traditional database management systems (DBMS) are not well suited to representing complex syntactic and semantic relationships in unstructured biomedical information, introducing barriers to realizing such solutions. We propose a scalable computational framework for addressing this need, which leverages a hypergraph-based data model and query language that may be better suited for representing complex multi-lateral, multi-scalar, and multi-dimensional relationships. We also discuss how this framework can be used to create rapid learning knowledge base systems to intelligently capture and relate complex patient data to biomedical knowledge in order to automate the recovery of clinically actionable information.

  13. Main data - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ftp://ftp.biosciencedbc.jp/archive/rmg/LATEST/rmg_main.zip File size: 1 KB Simple search URL http://togodb.b... This Database Database Description Download License Update History of This Database Site Policy | Contact Us Main data - RMG | LSDB Archive ...

  14. Complementary Value of Databases for Discovery of Scholarly Literature: A User Survey of Online Searching for Publications in Art History

    Science.gov (United States)

    Nemeth, Erik

    2010-01-01

    Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…

  15. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  16. A comparison of three design tree based search algorithms for the detection of engineering parts constructed with CATIA V5 in large databases

    Directory of Open Access Journals (Sweden)

    Robin Roj

    2014-07-01

    Full Text Available This paper presents three different search engines for the detection of CAD-parts in large databases. The analysis of the contained information is performed by the export of the data that is stored in the structure trees of the CAD-models. A preparation program generates one XML-file for every model, which in addition to including the data of the structure tree, also owns certain physical properties of each part. The first search engine is specializes in the discovery of standard parts, like screws or washers. The second program uses certain user input as search parameters, and therefore has the ability to perform personalized queries. The third one compares one given reference part with all parts in the database, and locates files that are identical, or similar to, the reference part. All approaches run automatically, and have the analysis of the structure tree in common. Files constructed with CATIA V5, and search engines written with Python have been used for the implementation. The paper also includes a short comparison of the advantages and disadvantages of each program, as well as a performance test.

  17. The Impact of Online Bibliographic Databases on Teaching and Research in Political Science.

    Science.gov (United States)

    Reichel, Mary

    The availability of online bibliographic databases greatly facilitates literature searching in political science. The advantages to searching databases online include combination of concepts, comprehensiveness, multiple database searching, free-text searching, currency, current awareness services, document delivery service, and convenience.…

  18. E-MSD: the European Bioinformatics Institute Macromolecular Structure Database.

    Science.gov (United States)

    Boutselakis, H; Dimitropoulos, D; Fillon, J; Golovin, A; Henrick, K; Hussain, A; Ionides, J; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Oldfield, T; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, J; Tagari, M; Tate, J; Tromm, S; Velankar, S; Vranken, W

    2003-01-01

    The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.

  19. Searching the PASCAL database - A user's perspective

    Science.gov (United States)

    Jack, Robert F.

    1989-01-01

    The operation of PASCAL, a bibliographic data base covering broad subject areas in science and technology, is discussed. The data base includes information from about 1973 to the present, including topics in engineering, chemistry, physics, earth science, environmental science, biology, psychology, and medicine. Data from 1986 to the present may be searched using DIALOG. The procedures and classification codes for searching PASCAL are presented. Examples of citations retrieved from the data base are given and suggestions are made concerning when to use PASCAL.

  20. The MAR databases: development and implementation of databases specific for marine metagenomics.

    Science.gov (United States)

    Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

    2018-01-04

    We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. NIMS structural materials databases and cross search engine - MatNavi

    Energy Technology Data Exchange (ETDEWEB)

    Yamazaki, M.; Xu, Y.; Murata, M.; Tanaka, H.; Kamihira, K.; Kimura, K. [National Institute for Materials Science, Tokyo (Japan)

    2007-06-15

    Materials Database Station (MDBS) of National Institute for Materials Science (NIMS) owns the world's largest Internet materials database for academic and industry purpose, which is composed of twelve databases: five concerning structural materials, five concerning basic physical properties, one for superconducting materials and one for polymers. All of theses databases are opened to Internet access at the website of http://mits.nims.go.jp/en. Online tools for predicting properties of polymers and composite materials are also available. The NIMS structural materials databases are composed of structural materials data sheet online version (creep, fatigue, corrosion and space use materials strength), microstructure for crept material database, Pressure vessel materials database and CCT diagram for welding. (orig.)

  2. [Biomedical informatics].

    Science.gov (United States)

    Capurro, Daniel; Soto, Mauricio; Vivent, Macarena; Lopetegui, Marcelo; Herskovic, Jorge R

    2011-12-01

    Biomedical Informatics is a new discipline that arose from the need to incorporate information technologies to the generation, storage, distribution and analysis of information in the domain of biomedical sciences. This discipline comprises basic biomedical informatics, and public health informatics. The development of the discipline in Chile has been modest and most projects have originated from the interest of individual people or institutions, without a systematic and coordinated national development. Considering the unique features of health care system of our country, research in the area of biomedical informatics is becoming an imperative.

  3. DDPC: Dragon database of genes associated with prostate cancer

    KAUST Repository

    Maqungo, Monique; Kaur, Mandeep; Kwofie, Samuel K.; Radovanovic, Aleksandar; Schaefer, Ulf; Schmeier, Sebastian; Oppon, Ekow; Christoffels, Alan; Bajic, Vladimir B.

    2010-01-01

    associated with Prostate Cancer (DDPC) as an integrated knowledgebase of genes experimentally verified as implicated in PC. DDPC is distinctive from other databases in that (i) it provides pre-compiled biomedical text-mining information on PC, which otherwise

  4. License - Q-TARO | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - Q-TARO | LSDB Archive ...

  5. Download - GenLibi | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - GenLibi | LSDB Archive ...

  6. Automatic sorting of toxicological information into the IUCLID (International Uniform Chemical Information Database) endpoint-categories making use of the semantic search engine Go3R.

    Science.gov (United States)

    Sauer, Ursula G; Wächter, Thomas; Hareng, Lars; Wareing, Britta; Langsch, Angelika; Zschunke, Matthias; Alvers, Michael R; Landsiedel, Robert

    2014-06-01

    The knowledge-based search engine Go3R, www.Go3R.org, has been developed to assist scientists from industry and regulatory authorities in collecting comprehensive toxicological information with a special focus on identifying available alternatives to animal testing. The semantic search paradigm of Go3R makes use of expert knowledge on 3Rs methods and regulatory toxicology, laid down in the ontology, a network of concepts, terms, and synonyms, to recognize the contents of documents. Search results are automatically sorted into a dynamic table of contents presented alongside the list of documents retrieved. This table of contents allows the user to quickly filter the set of documents by topics of interest. Documents containing hazard information are automatically assigned to a user interface following the endpoint-specific IUCLID5 categorization scheme required, e.g. for REACH registration dossiers. For this purpose, complex endpoint-specific search queries were compiled and integrated into the search engine (based upon a gold standard of 310 references that had been assigned manually to the different endpoint categories). Go3R sorts 87% of the references concordantly into the respective IUCLID5 categories. Currently, Go3R searches in the 22 million documents available in the PubMed and TOXNET databases. However, it can be customized to search in other databases including in-house databanks. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Navigating the Path to a Biomedical Science Career

    Science.gov (United States)

    Zimmerman, Andrea McNeely

    The number of biomedical PhD scientists being trained and graduated far exceeds the number of academic faculty positions and academic research jobs. If this trend is compelling biomedical PhD scientists to increasingly seek career paths outside of academia, then more should be known about their intentions, desires, training experiences, and career path navigation. Therefore, the purpose of this study was to understand the process through which biomedical PhD scientists are trained and supported for navigating future career paths. In addition, the study sought to determine whether career development support efforts and opportunities should be redesigned to account for the proportion of PhD scientists following non-academic career pathways. Guided by the social cognitive career theory (SCCT) framework this study sought to answer the following central research question: How does a southeastern tier 1 research university train and support its biomedical PhD scientists for navigating their career paths? Key findings are: Many factors influence PhD scientists' career sector preference and job search process, but the most influential were relationships with faculty, particularly the mentor advisor; Planned activities are a significant aspect of the training process and provide skills for career success; and Planned activities provided skills necessary for a career, but influential factors directed the career path navigated. Implications for practice and future research are discussed.

  8. LSDB Archive - KEGG MEDICUS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] English ]; } else if ( url.search(//en//) != -1 ) { url = url.replace(/...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us KEGG MEDI...CUS Database Description General information of database Database name KEGG MEDICUS...ug design Organism Taxonomy Name: Human Taxonomy ID: 9606 Database description KEGG MEDICUS is an integrated...ge inserts) of all marketed drugs in Japan and the USA are integrated with the KEGG DRUG and KEGG DISEASE databases in KEGG MEDI

  9. A review on quantum search algorithms

    Science.gov (United States)

    Giri, Pulak Ranjan; Korepin, Vladimir E.

    2017-12-01

    The use of superposition of states in quantum computation, known as quantum parallelism, has significant advantage in terms of speed over the classical computation. It is evident from the early invented quantum algorithms such as Deutsch's algorithm, Deutsch-Jozsa algorithm and its variation as Bernstein-Vazirani algorithm, Simon algorithm, Shor's algorithms, etc. Quantum parallelism also significantly speeds up the database search algorithm, which is important in computer science because it comes as a subroutine in many important algorithms. Quantum database search of Grover achieves the task of finding the target element in an unsorted database in a time quadratically faster than the classical computer. We review Grover's quantum search algorithms for a singe and multiple target elements in a database. The partial search algorithm of Grover and Radhakrishnan and its optimization by Korepin called GRK algorithm are also discussed.

  10. License - RGP gmap | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...nged without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RGP gmap | LSDB Archive ...

  11. Download - Plabrain DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Plabrain...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - Plabrain DB | LSDB Archive ...

  12. Retracted Publications in the Biomedical Literature from Open Access Journals.

    Science.gov (United States)

    Wang, Tao; Xing, Qin-Rui; Wang, Hui; Chen, Wei

    2018-03-07

    The number of articles published in open access journals (OAJs) has increased dramatically in recent years. Simultaneously, the quality of publications in these journals has been called into question. Few studies have explored the retraction rate from OAJs. The purpose of the current study was to determine the reasons for retractions of articles from OAJs in biomedical research. The Medline database was searched through PubMed to identify retracted publications in OAJs. The journals were identified by the Directory of Open Access Journals. Data were extracted from each retracted article, including the time from publication to retraction, causes, journal impact factor, and country of origin. Trends in the characteristics related to retraction were determined. Data from 621 retracted studies were included in the analysis. The number and rate of retractions have increased since 2010. The most common reasons for retraction are errors (148), plagiarism (142), duplicate publication (101), fraud/suspected fraud (98) and invalid peer review (93). The number of retracted articles from OAJs has been steadily increasing. Misconduct was the primary reason for retraction. The majority of retracted articles were from journals with low impact factors and authored by researchers from China, India, Iran, and the USA.

  13. Deja vu: a database of highly similar citations in the scientific literature.

    Science.gov (United States)

    Errami, Mounir; Sun, Zhaohui; Long, Tara C; George, Angela C; Garner, Harold R

    2009-01-01

    In the scientific research community, plagiarism and covert multiple publications of the same data are considered unacceptable because they undermine the public confidence in the scientific integrity. Yet, little has been done to help authors and editors to identify highly similar citations, which sometimes may represent cases of unethical duplication. For this reason, we have made available Déjà vu, a publicly available database of highly similar Medline citations identified by the text similarity search engine eTBLAST. Following manual verification, highly similar citation pairs are classified into various categories ranging from duplicates with different authors to sanctioned duplicates. Déjà vu records also contain user-provided commentary and supporting information to substantiate each document's categorization. Déjà vu and eTBLAST are available to authors, editors, reviewers, ethicists and sociologists to study, intercept, annotate and deter questionable publication practices. These tools are part of a sustained effort to enhance the quality of Medline as 'the' biomedical corpus. The Déjà vu database is freely accessible at http://spore.swmed.edu/dejavu. The tool eTBLAST is also freely available at http://etblast.org.

  14. License - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - fRNAdb | LSDB Archive ...

  15. License - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - AT Atlas | LSDB Archive ...

  16. License - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - TP Atlas | LSDB Archive ...

  17. Evaluation of Federated Searching Options for the School Library

    Science.gov (United States)

    Abercrombie, Sarah E.

    2008-01-01

    Three hosted federated search tools, Follett One Search, Gale PowerSearch Plus, and WebFeat Express, were configured and implemented in a school library. Databases from five vendors and the OPAC were systematically searched. Federated search results were compared with each other and to the results of the same searches in the database's native…

  18. Download - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Database Description Download License Update History of This Database Site Policy | Contact Us Download - SAHG | LSDB Archive ...

  19. Download - Metabolonote | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... Database Description Download License Update History of This Database Site Policy | Contact Us Download - Metabolonote | LSDB Archive ...

  20. Reference - PLACE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ailable. Data file File name: place_reference.zip File URL: ftp://ftp.biosciencedbc.jp/archive/place/LATEST/...ber About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Reference - PLACE | LSDB Archive ...

  1. Multilingual access to full text databases

    International Nuclear Information System (INIS)

    Fluhr, C.; Radwan, K.

    1990-05-01

    Many full text databases are available in only one language, or more, they may contain documents in different languages. Even if the user is able to understand the language of the documents in the database, it could be easier for him to express his need in his own language. For the case of databases containing documents in different languages, it is more simple to formulate the query in one language only and to retrieve documents in different languages. This paper present the developments and the first experiments of multilingual search, applied to french-english pair, for text data in nuclear field, based on the system SPIRIT. After reminding the general problems of full text databases search by queries formulated in natural language, we present the methods used to reformulate the queries and show how they can be expanded for multilingual search. The first results on data in nuclear field are presented (AFCEN norms and INIS abstracts). 4 refs

  2. License - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e Database Description Download License Update History of This Database Site Policy | Contact Us License - GRIPDB | LSDB Archive ...

  3. License - GETDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...se Database Description Download License Update History of This Database Site Policy | Contact Us License - GETDB | LSDB Archive ...

  4. Ocean Drilling Program: Janus Web Database

    Science.gov (United States)

    JANUS Database Send questions/comments about the online database Request data not available online Janus database Search the ODP/TAMU web site ODP's main web site Janus Data Model Data Migration Overview in Janus Data Types and Examples Leg 199, sunrise. Janus Web Database ODP and IODP data are stored in

  5. Literature database aid

    International Nuclear Information System (INIS)

    Wanderer, J.A.

    1991-01-01

    The booklet is to help with the acquisition of original literature either after a conventional literature search or in particular after a database search. It bridges the gap between abbreviated (short) and original (long) titel. This, together with information on the holdings of technical/scientific libraries, facilitates document delivery. 1500 short titles are listed alphabetically. (orig.) [de

  6. Preference vs. Authority: A Comparison of Student Searching in a Subject-Specific Indexing and Abstracting Database and a Customized Discovery Layer

    Science.gov (United States)

    Dahlen, Sarah P. C.; Hanson, Kathlene

    2017-01-01

    Discovery layers provide a simplified interface for searching library resources. Libraries with limited finances make decisions about retaining indexing and abstracting databases when similar information is available in discovery layers. These decisions should be informed by student success at finding quality information as well as satisfaction…

  7. License - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM....0 International . If you use data from this database, please be sure attribute this database as follows: FANTOM...se Database Description Download License Update History of This Database Site Policy | Contact Us License - FANTOM5 | LSDB Archive ...

  8. Optimization of partial search

    International Nuclear Information System (INIS)

    Korepin, Vladimir E

    2005-01-01

    A quantum Grover search algorithm can find a target item in a database faster than any classical algorithm. One can trade accuracy for speed and find a part of the database (a block) containing the target item even faster; this is partial search. A partial search algorithm was recently suggested by Grover and Radhakrishnan. Here we optimize it. Efficiency of the search algorithm is measured by the number of queries to the oracle. The author suggests a new version of the Grover-Radhakrishnan algorithm which uses a minimal number of such queries. The algorithm can run on the same hardware that is used for the usual Grover algorithm. (letter to the editor)

  9. Exon - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ontents Exons in variants Data file File name: astra_exon.zip File URL: ftp://ftp.biosciencedbc.jp/archive/a... About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Exon - ASTRA | LSDB Archive ...

  10. PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine

    Directory of Open Access Journals (Sweden)

    Baskin Berivan

    2003-03-01

    Full Text Available Abstract Background The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND. Results Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days. Conclusions Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca. Current capabilities allow searching for human, mouse and yeast protein-interaction information.

  11. Scopus database: a review.

    Science.gov (United States)

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  12. Using the TIGR gene index databases for biological discovery.

    Science.gov (United States)

    Lee, Yuandan; Quackenbush, John

    2003-11-01

    The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.

  13. Astronomical databases of Nikolaev Observatory

    Science.gov (United States)

    Protsyuk, Y.; Mazhaev, A.

    2008-07-01

    Several astronomical databases were created at Nikolaev Observatory during the last years. The databases are built by using MySQL search engine and PHP scripts. They are available on NAO web-site http://www.mao.nikolaev.ua.

  14. Effective Image Database Search via Dimensionality Reduction

    DEFF Research Database (Denmark)

    Dahl, Anders Bjorholm; Aanæs, Henrik

    2008-01-01

    Image search using the bag-of-words image representation is investigated further in this paper. This approach has shown promising results for large scale image collections making it relevant for Internet applications. The steps involved in the bag-of-words approach are feature extraction, vocabul......Image search using the bag-of-words image representation is investigated further in this paper. This approach has shown promising results for large scale image collections making it relevant for Internet applications. The steps involved in the bag-of-words approach are feature extraction......, vocabulary building, and searching with a query image. It is important to keep the computational cost low through all steps. In this paper we focus on the efficiency of the technique. To do that we substantially reduce the dimensionality of the features by the use of PCA and addition of color. Building...... of the visual vocabulary is typically done using k-means. We investigate a clustering algorithm based on the leader follower principle (LF-clustering), in which the number of clusters is not fixed. The adaptive nature of LF-clustering is shown to improve the quality of the visual vocabulary using this...

  15. Multi-lingual search engine to access PubMed monolingual subsets: a feasibility study.

    Science.gov (United States)

    Darmoni, Stéfan J; Soualmia, Lina F; Griffon, Nicolas; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse

    2013-01-01

    PubMed contains many articles in languages other than English but it is difficult to find them using the English version of the Medical Subject Headings (MeSH) Thesaurus. The aim of this work is to propose a tool allowing access to a PubMed subset in one language, and to evaluate its performance. Translations of MeSH were enriched and gathered in the information system. PubMed subsets in main European languages were also added in our database, using a dedicated parser. The CISMeF generic semantic search engine was evaluated on the response time for simple queries. MeSH descriptors are currently available in 11 languages in the information system. All the 654,000 PubMed citations in French were integrated into CISMeF database. None of the response times exceed the threshold defined for usability (2 seconds). It is now possible to freely access biomedical literature in French using a tool in French; health professionals and lay people with a low English language may find it useful. It will be expended to several European languages: German, Spanish, Norwegian and Portuguese.

  16. International patent applications for non-injectable naloxone for opioid overdose reversal: Exploratory search and retrieve analysis of the PatentScope database.

    Science.gov (United States)

    McDonald, Rebecca; Danielsson Glende, Øyvind; Dale, Ola; Strang, John

    2018-02-01

    Non-injectable naloxone formulations are being developed for opioid overdose reversal, but only limited data have been published in the peer-reviewed domain. Through examination of a hitherto-unsearched database, we expand public knowledge of non-injectable formulations, tracing their development and novelty, with the aim to describe and compare their pharmacokinetic properties. (i) The PatentScope database of the World Intellectual Property Organization was searched for relevant English-language patent applications; (ii) Pharmacokinetic data were extracted, collated and analysed; (iii) PubMed was searched using Boolean search query '(nasal OR intranasal OR nose OR buccal OR sublingual) AND naloxone AND pharmacokinetics'. Five hundred and twenty-two PatentScope and 56 PubMed records were identified: three published international patent applications and five peer-reviewed papers were eligible. Pharmacokinetic data were available for intranasal, sublingual, and reference routes. Highly concentrated formulations (10-40 mg mL -1 ) had been developed and tested. Sublingual bioavailability was very low (1%; relative to intravenous). Non-concentrated intranasal spray (1 mg mL -1 ; 1 mL per nostril) had low bioavailability (11%). Concentrated intranasal formulations (≥10 mg mL -1 ) had bioavailability of 21-42% (relative to intravenous) and 26-57% (relative to intramuscular), with peak concentrations (dose-adjusted C max  = 0.8-1.7 ng mL -1 ) reached in 19-30 min (t max ). Exploratory analysis identified intranasal bioavailability as associated positively with dose and negatively with volume. We find consistent direction of development of intranasal sprays to high-concentration, low-volume formulations with bioavailability in the 20-60% range. These have potential to deliver a therapeutic dose in 0.1 mL volume. [McDonald R, Danielsson Glende Ø, Dale O, Strang J. International patent applications for non-injectable naloxone for opioid overdose reversal

  17. Database in Artificial Intelligence.

    Science.gov (United States)

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  18. An effective suggestion method for keyword search of databases

    KAUST Repository

    Huang, Hai; Chen, Zonghai; Liu, Chengfei; Huang, He; Zhang, Xiangliang

    2016-01-01

    This paper solves the problem of providing high-quality suggestions for user keyword queries over databases. With the assumption that the returned suggestions are independent, existing query suggestion methods over databases score candidate

  19. Undergraduates Prefer Federated Searching to Searching Databases Individually. A Review of: Belliston, C. Jeffrey, Jared L. Howland, & Brian C. Roberts. “Undergraduate Use of Federated Searching: A Survey of Preferences and Perceptions of Value-Added Functionality.” College & Research Libraries 68.6 (Nov. 2007: 472-86.

    Directory of Open Access Journals (Sweden)

    Genevieve Gore

    2008-09-01

    Full Text Available Objective – To determine whether use offederated searching by undergraduates saves time, meets their information needs, is preferred over searching databases individually, and provides results of higher quality. Design – Crossover study.Setting – Three American universities, all members of the Consortium of Church Libraries & Archives (CCLA: BYU (Brigham Young University, a large research university; BYUH (Brigham Young University – Hawaii, a small baccalaureate college; and BYUI (Brigham Young University – Idaho, a large baccalaureate collegeSubjects – Ninety-five participants recruited via e-mail invitations sent to a random sample of currently enrolled undergraduates at BYU, BYUH, and BYUI.Methods – Participants were given written directions to complete a literature search for journal articles on two biology-related topics using two search methods: 1. federated searching with WebFeat® (implemented in the same way for this study at the three universities and 2. a hyperlinked list of databases to search individually. Both methods used the same set of seven databases. Each topic was assigned in random order to one of the two search methods, also assigned in random order, for a total of two searches per participant. The time to complete the searches was recorded. Students compiled their list of citations, which were later normalized and graded. To analyze the quality of the citations, one quantitative rubric was created by librarians and one qualitative rubric was approved by a faculty member at BYU. The librarian-created rubric included the journal impact factor (from ISI’s Journal Citation Reports®, the proportion of citations from peer-reviewed journals (determined from Ulrichsweb.com™ to total citations, and the timeliness of the articles. The faculty-approved rubric included three criteria: relevance to the topic, quality of the individual citations (good quality: primary research results, peer-reviewed sources, and

  20. Flat Files - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... Data file File name: jsnp_flat_files File URL: ftp://ftp.biosciencedbc.jp/archiv...his Database Database Description Download License Update History of This Database Site Policy | Contact Us Flat Files - JSNP | LSDB Archive ...

  1. Biomedical engineering and nanotechnology

    International Nuclear Information System (INIS)

    Pawar, S.H.; Khyalappa, R.J.; Yakhmi, J.V.

    2009-01-01

    This book is predominantly a compilation of papers presented in the conference which is focused on the development in biomedical materials, biomedical devises and instrumentation, biomedical effects of electromagnetic radiation, electrotherapy, radiotherapy, biosensors, biotechnology, bioengineering, tissue engineering, clinical engineering and surgical planning, medical imaging, hospital system management, biomedical education, biomedical industry and society, bioinformatics, structured nanomaterial for biomedical application, nano-composites, nano-medicine, synthesis of nanomaterial, nano science and technology development. The papers presented herein contain the scientific substance to suffice the academic directivity of the researchers from the field of biomedicine, biomedical engineering, material science and nanotechnology. Papers relevant to INIS are indexed separately

  2. Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

    Science.gov (United States)

    Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

    2009-04-01

    We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).

  3. Biomedical photonics handbook biomedical diagnostics

    CERN Document Server

    Vo-Dinh, Tuan

    2014-01-01

    Shaped by Quantum Theory, Technology, and the Genomics RevolutionThe integration of photonics, electronics, biomaterials, and nanotechnology holds great promise for the future of medicine. This topic has recently experienced an explosive growth due to the noninvasive or minimally invasive nature and the cost-effectiveness of photonic modalities in medical diagnostics and therapy. The second edition of the Biomedical Photonics Handbook presents fundamental developments as well as important applications of biomedical photonics of interest to scientists, engineers, manufacturers, teachers, studen

  4. The Ontology for Biomedical Investigations.

    Science.gov (United States)

    Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias; Brush, Matthew H; Bug, Bill; Chibucos, Marcus C; Clancy, Kevin; Courtot, Mélanie; Derom, Dirk; Dumontier, Michel; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Gibson, Frank; Gonzalez-Beltran, Alejandra; Haendel, Melissa A; He, Yongqun; Heiskanen, Mervi; Hernandez-Boussard, Tina; Jensen, Mark; Lin, Yu; Lister, Allyson L; Lord, Phillip; Malone, James; Manduchi, Elisabetta; McGee, Monnie; Morrison, Norman; Overton, James A; Parkinson, Helen; Peters, Bjoern; Rocca-Serra, Philippe; Ruttenberg, Alan; Sansone, Susanna-Assunta; Scheuermann, Richard H; Schober, Daniel; Smith, Barry; Soldatova, Larisa N; Stoeckert, Christian J; Taylor, Chris F; Torniai, Carlo; Turner, Jessica A; Vita, Randi; Whetzel, Patricia L; Zheng, Jie

    2016-01-01

    The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed

  5. Personalized Search

    CERN Document Server

    AUTHOR|(SzGeCERN)749939

    2015-01-01

    As the volume of electronically available information grows, relevant items become harder to find. This work presents an approach to personalizing search results in scientific publication databases. This work focuses on re-ranking search results from existing search engines like Solr or ElasticSearch. This work also includes the development of Obelix, a new recommendation system used to re-rank search results. The project was proposed and performed at CERN, using the scientific publications available on the CERN Document Server (CDS). This work experiments with re-ranking using offline and online evaluation of users and documents in CDS. The experiments conclude that the personalized search result outperform both latest first and word similarity in terms of click position in the search result for global search in CDS.

  6. Atomic Spectra Database (ASD)

    Science.gov (United States)

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  7. Uploading, Searching and Visualizing of Paleomagnetic and Rock Magnetic Data in the Online MagIC Database

    Science.gov (United States)

    Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Donadini, F.

    2007-12-01

    The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all available measurements and derived properties from paleomagnetic studies of directions and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and will soon implement two search nodes, one for paleomagnetism and one for rock magnetism. Currently the PMAG node is operational. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. Users can also browse the database by data type or by data compilation to view all contributions associated with well known earlier collections like PINT, GMPDB or PSVRL. The query result set is displayed in a digestible tabular format allowing the user to descend from locations to sites, samples, specimens and measurements. At each stage, the result set can be saved and, where appropriate, can be visualized by plotting global location maps, equal area, XY, age, and depth plots, or typical Zijderveld, hysteresis, magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (version 2.3) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload

  8. Protein - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ..._protein.zip File URL: ftp://ftp.biosciencedbc.jp/archive/at_atlas/LATEST/at_atla...About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Protein - AT Atlas | LSDB Archive ...

  9. Download - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM... and CAGE TSS aggregation fantom5_new_experimental_details.zip (273 KB) basic (1.3 TB) Simple search and dow...nload 3 (reprocessed)HeliscopeCAGE sequencing, Delve mapping and CAGE TSS aggregation fantom5_rp_exp_details...access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - FANTOM5 | LSDB Archive ...

  10. License - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...-Share Alike 2.1 Japan. If you use data from this database, please be sure attribute this database as follows: tRNAD...: About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - tRNADB-CE | LSDB Archive ...

  11. Online Databases for Health Professionals

    OpenAIRE

    Marshall, Joanne Gard

    1987-01-01

    Recent trends in the marketing of electronic information technology have increased interest among health professionals in obtaining direct access to online biomedical databases such as Medline. During 1985, the Canadian Medical Association (CMA) and Telecom Canada conducted an eight-month trial of the use made of online information retrieval systems by 23 practising physicians and one pharmacist. The results of this project demonstrated both the value and the limitations of these systems in p...

  12. Home | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ple Search Original Site Database Center for Life Science Kousaku Okubo organ human The dictionary-type data...-SA Detail Taxonomy Icon Taxonomy Icon Download | Simple Search Original Site National Bioscience Database Center Kousaku Okubo...enter for Life Science Kousaku Okubo Dictionary 9 species (human, mouse, rat, zeb

  13. Stemcell Information: SKIP000331 [SKIP Stemcell Database[Archive

    Lifescience Database Archive (English)

    Full Text Available t Available National Institute of Biomedical Innovation. 独立行政法人医薬基盤研究所JCRB細胞バンク http://cellbank.nibio.go.jp/~cellbank/cgi-bin/search_res_det.cgi?DB_NUM=1&ID=3527 ...

  14. Image BOSS: a biomedical object storage system

    Science.gov (United States)

    Stacy, Mahlon C.; Augustine, Kurt E.; Robb, Richard A.

    1997-05-01

    Researchers using biomedical images have data management needs which are oriented perpendicular to clinical PACS. The image BOSS system is designed to permit researchers to organize and select images based on research topic, image metadata, and a thumbnail of the image. Image information is captured from existing images in a Unix based filesystem, stored in an object oriented database, and presented to the user in a familiar laboratory notebook metaphor. In addition, the ImageBOSS is designed to provide an extensible infrastructure for future content-based queries directly on the images.

  15. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  16. Download - eSOL | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Database Description Download License Update History of This Database Site Policy | Contact Us Download - eSOL | LSDB Archive ...

  17. Online Petroleum Industry Bibliographic Databases: A Review.

    Science.gov (United States)

    Anderson, Margaret B.

    This paper discusses the present status of the bibliographic database industry, reviews the development of online databases of interest to the petroleum industry, and considers future developments in online searching and their effect on libraries and information centers. Three groups of databases are described: (1) databases developed by the…

  18. License - KEGG MEDICUS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] English ]; } else if ( url.search(//en//) != -1 ) { url = url.replace(/...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us KEGG MEDI...ion-NonCommercial-NoDerivs 2.1 Japan. If you use data from this database, please be sure attribute this data...whole of this database, and acquire data; and freely redistribute part or whole of the data from this databa...This Database Site Policy | Contact Us License - KEGG MEDICUS | LSDB Archive ...

  19. Status report on nuclear power - information from STN databases

    International Nuclear Information System (INIS)

    Prinz, H.

    1995-01-01

    The worldwide future of nuclear power as seen about 25 years ago is presented based on a literature search in the INIS database. The role of nuclear power today, after TMI and Chernobyl, in energy supplies and in combating the greehouse effect is evaluated by literature searches in STN databases (e.g. INIS, ETDE, COMPENDEX, CA, ULIDAT, INSPEC). An evaluation is given of the different information contents of bibliographic databases such as INIS and pure information databases such as NLDB. (orig./HP)

  20. Biomedical engineering fundamentals

    CERN Document Server

    Bronzino, Joseph D

    2014-01-01

    Known as the bible of biomedical engineering, The Biomedical Engineering Handbook, Fourth Edition, sets the standard against which all other references of this nature are measured. As such, it has served as a major resource for both skilled professionals and novices to biomedical engineering.Biomedical Engineering Fundamentals, the first volume of the handbook, presents material from respected scientists with diverse backgrounds in physiological systems, biomechanics, biomaterials, bioelectric phenomena, and neuroengineering. More than three dozen specific topics are examined, including cardia

  1. The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research.

    Science.gov (United States)

    Tenenbaum, Jessica D; Whetzel, Patricia L; Anderson, Kent; Borromeo, Charles D; Dinov, Ivo D; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D; Becich, Michael J; Ginsburg, Geoffrey S; Musen, Mark A; Smith, Kevin A; Tarantal, Alice F; Rubin, Daniel L; Lyster, Peter

    2011-02-01

    The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. Copyright © 2010 Elsevier Inc. All rights reserved.

  2. ORF information - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... File URL: ftp://ftp.biosciencedbc.jp/archive/kome/LATEST/kome_orf_infomation.zip File size: 526 KB Simple s...ut This Database Database Description Download License Update History of This Database Site Policy | Contact Us ORF information - KOME | LSDB Archive ...

  3. EST data - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...st.zip File URL: ftp://ftp.biosciencedbc.jp/archive/red/LATEST/red_est.zip File size: 629 KB Simple search U...ase Database Description Download License Update History of This Database Site Policy | Contact Us EST data - RED | LSDB Archive ...

  4. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    Science.gov (United States)

    Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem

    2016-01-01

    Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  5. Examples how to use atomic and molecular databases

    International Nuclear Information System (INIS)

    Murakami, Izumi

    2012-01-01

    As examples how to use atomic and molecular databases, atomic spectra database (ASD) and molecular chemical kinetics database of National Institute of Standards and Technology (NIST), collision cross sections of National Institute of Fusion Science (NIFS), Open-Atomic Data and Analysis Structure (ADAS) and chemical reaction rate coefficients of GRI-Mech were presented. Sorting method differed in each database and several options were prepared. Atomic wavelengths/transition probabilities and electron collision ionization, excitation and recombination cross sections/rate coefficients were simply searched with just specifying atom or ion using a general internet search engine (GENIE) of IAEA. (T. Tanaka)

  6. Mapping data - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...tional Rice Genome Sequencing Project (IRGSP) Data file File name: kome_mapping_data.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...(Transcriptional Unit) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Mapping data - KOME | LSDB Archive ...

  7. Spot table - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...d_spot.zip File URL: ftp://ftp.biosciencedbc.jp/archive/rpd/LATEST/rpd_spot.zip F... cDNA. (multiple entries) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Spot table - RPD | LSDB Archive ...

  8. About Libraries - AcEST | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ormat text file. Data file File name: acest_library.zip File URL: ftp://ftp.biosciencedbc.jp/archive/acest/L...ATEST/acest_library.zip File size: 2KB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/archiv...s Database Database Description Download License Update History of This Database Site Policy | Contact Us About Libraries - AcEST | LSDB Archive ...

  9. License - Plabrain DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Plabrain... Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follows: Plabrain...of this database (http://dbarchive.lifesciencedb.jp/english/en/plabrain-db/desc.html) in the article or pape...se Description Download License Update History of This Database Site Policy | Contact Us License - Plabrain DB | LSDB Archive ...

  10. KAIKObase: An integrated silkworm genome database and data mining tool

    Directory of Open Access Journals (Sweden)

    Nagaraju Javaregowda

    2009-10-01

    Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the

  11. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  12. Repeat: a framework to assess empirical reproducibility in biomedical research

    Directory of Open Access Journals (Sweden)

    Leslie D. McIntosh

    2017-09-01

    Full Text Available Abstract Background The reproducibility of research is essential to rigorous science, yet significant concerns of the reliability and verifiability of biomedical research have been recently highlighted. Ongoing efforts across several domains of science and policy are working to clarify the fundamental characteristics of reproducibility and to enhance the transparency and accessibility of research. Methods The aim of the proceeding work is to develop an assessment tool operationalizing key concepts of research transparency in the biomedical domain, specifically for secondary biomedical data research using electronic health record data. The tool (RepeAT was developed through a multi-phase process that involved coding and extracting recommendations and practices for improving reproducibility from publications and reports across the biomedical and statistical sciences, field testing the instrument, and refining variables. Results RepeAT includes 119 unique variables grouped into five categories (research design and aim, database and data collection methods, data mining and data cleaning, data analysis, data sharing and documentation. Preliminary results in manually processing 40 scientific manuscripts indicate components of the proposed framework with strong inter-rater reliability, as well as directions for further research and refinement of RepeAT. Conclusions The use of RepeAT may allow the biomedical community to have a better understanding of the current practices of research transparency and accessibility among principal investigators. Common adoption of RepeAT may improve reporting of research practices and the availability of research outputs. Additionally, use of RepeAT will facilitate comparisons of research transparency and accessibility across domains and institutions.

  13. Biomedical informatics: we are what we publish.

    Science.gov (United States)

    Elkin, P L; Brown, S H; Wright, G

    2013-01-01

    This article is part of a For-Discussion-Section of Methods of Information in Medicine on "Biomedical Informatics: We are what we publish". It is introduced by an editorial and followed by a commentary paper with invited comments. In subsequent issues the discussion may continue through letters to the editor. Informatics experts have attempted to define the field via consensus projects which has led to consensus statements by both AMIA. and by IMIA. We add to the output of this process the results of a study of the Pubmed publications with abstracts from the field of Biomedical Informatics. We took the terms from the AMIA consensus document and the terms from the IMIA definitions of the field of Biomedical Informatics and combined them through human review to create the Health Informatics Ontology. We built a terminology server using the Intelligent Natural Language Processor (iNLP). Then we downloaded the entire set of articles in Medline identified by searching the literature by "Medical Informatics" OR "Bioinformatics". The articles were parsed by the joint AMIA / IMIA terminology and then again using SNOMED CT and for the Bioinformatics they were also parsed using HGNC Ontology. We identified 153,580 articles using "Medical Informatics" and 20,573 articles using "Bioinformatics". This resulted in 168,298 unique articles and an overlap of 5,855 articles. Of these 62,244 articles (37%) had titles and abstracts that contained at least one concept from the Health Informatics Ontology. SNOMED CT indexing showed that the field interacts with most all clinical fields of medicine. Further defining the field by what we publish can add value to the consensus driven processes that have been the mainstay of the efforts to date. Next steps should be to extract terms from the literature that are uncovered and create class hierarchies and relationships for this content. We should also examine the high occurring of MeSH terms as markers to define Biomedical Informatics

  14. Spatial search by quantum walk

    International Nuclear Information System (INIS)

    Childs, Andrew M.; Goldstone, Jeffrey

    2004-01-01

    Grover's quantum search algorithm provides a way to speed up combinatorial search, but is not directly applicable to searching a physical database. Nevertheless, Aaronson and Ambainis showed that a database of N items laid out in d spatial dimensions can be searched in time of order √(N) for d>2, and in time of order √(N) poly(log N) for d=2. We consider an alternative search algorithm based on a continuous-time quantum walk on a graph. The case of the complete graph gives the continuous-time search algorithm of Farhi and Gutmann, and other previously known results can be used to show that √(N) speedup can also be achieved on the hypercube. We show that full √(N) speedup can be achieved on a d-dimensional periodic lattice for d>4. In d=4, the quantum walk search algorithm takes time of order √(N) poly(log N), and in d<4, the algorithm does not provide substantial speedup

  15. Training multidisciplinary biomedical informatics students: three years of experience.

    Science.gov (United States)

    van Mulligen, Erik M; Cases, Montserrat; Hettne, Kristina; Molero, Eva; Weeber, Marc; Robertson, Kevin A; Oliva, Baldomero; de la Calle, Guillermo; Maojo, Victor

    2008-01-01

    The European INFOBIOMED Network of Excellence recognized that a successful education program in biomedical informatics should include not only traditional teaching activities in the basic sciences but also the development of skills for working in multidisciplinary teams. A carefully developed 3-year training program for biomedical informatics students addressed these educational aspects through the following four activities: (1) an internet course database containing an overview of all Medical Informatics and BioInformatics courses, (2) a BioMedical Informatics Summer School, (3) a mobility program based on a 'brokerage service' which published demands and offers, including funding for research exchange projects, and (4) training challenges aimed at the development of multi-disciplinary skills. This paper focuses on experiences gained in the development of novel educational activities addressing work in multidisciplinary teams. The training challenges described here were evaluated by asking participants to fill out forms with Likert scale based questions. For the mobility program a needs assessment was carried out. The mobility program supported 20 exchanges which fostered new BMI research, resulted in a number of peer-reviewed publications and demonstrated the feasibility of this multidisciplinary BMI approach within the European Union. Students unanimously indicated that the training challenge experience had contributed to their understanding and appreciation of multidisciplinary teamwork. The training activities undertaken in INFOBIOMED have contributed to a multi-disciplinary BMI approach. It is our hope that this work might provide an impetus for training efforts in Europe, and yield a new generation of biomedical informaticians.

  16. Integrating Variances into an Analytical Database

    Science.gov (United States)

    Sanchez, Carlos

    2010-01-01

    For this project, I enrolled in numerous SATERN courses that taught the basics of database programming. These include: Basic Access 2007 Forms, Introduction to Database Systems, Overview of Database Design, and others. My main job was to create an analytical database that can handle many stored forms and make it easy to interpret and organize. Additionally, I helped improve an existing database and populate it with information. These databases were designed to be used with data from Safety Variances and DCR forms. The research consisted of analyzing the database and comparing the data to find out which entries were repeated the most. If an entry happened to be repeated several times in the database, that would mean that the rule or requirement targeted by that variance has been bypassed many times already and so the requirement may not really be needed, but rather should be changed to allow the variance's conditions permanently. This project did not only restrict itself to the design and development of the database system, but also worked on exporting the data from the database to a different format (e.g. Excel or Word) so it could be analyzed in a simpler fashion. Thanks to the change in format, the data was organized in a spreadsheet that made it possible to sort the data by categories or types and helped speed up searches. Once my work with the database was done, the records of variances could be arranged so that they were displayed in numerical order, or one could search for a specific document targeted by the variances and restrict the search to only include variances that modified a specific requirement. A great part that contributed to my learning was SATERN, NASA's resource for education. Thanks to the SATERN online courses I took over the summer, I was able to learn many new things about computers and databases and also go more in depth into topics I already knew about.

  17. Standardization of Keyword Search Mode

    Science.gov (United States)

    Su, Di

    2010-01-01

    In spite of its popularity, keyword search mode has not been standardized. Though information professionals are quick to adapt to various presentations of keyword search mode, novice end-users may find keyword search confusing. This article compares keyword search mode in some major reference databases and calls for standardization. (Contains 3…

  18. Custom Search Engines: Tools & Tips

    Science.gov (United States)

    Notess, Greg R.

    2008-01-01

    Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…

  19. CDAPubMed: a browser extension to retrieve EHR-based biomedical literature

    Directory of Open Access Journals (Sweden)

    Perez-Rey David

    2012-04-01

    Full Text Available Abstract Background Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs. In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs. Results We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA, (ii identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH, automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination. Conclusions CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard

  20. BioTCM-SE: a semantic search engine for the information retrieval of modern biology and traditional Chinese medicine.

    Science.gov (United States)

    Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui

    2014-01-01

    Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.

  1. BioTCM-SE: A Semantic Search Engine for the Information Retrieval of Modern Biology and Traditional Chinese Medicine

    Directory of Open Access Journals (Sweden)

    Xi Chen

    2014-01-01

    Full Text Available Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM, essentially different from Western Medicine (WM, is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.

  2. PSCID List - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t.zip File URL: ftp://ftp.biosciencedbc.jp/archive/pscdb/LATEST/pscdb_pscid_list.zip File size: 24.4 KB Simp...nd-binding sites About This Database Database Description Download License Update History of This Database Site Policy | Contact Us PSCID List - PSCDB | LSDB Archive ...

  3. Subject Retrieval from Full-Text Databases in the Humanities

    Science.gov (United States)

    East, John W.

    2007-01-01

    This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…

  4. Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network

    Science.gov (United States)

    Pratiwi, A. B.; Damayanti, A.; Miswanto

    2017-07-01

    Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.

  5. OvidSP Medline-to-PubMed search filter translation: a methodology for extending search filter range to include PubMed's unique content.

    Science.gov (United States)

    Damarell, Raechel A; Tieman, Jennifer J; Sladek, Ruth M

    2013-07-02

    PubMed translations of OvidSP Medline search filters offer searchers improved ease of access. They may also facilitate access to PubMed's unique content, including citations for the most recently published biomedical evidence. Retrieving this content requires a search strategy comprising natural language terms ('textwords'), rather than Medical Subject Headings (MeSH). We describe a reproducible methodology that uses a validated PubMed search filter translation to create a textword-only strategy to extend retrieval to PubMed's unique heart failure literature. We translated an OvidSP Medline heart failure search filter for PubMed and established version equivalence in terms of indexed literature retrieval. The PubMed version was then run within PubMed to identify citations retrieved by the filter's MeSH terms (Heart failure, Left ventricular dysfunction, and Cardiomyopathy). It was then rerun with the same MeSH terms restricted to searching on title and abstract fields (i.e. as 'textwords'). Citations retrieved by the MeSH search but not the textword search were isolated. Frequency analysis of their titles/abstracts identified natural language alternatives for those MeSH terms that performed less effectively as textwords. These terms were tested in combination to determine the best performing search string for reclaiming this 'lost set'. This string, restricted to searching on PubMed's unique content, was then combined with the validated PubMed translation to extend the filter's performance in this database. The PubMed heart failure filter retrieved 6829 citations. Of these, 834 (12%) failed to be retrieved when MeSH terms were converted to textwords. Frequency analysis of the 834 citations identified five high frequency natural language alternatives that could improve retrieval of this set (cardiac failure, cardiac resynchronization, left ventricular systolic dysfunction, left ventricular diastolic dysfunction, and LV dysfunction). Together these terms reclaimed

  6. JICST Factual Database(2)

    Science.gov (United States)

    Araki, Keisuke

    The computer programme, which builds atom-bond connection tables from nomenclatures, is developed. Chemical substances with their nomenclature and varieties of trivial names or experimental code numbers are inputted. The chemical structures of the database are stereospecifically stored and are able to be searched and displayed according to stereochemistry. Source data are from laws and regulations of Japan, RTECS of US and so on. The database plays a central role within the integrated fact database service of JICST and makes interrelational retrieval possible.

  7. Introduction to biomedical engineering

    CERN Document Server

    Enderle, John D; Blanchard, Susan M

    2005-01-01

    Under the direction of John Enderle, Susan Blanchard and Joe Bronzino, leaders in the field have contributed chapters on the most relevant subjects for biomedical engineering students. These chapters coincide with courses offered in all biomedical engineering programs so that it can be used at different levels for a variety of courses of this evolving field. Introduction to Biomedical Engineering, Second Edition provides a historical perspective of the major developments in the biomedical field. Also contained within are the fundamental principles underlying biomedical engineering design, analysis, and modeling procedures. The numerous examples, drill problems and exercises are used to reinforce concepts and develop problem-solving skills making this book an invaluable tool for all biomedical students and engineers. New to this edition: Computational Biology, Medical Imaging, Genomics and Bioinformatics. * 60% update from first edition to reflect the developing field of biomedical engineering * New chapters o...

  8. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    Directory of Open Access Journals (Sweden)

    Ahmad Tamimi

    Full Text Available Profile Hidden Markov Model (Profile-HMM is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  9. Database Description - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Affiliation: National Institute of Advanced Industrial Science and Technology (AIST) Journal Search: Creato...D89-92 External Links: Original website information Database maintenance site National Institute of Industrial Science and Technology

  10. Biomedical applications of nanotechnology.

    Science.gov (United States)

    Ramos, Ana P; Cruz, Marcos A E; Tovani, Camila B; Ciancaglini, Pietro

    2017-04-01

    The ability to investigate substances at the molecular level has boosted the search for materials with outstanding properties for use in medicine. The application of these novel materials has generated the new research field of nanobiotechnology, which plays a central role in disease diagnosis, drug design and delivery, and implants. In this review, we provide an overview of the use of metallic and metal oxide nanoparticles, carbon-nanotubes, liposomes, and nanopatterned flat surfaces for specific biomedical applications. The chemical and physical properties of the surface of these materials allow their use in diagnosis, biosensing and bioimaging devices, drug delivery systems, and bone substitute implants. The toxicology of these particles is also discussed in the light of a new field referred to as nanotoxicology that studies the surface effects emerging from nanostructured materials.

  11. Content-Based Information Retrieval from Forensic Databases

    NARCIS (Netherlands)

    Geradts, Z.J.M.H.

    2002-01-01

    In forensic science, the number of image databases is growing rapidly. For this reason, it is necessary to have a proper procedure for searching in these images databases based on content. The use of image databases results in more solved crimes; furthermore, statistical information can be obtained

  12. [The long pilgrimage of Spanish biomedical journals toward excellence. Who helps? Quality, impact and research merit].

    Science.gov (United States)

    Alfonso, Fernando

    2010-03-01

    Biomedical journals must adhere to strict standards of editorial quality. In a globalized academic scenario, biomedical journals must compete firstly to publish the most relevant original research and secondly to obtain the broadest possible visibility and the widest dissemination of their scientific contents. The cornerstone of the scientific process is still the peer-review system but additional quality criteria should be met. Recently access to medical information has been revolutionized by electronic editions. Bibliometric databases such as MEDLINE, the ISI Web of Science and Scopus offer comprehensive online information on medical literature. Classically, the prestige of biomedical journals has been measured by their impact factor but, recently, other indicators such as SCImago SJR or the Eigenfactor are emerging as alternative indices of a journal's quality. Assessing the scholarly impact of research and the merits of individual scientists remains a major challenge. Allocation of authorship credit also remains controversial. Furthermore, in our Kafkaesque world, we prefer to count rather than read the articles we judge. Quantitative publication metrics (research output) and citations analyses (scientific influence) are key determinants of the scientific success of individual investigators. However, academia is embracing new objective indicators (such as the "h" index) to evaluate scholarly merit. The present review discusses some editorial issues affecting biomedical journals, currently available bibliometric databases, bibliometric indices of journal quality and, finally, indicators of research performance and scientific success. Copyright 2010 SEEN. Published by Elsevier Espana. All rights reserved.

  13. Omicseq: a web-based search engine for exploring omics datasets

    Science.gov (United States)

    Sun, Xiaobo; Pittard, William S.; Xu, Tianlei; Chen, Li; Zwick, Michael E.; Jiang, Xiaoqian; Wang, Fusheng

    2017-01-01

    Abstract The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve ‘findability’ of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. PMID:28402462

  14. CAGE peaks - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM...E Data file File name: CAGE_peaks File URL: ftp://ftp.biosciencedbc.jp/archive/fantom... This Database Database Description Download License Update History of This Database Site Policy | Contact Us CAGE peaks - FANTOM5 | LSDB Archive ...

  15. Modelling antibody side chain conformations using heuristic database search.

    Science.gov (United States)

    Ritchie, D W; Kemp, G J

    1997-01-01

    We have developed a knowledge-based system which models the side chain conformations of residues in the variable domains of antibody Fv fragments. The system is written in Prolog and uses an object-oriented database of aligned antibody structures in conjunction with a side chain rotamer library. The antibody database provides 3-dimensional clusters of side chain conformations which can be copied en masse into the model structure. The object-oriented database architecture facilitates a navigational style of database access, necessary to assemble side chains clusters. Around 60% of the model is built using side chain clusters and this eliminates much of the combinatorial complexity associated with many other side chain placement algorithms. Construction and placement of side chain clusters is guided by a heuristic cost function based on a simple model of side chain packing interactions. Even with a simple model, we find that a large proportion of side chain conformations are modelled accurately. We expect our approach could be used with other homologous protein families, in addition to antibodies, both to improve the quality of model structures and to give a "smart start" to the side chain placement problem.

  16. Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span

    Directory of Open Access Journals (Sweden)

    Jordan MI

    2006-05-01

    Full Text Available Abstract Background The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caenorhabditis Genetic Center (CGC Bibliography using techniques from statistical information retrieval. Items in the CGC biomedical text corpus were modeled using the Latent Dirichlet Allocation (LDA model. LDA is a hierarchical Bayesian model which represents a document as a random mixture over latent topics; each topic is characterized by a distribution over words. Results An LDA model estimated from CGC items had better predictive performance than two standard models (unigram and mixture of unigrams trained using the same data. To illustrate the practical utility of LDA models of biomedical corpora, a trained CGC LDA model was used for a retrospective study of nematode genes known to be associated with life span modification. Corpus-, document-, and word-level LDA parameters were combined with terms from the Gene Ontology to enhance the explanatory value of the CGC LDA model, and to suggest additional candidates for age-related genes. A novel, pairwise document similarity measure based on the posterior distribution on the topic simplex was formulated and used to search the CGC database for "homologs" of a "query" document discussing the life span-modifying clk-2 gene. Inspection of these document homologs enabled and facilitated the production of hypotheses about the function and role of clk-2. Conclusion Like other graphical models for genetic, genomic and other types of biological data, LDA provides a method for extracting unanticipated insights and generating predictions amenable to subsequent experimental validation.

  17. Epsilon-Q: An Automated Analyzer Interface for Mass Spectral Library Search and Label-Free Protein Quantification.

    Science.gov (United States)

    Cho, Jin-Young; Lee, Hyoung-Joo; Jeong, Seul-Ki; Paik, Young-Ki

    2017-12-01

    Mass spectrometry (MS) is a widely used proteome analysis tool for biomedical science. In an MS-based bottom-up proteomic approach to protein identification, sequence database (DB) searching has been routinely used because of its simplicity and convenience. However, searching a sequence DB with multiple variable modification options can increase processing time, false-positive errors in large and complicated MS data sets. Spectral library searching is an alternative solution, avoiding the limitations of sequence DB searching and allowing the detection of more peptides with high sensitivity. Unfortunately, this technique has less proteome coverage, resulting in limitations in the detection of novel and whole peptide sequences in biological samples. To solve these problems, we previously developed the "Combo-Spec Search" method, which uses manually multiple references and simulated spectral library searching to analyze whole proteomes in a biological sample. In this study, we have developed a new analytical interface tool called "Epsilon-Q" to enhance the functions of both the Combo-Spec Search method and label-free protein quantification. Epsilon-Q performs automatically multiple spectral library searching, class-specific false-discovery rate control, and result integration. It has a user-friendly graphical interface and demonstrates good performance in identifying and quantifying proteins by supporting standard MS data formats and spectrum-to-spectrum matching powered by SpectraST. Furthermore, when the Epsilon-Q interface is combined with the Combo-Spec search method, called the Epsilon-Q system, it shows a synergistic function by outperforming other sequence DB search engines for identifying and quantifying low-abundance proteins in biological samples. The Epsilon-Q system can be a versatile tool for comparative proteome analysis based on multiple spectral libraries and label-free quantification.

  18. Main - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ntents List of datasets Data file File name: kome_main.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kome...ase Database Description Download License Update History of This Database Site Policy | Contact Us Main - KOME | LSDB Archive ...

  19. Combining history of medicine and library instruction: an innovative approach to teaching database searching to medical students.

    Science.gov (United States)

    Timm, Donna F; Jones, Dee; Woodson, Deidra; Cyrus, John W

    2012-01-01

    Library faculty members at the Health Sciences Library at the LSU Health Shreveport campus offer a database searching class for third-year medical students during their surgery rotation. For a number of years, students completed "ten-minute clinical challenges," but the instructors decided to replace the clinical challenges with innovative exercises using The Edwin Smith Surgical Papyrus to emphasize concepts learned. The Surgical Papyrus is an online resource that is part of the National Library of Medicine's "Turning the Pages" digital initiative. In addition, vintage surgical instruments and historic books are displayed in the classroom to enhance the learning experience.

  20. Possible use of fuzzy logic in database

    Directory of Open Access Journals (Sweden)

    Vaclav Bezdek

    2011-04-01

    Full Text Available The article deals with fuzzy logic and its possible use in database systems. At first fuzzy thinking style is shown on a simple example. Next the advantages of the fuzzy approach to database searching are considered on the database of used cars in the Czech Republic.

  1. BRC - MicrobeDB.jp | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...table). Data file File name: brc.tar.gz File URL: ftp://ftp.biosciencedbc.jp/archive/microbedb/LATEST/brc.ta...rains in JCM. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us BRC - MicrobeDB.jp | LSDB Archive ...

  2. SRA - MicrobeDB.jp | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e following table). Data file File name: sra.tar.gz File URL: ftp://ftp.biosciencedbc.jp/archive/microbedb/L...t This Database Database Description Download License Update History of This Database Site Policy | Contact Us SRA - MicrobeDB.jp | LSDB Archive ...

  3. A comparative study of six European databases of medically oriented Web resources.

    Science.gov (United States)

    Abad García, Francisca; González Teruel, Aurora; Bayo Calduch, Patricia; de Ramón Frias, Rosa; Castillo Blasco, Lourdes

    2005-10-01

    The paper describes six European medically oriented databases of Web resources, pertaining to five quality-controlled subject gateways, and compares their performance. The characteristics, coverage, procedure for selecting Web resources, record structure, searching possibilities, and existence of user assistance were described for each database. Performance indicators for each database were obtained by means of searches carried out using the key words, "myocardial infarction." Most of the databases originated in the 1990s in an academic or library context and include all types of Web resources of an international nature. Five databases use Medical Subject Headings. The number of fields per record varies between three and nineteen. The language of the search interfaces is mostly English, and some of them allow searches in other languages. In some databases, the search can be extended to Pubmed. Organizing Medical Networked Information, Catalogue et Index des Sites Médicaux Francophones, and Diseases, Disorders and Related Topics produced the best results. The usefulness of these databases as quick reference resources is clear. In addition, their lack of content overlap means that, for the user, they complement each other. Their continued survival faces three challenges: the instability of the Internet, maintenance costs, and lack of use in spite of their potential usefulness.

  4. Search Engines for Tomorrow's Scholars

    Science.gov (United States)

    Fagan, Jody Condit

    2011-01-01

    Today's scholars face an outstanding array of choices when choosing search tools: Google Scholar, discipline-specific abstracts and index databases, library discovery tools, and more recently, Microsoft's re-launch of their academic search tool, now dubbed Microsoft Academic Search. What are these tools' strengths for the emerging needs of…

  5. Enabling Searches on Wavelengths in a Hyperspectral Indices Database

    Science.gov (United States)

    Piñuela, F.; Cerra, D.; Müller, R.

    2017-10-01

    Spectral indices derived from hyperspectral reflectance measurements are powerful tools to estimate physical parameters in a non-destructive and precise way for several fields of applications, among others vegetation health analysis, coastal and deep water constituents, geology, and atmosphere composition. In the last years, several micro-hyperspectral sensors have appeared, with both full-frame and push-broom acquisition technologies, while in the near future several hyperspectral spaceborne missions are planned to be launched. This is fostering the use of hyperspectral data in basic and applied research causing a large number of spectral indices to be defined and used in various applications. Ad hoc search engines are therefore needed to retrieve the most appropriate indices for a given application. In traditional systems, query input parameters are limited to alphanumeric strings, while characteristics such as spectral range/ bandwidth are not used in any existing search engine. Such information would be relevant, as it enables an inverse type of search: given the spectral capabilities of a given sensor or a specific spectral band, find all indices which can be derived from it. This paper describes a tool which enables a search as described above, by using the central wavelength or spectral range used by a given index as a search parameter. This offers the ability to manage numeric wavelength ranges in order to select indices which work at best in a given set of wavelengths or wavelength ranges.

  6. DB-PABP: a database of polyanion-binding proteins.

    Science.gov (United States)

    Fang, Jianwen; Dong, Yinghua; Salamat-Miller, Nazila; Middaugh, C Russell

    2008-01-01

    The interactions between polyanions (PAs) and polyanion-binding proteins (PABPs) have been found to play significant roles in many essential biological processes including intracellular organization, transport and protein folding. Furthermore, many neurodegenerative disease-related proteins are PABPs. Thus, a better understanding of PA/PABP interactions may not only enhance our understandings of biological systems but also provide new clues to these deadly diseases. The literature in this field is widely scattered, suggesting the need for a comprehensive and searchable database of PABPs. The DB-PABP is a comprehensive, manually curated and searchable database of experimentally characterized PABPs. It is freely available and can be accessed online at http://pabp.bcf.ku.edu/DB_PABP/. The DB-PABP was implemented as a MySQL relational database. An interactive web interface was created using Java Server Pages (JSP). The search page of the database is organized into a main search form and a section for utilities. The main search form enables custom searches via four menus: protein names, polyanion names, the source species of the proteins and the methods used to discover the interactions. Available utilities include a commonality matrix, a function of listing PABPs by the number of interacting polyanions and a string search for author surnames. The DB-PABP is maintained at the University of Kansas. We encourage users to provide feedback and submit new data and references.

  7. NASA's GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain V.; Tran, Peter B.

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate 'open science' biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics ('omics') data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  8. NASAs GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain; Tran, Peter

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate open science biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics (omics) data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  9. Three-dimensional biomedical imaging

    International Nuclear Information System (INIS)

    Robb, R.A.

    1985-01-01

    Scientists in biomedical imaging provide researchers, physicians, and academicians with an understanding of the fundamental theories and practical applications of three-dimensional biomedical imaging methodologies. Succinct descriptions of each imaging modality are supported by numerous diagrams and illustrations which clarify important concepts and demonstrate system performance in a variety of applications. Comparison of the different functional attributes, relative advantages and limitations, complementary capabilities, and future directions of three-dimensional biomedical imaging modalities are given. Volume 1: Introductions to Three-Dimensional Biomedical Imaging Photoelectronic-Digital Imaging for Diagnostic Radiology. X-Ray Computed Tomography - Basic Principles. X-Ray Computed Tomography - Implementation and Applications. X-Ray Computed Tomography: Advanced Systems and Applications in Biomedical Research and Diagnosis. Volume II: Single Photon Emission Computed Tomography. Position Emission Tomography (PET). Computerized Ultrasound Tomography. Fundamentals of NMR Imaging. Display of Multi-Dimensional Biomedical Image Information. Summary and Prognostications

  10. Evidence-based librarianship: searching for the needed EBL evidence.

    Science.gov (United States)

    Eldredge, J D

    2000-01-01

    This paper discusses the challenges of finding evidence needed to implement Evidence-Based Librarianship (EBL). Focusing first on database coverage for three health sciences librarianship journals, the article examines the information contents of different databases. Strategies are needed to search for relevant evidence in the library literature via these databases, and the problems associated with searching the grey literature of librarianship. Database coverage, plausible search strategies, and the grey literature of library science all pose challenges to finding the needed research evidence for practicing EBL. Health sciences librarians need to ensure that systems are designed that can track and provide access to needed research evidence to support Evidence-Based Librarianship (EBL).

  11. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases.

    Science.gov (United States)

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-06-01

    The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles.

  12. Movie collection - TogoTV | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ols. Data file File name: movie File URL: ftp://ftp.biosciencedbc.jp/archive/togotv/movie/ File size: 200 GB...ata entries 1169 entries - About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Movie collection - TogoTV | LSDB Archive ...

  13. Database-independent, database-dependent, and extended interpretation of peptide mass spectra in VEMS V2.0

    DEFF Research Database (Denmark)

    Matthiesen, Rune; Bunkenborg, Jakob; Stensballe, Allan

    2004-01-01

    , and generation of protein and peptide databases. VEMS V2.0 has been developed into a fast tool for combining database-independent and -dependent protein assignments in an extended analysis of MS/MS-peptide data. MS or MS/MS data can be directly recalibrated after the first search by fitting the data to the best...... search result using polynomial equations. The score function is an improvement of known scoring algorithms and can be adapted for any MS instrument type. In addition, VEMS offers a novel statistical model for evaluating the significance of the protein assignment. The novel features are illustrated...

  14. Main - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... name: at_atlas_en.zip File URL: ftp://ftp.biosciencedbc.jp/archive/at_atlas/LATE... Database Description Download License Update History of This Database Site Policy | Contact Us Main - AT Atlas | LSDB Archive ...

  15. Protein (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ut This Database Database Description Download License Update History of This Database Site Policy | Contact Us Protein (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Protein (Cyanobacteria) Data detail Data name Protein (Cyanobacteria) DO...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  16. The CAPEC Database

    DEFF Research Database (Denmark)

    Nielsen, Thomas Lund; Abildskov, Jens; Harper, Peter Mathias

    2001-01-01

    in the compound. This classification makes the CAPEC database a very useful tool, for example, in the development of new property models, since properties of chemically similar compounds are easily obtained. A program with efficient search and retrieval functions of properties has been developed.......The Computer-Aided Process Engineering Center (CAPEC) database of measured data was established with the aim to promote greater data exchange in the chemical engineering community. The target properties are pure component properties, mixture properties, and special drug solubility data....... The database divides pure component properties into primary, secondary, and functional properties. Mixture properties are categorized in terms of the number of components in the mixture and the number of phases present. The compounds in the database have been classified on the basis of the functional groups...

  17. Multilingual Federated Searching Across Heterogeneous Collections.

    Science.gov (United States)

    Powell, James; Fox, Edward A.

    1998-01-01

    Describes a scalable system for searching heterogeneous multilingual collections on the World Wide Web. Details Searchable Database Markup Language (SearchDB-ML) for describing the characteristics of a search engine and its interface, and a protocol for requesting word translations between languages. (Author)

  18. An Improved Forensic Science Information Search.

    Science.gov (United States)

    Teitelbaum, J

    2015-01-01

    Although thousands of search engines and databases are available online, finding answers to specific forensic science questions can be a challenge even to experienced Internet users. Because there is no central repository for forensic science information, and because of the sheer number of disciplines under the forensic science umbrella, forensic scientists are often unable to locate material that is relevant to their needs. The author contends that using six publicly accessible search engines and databases can produce high-quality search results. The six resources are Google, PubMed, Google Scholar, Google Books, WorldCat, and the National Criminal Justice Reference Service. Carefully selected keywords and keyword combinations, designating a keyword phrase so that the search engine will search on the phrase and not individual keywords, and prompting search engines to retrieve PDF files are among the techniques discussed. Copyright © 2015 Central Police University.

  19. Fundamental of biomedical engineering

    CERN Document Server

    Sawhney, GS

    2007-01-01

    About the Book: A well set out textbook explains the fundamentals of biomedical engineering in the areas of biomechanics, biofluid flow, biomaterials, bioinstrumentation and use of computing in biomedical engineering. All these subjects form a basic part of an engineer''s education. The text is admirably suited to meet the needs of the students of mechanical engineering, opting for the elective of Biomedical Engineering. Coverage of bioinstrumentation, biomaterials and computing for biomedical engineers can meet the needs of the students of Electronic & Communication, Electronic & Instrumenta

  20. Preparing College Students To Search Full-Text Databases: Is Instruction Necessary?

    Science.gov (United States)

    Riley, Cheryl; Wales, Barbara

    Full-text databases allow Central Missouri State University's clients to access some of the serials that libraries have had to cancel due to escalating subscription costs; EbscoHost, the subject of this study, is one such database. The database is available free to all Missouri residents. A survey was designed consisting of 21 questions intended…

  1. Building a biomedical cyberinfrastructure for collaborative research.

    Science.gov (United States)

    Schad, Peter A; Mobley, Lee Rivers; Hamilton, Carol M

    2011-05-01

    For the potential power of genome-wide association studies (GWAS) and translational medicine to be realized, the biomedical research community must adopt standard measures, vocabularies, and systems to establish an extensible biomedical cyberinfrastructure. Incorporating standard measures will greatly facilitate combining and comparing studies via meta-analysis. Incorporating consensus-based and well-established measures into various studies should reduce the variability across studies due to attributes of measurement, making findings across studies more comparable. This article describes two well-established consensus-based approaches to identifying standard measures and systems: PhenX (consensus measures for phenotypes and eXposures), and the Open Geospatial Consortium (OGC). NIH support for these efforts has produced the PhenX Toolkit, an assembled catalog of standard measures for use in GWAS and other large-scale genomic research efforts, and the RTI Spatial Impact Factor Database (SIFD), a comprehensive repository of geo-referenced variables and extensive meta-data that conforms to OGC standards. The need for coordinated development of cyberinfrastructure to support measures and systems that enhance collaboration and data interoperability is clear; this paper includes a discussion of standard protocols for ensuring data compatibility and interoperability. Adopting a cyberinfrastructure that includes standard measures and vocabularies, and open-source systems architecture, such as the two well-established systems discussed here, will enhance the potential of future biomedical and translational research. Establishing and maintaining the cyberinfrastructure will require a fundamental change in the way researchers think about study design, collaboration, and data storage and analysis. Copyright © 2011 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  2. Analysis list - ChIP-Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...://ftp.biosciencedbc.jp/archive/chip-atlas/LATEST/chip_atlas_analysis_list.zip File size: 44.8 KB Simple sea...e class. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Analysis list - ChIP-Atlas | LSDB Archive ...

  3. The North Carolina State University Libraries Search Experience: Usability Testing Tabbed Search Interfaces for Academic Libraries

    Science.gov (United States)

    Teague-Rector, Susan; Ballard, Angela; Pauley, Susan K.

    2011-01-01

    Creating a learnable, effective, and user-friendly library Web site hinges on providing easy access to search. Designing a search interface for academic libraries can be particularly challenging given the complexity and range of searchable library collections, such as bibliographic databases, electronic journals, and article search silos. Library…

  4. A student's guide to searching the literature using online databases

    Science.gov (United States)

    Miller, Casey W.; Belyea, Dustin; Chabot, Michelle; Messina, Troy

    2012-02-01

    A method is described to empower students to efficiently perform general and specific literature searches using online resources [Miller et al., Am. J. Phys. 77, 1112 (2009)]. The method was tested on multiple groups, including undergraduate and graduate students with varying backgrounds in scientific literature searches. Students involved in this study showed marked improvement in their awareness of how and where to find scientific information. Repeated exposure to literature searching methods appears worthwhile, starting early in the undergraduate career, and even in graduate school orientation.

  5. Beyond MEDLINE for literature searches.

    Science.gov (United States)

    Conn, Vicki S; Isaramalai, Sang-arun; Rath, Sabyasachi; Jantarakupt, Peeranuch; Wadhawan, Rohini; Dash, Yashodhara

    2003-01-01

    To describe strategies for a comprehensive literature search. MEDLINE searches result in limited numbers of studies that are often biased toward statistically significant findings. Diversified search strategies are needed. Empirical evidence about the recall and precision of diverse search strategies is presented. Challenges and strengths of each search strategy are identified. Search strategies vary in recall and precision. Often sensitivity and specificity are inversely related. Valuable search strategies include examination of multiple diverse computerized databases, ancestry searches, citation index searches, examination of research registries, journal hand searching, contact with the "invisible college," examination of abstracts, Internet searches, and contact with sources of synthesized information. Extending searches beyond MEDLINE enables researchers to conduct more systematic comprehensive searches.

  6. Cluster (Viridiplantae) - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available 0”. This cluster ID is uniquely-assigned by the PGDBj Ortholog Database. Cluster size Number of proteins aff...r About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Cluster (Viridiplantae) - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Cluster (Viridiplantae) Data detail Data name Cluster (Viridiplantae) DO...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  7. Cluster (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available 3090”. This cluster ID is uniquely-assigned by the PGDBj Ortholog Database. Cluster size Number of proteins ...ster About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Cluster (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Cluster (Cyanobacteria) Data detail Data name Cluster (Cyanobacteria) DO...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  8. The biomedical disciplines and the structure of biomedical and clinical knowledge.

    Science.gov (United States)

    Nederbragt, H

    2000-11-01

    The relation between biomedical knowledge and clinical knowledge is discussed by comparing their respective structures. The knowledge of a disease as a biological phenomenon is constructed by the interaction of facts and theories from the main biomedical disciplines: epidemiology, diagnostics, clinical trial, therapy development and pathogenesis. Although these facts and theories are based on probabilities and extrapolations, the interaction provides a reliable and coherent structure, comparable to a Kuhnian paradigma. In the structure of clinical knowledge, i.e. knowledge of the patient with the disease, not only biomedical knowledge contributes to the structure but also economic and social relations, ethics and personal experience. However, the interaction between each of the participating "knowledges" in clinical knowledge is not based on mutual dependency and accumulation of different arguments from each, as in biomedical knowledge, but on competition and partial exclusion. Therefore, the structure of biomedical knowledge is different from that of clinical knowledge. This difference is used as the basis for a discussion in which the place of technology, evidence-based medicine and the gap between scientific and clinical knowledge are evaluated.

  9. Citation Searching: Search Smarter & Find More

    Science.gov (United States)

    Hammond, Chelsea C.; Brown, Stephanie Willen

    2008-01-01

    The staff at University of Connecticut are participating in Elsevier's Student Ambassador Program (SAmP) in which graduate students train their peers on "citation searching" research using Scopus and Web of Science, two tremendous citation databases. They are in the fourth semester of these training programs, and they are wildly successful: They…

  10. Biomedical signals, imaging, and informatics

    CERN Document Server

    Bronzino, Joseph D

    2014-01-01

    Known as the bible of biomedical engineering, The Biomedical Engineering Handbook, Fourth Edition, sets the standard against which all other references of this nature are measured. As such, it has served as a major resource for both skilled professionals and novices to biomedical engineering.Biomedical Signals, Imaging, and Informatics, the third volume of the handbook, presents material from respected scientists with diverse backgrounds in biosignal processing, medical imaging, infrared imaging, and medical informatics.More than three dozen specific topics are examined, including biomedical s

  11. Powering biomedical devices

    CERN Document Server

    Romero, Edwar

    2013-01-01

    From exoskeletons to neural implants, biomedical devices are no less than life-changing. Compact and constant power sources are necessary to keep these devices running efficiently. Edwar Romero's Powering Biomedical Devices reviews the background, current technologies, and possible future developments of these power sources, examining not only the types of biomedical power sources available (macro, mini, MEMS, and nano), but also what they power (such as prostheses, insulin pumps, and muscular and neural stimulators), and how they work (covering batteries, biofluids, kinetic and ther

  12. PREIMS - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Targeted Proteins Research Program (TPRP). Data file File name: at_atlas_preims.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...base Database Description Download License Update History of This Database Site Policy | Contact Us PREIMS - AT Atlas | LSDB Archive ...

  13. Mobile object retrieval in server-based image databases

    Science.gov (United States)

    Manger, D.; Pagel, F.; Widak, H.

    2013-05-01

    The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.

  14. ElasticSearch server

    CERN Document Server

    Rogozinski, Marek

    2014-01-01

    This book is a detailed, practical, hands-on guide packed with real-life scenarios and examples which will show you how to implement an ElasticSearch search engine on your own websites.If you are a web developer or a user who wants to learn more about ElasticSearch, then this is the book for you. You do not need to know anything about ElastiSeach, Java, or Apache Lucene in order to use this book, though basic knowledge about databases and queries is required.

  15. An ontology-based search engine for protein-protein interactions.

    Science.gov (United States)

    Park, Byungkyu; Han, Kyungsook

    2010-01-18

    Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.

  16. Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology

    International Nuclear Information System (INIS)

    Shen Yang; Bax, Ad

    2007-01-01

    Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in φ/ψ/χ 1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C' chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 A) structure is available. The relative importance of the weighting factors for the φ/ψ/χ 1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C', respectively, including outliers

  17. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2004-10-01

    Full Text Available Abstract Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper, a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST that provides a complementary solution for BLAST searches when the database is too large to fit into

  18. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

    Science.gov (United States)

    Wang, Chunlin; Lefkowitz, Elliot J

    2004-10-28

    Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together

  19. CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database.

    Science.gov (United States)

    Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C

    2010-12-01

    The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.

  20. BLAST and FASTA similarity searching for multiple sequence alignment.

    Science.gov (United States)

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  1. Past and future trends in cancer and biomedical research: a comparison between Egypt and the World using PubMed-indexed publications

    Directory of Open Access Journals (Sweden)

    Zeeneldin Ahmed Abdelmabood

    2012-07-01

    Full Text Available Abstract Background PubMed is a free web literature search service that contains almost 21 millions of abstracts and publications with almost 5 million user queries daily. The purposes of the study were to compare trends in PubMed-indexed cancer and biomedical publications from Egypt to that of the world and to predict future publication volumes. Methods The PubMed was searched for the biomedical publications between 1991 and 2010 (publications dates. Affiliation was then limited to Egypt. Further limitation was applied to cancer, human and animal publications. Poisson regression model was used for prediction of future number of publications between 2011 and 2020. Results Cancer publications contributed 23% to biomedical publications both for Egypt and the world. Egyptian biomedical and cancer publications contributed about 0.13% to their world counterparts. This contribution was more than doubled over the study period. Egyptian and world’s publications increased from year to year with rapid rise starting the year 2003. Egyptian as well as world’s human cancer publications showed the highest increases. Egyptian publications had some peculiarities; they showed some drop at the years 1994 and 2002 and apart from the decline in the animal: human ratio with time, all Egyptian publications in the period 1991-2000 were significantly more than those in 2001-2010 (P  Conclusions The Egyptian contribution to world’s biomedical and cancer publications needs significant improvements through research strategic planning, setting national research priorities, adequate funding and researchers’ training.

  2. Past and future trends in cancer and biomedical research: a comparison between Egypt and the world using PubMed-indexed publications.

    Science.gov (United States)

    Zeeneldin, Ahmed Abdelmabood; Taha, Fatma Mohamed; Moneer, Manar

    2012-07-10

    PubMed is a free web literature search service that contains almost 21 millions of abstracts and publications with almost 5 million user queries daily. The purposes of the study were to compare trends in PubMed-indexed cancer and biomedical publications from Egypt to that of the world and to predict future publication volumes. The PubMed was searched for the biomedical publications between 1991 and 2010 (publications dates). Affiliation was then limited to Egypt. Further limitation was applied to cancer, human and animal publications. Poisson regression model was used for prediction of future number of publications between 2011 and 2020. Cancer publications contributed 23% to biomedical publications both for Egypt and the world. Egyptian biomedical and cancer publications contributed about 0.13% to their world counterparts. This contribution was more than doubled over the study period. Egyptian and world's publications increased from year to year with rapid rise starting the year 2003. Egyptian as well as world's human cancer publications showed the highest increases. Egyptian publications had some peculiarities; they showed some drop at the years 1994 and 2002 and apart from the decline in the animal: human ratio with time, all Egyptian publications in the period 1991-2000 were significantly more than those in 2001-2010 (P PubMed publications, respectively. The Egyptian contribution to world's biomedical and cancer publications needs significant improvements through research strategic planning, setting national research priorities, adequate funding and researchers' training.

  3. High serum folate is associated with reduced biochemical recurrence after radical prostatectomy: Results from the SEARCH Database

    Directory of Open Access Journals (Sweden)

    Daniel M. Moreira

    2013-06-01

    Full Text Available Introduction To analyze the association between serum levels of folate and risk of biochemical recurrence after radical prostatectomy among men from the Shared Equal Access Regional Cancer Hospital (SEARCH database. Materials and Methods Retrospective analysis of 135 subjects from the SEARCH database treated between 1991-2009 with available preoperative serum folate levels. Patients' characteristics at the time of the surgery were analyzed with ranksum and linear regression. Uni- and multivariable analyses of folate levels (log-transformed and time to biochemical recurrence were performed with Cox proportional hazards. Results The median preoperative folate level was 11.6ng/mL (reference = 1.5-20.0ng/mL. Folate levels were significantly lower among African-American men than Caucasians (P = 0.003. In univariable analysis, higher folate levels were associated with more recent year of surgery (P < 0.001 and lower preoperative PSA (P = 0.003. In univariable analysis, there was a trend towards lower risk of biochemical recurrence among men with high folate levels (HR = 0.61, 95%CI = 0.37-1.03, P = 0.064. After adjustments for patients characteristics' and pre- and post-operative clinical and pathological findings, higher serum levels of folate were independently associated with lower risk for biochemical recurrence (HR = 0.42, 95%CI = 0.20-0.89, P = 0.023. Conclusion In a cohort of men undergoing radical prostatectomy at several VAs across the country, higher serum folate levels were associated with lower PSA and lower risk for biochemical failure. While the source of the folate in the serum in this study is unknown (i.e. diet vs. supplement, these findings, if confirmed, suggest a potential role of folic acid supplementation or increased consumption of folate rich foods to reduce the risk of recurrence.

  4. Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine.

    Science.gov (United States)

    Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai

    2017-03-01

    The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.

  5. Omicseq: a web-based search engine for exploring omics datasets.

    Science.gov (United States)

    Sun, Xiaobo; Pittard, William S; Xu, Tianlei; Chen, Li; Zwick, Michael E; Jiang, Xiaoqian; Wang, Fusheng; Qin, Zhaohui S

    2017-07-03

    The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve 'findability' of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Protocol - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e version) File URL: ftp://ftp.biosciencedbc.jp/archive/rpd/LATEST/rpd_protocol_jp.zip File size: 535 KB Fil...e name: rpd_protocol_en.zip (English version) File URL: ftp://ftp.biosciencedbc.jp/archiv...tabase Database Description Download License Update History of This Database Site Policy | Contact Us Protocol - RPD | LSDB Archive ...

  7. Tibetan Magmatism Database

    Science.gov (United States)

    Chapman, James B.; Kapp, Paul

    2017-11-01

    A database containing previously published geochronologic, geochemical, and isotopic data on Mesozoic to Quaternary igneous rocks in the Himalayan-Tibetan orogenic system are presented. The database is intended to serve as a repository for new and existing igneous rock data and is publicly accessible through a web-based platform that includes an interactive map and data table interface with search, filtering, and download options. To illustrate the utility of the database, the age, location, and ɛHft composition of magmatism from the central Gangdese batholith in the southern Lhasa terrane are compared. The data identify three high-flux events, which peak at 93, 50, and 15 Ma. They are characterized by inboard arc migration and a temporal and spatial shift to more evolved isotopic compositions.

  8. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses.

    Science.gov (United States)

    Falagas, Matthew E; Pitsouni, Eleni I; Malietzis, George A; Pappas, Georgios

    2008-02-01

    The evolution of the electronic age has led to the development of numerous medical databases on the World Wide Web, offering search facilities on a particular subject and the ability to perform citation analysis. We compared the content coverage and practical utility of PubMed, Scopus, Web of Science, and Google Scholar. The official Web pages of the databases were used to extract information on the range of journals covered, search facilities and restrictions, and update frequency. We used the example of a keyword search to evaluate the usefulness of these databases in biomedical information retrieval and a specific published article to evaluate their utility in performing citation analysis. All databases were practical in use and offered numerous search facilities. PubMed and Google Scholar are accessed for free. The keyword search with PubMed offers optimal update frequency and includes online early articles; other databases can rate articles by number of citations, as an index of importance. For citation analysis, Scopus offers about 20% more coverage than Web of Science, whereas Google Scholar offers results of inconsistent accuracy. PubMed remains an optimal tool in biomedical electronic research. Scopus covers a wider journal range, of help both in keyword searching and citation analysis, but it is currently limited to recent articles (published after 1995) compared with Web of Science. Google Scholar, as for the Web in general, can help in the retrieval of even the most obscure information but its use is marred by inadequate, less often updated, citation information.

  9. Biomedical ontologies: toward scientific debate.

    Science.gov (United States)

    Maojo, V; Crespo, J; García-Remesal, M; de la Iglesia, D; Perez-Rey, D; Kulikowski, C

    2011-01-01

    Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.

  10. Download - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available e Description Download License Update History of This Database Site Policy | Contact Us Download - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Download First of all, please read the license of this database. Data na...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  11. Predicting the performance of fingerprint similarity searching.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  12. DRUMS: a human disease related unique gene mutation search engine.

    Science.gov (United States)

    Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

    2011-10-01

    With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

  13. Developing an Inhouse Database from Online Sources.

    Science.gov (United States)

    Smith-Cohen, Deborah

    1993-01-01

    Describes the development of an in-house bibliographic database by the U.S. Army Corp of Engineers Cold Regions Research and Engineering Laboratory on arctic wetlands research. Topics discussed include planning; identifying relevant search terms and commercial online databases; downloading citations; criteria for software selection; management…

  14. A Survey in Indexing and Searching XML Documents.

    Science.gov (United States)

    Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James

    2002-01-01

    Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…

  15. A database of immunoglobulins with integrated tools: DIGIT.

    KAUST Repository

    Chailyan, Anna; Tramontano, Anna; Marcatili, Paolo

    2011-01-01

    The DIGIT (Database of ImmunoGlobulins with Integrated Tools) database (http://biocomputing.it/digit) is an integrated resource storing sequences of annotated immunoglobulin variable domains and enriched with tools for searching and analyzing them. The annotations in the database include information on the type of antigen, the respective germline sequences and on pairing information between light and heavy chains. Other annotations, such as the identification of the complementarity determining regions, assignment of their structural class and identification of mutations with respect to the germline, are computed on the fly and can also be obtained for user-submitted sequences. The system allows customized BLAST searches and automatic building of 3D models of the domains to be performed.

  16. A database of immunoglobulins with integrated tools: DIGIT.

    KAUST Repository

    Chailyan, Anna

    2011-11-10

    The DIGIT (Database of ImmunoGlobulins with Integrated Tools) database (http://biocomputing.it/digit) is an integrated resource storing sequences of annotated immunoglobulin variable domains and enriched with tools for searching and analyzing them. The annotations in the database include information on the type of antigen, the respective germline sequences and on pairing information between light and heavy chains. Other annotations, such as the identification of the complementarity determining regions, assignment of their structural class and identification of mutations with respect to the germline, are computed on the fly and can also be obtained for user-submitted sequences. The system allows customized BLAST searches and automatic building of 3D models of the domains to be performed.

  17. A Part-Of-Speech term weighting scheme for biomedical information retrieval.

    Science.gov (United States)

    Wang, Yanshan; Wu, Stephen; Li, Dingcheng; Mehrabi, Saeed; Liu, Hongfang

    2016-10-01

    In the era of digitalization, information retrieval (IR), which retrieves and ranks documents from large collections according to users' search queries, has been popularly applied in the biomedical domain. Building patient cohorts using electronic health records (EHRs) and searching literature for topics of interest are some IR use cases. Meanwhile, natural language processing (NLP), such as tokenization or Part-Of-Speech (POS) tagging, has been developed for processing clinical documents or biomedical literature. We hypothesize that NLP can be incorporated into IR to strengthen the conventional IR models. In this study, we propose two NLP-empowered IR models, POS-BoW and POS-MRF, which incorporate automatic POS-based term weighting schemes into bag-of-word (BoW) and Markov Random Field (MRF) IR models, respectively. In the proposed models, the POS-based term weights are iteratively calculated by utilizing a cyclic coordinate method where golden section line search algorithm is applied along each coordinate to optimize the objective function defined by mean average precision (MAP). In the empirical experiments, we used the data sets from the Medical Records track in Text REtrieval Conference (TREC) 2011 and 2012 and the Genomics track in TREC 2004. The evaluation on TREC 2011 and 2012 Medical Records tracks shows that, for the POS-BoW models, the mean improvement rates for IR evaluation metrics, MAP, bpref, and P@10, are 10.88%, 4.54%, and 3.82%, compared to the BoW models; and for the POS-MRF models, these rates are 13.59%, 8.20%, and 8.78%, compared to the MRF models. Additionally, we experimentally verify that the proposed weighting approach is superior to the simple heuristic and frequency based weighting approaches, and validate our POS category selection. Using the optimal weights calculated in this experiment, we tested the proposed models on the TREC 2004 Genomics track and obtained average of 8.63% and 10.04% improvement rates for POS-BoW and POS

  18. Intelligent methods for data retrieval in fusion databases

    International Nuclear Information System (INIS)

    Vega, J.

    2008-01-01

    The plasma behaviour is identified through the recognition of patterns inside signals. The search for patterns is usually a manual and tedious procedure in which signals need to be examined individually. A breakthrough in data retrieval for fusion databases is the development of intelligent methods to search for patterns. A pattern (in the broadest sense) could be a single segment of a waveform, a set of pixels within an image or even a heterogeneous set of features made up of waveforms, images and any kind of experimental data. Intelligent methods will allow searching for data according to technical, scientific and structural criteria instead of an identifiable time interval or pulse number. Such search algorithms should be intelligent enough to avoid passing over the entire database. Benefits of such access methods are discussed and several available techniques are reviewed. In addition, the applicability of the methods from general purpose searching systems to ad hoc developments is covered

  19. License - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...wnload License Update History of This Database Site Policy | Contact Us License - PSCDB | LSDB Archive ...

  20. Download - RPSD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Download License Update History of This Database Site Policy | Contact Us Download - RPSD | LSDB Archive ...