GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    In: Biodiversity Data Journal, Pensoft Publishers, Vol. 9 ( 2021-09-24)
    Abstract: OpenBiodiv is a biodiversity knowledge graph containing a synthetic linked open dataset, OpenBiodiv-LOD, which combines knowledge extracted from academic literature with the taxonomic backbone used by the Global Biodiversity Information Facility. The linked open data is modelled according to the OpenBiodiv-O ontology integrating semantic resource types from recognised biodiversity and publishing ontologies with OpenBiodiv-O resource types, introduced to capture the semantics of resources not modelled before. We introduce the new release of the OpenBiodiv-LOD attained through information extraction and modelling of additional biodiversity entities. It was achieved by further developments to OpenBiodiv-O, the data storage infrastructure and the workflow and accompanying R software packages used for transformation of academic literature into Resource Description Framework (RDF). We discuss how to utilise the LOD in biodiversity informatics and give examples by providing solutions to several competency questions. We investigate performance issues that arise due to the large amount of inferred statements in the graph and conclude that OWL-full inference is impractical for the project and that unnecessary inference should be avoided.
    Type of Medium: Online Resource
    ISSN: 1314-2828 , 1314-2836
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2021
    detail.hit.zdb_id: 2736709-5
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Pensoft Publishers ; 2020
    In:  Journal of the Bulgarian Geographical Society Vol. 42 ( 2020-05-04), p. 146-149
    In: Journal of the Bulgarian Geographical Society, Pensoft Publishers, Vol. 42 ( 2020-05-04), p. 146-149
    Abstract: Acad. Anastas Beshkov (1896-1964) is one of the leading Bulgarian geographic researchers of the 20th century. His fundamental work is connected with development of regional geographical researches in Bulgaria and the first economic geographic regionalization of Bulgaria (1934). The scientific work and expertises of acad. Beshkov are related to important economic national projects as the importance of transport for development of the economy and settlements, decision of the transport problems in Dobrudzha, position of the factory for fertilizers near Stara Zagora town, an idea for the channel Varna- Devnya, developed in a project in 1965 and realized in 1975.
    Type of Medium: Online Resource
    ISSN: 2738-8115 , 2738-8107
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2020
    detail.hit.zdb_id: 3121218-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    In: Biodiversity Information Science and Standards, Pensoft Publishers, Vol. 4 ( 2020-09-28)
    Abstract: Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al. 2017). They describe single or multiple datasets and the methodologies required for their generation. Similar to traditional research articles, data papers and the underlying datasets are peer-reviewed. In this poster, we demonstrate how data papers can be used to incentivise researchers producing omics datasets to increase the quality of the metadata descriptors and the data itself through the journal authoring, peer review and publication process, thus improving data visibility, discoverability, sharing and reuse. We illustrate a highly automated workflow for the creation of omics data paper manuscripts, which started with the development of a template for this specific article type in the Biodiversity Data Journal (BDJ), published by Pensoft (Dimitrova et al. 2020). The workflow streamlines automatic conversion and import of metadata from the European Nucleotide Archive (ENA) into an omics data paper manuscript created in the ARPHA Writing Tool (AWT), following a three step procedure: mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. Records in ENA sometimes have linked data in the ArrayExpress and BioSamples databases, which describe sequencing experiments and samples following the community-accepted metadata standards MINSEQE and MIxS. The workflow also retrieves such records and inserts them both into the omics data paper narrative and as supplementary data files. The workflow has been integrated with Pensoft's ARPHA platform but the conversion code is openly accessible on GitHub under the Apache 2.0 license and can be run as a R Shiny app. By openly providing access to the code and its implementation in a web application, we enable the full reproducibility of the streamlined import of ENA metadata into an omics data paper manuscript. The plan is to further develop the workflow to include the import of various other types of omics data and omics data repositories in addition to the currently supported ENA genomic data. The workflow reaffirms the important role of high-quality metadata for creating extended dataset descriptions, recognised by Chavan and Penev 2011. Conversion of metadata into a manuscript helped us discover many datasets with insufficient or inaccurate metadata. Hence, we hope that our workflow promotes not only omics data paper publishing but also better metadata authoring and curation.
    Type of Medium: Online Resource
    ISSN: 2535-0897
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2020
    detail.hit.zdb_id: 3028709-1
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Pensoft Publishers ; 2020
    In:  Journal of the Bulgarian Geographical Society Vol. 42 ( 2020-05-10), p. 18-23
    In: Journal of the Bulgarian Geographical Society, Pensoft Publishers, Vol. 42 ( 2020-05-10), p. 18-23
    Abstract: Prof. Ivan Batakliev (1891-1973) is one of main persons in the process of foundation and basic development of geographical sciences in Bulgaria. His general scientific works are related with political geography, geopolitics and antropogeography in Bulgaria. He is founder of these scientific subjects and directions in Bulgaria. The work of prof. Batakliev is also connected with important contributions in physical geography, geography of population and settlements, economic geography, methodology of geographical education and regional geography as first landscape regionalization of Bulgaria (1934), first climate regionalization of Bulgaria (1941) and fundamental works for Pazardzhik town and the region of Pazardzhik (1922, 1923, 1969).
    Type of Medium: Online Resource
    ISSN: 2738-8115 , 2738-8107
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2020
    detail.hit.zdb_id: 3121218-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    In: Publications, MDPI AG, Vol. 7, No. 2 ( 2019-05-29), p. 38-
    Abstract: Hundreds of years of biodiversity research have resulted in the accumulation of a substantial pool of communal knowledge; however, most of it is stored in silos isolated from each other, such as published articles or monographs. The need for a system to store and manage collective biodiversity knowledge in a community-agreed and interoperable open format has evolved into the concept of the Open Biodiversity Knowledge Management System (OBKMS). This paper presents OpenBiodiv: An OBKMS that utilizes semantic publishing workflows, text and data mining, common standards, ontology modelling and graph database technologies to establish a robust infrastructure for managing biodiversity knowledge. It is presented as a Linked Open Dataset generated from scientific literature. OpenBiodiv encompasses data extracted from more than 5000 scholarly articles published by Pensoft and many more taxonomic treatments extracted by Plazi from journals of other publishers. The data from both sources are converted to Resource Description Framework (RDF) and integrated in a graph database using the OpenBiodiv-O ontology and an RDF version of the Global Biodiversity Information Facility (GBIF) taxonomic backbone. Through the application of semantic technologies, the project showcases the value of open publishing of Findable, Accessible, Interoperable, Reusable (FAIR) data towards the establishment of open science practices in the biodiversity domain.
    Type of Medium: Online Resource
    ISSN: 2304-6775
    Language: English
    Publisher: MDPI AG
    Publication Date: 2019
    detail.hit.zdb_id: 2720876-X
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Pensoft Publishers ; 2022
    In:  Biodiversity Information Science and Standards Vol. 6 ( 2022-08-23)
    In: Biodiversity Information Science and Standards, Pensoft Publishers, Vol. 6 ( 2022-08-23)
    Abstract: OpenBiodiv is a complex ecosystem of tools and services for RDF conversion of XML narratives of biodiversity articles including Darwin Core data into Linked Open Data (LOD), running on top of a graph database. OpenBiodiv provides four main types of services: Searching named entities (e.g., taxon names, taxon concepts, treatments, specimens, occurrences, gene sequences, bibliographic information, institutions, persons) in context, within and between articles. Answering questions based on the presence of certain named entities within specific article sections (e.g., titles, abstracts, introduction or other sections, taxon treatments). Identifying article sections for further text processing (NLP) and providing contextual information, stored in MongoDB. Federating the SPARQL endpoint with other triple stores to enrich the discovered knowledge. Searching named entities (e.g., taxon names, taxon concepts, treatments, specimens, occurrences, gene sequences, bibliographic information, institutions, persons) in context, within and between articles. Answering questions based on the presence of certain named entities within specific article sections (e.g., titles, abstracts, introduction or other sections, taxon treatments). Identifying article sections for further text processing (NLP) and providing contextual information, stored in MongoDB. Federating the SPARQL endpoint with other triple stores to enrich the discovered knowledge. Conversion of such data into RDF follows a general semantic model expressed in the OpenBiodiv-O ontology, an extension of the Treatment Ontology for knowledge representation of current and legacy biodiversity publications (Senderov et al. 2018) and uses two main sources, the full-text article XML published on the ARPHA Publishing Platform and the taxon treatments extracted by Plazi’s TreatmentBank from more than 100 biodiversity journals, stored in the Biodiversity Literature Repository at Zenodo. To ensure efficiency, quality control and fast tracking of all stages of the entire process of extraction, conversion to RDF and indexing of the content has been re-built on the Apache Kafka event streaming platform (Fig. 1). In this new format, OpenBiodiv provides not only a GraphDB SPARQL query endpoint but also indexes the named entities through Elasticsearch and additional provision of data to end users through a RESTful API and a number of user applications. OpenBiodiv is designed for a wide range of users who are interested in a deep-level bibliographic exploration, an ontology-linked search of various data elements (e.g., specimens, sequences, taxon concepts, persons), or co-existence of named entities (e.g., taxon names with a possible biotic relationships between them, or taxon names and potential habitats of occupation) in pre-defined sections of the articles. The SPARQL endpoint allows complex queries of various kinds (Dimitrova et al. 2021).
    Type of Medium: Online Resource
    ISSN: 2535-0897
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2022
    detail.hit.zdb_id: 3028709-1
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    Online Resource
    Online Resource
    Pensoft Publishers ; 2019
    In:  Biodiversity Information Science and Standards Vol. 3 ( 2019-06-18)
    In: Biodiversity Information Science and Standards, Pensoft Publishers, Vol. 3 ( 2019-06-18)
    Abstract: OpenBiodiv is a knowledge management system containing biodiversity knowledge extracted from scholarly literature: both recently published articles in Pensoft's journals and legacy (taxon treatments extracted by Plazi) (Senderov et al. 2017). OpenBiodiv advances our understanding of the use of scientific names, collection codes and institutions within published literature by using semantic technologies, such as the conversion of XML-encoded text to RDF triples, linked via the OpenBiodiv-O onthology (Senderov et al. 2018). In this poster, we show how OpenBiodiv, currently containing more than 729 million statements, can be used to address a specific use case: finding institutions storing type material specimens of the genus Prosopistoma from various literature sources (Fig. 1). This use case is important for various groups of users: institutions, taxonomists, and curators. Answering this complex question is made possible through the application of semantic technologies within OpenBiodiv. Data extraction from taxonomic articles and treatments is enabled the utilisation of common schemas and standards into the extraction process, whereas the conversion of XML-encoded scholarly literature into Resоurce Description Framework (RDF) is facilitated by OpenBiodiv-O. The code base for information extraction and data transformation is wrapped in the R packages rdf4r and ropenbio. The ontology allows to model the structure of research articles and treatments, as well as their corresponding metadata. Thus, OpenBiodiv-O is used to represent not only the sections of treatments but also the various entities within them, for instance geographic coordinates and institution codes within the “Type materials” section of a treatment. Institution codes marked up within articles using the Darwin Core standard (Wieczorek et al. 2012) are mapped to GRBio's institution records. Institutions which are not present in GRBio can often be extracted from the “Abbreviations” section of a given article, thus utilising the power of semantic publishing workflows to discover information hidden within scholarly literature (Penev et al. 2011, Agosti and Egloff 2009). Institutional codes (abbreviations) are then mapped to the narrative section, containing the type materials information. The extraction of coordinates in the taxonomic treatment section allows to establish the location of the collection event through reverse geocoding and enables the selection of treatments linked to a specific geographic region. Modelling of the “Nomenclature” section within OpenBiodiv-O helps to link taxonomic names, mapped to GBIF’s taxonomic backbone, to their type materials, thus facilitating the discovery of materials corresponding to species from a certain higher-rank taxon.
    Type of Medium: Online Resource
    ISSN: 2535-0897
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2019
    detail.hit.zdb_id: 3028709-1
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 8
    Online Resource
    Online Resource
    Pensoft Publishers ; 2020
    In:  Biodiversity Information Science and Standards Vol. 4 ( 2020-09-28)
    In: Biodiversity Information Science and Standards, Pensoft Publishers, Vol. 4 ( 2020-09-28)
    Abstract: Introduction Scholarly literature is the primary source for biodiversity knowledge based on observations, field work, analysis and taxonomic classification. Publishing such literature in semantically enhanced formats (e.g., through Extensible Markup Language (XML) tagging) helps to make this knowledge easily accessible and available to humans and actionable by computers. A recent collaboration between Pensoft Publishers and Global Biotic Interactions (GloBI) (Poelen et al. 2014) demonstrates how semantically published literature can be used to extract species interactions from tables published in the article narratives (Dimitrova et al. 2020) (Fig. 1). Methods Biotic interactions were extracted from scholarly literature tables published in several biodiversity journals from Pensoft. Semantically enhanced publications were processed to extract the tables from the article XMLs. There were 6993 tables from 21 different journals. Using the Pensoft Annotator, a text-to-ontology mapping tool, we were able to detect tables that could contain biotic interactions. The Pensoft Annotator was used together with a modified subset of the OBO Foundry Relation Ontology (RO), concentrating on the term labeled ‘biotically interacts with’ and all its children. The contents and captions of all tables were run through the Pensoft Annotator, which returned the matching ontology terms and their position in the text. The resulting subset of tables was then processed by GloBI, which parsed the tables to extract the taxonomic names participating in each interaction. The GloBI workflow also generated table citations by SPARQL queries to the OpenBiodiv triple store where all table and article metadata are stored (Penev et al. 2019). OpenBiodiv was also used as a taxon name knowledge base to expand the taxon hierarchy in the tables and to guide the merging of overlapping taxon hierarchies in a single row (e.g., host plant family + host plant species - & gt; host plant species). Taxon name resolution of species interactions was done under the assumption that two non-overlapping taxa are found in a single column. The exact interaction types between the species were not determined, instead the general term labelled “interacts with” was used. Results Annotation of biotic interactions via the Pensoft Annotator helped to identify 233 tables possibly containing biotic interactions out of the 6993 tables that were processed. Semantic annotation of taxonomic names within tables allowed GloBI to index the species including their complete taxonomic hierarchies. Currently, GloBI has indexed 2378 interactions, extracted from a subset of 46 of the 233 tables. Interactions extracted via this workflow are available on a special webpage on GloBI's website. Records of the communication behind this collaborative work between GloBI and Pensoft are publically available. Discussion & amp; Conclusion One of the limitations of the workflow was the inability to detect the directionality of the interactions. In other words, the tables do not contain information about the subject and object of a given interaction. For instance, in a host-parasite interaction, we can not automatically detect which species is the host and which is the parasite. We plan to address this issue by performing semantic analysis (e.g., part-of-speech tagging) of the table captions to determine the exact subjects and objects in the interactions. In addition, complicated table structures impeded both the processing of tables by the Pensoft Annotator and their parsing by GloBI’s algorithms. We recognise the importance of adopting common formats for sharing interaction data, a practice that would greatly improve the post-publication indexing of tables by GloBI. An example of a standardised table structure is the standard table template for primary biodiversity data, introduced by Pensoft (Penev et al. 2020). The template helps authors create semantically enhanced tables, which in turn enables direct harvesting and conversion to interlinked FAIR (Findable, Accessible, Interoperable, and Reusable) data. Indexing of biotic interactions by GloBI and Pensoft demonstrates the advantages of storing semantically enhanced data in tables. The adoption of the standard appendix table for primary biodiversity data would improve our ability to extract biotic interactions and to transform scholarly narrative into fully interoperable Linked Open Data.
    Type of Medium: Online Resource
    ISSN: 2535-0897
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2020
    detail.hit.zdb_id: 3028709-1
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 9
    Online Resource
    Online Resource
    Pensoft Publishers ; 2023
    In:  Biodiversity Information Science and Standards Vol. 7 ( 2023-08-09)
    In: Biodiversity Information Science and Standards, Pensoft Publishers, Vol. 7 ( 2023-08-09)
    Abstract: OpenBiodiv is a biodiversity database—knowledge graph based on Resource Description Framework (RDF)—that contains information extracted from the scientific literature. It provides access to an ecosystem of tools and services, including a Linked Open Dataset, an ontology (OpenBiodiv-O) and а website (Dimitrova et al. 2021). Using the available data, OpenBiodiv discovers links between various biodiversity data types (e.g., taxon names, treatments, specimens, sequences, people and institutions), to answer a user’s questions about specific taxa, scientific articles, materials examined and others. The full-text XML content is converted into Linked Open Data from journals on the ARPHA Publishing Platform and treatments extracted by Plazi’s TreatmentBank (stored in the Biodiversity Literature Repository at Zenodo). The database is updated and indexed daily using a workflow based on the Apache Kafka event-streaming platform. The workflow was developed during the European Union-funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project (Penev et al. 2022b). By 1 of August 2023, the graph consisted of 24,939 articles; 167,471 treatments; 130,359 authors; 736,809 taxon names; 129,257 sequences; 1,390 institutions and collections, 117,854 figures; 18,585 tables, and 90,008 materials examined sections. Each semantic statement (e.g., authors, articles, treatments, taxonomic names, localities) has its own globally unique, persistent and resolvable identifier (GUPRI). There are four ways a user can explore the data on OpenBiodiv: General search The search engine is accessible from the OpenBiodiv homepage. The user needs to type in a key term, (e.g., a taxonomic name, authority or an article title), and the system retrieves information about it. Errors caused by misspellings are avoided due to the Elasticsearch index. It can also determine the semantic type of the searched entity. Application Programing Interface (API) OpenBiodiv can be used through a RESTful API for programmatic access. The documentation of the API is described on Swagger. The API construction and functionalities follow the recommendations elaborated by the Technical Research Infrastructures forum of the BiCIKL project (Addink et al. 2023). User applications based on a query algorithm This function can be applied for any data class. The method uses the relationships between an element type (e.g., taxon name) and the type of the section, where it can be found. An application example is Literature exploration , designed to answer the question: Give me information about X mentioned within article section type Y. The results show the number of mentions of the entity (e.g., taxon name) in the section(s) of interest (e.g., Title, Abstract, Treatment). A click navigates the user to the place in the article that mentions the item (Fig. 1). SPARQL queries in a thematic context OpenBiodiv provides a SPARQL endpoint through the Ontotext GraphDB solution*1. Several sample SPARQL queries*2 are also available on the OpenBiodiv website.
    Type of Medium: Online Resource
    ISSN: 2535-0897
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2023
    detail.hit.zdb_id: 3028709-1
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 10
    Online Resource
    Online Resource
    Pensoft Publishers ; 2020
    In:  Journal of the Bulgarian Geographical Society Vol. 42 ( 2020-05-10), p. 12-17
    In: Journal of the Bulgarian Geographical Society, Pensoft Publishers, Vol. 42 ( 2020-05-10), p. 12-17
    Abstract: This article aims to fill the honorary place, which is laid on the founder of the Bulgarian Geographic Science, Acad. Anastas Ishirkov, in this collection of papers, but also to highlight some lesser known details of his research, socio-political and charitable activities. At the same time, the authors removed some inaccuracies in his biography accumulated over time and added new information about his life and activity. This was made possible by our research in the Archive of the Bulgarian Academy of Sciences, and from the materials for him in the Gipson Archive, as well as by the careful reading of the handwritten autobiography of Acad. Ishirkov.
    Type of Medium: Online Resource
    ISSN: 2738-8115 , 2738-8107
    Language: Unknown
    Publisher: Pensoft Publishers
    Publication Date: 2020
    detail.hit.zdb_id: 3121218-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...