Significant Update to PubChemRDF!

PubChemRDF 1.5β is now available.  The new version is faster, supports linked data in new formats, features improved search and query functions, and contains new links.

What is PubChemRDF?

PubChemRDF expresses data in a Resource Description Framework (RDF) format using ontological frameworks and semantic web technologies.  It facilitates data sharing and analysis, and integrates with other National Center for Biotechnology Information (NCBI) resources along with external resources across scientific domains.  To learn more about this project, please see our earlier blog post and PubChemRDF release notes.

PubChem RDF v1.5-beta

What is new in PubChemRDF 1.5β?

The 1.5β release contains a number of new features and technological improvements including:

  • Faster Speed
    PubChemRDF data is now served from a triple-store and provides a noticeable speed improvement, especially for records with lots of data.  Previously, RDF was generated on the fly from data stored in disparate data systems.
  • Addition of MeSH
    Major improvements were made to the reference subdomain.  Most notable is the addition of Medical Subject Heading (MeSH) annotation of PubMed records.  This includes MeSH topical descriptors (with optional qualifier) that indicate the subject of an article and MeSH (supplementary) concepts that indicate things like chemicals and diseases discussed in an article.
  • Direct links to authoritative RDF resources
    PubChemRDF now enhances cross-integration by providing direct links to available authoritative RDF resources within applicable subdomains, including: reference, synonym, and inchikey to MeSH RDF; protein to UniProt RDF; protein and substance to PDB RDF; biosystem to Reactome RDF; substance to ChEMBL RDF; and compound to WikiData RDF.  For example, the links to PDB RDF help to distinguish proteins and associated chemical substances found in a Protein Data Bank (PDB) crystal structure.
  • Addition of ‘concept’ subdomain
    A new ‘concept’ subdomain provides the means to annotate PubChemRDF subdomains.  For example, annotation between nodes within the concept subdomain allows a hierarchy of concepts to be created, such as those in the WHO ATC classification.  These can then be applied, such as in the case of adding links from chemical substance synonyms to a WHO ATC classification to indicate its therapeutic and pharmacological properties.
  • New links added between the compound and biosystem subdomains
    Previously, the biosystem subdomain linked only to the protein subdomain.  The added links between the compound and biosystem subdomains help to indicate the chemical structure involved in a given pathway.
  • Support for protein complexes
    Protein complex targets are now distinguished within the bioassay subdomain and are linked to the component protein units.
  • Linked Data using JSON
    JSON-LD (or JavaScript Object Notation for Linked Data) is a method of transporting Linked Data using JSON. This addition helps those wanting to use JSON formatted data, for example, with JavaScript.

Where can I learn more about PubChemRDF?

To read more on this topic, please consider exploring these links: