PubChem presents at the American Chemical Society National Meeting in San Diego (March 13-17, 2016)

On March 13-17, 2016, the 251st American Chemical Society National Meeting will be held in San Diego, CA, the theme of which is “Computers in Chemistry”.  The PubChem team will be at the ACS meeting to present new developments and recent changes in PubChem.  Below is a list of presentations that will be given by the PubChem staff.


Day 1 (Sunday, March 13)

Day 2 (Monday, March 14)

Day 3 (Tuesday, March 15)

Day 4 (Wednesday, March 16)

Day 5 (Thursday, March 17)

Recent PubChem Publications: Read about What’s New!

PubChem PublicationsThe PubChem team published an article in the 2016 Nucleic Acids Research Database issue (Kim et al., Nucl. Acids Res., 2016, 44(D1), D1202-D1213, PMID: 26400175).  This article provides an overview of the PubChem Compound and Substance databases, including organization, contents, interfaces, programmatic access and other relevant tools and services.  Considerable changes have been made since these two databases were described in a previous paper published in 2008 (Bolton et al., Ann. Rep. Comput. Chem., 2008, 4, 217-241), and the newly published paper provides updated information on these resources.

Additional papers published about PubChem by the team in 2015 include:

To get a complete list of all articles published by the PubChem team, please visit the PubChem Publication page.

BioAssay Record Page Released

The PubChem BioAssay Record page is now available.  It complements a recent revamp of the PubChem Compound Summary page and the Substance Record page.

What is the BioAssay Record Page?

PubChem Legacy Designation 1As explained in a previous post, PubChem organizes data into three primary databases: Substance, Compound, and BioAssay.  The BioAssay database contains over one million biological assay experiments containing more than 229 million bioactivity outcomes.  For each assay, PubChem now provides a BioAssay Record page (formerly called the Assay Summary page), which displays information provided by the data contributor about the assay as well as annotations and links to tools that support data interpretation and analysis.

What changed?

The key improvements include:

  • Technology refresh
    As with the recent update to the Compound Summary and the Substance Record pages, the new data-driven BioAssay interface is optimized for both touch- and mouse-based devices.  Using a responsive design, it automatically adapts to the available screen size, making it friendly for desktops, tablets, and mobile phones.
  • Data contents reorganized
    In the new BioAssay Record page, depositor-provided information is presented first, followed by annotations based on third-party curation and PubChem processing.  In the now deprecated Assay Summary page, depositor-provided information was intermingled with annotations from third-party curation and PubChem processing, often causing confusion about data provenance (i.e., the information source).
  • Improved data table
    PubChem Legacy Designation 1While a full bioactivity data set is retrieved by default, the data table is partitioned according to activity outcomes (e.g., active, inactive, submicromolar activity, subnanomolar activity, and so on), allowing users to quickly filter results.  In addition, users can download the entire data table or a filtered subset.  To support comparative evaluation, a link to a cross-assay bioactivity analysis page is provided for each compound displayed in the data table.
  • Extended download functionality
    PubChem Legacy Designation 1The top bar ‘Download’ button provides access to all downloadable data on the page.  This includes depositor-provided description, data table results, chemical structures tested in an assay, and annotations shown in an individual section.
  • Integrated Related BioAssay Summary
    At the bottom of the BioAssay Record page, BioAssays from the same assay project and other related BioAssays are displayed in a tabular format, facilitating assay data interpretation and comparison.
  • URL change
    BioAssay Record page uses a different URL from the now deprecated Assay Summary page.  For example, the URL for the BioAssay Record for AID 1284 is:

    Links to the now deprecated Assay Summary page will automatically redirect to the new location for the BioAssay Record page.

Future plans?

The now deprecated Assay Summary page will remain accessible from a link at the top of the BioAssay Record page until May 2016.

A PubChem Target Summary page is in progress, helping to summarize available biological activity and annotation information in PubChem is in progress.

PubChem adds a “legacy” designation for outdated data

Sometimes information provided to PubChem by data contributors becomes outdated.  To address this, PubChem is introducing a “legacy” designation for collections that are not regularly updated.  This “legacy” designation applies to project/contributors that appear to no longer be active, as well as to their individual records.  This designation will help PubChem users quickly identify records that may have out-of-date information and/or hyperlinks.

Why a “legacy” designation?

PubChem Legacy Designation 1As an archive, PubChem accepts scientific data from contributors and maintains that data even if the contributing project is discontinued. While this helps ensure community access to the information lasts beyond the lifetime of a given scientific endeavor, the archival nature of PubChem does not allow anyone other than the data contributor to modify provided information.  Therefore, some records in PubChem can persist with outdated (or incorrect) data.  To help identify such cases, we are introducing a “legacy” indication for contributors and their records.  Please note that this does not mean that data identified as “legacy” is without value.  Quite to the contrary, some legacy collections successfully collected valuable scientific data for the research community, and are simply no longer updating the information.

How is a “legacy” designation determined?

A “legacy” designation is arrived at via a semi-manual, semi-automated procedure.  It involves aspects of examining contributor account information, individual records, and user reports.  For example, if the depositor website does not work for a period of time, attempts are made to contact the submitting organization.  If PubChem staff are unable to make contact with the data contributor or if an organization is no longer updating records, a legacy designation may be initiated.  Please note that a “legacy” designation can be removed at any time, when contact is reestablished and updates resume.

Impacts of legacy designation?

PubChem Legacy Designation 2If a data contributor is designated as “legacy”, all records deposited by the contributor are also designated as “legacy”.  While still searchable, these records will clearly indicate that they are “legacy”.  Please note that “legacy” records will not be shown in the “Chemical Vendors” section of Compound Summary pages.  In addition, in the “Substances by Category” section of the Compound Summary page, “legacy” substance records only will be found under “Legacy Depositors”.

Future plans?

The way PubChem implements both manual and automated processes to ascertain a “legacy” indication will likely evolve over time.  In addition, we are looking at the possibility of enabling users to separate out legacy records when searching and analyzing the database.

Laboratory Chemical Safety Summary (LCSS) views now available in PubChem

PubChem Laboratory Chemical Safety Summary 1The PubChem Laboratory Chemical Safety Summary (LCSS) provides pertinent chemical health and safety data for a given PubChem Compound record.  The PubChem LCSS is a community effort involving professionals in health and safety, chemistry librarianship, informatics, and other specialties.

What is LCSS?

PubChem Laboratory Chemical Safety Summary 2The LCSS is based on the format described by the National Research Council in the Prudent Practices in the Laboratory: Handling and Management of Chemical Hazards.  Information contained in the PubChem LCSS is a subset of the PubChem Compound summary page content.  It includes a summary of hazard and safety information for a chemical, such as flammability, toxicity, exposure limits, exposure symptoms, first aid, handling, and clean up.

How can I access LCSS?

PubChem Laboratory Chemical Safety Summary 3An LCSS is available for PubChem Compound records with a GHS hazard classification (Globally Harmonized System of Classification and Labeling of Chemicals).  If a PubChem Compound record has an LCSS, the link to view it is provided at the top of the page under the heading “Safety Summary”.  In addition, one can get the complete list of chemicals with an LCSS by visiting the PubChem LCSS webpage or by using the PubChem Classification Browser.

To learn more about LCSS in PubChem, please explore the following webpages:

Significant Update to PubChemRDF!

PubChemRDF 1.5β is now available.  The new version is faster, supports linked data in new formats, features improved search and query functions, and contains new links.

What is PubChemRDF?

PubChemRDF expresses data in a Resource Description Framework (RDF) format using ontological frameworks and semantic web technologies.  It facilitates data sharing and analysis, and integrates with other National Center for Biotechnology Information (NCBI) resources along with external resources across scientific domains.  To learn more about this project, please see our earlier blog post and PubChemRDF release notes.

PubChem RDF v1.5-beta

What is new in PubChemRDF 1.5β?

The 1.5β release contains a number of new features and technological improvements including:

  • Faster Speed
    PubChemRDF data is now served from a triple-store and provides a noticeable speed improvement, especially for records with lots of data.  Previously, RDF was generated on the fly from data stored in disparate data systems.
  • Addition of MeSH
    Major improvements were made to the reference subdomain.  Most notable is the addition of Medical Subject Heading (MeSH) annotation of PubMed records.  This includes MeSH topical descriptors (with optional qualifier) that indicate the subject of an article and MeSH (supplementary) concepts that indicate things like chemicals and diseases discussed in an article.
  • Direct links to authoritative RDF resources
    PubChemRDF now enhances cross-integration by providing direct links to available authoritative RDF resources within applicable subdomains, including: reference, synonym, and inchikey to MeSH RDF; protein to UniProt RDF; protein and substance to PDB RDF; biosystem to Reactome RDF; substance to ChEMBL RDF; and compound to WikiData RDF.  For example, the links to PDB RDF help to distinguish proteins and associated chemical substances found in a Protein Data Bank (PDB) crystal structure.
  • Addition of ‘concept’ subdomain
    A new ‘concept’ subdomain provides the means to annotate PubChemRDF subdomains.  For example, annotation between nodes within the concept subdomain allows a hierarchy of concepts to be created, such as those in the WHO ATC classification.  These can then be applied, such as in the case of adding links from chemical substance synonyms to a WHO ATC classification to indicate its therapeutic and pharmacological properties.
  • New links added between the compound and biosystem subdomains
    Previously, the biosystem subdomain linked only to the protein subdomain.  The added links between the compound and biosystem subdomains help to indicate the chemical structure involved in a given pathway.
  • Support for protein complexes
    Protein complex targets are now distinguished within the bioassay subdomain and are linked to the component protein units.
  • Linked Data using JSON
    JSON-LD (or JavaScript Object Notation for Linked Data) is a method of transporting Linked Data using JSON. This addition helps those wanting to use JSON formatted data, for example, with JavaScript.

Where can I learn more about PubChemRDF?

To read more on this topic, please consider exploring these links:

Substance Record Page Released

The PubChem Substance Record page is now available.  It complements an update of the PubChem Compound Summary page released six months ago.

What is the Substance Record Page?Substance Record Page

PubChem organizes its data into three databases: Substance, Compound, and BioAssay.  PubChem Substance (accession SID) contains nearly 200 million chemical substance descriptions provided by hundreds of data contributors.  Each record has a webpage that displays contributed information provided by an individual contributor about a particular chemical substance.  This page is called the Substance Record page, and it replaces the PubChem Substance Summary page.

What changed?

The key improvements include:

  • Technology refresh
    As with the recent update to the Compound Summary page, this new page loads much faster by minimizing the amount of data and the time to respond to requests.  The new interface is optimized for both touch- and mouse-based devices.  Using a responsive design, it automatically adapts to the available screen size, making it friendly for desktops, tablets, and phones.
  • What you see, is what we got
    The new page is renamed the Substance Record page as it clearly shows the information provided to PubChem by the contributor.  The older page was called the PubChem Substance Summary page and included additional derived annotation (making it confusing to understand what the contributor provided) and a direct interface to the Compound Summary page (adding user confusion as to the difference between a compound and substance record).
  • URL change
    The old URLs from the Substance Summary page will automatically redirect to the new location for the Substance Record page.  For example, the URL for the Substance Record for SID 12345 is now:

Future plans?

The legacy Substance Summary page will be accessible until October 1, 2015; however, a redirect to the new pages will remain in place.

Our next focus will be on redesigning the BioAssay Summary page.