New in the 1.5 API release
Updated versions of the data sets and new features
1) Updated data sources
The list of the data sets with the date downloaded and the version numbers:
UniProt (manually curated entries only) from 28 Jan 2015, release 2015_1
ENZYME from 02 Feb 2015, release 2015_1
DrugBank from 19 Feb 2015, version 4.1
ChEMBL from 18 Feb 2015, ChEMBL 20
ChEBI 04 Mar 2015 ChEBI, Release 125
FDA Adverse Events (FAERS) data, 09 Jul 2012
Gene Ontology, 04 Mar 2015
Gene Ontology Annotations, 17 Feb 2015
WikiPathways, 20 Mar 2015, v20150312
DisGeNET, 31 Mar 2015, v2.1.0
Various SPARQL optimizations have been done to improve API calls and results.
Open PHACTS Chemical Registration Service (OCRS)
The IMS has been enriched with additional patterns for various datasets.
Quality assurance comparing the API results to the native data sources was done to assure the same content.
2) Several new features
· New filter for the Tissue API calls. A new filter has been added to Tissues for Protein and Protein for Tissues.
· New filter for the Disease API calls. A new filter, assoc_type, has been added to Associations for Disease: Count, Associations for Disease: List, Associations for Target: Count, an Associations for Target: List. This new parameter filters for these associations by using the SIO identifier:
o sio:SIO_001119 rdfs:label "gene-disease association linked with causal mutation"
o sio:SIO_001120 rdfs:label "therapeutic gene-disease association"
o sio:SIO_001121 rdfs:label "gene-disease biomarker association"
o sio:SIO_001122 rdfs:label "gene-disease association linked with genetic variation"
o sio:SIO_001123 rdfs:label "gene-disease association linked with altered gene expression"
o sio:SIO_001124 rdfs:label "gene-disease association linked with post-translational modification"
3) Response format changes
The color-coded 3-Scale documentation reflects the optional and required parameters. The documentation also has the example query URIs pre-loaded in the query box as well as dropdown menus for parameter filters that have less than 100 options.
OCRS matches for compounds are now optional: This implies that compounds can be returned without an OCRS identifier. This allows the use of new ChEMBL_20 URIs to retrieve results from Compound Information and Compound Pharmacology.
This modification is temporary. The requirement for the presence of an OCRS identifier will be returned with the next release of the platform.
Batch calls: The API only exposes skos:exactMatch relationships between instances regardless of the underlying mapping in the IMS. For example, the itemized list now resembles the same structure as Target Information results.
Target Class members: The requirement for the presence in ChEMBL has been removed for ENZYME proteins and UniProt is the main data source for the proteins displayed in the result. Therefore, all proteins from UniProt will be returned in the result with ChEMBL data as a sub-block when present.
DrugBank results: DrugBank has added language tags (_en) to their data set and this is now reflected in the results returned by the API.
GO data set: The “primary topic” URL for GO terms is now given in the form of http://purl.obolibrary.org/obo/GO_000000. It was previously in the form of http://purl.org/obo/owl/GO#GO_0000000. If the http://purl.org/obo/owl/GO# form is used now, the http://purl.obolibrary.org/obo/ form will be returned in a primary topic block and an exact match block will be returned with the http://purl.org/obo/owl/GO# input.
4) Fixes for previous issues
Target Information: All targets include amino acid sequences.
Associations for disease: The requirement to return the disease class has been removed. Get targets for disease and Association for disease now return the same data as the DisGeNET source.
Pathway for Target: Previously missing pathways for the specified target are now returned.
Statistics on 1.5 Release
Linked data cache:
Identifier mapping service:
186 Mapping Sets
50 Source Data Sources
50 Target Data Sources