WikiPathways Metabolomics

On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by Andra’s RDF / SPARQL work, curation started with metabolites without database identifiers. But this soon led to the observation that metabolites are often not even annotated as being a metabolite (using <Label> rather than <DataNode>).

The Data

The latest revision you can look up with:

select (str(?o) as ?version) where {
  ?pw a void:Dataset ;
    dcterms:title ?o .
}

Open

Metabolome

The following queries provide an overview of the Metabolome captures by WikiPathways.

The key type for metabolites is the wp:Metabolite. We can see all available properties with:

select (str(?o) as ?version) where {
  ?pw a void:Dataset ;
    dcterms:title ?o .
}

Open

All Metabolites

We can get the count of metabolites datanodes in WikiPathways with:

select count(distinct ?mb) where {
  ?mb a wp:Metabolite .
}

Open

As list:

select distinct ?mb ?label where {
  ?mb a wp:Metabolite ;
     rdfs:label ?label .
}

Open

Or metabolites for just zebrafish pathways:

select distinct ?metabolite (str(?titleLit) as ?title) where {
  ?metabolite a wp:Metabolite ;
    dcterms:isPartOf ?pw .
  ?pw dc:title ?titleLit ;
    wp:organismName "Danio rerio" .
}

Open

Metabolic Data Sources

Sorted by use

ChEBI, HMDB, and LIPID MAPS are the main data sources for identifiers:

select str(?datasource) as ?source count(distinct ?identifier) as ?count
where {
  ?mb a wp:Metabolite ;
    dc:source ?datasource ;
    dc:identifier ?identifier .
} order by desc(?count)

Open

All metabolites from one source

All KEGG identifiers

This SPARQL query lists all metabolite datanodes annotated with a KEGG compound identifier:

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "KEGG Compound" ;
    dc:identifier ?identifier .
} order by ?identifier

Open

All HMDB identifiers

Return all HMDB identfiers with:

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "HMDB" ;
    dc:identifier ?identifier .
} order by ?identifier

Open

Metabolic Pathways

Of general interest is the number of pathways per species:

select distinct str(?orgName) as ?organism count(?pw) as ?pathways  where {
  ?pw wp:organismName ?orgName .
} order by desc(?pathways)

Open

Metabolomes

Human Metabolome

prefix ncbi:    <http://purl.obolibrary.org/obo/NCBITaxon_>

select distinct ?mb where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pw .
  ?pw wp:organism ncbi:9606 .
} order by ?mb

Open

Arabodopsis thaliana Metabolome

prefix ncbi:    <http://purl.obolibrary.org/obo/NCBITaxon_>

select distinct ?mb where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pw .
  ?pw wp:organism ncbi:3702 .
} order by ?mb

Open

Pathways with the most metabolites

select ?pathway count(distinct ?mb) as ?mbCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?mbCount)

Open

Metabolites in the most Pathways

With the remark that BridgeDb is not involved yet: the results are based on metabolite datanodes, not unique metabolites.

select ?mb count(distinct ?pathway) as ?pwCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?pwCount)

Open

Enzymatic reactions

SELECT DISTINCT ?wpid ?catalyst ?source ?sourceDb ?target ?targetDb WHERE {
  ?pathway a wp:Pathway ;
      dc:identifier / dcterms:identifier ?wpid .
  # ?catalysis a wp:Catalysis .
  ?catalysis dcterms:isPartOf ?pathway ;
    wp:source / rdfs:label ?catalyst ;
    wp:participants ?reaction .
  ?reaction a wp:Interaction .
  ?reaction wp:source ?source .
  ?source a wp:Metabolite . 
  OPTIONAL{?source wp:bdbWikidata ?sourceDb .}
  
  ?reaction wp:target ?target .
  ?target a wp:Metabolite . 
  OPTIONAL{?target wp:bdbWikidata ?targetDb .}
} ORDER BY ASC(?source)

Open