Wikidata includes links between entities using predicates such as SubClassOf (P279). These form a classification hierarchy, although as this comes from multiple sources, it may not conform to the same rules as ontology hierarchies.
OntoBio includes a wikidata ontology factory, so we can transparently create an Ontology object from wikidata, and leverage the same methods available in ontobio.
This example is focused around Anxiety disorders
from ontobio.ontol_factory import OntologyFactory
f = OntologyFactory()
## OntologyFactory recognizes the prefix wdq for wikidata queries;
## We use this to make a sub-ontology
## (currently we have no lazy wrapper for WD, only Eager, so we limit the size)
ont = f.create('wdq:Q544006') # Anxiety disorder
WARNING:rdflib.term: does not look like a valid URI, trying to serialize this will break.
## Find terms starting with Anxiety in the sub-ontology
qids = ont.search('Anxiety%')
qids
[rdflib.term.URIRef('http://www.wikidata.org/entity/Q544006')]
## Traverse up and down from query node in our sub-ontology
nodes = ont.traverse_nodes(qids, up=True, down=True)
labels = [ont.label(n) for n in nodes]
labels[:25]
['Aktualneurosen', 'cognitive disorder', 'Anti-French sentiment in the United States', 'acarophobia', 'Organic disease', 'identifier', 'Alektorophobia', 'Katagelasticism', 'answer', 'Counterphobic attitude', 'compulsive act', 'physical condition', 'Piblokto', 'blood phobia', 'category of being', 'Childhood phobias', 'ability', 'disposition', 'Entomophobia', 'physiological condition', 'property', 'Cynophobia', 'neurosis effects', 'bowel-control anxiety', 'Anxiety disorder']
## Test for cycles
import networkx as nx
g = ont.get_graph()
def show_cycle(nl):
print(["{} {}".format(n, ont.label(n)) for n in nl])
cycles_list = list(nx.simple_cycles(g))
show_cycle(cycles_list[0])
['http://www.wikidata.org/entity/Q1347367 ability', 'http://www.wikidata.org/entity/Q151885 concept', 'http://www.wikidata.org/entity/Q9081 knowledge', 'http://www.wikidata.org/entity/Q3695082 sign', 'http://www.wikidata.org/entity/Q853614 identifier', 'http://www.wikidata.org/entity/Q937228 property']
## Show our extract of the sub-ontology as an ascii tree
## (note this is resilient to cycles)
## only traverse down from our query nodes
## (including ancestors causes multiple paths, and a verbose display)
nodes = ont.traverse_nodes(qids, up=False, down=True)
from ontobio.io.ontol_renderers import GraphRenderer
w = GraphRenderer.create('tree')
w.write_subgraph(ont, nodes, query_ids=qids)
. http://www.wikidata.org/entity/Q544006 ! Anxiety disorder * % http://www.wikidata.org/entity/Q741713 ! panic disorder % http://www.wikidata.org/entity/Q6374996 ! Katagelasticism % http://www.wikidata.org/entity/Q845224 ! generalized anxiety disorder % http://www.wikidata.org/entity/Q377493 ! selective mutism % http://www.wikidata.org/entity/Q5354941 ! Elective mutism % http://www.wikidata.org/entity/Q202387 ! post-traumatic stress disorder % http://www.wikidata.org/entity/Q10547816 ! Counterphobic attitude % http://www.wikidata.org/entity/Q13604751 ! lovesickness % http://www.wikidata.org/entity/Q1316515 ! School refusal % http://www.wikidata.org/entity/Q4386741 ! Olfactory Reference Syndrome % http://www.wikidata.org/entity/Q424221 ! acute stress disorder % http://www.wikidata.org/entity/Q1482034 ! combat disorder % http://www.wikidata.org/entity/Q18967153 ! mixed disorder as reaction to stress % http://www.wikidata.org/entity/Q18967156 ! acute stress reaction with predominant disturbance of consciousness % http://www.wikidata.org/entity/Q178190 ! obsessive-compulsive disorder % http://www.wikidata.org/entity/Q7458802 ! Sexual obsessions % http://www.wikidata.org/entity/Q231624 ! compulsive act % http://www.wikidata.org/entity/Q7310756 ! Relationship obsessive–compulsive disorder % http://www.wikidata.org/entity/Q19000444 ! neurotic disorder % http://www.wikidata.org/entity/Q181032 ! neurosis effects % http://www.wikidata.org/entity/Q144119 ! hysteria % http://www.wikidata.org/entity/Q336203 ! Abwehrhysterie % http://www.wikidata.org/entity/Q1779438 ! Piblokto % http://www.wikidata.org/entity/Q423509 ! Aktualneurosen % http://www.wikidata.org/entity/Q2300749 ! separation anxiety disorder % http://www.wikidata.org/entity/Q19000931 ! organic anxiety disorder % http://www.wikidata.org/entity/Q175854 ! phobia % http://www.wikidata.org/entity/Q560107 ! Tryophobia % http://www.wikidata.org/entity/Q1343559 ! ochlophobia % http://www.wikidata.org/entity/Q980010 ! Tokophobia % http://www.wikidata.org/entity/Q5097985 ! Childhood phobias % http://www.wikidata.org/entity/Q909355 ! Francophobia % http://www.wikidata.org/entity/Q3427834 ! Anti-French sentiment in the United States % http://www.wikidata.org/entity/Q174589 ! agoraphobia % http://www.wikidata.org/entity/Q22906231 ! Afrophobia % http://www.wikidata.org/entity/Q1363791 ! erythrophobia % http://www.wikidata.org/entity/Q13 ! triskaidekaphobia % http://www.wikidata.org/entity/Q2015728 ! specific phobia % http://www.wikidata.org/entity/Q944108 ! animal phobia % http://www.wikidata.org/entity/Q619261 ! Ornithophobia % http://www.wikidata.org/entity/Q4694196 ! Agrizoophobia % http://www.wikidata.org/entity/Q3321265 ! Fear of fish % http://www.wikidata.org/entity/Q596505 ! Ophidiophobia % http://www.wikidata.org/entity/Q4422074 ! Vermiphobia % http://www.wikidata.org/entity/Q405385 ! Ailurophobia % http://www.wikidata.org/entity/Q4297397 ! Fear of frogs % http://www.wikidata.org/entity/Q2319444 ! Herpetophobia % http://www.wikidata.org/entity/Q38579 ! Cynophobia % http://www.wikidata.org/entity/Q5384517 ! Equinophobia % http://www.wikidata.org/entity/Q2157130 ! Entomophobia % http://www.wikidata.org/entity/Q2160101 ! Fear of bees % http://www.wikidata.org/entity/Q2822642 ! acarophobia % http://www.wikidata.org/entity/Q220783 ! arachnophobia % http://www.wikidata.org/entity/Q3440772 ! Fear of mice % http://www.wikidata.org/entity/Q16002436 ! Alektorophobia % http://www.wikidata.org/entity/Q5439392 ! Fear of bats % http://www.wikidata.org/entity/Q3381344 ! Blood-injection-injury type phobia % http://www.wikidata.org/entity/Q886731 ! blood phobia % http://www.wikidata.org/entity/Q6034425 ! Injury phobia % http://www.wikidata.org/entity/Q169922 ! Fear of needles % http://www.wikidata.org/entity/Q1127417 ! flying phobia % http://www.wikidata.org/entity/Q3052614 ! nosophobia % http://www.wikidata.org/entity/Q18557105 ! cancerophobia % http://www.wikidata.org/entity/Q18557109 ! AIDS phobia % http://www.wikidata.org/entity/Q281928 ! social phobia % http://www.wikidata.org/entity/Q17147649 ! Specific social phobia % http://www.wikidata.org/entity/Q1335831 ! paruresis % http://www.wikidata.org/entity/Q612851 ! Telephone phobia % http://www.wikidata.org/entity/Q7136497 ! Parcopresis % http://www.wikidata.org/entity/Q2540262 ! Glossophobia % http://www.wikidata.org/entity/Q3219948 ! bowel-control anxiety % http://www.wikidata.org/entity/Q168995 ! Surdophobia % http://www.wikidata.org/entity/Q1131359 ! Amaxophobia
## Show as graph using GraphViz
## We can do this for both descendants and ancestors
nodes = ont.traverse_nodes(qids, up=True, down=True)
w = GraphRenderer.create('png')
w.outfile = 'output/anxiety-disorder.png'
w.write_subgraph(ont, nodes, query_ids=qids)
TODO: Drugs
## What proteins are associated with PTSD? (via GWAS)
[ptsd] = ont.search('post-traumatic stress disorder')
import ontobio.sparql.wikidata as wd
proteins = wd.canned_query('disease2protein', ptsd)
proteins
['UniProtKB:Q92831', 'UniProtKB:P17252', 'UniProtKB:Q8N9K7', 'UniProtKB:O75899', 'UniProtKB:Q92597', 'UniProtKB:P40145', 'UniProtKB:Q9HA38', 'UniProtKB:P42658', 'UniProtKB:Q9Y243', 'UniProtKB:Q9NUQ9', 'UniProtKB:Q9P272', 'UniProtKB:Q9BY07', 'UniProtKB:O43897', 'UniProtKB:A0A024R9G4', 'UniProtKB:Q4F7X0', 'UniProtKB:E5RIR1', 'UniProtKB:Q8IYG9', 'UniProtKB:A7E2E4']
## Find GO terms for all genes/products associated with all nodes in Anxiety sub-ontology
## First create a GO handle and get association sets for GO (in human)
go = f.create('go')
from ontobio.assoc_factory import AssociationSetFactory
afactory = AssociationSetFactory()
aset = afactory.create(ontology=go,
subject_category='gene',
object_category='function',
taxon='NCBITaxon:9606')
for n in ont.nodes():
proteins = wd.canned_query('disease2protein', n)
anns = [a for p in proteins for a in aset.annotations(p)]
if len(anns) > 0:
print("{} {}".format(n,ont.label(n)))
for a in anns:
print(" {} {}".format(a, go.label(a)))
http://www.wikidata.org/entity/Q202387 post-traumatic stress disorder GO:0007616 long-term memory GO:0006171 cAMP biosynthetic process GO:0007193 adenylate cyclase-inhibiting G-protein coupled receptor signaling pathway GO:0016021 integral component of membrane GO:0005524 ATP binding GO:0003091 renal water homeostasis GO:0005886 plasma membrane GO:0004016 adenylate cyclase activity GO:0004383 guanylate cyclase activity GO:0006182 cGMP biosynthetic process GO:0007165 signal transduction GO:0007190 activation of adenylate cyclase activity GO:0008294 calcium- and calmodulin-responsive adenylate cyclase activity GO:0008074 guanylate cyclase complex, soluble GO:0007189 adenylate cyclase-activating G-protein coupled receptor signaling pathway GO:0046872 metal ion binding GO:0007611 learning or memory GO:0071377 cellular response to glucagon stimulus GO:0016020 membrane GO:0035556 intracellular signal transduction GO:0034199 activation of protein kinase A activity GO:0008198 ferrous iron binding GO:0016706 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors GO:0005634 nucleus GO:0005737 cytoplasm GO:0055114 oxidation-reduction process GO:0016300 tRNA (uracil) methyltransferase activity GO:0030488 tRNA methylation GO:0002098 tRNA wobble uridine modification GO:0000049 tRNA binding GO:0006400 tRNA modification GO:0008175 tRNA methyltransferase activity http://www.wikidata.org/entity/Q741713 panic disorder GO:0003713 transcription coactivator activity GO:0030374 ligand-dependent nuclear receptor transcription coactivator activity GO:0043565 sequence-specific DNA binding GO:0044212 transcription regulatory region DNA binding GO:0005515 protein binding GO:0005634 nucleus GO:0007165 signal transduction GO:0045893 positive regulation of transcription, DNA-templated GO:0003682 chromatin binding GO:0001047 core promoter binding GO:0003712 transcription cofactor activity GO:0008022 protein C-terminus binding GO:0043231 intracellular membrane-bounded organelle GO:0045944 positive regulation of transcription from RNA polymerase II promoter GO:0030518 intracellular steroid hormone receptor signaling pathway GO:0006351 transcription, DNA-templated GO:0008013 beta-catenin binding GO:0070016 armadillo repeat domain binding GO:0010628 positive regulation of gene expression GO:0016055 Wnt signaling pathway GO:0005829 cytosol GO:0000790 nuclear chromatin