Intermine-Python: Tutorial 12: More about Queries

Welcome to your twelfth intermine-python tutorial.

This tutorial will cover some more functionalities of a query. Queries are the basis of all research in InterMine and being able to manage them more effectively is always useful.

In [1]:
from intermine.webservice import Service
In [2]:
service = Service("https://www.flymine.org/query/service")
query=service.new_query()
In [3]:
query.add_view("Gene.organism.name","Gene.symbol")
Out[3]:
<intermine.query.Query at 0x7fb6660a7e10>

Suppose, the query is not as simple as a strict cumulative filter and the user wants combinations of constraints. For example, the user wants all genes such that the gene symbol is either ‘eve’ or ‘zen’. This can be incorporated in the following way using set_logic:

In [4]:
gene_is_eve = query.add_constraint("Gene.symbol", "=", "eve")
gene_is_zen = query.add_constraint("Gene.symbol", "=", "zen")
query.set_logic(gene_is_eve | gene_is_zen)
Out[4]:
<intermine.query.Query at 0x7fb6660a7e10>
In [5]:
for row in query.rows():
    print(row)
Gene: organism.name='Drosophila melanogaster' symbol='eve'
Gene: organism.name='Drosophila melanogaster' symbol='zen'

The query results can be converted into a dictionary in the following way:

In [6]:
for row in query.rows():
    print(row.to_d())
{'Gene.organism.name': 'Drosophila melanogaster', 'Gene.symbol': 'eve'}
{'Gene.organism.name': 'Drosophila melanogaster', 'Gene.symbol': 'zen'}

Similarly, row.to_l() can be used for conversion of the results into a list.

count() can be used to print the total number of rows in a query:

In [7]:
query.count()
Out[7]:
2

to_xml() can be used to return a readable XML serialisation of the query:

In [8]:
query.to_xml()
Out[8]:
'<query constraintLogic="A or B" longDescription="" model="genomic" name="" sortOrder="Gene.organism.name asc" view="Gene.organism.name Gene.symbol"><constraint code="A" op="=" path="Gene.symbol" value="eve"/><constraint code="B" op="=" path="Gene.symbol" value="zen"/></query>'

clear_view() can be used to clear the output column list:

In [9]:
query.clear_view()
In [10]:
for row in query.rows():
    print(row)
Gene: briefDescription=None cytoLocation='46C10-46C10' description=None id=1007357 length=1477 name='even skipped' primaryIdentifier='FBgn0000606' score=None scoreType=None secondaryIdentifier='CG2328' symbol='eve'
Gene: briefDescription=None cytoLocation='84A5-84A5' description=None id=1007877 length=1331 name='zerknullt' primaryIdentifier='FBgn0004053' score=None scoreType=None secondaryIdentifier='CG1046' symbol='zen'

In these ways, queries can be utilized to a greater extent and produce more fruitful results.