Intermine-Python: Tutorial 11: Combining Lists

This tutorial will talk about how you can combine two lists easily in Intermine.

This tutorial will require you to login into your Intermine Account so that the lists that you will combine can be saved.

In [1]:
from intermine.webservice import Service

Enter your username and password, uncomment the given line, execute it and then proceed with the rest of the tutorial.

In [2]:
service = Service("https://www.flymine.org/flymine/service",username="[email protected]",password="demo")

We begin by declaring a list manager object which will help us in combining various lists together.

In [3]:
lm=service.list_manager()

Let's say that you want to combine all the most enriched genes in the adult Fly brain and in the adult Fly hindgut. These are present as two separate lists currently on Flymine. We begin by extracting both the lists first.

In [4]:
l1=lm.get_list(name="PL FlyAtlas_brain_top")
In [5]:
l2=lm.get_list(name="PL FlyAtlas_hindgut_top")

There are a couple of ways by which you combine the two lists, i.e. union of the two lists. The first method is shown below - using the addition operator automatically combines both the lists.

In [6]:
l3=l1+l2
In [7]:
lm.delete_lists(["combination-1"])
l3.set_name("combination-1")
In [8]:
for r in l3:
    print(r)
Gene(score = None,  symbol = 'beat-VI',  length = 56607,  name = 'beaten path VI',  description = None,  scoreType = None,  secondaryIdentifier = 'CG14064',  primaryIdentifier = 'FBgn0039584',  briefDescription = None,  cytoLocation = '98D1-98D2')
Gene(score = None,  symbol = 'GluRIA',  length = 11028,  name = 'Glutamate receptor IA',  description = None,  scoreType = None,  secondaryIdentifier = 'CG8442',  primaryIdentifier = 'FBgn0004619',  briefDescription = None,  cytoLocation = '65C1-65C1')
Gene(score = None,  symbol = 'CG43740',  length = 14143,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG43740',  primaryIdentifier = 'FBgn0263997',  briefDescription = None,  cytoLocation = '9B7-9B7')
Gene(score = None,  symbol = 'fred',  length = 88786,  name = 'friend of echinoid',  description = None,  scoreType = None,  secondaryIdentifier = 'CG31774',  primaryIdentifier = 'FBgn0051774',  briefDescription = None,  cytoLocation = '24C9-24D1')
Gene(score = None,  symbol = 'MFS3',  length = 4195,  name = 'Major Facilitator Superfamily Transporter 3',  description = None,  scoreType = None,  secondaryIdentifier = 'CG4726',  primaryIdentifier = 'FBgn0031307',  briefDescription = None,  cytoLocation = '21F1-21F1')
Gene(score = None,  symbol = 'SK',  length = 64903,  name = 'small conductance calcium-activated potassium channel',  description = None,  scoreType = None,  secondaryIdentifier = 'CG10706',  primaryIdentifier = 'FBgn0029761',  briefDescription = None,  cytoLocation = '4F5-4F9')
Gene(score = None,  symbol = 'CG7509',  length = 2198,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG7509',  primaryIdentifier = 'FBgn0035575',  briefDescription = None,  cytoLocation = '64B17-64B17')
Gene(score = None,  symbol = 'Fer1',  length = 5767,  name = '48 related 1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG33323',  primaryIdentifier = 'FBgn0037475',  briefDescription = None,  cytoLocation = '84C5-84C6')
Gene(score = None,  symbol = 'CG42269',  length = 5250,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG42269',  primaryIdentifier = 'FBgn0259164',  briefDescription = None,  cytoLocation = '65A5-65A5')
Gene(score = None,  symbol = 'CG12912',  length = 29154,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG12912',  primaryIdentifier = 'FBgn0033497',  briefDescription = None,  cytoLocation = '46F7-46F7')
Gene(score = None,  symbol = 'yellow-b',  length = 3948,  name = 'yellow-b',  description = None,  scoreType = None,  secondaryIdentifier = 'CG17914',  primaryIdentifier = 'FBgn0032601',  briefDescription = None,  cytoLocation = '36A14-36A14')
Gene(score = None,  symbol = 'comm3',  length = 34579,  name = 'comm3',  description = None,  scoreType = None,  secondaryIdentifier = 'CG42334',  primaryIdentifier = 'FBgn0259236',  briefDescription = None,  cytoLocation = '71E3-71E4')
Gene(score = None,  symbol = 'CG3604',  length = 631,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG3604',  primaryIdentifier = 'FBgn0031562',  briefDescription = None,  cytoLocation = '24B3-24B3')
Gene(score = None,  symbol = 'Irk2',  length = 5829,  name = 'Inwardly rectifying potassium channel 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CG4370',  primaryIdentifier = 'FBgn0039081',  briefDescription = None,  cytoLocation = '95A1-95A1')
Gene(score = None,  symbol = 'ine',  length = 9023,  name = 'inebriated',  description = None,  scoreType = None,  secondaryIdentifier = 'CG15444',  primaryIdentifier = 'FBgn0011603',  briefDescription = None,  cytoLocation = '24F4-24F4')
Gene(score = None,  symbol = 'mmd',  length = 26224,  name = 'mind-meld',  description = None,  scoreType = None,  secondaryIdentifier = 'CG42252',  primaryIdentifier = 'FBgn0259110',  briefDescription = None,  cytoLocation = '14A1-14A1')
Gene(score = None,  symbol = 'Dop1R1',  length = 49432,  name = 'Dopamine 1-like receptor 1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG9652',  primaryIdentifier = 'FBgn0011582',  briefDescription = None,  cytoLocation = '88A10-88A12')
Gene(score = None,  symbol = 'Nha1',  length = 9485,  name = 'Na[+]/H[+] hydrogen antiporter 1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG10806',  primaryIdentifier = 'FBgn0031865',  briefDescription = None,  cytoLocation = '27C1-27C1')
Gene(score = None,  symbol = 'GC',  length = 6123,  name = 'gamma-glutamyl carboxylase',  description = None,  scoreType = None,  secondaryIdentifier = 'CG13927',  primaryIdentifier = 'FBgn0035245',  briefDescription = None,  cytoLocation = '62A9-62A9')
Gene(score = None,  symbol = 'Nha2',  length = 17070,  name = 'Na[+]/H[+] hydrogen antiporter 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CG43442',  primaryIdentifier = 'FBgn0263390',  briefDescription = None,  cytoLocation = '94D6-94D7')
Gene(score = None,  symbol = 'Dop1R2',  length = 29681,  name = 'Dopamine 1-like receptor 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CG18741',  primaryIdentifier = 'FBgn0266137',  briefDescription = None,  cytoLocation = '99B5-99B6')
Gene(score = None,  symbol = 'Sh',  length = 138941,  name = 'Shaker',  description = None,  scoreType = None,  secondaryIdentifier = 'CG12348',  primaryIdentifier = 'FBgn0003380',  briefDescription = None,  cytoLocation = '16F3-16F5')
Gene(score = None,  symbol = 'CG15765',  length = 22265,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG15765',  primaryIdentifier = 'FBgn0029814',  briefDescription = None,  cytoLocation = '5C2-5C3')
Gene(score = None,  symbol = 'Cyp49a1',  length = 9543,  name = 'Cyp49a1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG18377',  primaryIdentifier = 'FBgn0033524',  briefDescription = None,  cytoLocation = '47A7-47A9')
Gene(score = None,  symbol = 'CG31371',  length = 2383,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG31371',  primaryIdentifier = 'FBgn0051371',  briefDescription = None,  cytoLocation = '99F6-99F6')
Gene(score = None,  symbol = 'Dop2R',  length = 58027,  name = 'Dopamine 2-like receptor',  description = None,  scoreType = None,  secondaryIdentifier = 'CG33517',  primaryIdentifier = 'FBgn0053517',  briefDescription = None,  cytoLocation = '19A4-19A4')
Gene(score = None,  symbol = 'rad',  length = 90456,  name = 'radish',  description = None,  scoreType = None,  secondaryIdentifier = 'CG44424',  primaryIdentifier = 'FBgn0265597',  briefDescription = None,  cytoLocation = '11D8-11D8')
Gene(score = None,  symbol = 'nAChRalpha7',  length = 21373,  name = 'nicotinic Acetylcholine Receptor alpha7',  description = None,  scoreType = None,  secondaryIdentifier = 'CG32538',  primaryIdentifier = 'FBgn0086778',  briefDescription = None,  cytoLocation = '18C2-18C3')
Gene(score = None,  symbol = 'byn',  length = 8227,  name = 'brachyenteron',  description = None,  scoreType = None,  secondaryIdentifier = 'CG7260',  primaryIdentifier = 'FBgn0011723',  briefDescription = None,  cytoLocation = '68E3-68E3')
Gene(score = None,  symbol = 'alrm',  length = 1717,  name = 'astrocytic leucine-rich repeat molecule',  description = None,  scoreType = None,  secondaryIdentifier = 'CG11910',  primaryIdentifier = 'FBgn0039332',  briefDescription = None,  cytoLocation = '96D2-96D3')
Gene(score = None,  symbol = 'CG44837',  length = 88228,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG44837',  primaryIdentifier = 'FBgn0266100',  briefDescription = None,  cytoLocation = '68E3-68E3')
Gene(score = None,  symbol = 'CG15465',  length = 169106,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG15465',  primaryIdentifier = 'FBgn0029746',  briefDescription = None,  cytoLocation = '4F2-4F2')
Gene(score = None,  symbol = 'NKCC',  length = 13198,  name = 'sodium potassium chloride cotransporter',  description = None,  scoreType = None,  secondaryIdentifier = 'CG31547',  primaryIdentifier = 'FBgn0051547',  briefDescription = None,  cytoLocation = '83A5-83A6')
Gene(score = None,  symbol = 'CG1545',  length = 2075,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG1545',  primaryIdentifier = 'FBgn0030259',  briefDescription = None,  cytoLocation = '10A3-10A3')
Gene(score = None,  symbol = 'CG7365',  length = 3084,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG7365',  primaryIdentifier = 'FBgn0036939',  briefDescription = None,  cytoLocation = '76F2-76F2')
Gene(score = None,  symbol = 'Pkg21D',  length = 4646,  name = 'Protein kinase, cGMP-dependent at 21D',  description = None,  scoreType = None,  secondaryIdentifier = 'CG3324',  primaryIdentifier = 'FBgn0000442',  briefDescription = None,  cytoLocation = '21E2-21E2')
Gene(score = None,  symbol = 'RYa-R',  length = 60956,  name = 'RYamide receptor',  description = None,  scoreType = None,  secondaryIdentifier = 'CG5811',  primaryIdentifier = 'FBgn0004842',  briefDescription = None,  cytoLocation = '97D14-97E1')
Gene(score = None,  symbol = 'hbn',  length = 6242,  name = 'homeobrain',  description = None,  scoreType = None,  secondaryIdentifier = 'CG33152',  primaryIdentifier = 'FBgn0008636',  briefDescription = None,  cytoLocation = '57B5-57B5')
Gene(score = None,  symbol = 'otp',  length = 19790,  name = 'orthopedia',  description = None,  scoreType = None,  secondaryIdentifier = 'CG10036',  primaryIdentifier = 'FBgn0015524',  briefDescription = None,  cytoLocation = '57B4-57B4')
Gene(score = None,  symbol = 'Crz',  length = 863,  name = 'Corazonin',  description = None,  scoreType = None,  secondaryIdentifier = 'CG3302',  primaryIdentifier = 'FBgn0013767',  briefDescription = None,  cytoLocation = '88B3-88B3')
Gene(score = None,  symbol = 'CG9993',  length = 1980,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG9993',  primaryIdentifier = 'FBgn0034553',  briefDescription = None,  cytoLocation = '57B3-57B3')
Gene(score = None,  symbol = 'CG17999',  length = 1960,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG17999',  primaryIdentifier = 'FBgn0034552',  briefDescription = None,  cytoLocation = '57B3-57B3')
Gene(score = None,  symbol = 'CG11353',  length = 6101,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG11353',  primaryIdentifier = 'FBgn0035557',  briefDescription = None,  cytoLocation = '64B9-64B9')
Gene(score = None,  symbol = 'CG15236',  length = 10005,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG15236',  primaryIdentifier = 'FBgn0033108',  briefDescription = None,  cytoLocation = '42D4-42D6')
Gene(score = None,  symbol = 'Cyp301a1',  length = 2877,  name = 'Cyp301a1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG8587',  primaryIdentifier = 'FBgn0033753',  briefDescription = None,  cytoLocation = '49B4-49B5')
Gene(score = None,  symbol = 'CG13285',  length = 1792,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG13285',  primaryIdentifier = 'FBgn0035611',  briefDescription = None,  cytoLocation = '64D3-64D3')
Gene(score = None,  symbol = 'natalisin',  length = 3273,  name = 'natalisin',  description = None,  scoreType = None,  secondaryIdentifier = 'CG34388',  primaryIdentifier = 'FBgn0085417',  briefDescription = None,  cytoLocation = '88C1-88C1')
Gene(score = None,  symbol = 'axo',  length = 57714,  name = 'axotactin',  description = None,  scoreType = None,  secondaryIdentifier = 'CG43225',  primaryIdentifier = 'FBgn0262870',  briefDescription = None,  cytoLocation = '64B12-64B13')
Gene(score = None,  symbol = 'CG7368',  length = 8173,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG7368',  primaryIdentifier = 'FBgn0036179',  briefDescription = None,  cytoLocation = '68C14-68C15')
Gene(score = None,  symbol = 'CG17781',  length = 2206,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG17781',  primaryIdentifier = 'FBgn0039196',  briefDescription = None,  cytoLocation = '95F15-95F15')
Gene(score = None,  symbol = 'CG32564',  length = 1448,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG32564',  primaryIdentifier = 'FBgn0052564',  briefDescription = None,  cytoLocation = '15F3-15F3')
Gene(score = None,  symbol = 'CG30053',  length = 1144,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG30053',  primaryIdentifier = 'FBgn0050053',  briefDescription = None,  cytoLocation = '49B12-49B12')
Gene(score = None,  symbol = 'CG13215',  length = 621,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG13215',  primaryIdentifier = 'FBgn0033592',  briefDescription = None,  cytoLocation = '47E1-47E1')
Gene(score = None,  symbol = 'CG4409',  length = 1635,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG4409',  primaryIdentifier = 'FBgn0034128',  briefDescription = None,  cytoLocation = '53C4-53C4')
Gene(score = None,  symbol = 'CG34109',  length = 31649,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG34109',  primaryIdentifier = 'FBgn0083945',  briefDescription = None,  cytoLocation = None)
Gene(score = None,  symbol = 'CG6337',  length = 1522,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG6337',  primaryIdentifier = 'FBgn0033873',  briefDescription = None,  cytoLocation = '50C6-50C6')
Gene(score = None,  symbol = 'AhcyL2',  length = 3720,  name = 'Adenosylhomocysteinase like 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CG8956',  primaryIdentifier = 'FBgn0015011',  briefDescription = None,  cytoLocation = '89E5-89E5')
Gene(score = None,  symbol = 'nAChRalpha6',  length = 92934,  name = 'nicotinic Acetylcholine Receptor alpha6',  description = None,  scoreType = None,  secondaryIdentifier = 'CG4128',  primaryIdentifier = 'FBgn0032151',  briefDescription = None,  cytoLocation = '30D1-30E1')
Gene(score = None,  symbol = 'Dscam2',  length = 30733,  name = 'Down syndrome cell adhesion molecule 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CG42256',  primaryIdentifier = 'FBgn0265296',  briefDescription = None,  cytoLocation = '65E6-65E6')
Gene(score = None,  symbol = 'unc79',  length = 15367,  name = 'uncoordinated 79',  description = None,  scoreType = None,  secondaryIdentifier = 'CG5237',  primaryIdentifier = 'FBgn0038693',  briefDescription = None,  cytoLocation = '91F12-91F12')
Gene(score = None,  symbol = 'CG14949',  length = 1241,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG14949',  primaryIdentifier = 'FBgn0035358',  briefDescription = None,  cytoLocation = '62E8-62E8')
Gene(score = None,  symbol = 'CG15537',  length = 8423,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG15537',  primaryIdentifier = 'FBgn0039770',  briefDescription = None,  cytoLocation = '99F4-99F4')
Gene(score = None,  symbol = 'Lgr1',  length = 7431,  name = 'Leucine-rich repeat-containing G protein-coupled receptor 1',  description = None,  scoreType = None,  secondaryIdentifier = 'CG7665',  primaryIdentifier = 'FBgn0016650',  briefDescription = None,  cytoLocation = '90C2-90C2')
Gene(score = None,  symbol = 'ChT',  length = 6136,  name = 'Choline transporter',  description = None,  scoreType = None,  secondaryIdentifier = 'CG7708',  primaryIdentifier = 'FBgn0038641',  briefDescription = None,  cytoLocation = '91C1-91C1')
Gene(score = None,  symbol = 'CG6867',  length = 4392,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG6867',  primaryIdentifier = 'FBgn0030887',  briefDescription = None,  cytoLocation = '16F5-16F5')
Gene(score = None,  symbol = 'nkt',  length = 808,  name = 'noktochor',  description = None,  scoreType = None,  secondaryIdentifier = 'CG14141',  primaryIdentifier = 'FBgn0036146',  briefDescription = None,  cytoLocation = '68B1-68B1')
Gene(score = None,  symbol = 'CG1143',  length = 1825,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG1143',  primaryIdentifier = 'FBgn0035359',  briefDescription = None,  cytoLocation = '62E8-62E8')
Gene(score = None,  symbol = 'CG12826',  length = 777,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG12826',  primaryIdentifier = 'FBgn0033207',  briefDescription = None,  cytoLocation = '43E9-43E9')
Gene(score = None,  symbol = 'CG13616',  length = 874,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG13616',  primaryIdentifier = 'FBgn0039200',  briefDescription = None,  cytoLocation = '96A1-96A1')
Gene(score = None,  symbol = 'CG14044',  length = 1564,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG14044',  primaryIdentifier = 'FBgn0031650',  briefDescription = None,  cytoLocation = '25B4-25B4')
Gene(score = None,  symbol = 'CG9657',  length = 3549,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG9657',  primaryIdentifier = 'FBgn0029950',  briefDescription = None,  cytoLocation = '7B2-7B2')
Gene(score = None,  symbol = 'CG18467',  length = 2187,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG18467',  primaryIdentifier = 'FBgn0034218',  briefDescription = None,  cytoLocation = '54B14-54B15')
Gene(score = None,  symbol = 'Cpr62Ba',  length = 7655,  name = 'Cuticular protein 62Ba',  description = None,  scoreType = None,  secondaryIdentifier = 'CG13934',  primaryIdentifier = 'FBgn0035279',  briefDescription = None,  cytoLocation = '62B6-62B6')
Gene(score = None,  symbol = 'CG14115',  length = 1023,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG14115',  primaryIdentifier = 'FBgn0036343',  briefDescription = None,  cytoLocation = '69F6-69F6')
Gene(score = None,  symbol = 'CG4459',  length = 2323,  name = None,  description = None,  scoreType = None,  secondaryIdentifier = 'CG4459',  primaryIdentifier = 'FBgn0038753',  briefDescription = None,  cytoLocation = '92B8-92B8')
Gene(score = None,  symbol = 'asRNA:Eig63F-2',  length = 1275,  name = 'antisense RNA:Ecdysone-induced gene 63F 2',  description = None,  scoreType = None,  secondaryIdentifier = 'CR32265',  primaryIdentifier = 'FBgn0004911',  briefDescription = None,  cytoLocation = '63F7-64A1')

The second way of combining two lists is to first declare a Python List object with the lists that you want to combine. I've called this temporary list "y". You can then use the union method present in list manager to take the set union and give a name to the list in one step. This has been shown below.

In [9]:
y=[l1,l2]
In [10]:
lm.delete_lists(["combination-2"])
lm.union(y,name="combination-2")
Out[10]:
<intermine.lists.list.List at 0x7fb534535eb8>

Similarly, if you want to find the intersection of two lists, you can use the intersect method that is present in the list manager class. Note that, you can combine lists and queries in the same way.