The bacteria P. pythonicus replicates every one hour, in a 100 ml tube. Being a very unfriendly bacteria, they reach stationary phase when there are 1,000,000 or more bacteria in the tube.
a) Write a program that will calculate the number of bacteria after one hour, two hours, etc, until stationarity is reached. The program will receive the starter size (number of bacteria to begin with), and start calculating from there. At each time point, the following message should be printed:
< time > hours: < no. of bacteria > bacteria
starter = 5 # Replace ??? with a value of your choice.
bacteria = starter
time = 0
while bacteria < 1000000:
print(time,'hours:',bacteria,'bacteria')
time = time + 1
bacteria = bacteria * 2
0 hours: 5 bacteria 1 hours: 10 bacteria 2 hours: 20 bacteria 3 hours: 40 bacteria 4 hours: 80 bacteria 5 hours: 160 bacteria 6 hours: 320 bacteria 7 hours: 640 bacteria 8 hours: 1280 bacteria 9 hours: 2560 bacteria 10 hours: 5120 bacteria 11 hours: 10240 bacteria 12 hours: 20480 bacteria 13 hours: 40960 bacteria 14 hours: 81920 bacteria 15 hours: 163840 bacteria 16 hours: 327680 bacteria 17 hours: 655360 bacteria
b) It turns out that the growth rate of P. pythonicus is affected by temperature. It's replication time r, is a function of the temperature T, so that:
$r = \frac{19 T (T - 70)}{2450} + 10$.
However, when the temperature is below 5, or over 50, the bacteria don't grow at all.
Write a program that will receive the starter size and the growth temperature, and will calculate the time to reach stationarity, printing the number of bacteria at each time point (like in part a). If bacteria can't grow, print an appropriate message (and don't do any calculation).
starter = 5 # Replace ??? with a value of your choice.
temp = 23 # Replace ??? with a value of your choice.
# check temperature and calculate replication time
if temp < 5 or temp > 50:
print("Bacteria can't grow in this temperature")
else:
r = (19 * temp * (temp - 70))/2450 + 10
# calculate and growth
bacteria = starter
time = 0
while bacteria < 1000000:
print(time,'hours:',bacteria,'bacteria')
time = time + r
bacteria = bacteria * 2
0 hours: 5 bacteria 1.6167346938775502 hours: 10 bacteria 3.2334693877551004 hours: 20 bacteria 4.850204081632651 hours: 40 bacteria 6.466938775510201 hours: 80 bacteria 8.083673469387751 hours: 160 bacteria 9.700408163265301 hours: 320 bacteria 11.317142857142851 hours: 640 bacteria 12.933877551020402 hours: 1280 bacteria 14.550612244897952 hours: 2560 bacteria 16.167346938775502 hours: 5120 bacteria 17.784081632653052 hours: 10240 bacteria 19.400816326530602 hours: 20480 bacteria 21.017551020408153 hours: 40960 bacteria 22.634285714285703 hours: 81920 bacteria 24.251020408163253 hours: 163840 bacteria 25.867755102040803 hours: 327680 bacteria 27.484489795918353 hours: 655360 bacteria
a) Here’s a short section of genomic DNA:
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACTACTAT
It comprises two exons and an intron. The first exon runs from the start of the sequence to the sixty-third character, and the second exon runs from the ninety-first character to the end of the sequence.
Write a program that will print just the coding regions of the DNA sequence.
seq = 'ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACTACTAT'
## your code goes here
exon1 = seq[:63]
exon2 = seq[90:]
coding = exon1 + exon2
print(coding)
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGAATCATCGATCGATATCGATGCATCGACTACTAT
b) Using the sequence from a, write a program that will calculate what percentage of the DNA sequence is coding.
## your code goes here
percent_coding = len(coding)/len(seq)*100
print(percent_coding,"%","of the sequence is coding")
77.23577235772358 % of the sequence is coding
c) Using the data from a, write a program that will print out the original genomic DNA sequence with coding bases in uppercase and non-coding bases in lowercase.
## your code goes here
intron = seq[63:90]
print(exon1 + intron.lower() + exon2)
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGAtcgatcgatcgatcgatcgatcatgctATCATCGATCGATATCGATGCATCGACTACTAT
The list sequences
contains a number of DNA sequences as strings. Each sequence starts with the same 14 base pair fragment – a sequencing adapter that should have been removed.
Write a program that will trim this adapter and print the cleaned sequences to the screen. The program will then print the length of each cleaned sequence to the screen.
sequences = ['ATTCGATTATAAGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC', \
'ATTCGATTATAAGCACTGATCGATCGATCGATCGATCGATGCTATCGTCGT', \
'ATTCGATTATAAGCATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC', \
'ATTCGATTATAAGCACTATCGATGATCTAGCTACGATCGTAGCTGTA', \
'ATTCGATTATAAGCACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTATGCA']
## your code goes here
for seq in sequences:
cleaned_seq = seq[14:]
print(cleaned_seq)
print(len(cleaned_seq))
TCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC 42 ACTGATCGATCGATCGATCGATCGATGCTATCGTCGT 37 ATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC 48 ACTATCGATGATCTAGCTACGATCGTAGCTGTA 33 ACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTATGCA 47
The string genomic_dna
contains a section of genomic DNA.
The list exons
contains start/stop positions of exons.
Each exon is a separate list (within the list of exons) with two elements: the start and stop positions.
Write a program that will extract the exon segments from genomic_dna
using the positions in exons
, concatenate them, and print them to the screen.
genomic_dna = 'TCGATCGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGACTGATCGATCGATCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGATCGATCATATGTCAGTCGATGCATCGTAGCATCGTATAGTAGCTACGTAGCTACGATCGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTAGCTAGTACGATCGCGTAGCTAGCATGCTACGTAGATCGATCGATGCATGCTAGCTAGCTAGCTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCGATGCTACGTAGATCGATCGCTAGTAGATCGATCGCTAGCTAGCTGACTAGTACGCTGCTAGTAGTCAGCTAGATCGATGCTAGTCA'
exons = [[5, 58], [72, 133], [190, 276], [340, 398]] # [[start, stop], [start, stop], ...]
## your code goes here
coding = ""
for exon in exons:
start = exon[0]
stop = exon[1]
exon_seq = genomic_dna[start:stop]
coding = coding + exon_seq
print(coding)
CGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGACGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCG 258
Questions modified from Python for Biologists, a great book by Martin Jones.