The format of the IMDb file is as follows:
|
character#
An example of an IMDb file with the header line and the top two records is shown below:
Using the data provided in 250.imdb
, find the total number of unique genres. It is recommended to use set
to help filter out duplicates.
Note: Be mindful of case sensitivity (e.g., "Action" and "action" should be considered the same genre).
Hint: The correct answer is 22.
Here you have to loop twice!
Correct answers:
Note:
sys.argv
for input/outputAs everything is coding, there are many different ways of writing code that will achieve the same end result. Below is presented one way of thinking about these problems, there are of course many other ways.
# Code Snippet for Finding the Movie with the Highest Rating
# Note that this is just one of the solutions
with open('../../downloads/250.imdb', 'r') as fh:
movieList = []
highestRating = -100
for line in fh:
if not line.startswith('#'):
cols = line.strip().split('|')
rating = float(cols[1].strip())
title = cols[6].strip()
movieList.append((rating, title))
if rating > highestRating:
highestRating = rating
print("Movie(s) with highest rating " + str(highestRating) + ":" )
for i in range(len(movieList)):
if movieList[i][0] == highestRating:
print(movieList[i][1])
Movie(s) with highest rating 9.3: The Shawshank Redemption
# Code Snippet for finding the number of unique genres
# Note that this is just one of the solutions
with open('../../downloads/250.imdb', 'r') as fh:
# empty list to start with
genres_list = []
# iterate over the file
for line in fh:
if not line.startswith('#'):
# split the line into a list, del |
cols = line.strip().split('|')
# extract genres from list, split genres into list
genres = cols[5].strip().split(',')
# loop over genre list and add to empty start list if genre not already in list
for genre in genres:
if genre.lower() not in genres_list:
genres_list.append(genre.lower())
print(genres_list)
print(len(genres_list))
['drama', 'war', 'adventure', 'comedy', 'family', 'animation', 'biography', 'history', 'action', 'crime', 'mystery', 'thriller', 'fantasy', 'romance', 'sci-fi', 'western', 'musical', 'music', 'historical', 'sport', 'film-noir', 'horror'] 22
# Code Snippet for calculating the average length of the movies (in hours and minutes) for each genre
# Note that this is just one of the solutions
with open('../../downloads/250.imdb', 'r') as fh:
genreDict = {}
for line in fh:
if not line.startswith('#'):
cols = line.strip().split('|')
genre = cols[5].strip()
glist = genre.split(',')
runtime = cols[3] # length of movie in seconds
for entry in glist:
if not entry.lower() in genreDict:
genreDict[entry.lower()] = [] # add a list with the runtime
genreDict[entry.lower()].append(int(runtime)) # append runtime to existing list
fh.close()
for genre in genreDict: # loop over the genres in the dictionaries
average = sum(genreDict[genre])/len(genreDict[genre]) # calculate average length per genre
hours = int(average/3600) # format seconds to hours
minutes = (average - (3600*hours))/60 # format seconds to minutes
print('The average length for movies in genre '+genre\
+' is '+str(hours)+'h'+str(round(minutes))+'min')
The average length for movies in genre drama is 2h14min The average length for movies in genre war is 2h30min The average length for movies in genre adventure is 2h13min The average length for movies in genre comedy is 1h53min The average length for movies in genre family is 1h44min The average length for movies in genre animation is 1h40min The average length for movies in genre biography is 2h30min The average length for movies in genre history is 2h47min The average length for movies in genre action is 2h18min The average length for movies in genre crime is 2h11min The average length for movies in genre mystery is 2h3min The average length for movies in genre thriller is 2h11min The average length for movies in genre fantasy is 2h2min The average length for movies in genre romance is 2h2min The average length for movies in genre sci-fi is 2h6min The average length for movies in genre western is 2h11min The average length for movies in genre musical is 1h57min The average length for movies in genre music is 2h24min The average length for movies in genre historical is 2h38min The average length for movies in genre sport is 2h17min The average length for movies in genre film-noir is 1h43min The average length for movies in genre horror is 1h59min
Example code can be found at https://uppsala.instructure.com/courses/99844/modules/items/1111740