Like any good OpenSource project the MySQL Bugs website is open for anyone to search through. This ofcourse doesn't include the security bugs.
There is a second collection of bugs in the My Oracle Support and these bugs are only accesseble by customers with a support contract. Even when I have access to MOS I still prefer to use the community bugs site. For service requests etc. I would use MOS.
The openness of the bugs database is one of the topic the IOUG MySQL Council discusses with Oracle.
The bugs database has more to offer than just information about initial bugs:
For both there are some filter option for version and whether to include feature requests.
You might want to read life cycle of a MySQL bug by Valeriy Kravchuk if you're not familiar with the different states of MySQL bugs.
The bugs website offers search results not only in the HTML reports as shown on the website but also offers RSS and CSV format so it can be consumed by computer programs.
To demostrate this I wrote this script. It fetches the CSV for my bugs and generates some graphs. As you might (or might not) notice this doesn't include my older bug reports as I have two different reporter-id's as I had different email accounts for my MySQL account and my My Oracle account before MySQL/Sun was merged with Oracle.
This page is made with IPython Notebook, which is a tool which allows you to combine code, output, graphs and text. You can also view this notebook here.
%pylab inline
Populating the interactive namespace from numpy and matplotlib
import requests
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from StringIO import StringIO
from datetime import datetime
This is the URL as found in the bugs website. Feel free to modify the parameters.
bugs_url = "http://bugs.mysql.com/search-csv.php?status=all&severity=all&mine=9242646"
bugs_req = requests.get(bugs_url)
csv_file = StringIO()
csv_file.write(bugs_req.text)
csv_file.seek(0)
bugsdf = pd.read_csv(csv_file)
Let's have a look at the data.
bugsdf.head()
ID | Entered | Modified | Type | Status | Severity | Version | OS | Summary | Assign First | Assign Last | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 68132 | 2013-01-21 13:53:28 | 2013-03-26 16:14:20 | Server: Docs | Closed | S3 | 5.6, 5.5.29 | Any | Documentation about verification with GnuPG is... | Philip | Olson |
1 | 68199 | 2013-01-28 08:23:59 | 2013-05-13 13:10:00 | bugs.mysql.com | Closed | S3 | 5.7 | Any | Month-by-month (tide) stats for 5.7 absent | Sveta | Smirnova |
2 | 68200 | 2013-01-28 08:54:53 | 2013-03-10 16:03:34 | Server: Privileges | Closed | S3 | 5.6.9-rc | Any | Password logging doesn't work as documented | NaN | NaN |
3 | 68536 | 2013-03-01 10:29:46 | 2013-07-31 13:24:26 | MySQL Workbench | Closed | S3 | 5.2.47 | Any | Better support IPv6 addresses for new connections | NaN | NaN |
4 | 68571 | 2013-03-05 09:08:23 | 2014-02-26 09:41:24 | Monitoring: Advisors/Rules | Verified | S3 | NaN | Microsoft Windows (Win7 SP1) | False positive for 32-binary check on win64 | NaN | NaN |
Now we have the data, let's make a horizontal barplot for the number of bugs per category (the Type column).
We change the size of the figure as the default is too small to be readable.
fig = plt.figure(figsize=(8,10), dpi=100)
bugsdf.Type.value_counts(ascending=True).plot(kind='barh')
<matplotlib.axes.AxesSubplot at 0x394de90>
The Version column has a the text for the version. Let's grab the first three characters to get the major version of for the bug. This is not really perfect as it will only return 5.1 if the string is '5.1.30, 5.5.16', but it's good enough for now.
The function will be mapped to the Version column of the dataframe. And we will save the result in a new column called major_version.
def getversion(inputver):
if isinstance(inputver, float):
return inputver
return inputver[:3]
bugsdf['major_version'] = bugsdf.Version.map(getversion)
Besides Pandas we can use matplotlib's pyplot, which is a bit like MATLAB.
Let's create a heatmap for bug status and category
compstat = bugsdf.groupby('major_version').Type.value_counts().unstack().T
fig = plt.figure(figsize=(15,10), dpi=100)
plt.pcolor(compstat, vmin=0, vmax=5, cmap='Blues')
plt.yticks(np.arange(0.5, len(compstat.index), 1), compstat.index)
plt.xticks(np.arange(0.5, len(compstat.columns), 1), compstat.columns)
([<matplotlib.axis.XTick at 0x3c067d0>, <matplotlib.axis.XTick at 0x3d87410>, <matplotlib.axis.XTick at 0x3f2d550>, <matplotlib.axis.XTick at 0x3f2da50>, <matplotlib.axis.XTick at 0x3f2df50>, <matplotlib.axis.XTick at 0x3f15490>, <matplotlib.axis.XTick at 0x3f15990>, <matplotlib.axis.XTick at 0x3f15e90>, <matplotlib.axis.XTick at 0x40983d0>, <matplotlib.axis.XTick at 0x4098b10>, <matplotlib.axis.XTick at 0x3d8f2d0>], <a list of 11 Text xticklabel objects>)
Now we can create a heatmap which compares major versions and components.
fig = plt.figure(figsize=(8,10), dpi=100)
plt.pcolor(compstat, vmin=0, vmax=10, cmap='Blues')
plt.yticks(np.arange(0.5, len(compstat.index), 1), compstat.index)
plt.xticks(np.arange(0.5, len(compstat.columns), 1), compstat.columns)
([<matplotlib.axis.XTick at 0x3f15bd0>, <matplotlib.axis.XTick at 0x3f176d0>, <matplotlib.axis.XTick at 0x42bcad0>, <matplotlib.axis.XTick at 0x42bcfd0>, <matplotlib.axis.XTick at 0x4459d50>, <matplotlib.axis.XTick at 0x4455710>, <matplotlib.axis.XTick at 0x4451310>, <matplotlib.axis.XTick at 0x42cdc50>, <matplotlib.axis.XTick at 0x42c9810>, <matplotlib.axis.XTick at 0x42c51d0>, <matplotlib.axis.XTick at 0x42bdb10>], <a list of 11 Text xticklabel objects>)
Now we have the major version, let's filter on MySQL 5.6 and then graph the different values for the Status field.
bugsdf[bugsdf.major_version == '5.6'].Status.value_counts().plot(kind='bar')
<matplotlib.axes.AxesSubplot at 0x4451e10>
Or maybe check the status for all versions.
bugsdf.groupby('major_version').Status.value_counts().unstack().plot(kind='barh', stacked=True)
<matplotlib.axes.AxesSubplot at 0x446f790>
That's not really helpful, let's remove the Closed ones.
bugsdf[bugsdf.Status != 'Closed'].groupby('major_version').Status.value_counts().unstack().plot(kind='bar', stacked=True)
<matplotlib.axes.AxesSubplot at 0x4464650>
The Entered and Modified fields are not yet in a 'real' date format. So these must be coverted before we can use them.
bugsdf['Entered'] = pd.to_datetime(bugsdf.Entered)
bugsdf['Modified'] = pd.to_datetime(bugsdf.Modified)
Which are the oldest open bugs?
The - before the bugsdf.Status.isin reverses the result, so it behave like 'is not in'.
bugsdf[-bugsdf.Status.isin(['Closed', 'Duplicate', 'Won\'t fix', 'Can\'t repeat'])].sort(columns='Entered')[:5]
ID | Entered | Modified | Type | Status | Severity | Version | OS | Summary | Assign First | Assign Last | major_version | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | 68571 | 2013-03-05 09:08:23 | 2014-02-26 09:41:24 | Monitoring: Advisors/Rules | Verified | S3 | NaN | Microsoft Windows (Win7 SP1) | False positive for 32-binary check on win64 | NaN | NaN | NaN |
5 | 68574 | 2013-03-05 13:42:26 | 2014-02-23 11:25:48 | Server: InnoDB | Verified | S3 | 5.5.30, 5.6.16 | Microsoft Windows (win7 sp1 ent) | No instrumentation for InnoDB files in P_S on ... | Marc | Alff | 5.5 |
19 | 68925 | 2013-04-11 05:57:29 | 2013-04-11 06:15:40 | Client | Verified | S3 | 5.6.10, 5.5.29 | Any | Compatibility issue with mysql history ("\040"... | NaN | NaN | 5.6 |
22 | 69147 | 2013-05-05 10:07:16 | 2013-07-28 14:43:57 | Server: I_S | Verified | S4 | 5.7 | Any | Variable defaults in information_schema | NaN | NaN | 5.7 |
33 | 69223 | 2013-05-14 06:22:48 | 2013-06-24 20:03:16 | Server: DDL | Verified | S4 | 5.6.10 | Any | Give a warning on CREATE TABLE without PRIMARY... | NaN | NaN | 5.6 |
Now let's find bugs which are open and not modified in a some time.
bugsdf[-bugsdf.Status.isin(['Closed', 'Duplicate', 'Won\'t fix', 'Can\'t repeat'])] \
[bugsdf.Modified < datetime(2013, 6, 1)] \
[bugsdf.Modified > datetime(1970, 1, 1)]
ID | Entered | Modified | Type | Status | Severity | Version | OS | Summary | Assign First | Assign Last | major_version | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
19 | 68925 | 2013-04-11 05:57:29 | 2013-04-11 06:15:40 | Client | Verified | S3 | 5.6.10, 5.5.29 | Any | Compatibility issue with mysql history ("\040"... | NaN | NaN | 5.6 |
35 | 69226 | 2013-05-14 10:31:09 | 2013-05-14 16:09:09 | Server: Options | Verified | S3 | 5.6.11 | Any | Status variable for SSL/TLS implementation | NaN | NaN | 5.6 |
36 | 69314 | 2013-05-24 21:33:48 | 2013-05-25 07:55:24 | Server: Options | Verified | S3 | 5.7.1 | Any | ignore-db-dir option doesn't work for database... | NaN | NaN | 5.7 |
Besides searching for bugs by reporter you can also use this to search on other conditions.