Analyzing Heavy Traffic Indicators on I-94¶

I'm going to analyze the westbound traffic on the I-94 Interstate highway. It is an east–west Interstate Highway connecting the Great Lakes and northern Great Plains regions of the United States.
John Hogue made the dataset available, and you can download it from the UCI Machine Learning Repository.
The goal of our analysis is to determine a few indicators of heavy traffic on I-94. These indicators can be weather type, time of the day, time of the week, etc. For instance, we may find out that the traffic is usually heavier in the summer or when it snows.
The dataset documentation mentions that a station located approximately midway between Minneapolis and Saint Paul recorded the traffic data.
The station only records westbound traffic (cars moving from east to west).

Exploring Rows & Columns¶

In [21]:

# Import pandas library

import pandas as pd

In [22]:

# Creating 'df' DataFrame by using pandas.

df = pd.read_csv('Metro_Interstate_Traffic_Volume.csv')

In [23]:

df.head(5)

Out[23]:

	holiday	temp	clouds_all	weather_main	weather_description	date_time	traffic_volume
0	None	288.28	40	Clouds	scattered clouds	2012-10-02 09:00:00	5545
1	None	289.36	75	Clouds	broken clouds	2012-10-02 10:00:00	4516
2	None	289.58	90	Clouds	overcast clouds	2012-10-02 11:00:00	4767
3	None	290.13	90	Clouds	overcast clouds	2012-10-02 12:00:00	5026
4	None	291.14	75	Clouds	broken clouds	2012-10-02 13:00:00	4918

In [24]:

df.tail(5)

Out[24]:

	holiday	temp	clouds_all	weather_main	weather_description	date_time	traffic_volume
48199	None	283.45	75	Clouds	broken clouds	2018-09-30 19:00:00	3543
48200	None	282.76	90	Clouds	overcast clouds	2018-09-30 20:00:00	2781
48201	None	282.73	90	Thunderstorm	proximity thunderstorm	2018-09-30 21:00:00	2159
48202	None	282.09	90	Clouds	overcast clouds	2018-09-30 22:00:00	1450
48203	None	282.12	90	Clouds	overcast clouds	2018-09-30 23:00:00	954

In [25]:

df.columns

Out[25]:

Index(['holiday', 'temp', 'rain_1h', 'snow_1h', 'clouds_all', 'weather_main',
       'weather_description', 'date_time', 'traffic_volume'],
      dtype='object')

In [26]:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48204 entries, 0 to 48203
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   holiday              48204 non-null  object 
 1   temp                 48204 non-null  float64
 2   rain_1h              48204 non-null  float64
 3   snow_1h              48204 non-null  float64
 4   clouds_all           48204 non-null  int64  
 5   weather_main         48204 non-null  object 
 6   weather_description  48204 non-null  object 
 7   date_time            48204 non-null  object 
 8   traffic_volume       48204 non-null  int64  
dtypes: float64(3), int64(2), object(4)
memory usage: 3.3+ MB

There are total 9 columns.

Each Column has 48204 entries.

No comumn has null values.

The data spread across the period of 6 years from 2012 to 2018.

Plot Distribution of Traffic Volume¶

In [28]:

# Import matplotlib library to plot graph.

import matplotlib.pyplot as plt
%matplotlib inline

In [29]:

# plot histogram to see distribution of traffic volume column of DataFrame.

plt.hist(df["traffic_volume"])
plt.xlabel("Volume of Traffic")
plt.ylabel("Frequency")
plt.title("Distribution of Traffic")
plt.show()

In [30]:

df["traffic_volume"].describe()

Out[30]:

count    48204.000000
mean      3259.818355
std       1986.860670
min          0.000000
25%       1193.000000
50%       3380.000000
75%       4933.000000
max       7280.000000
Name: traffic_volume, dtype: float64

About 25% of time traffic is very low around 1100 & below it.

About 25% off time traffic very high around 4000 .

There is a possibility that nighttime & daytime might influence the traffic volume.

We will divide the dataset into two parts:

- **Daytime data**: hours from 7 a.m. to 7p.m.(12 hours)
- **Nighttime data**: hours from 7 p.m. to 7 a.m. (12 hours)

Divide Dataset based on Time-Zone¶

In [31]:

# Converting 'date_time' column into date_time object.

df["date_time"] = pd.to_datetime(df["date_time"])

In [32]:

# Creating new column 'time' to get time(in hrs) during record of data

df["time"] = df["date_time"].dt.hour

In [33]:

df["time"].dtype

Out[33]:

dtype('int64')

In [34]:

# Creating daytime data_set

combined = (df["time"] > 7) & (df["time"] < 19)
day_data = df[combined]

In [35]:

day_data.head()

Out[35]:

	holiday	temp	clouds_all	weather_main	weather_description	date_time	traffic_volume	time
0	None	288.28	40	Clouds	scattered clouds	2012-10-02 09:00:00	5545	9
1	None	289.36	75	Clouds	broken clouds	2012-10-02 10:00:00	4516	10
2	None	289.58	90	Clouds	overcast clouds	2012-10-02 11:00:00	4767	11
3	None	290.13	90	Clouds	overcast clouds	2012-10-02 12:00:00	5026	12
4	None	291.14	75	Clouds	broken clouds	2012-10-02 13:00:00	4918	13

In [36]:

# Creating nighttime data_set

combined = (df["time"] > 7) & (df["time"] < 19)
night_data = df[~combined]

In [37]:

night_data.head()

Out[37]:

	holiday	temp	clouds_all	weather_main	weather_description	date_time	traffic_volume	time
10	None	290.97	20	Clouds	few clouds	2012-10-02 19:00:00	3539	19
11	None	289.38	1	Clear	sky is clear	2012-10-02 20:00:00	2784	20
12	None	288.61	1	Clear	sky is clear	2012-10-02 21:00:00	2361	21
13	None	287.16	1	Clear	sky is clear	2012-10-02 22:00:00	1529	22
14	None	285.45	1	Clear	sky is clear	2012-10-02 23:00:00	963	23

Plot & Compare Day vs Night Time Traffic Volume¶

In [38]:

# Defining size of canvas.
plt.figure(figsize = (10,8))

# Ceating day time plot for frequency of traffic volume.
plt.subplot(1,2,1)
plt.hist(day_data["traffic_volume"])
plt.title("Day Traffic Volume")
plt.xlabel("Volume of Traffic")
plt.ylabel("Number of Traffic")
plt.xlim(0,7000)

# Creating night time plot for frequency of traffic volume.
plt.subplot(1,2,2)
plt.hist(night_data["traffic_volume"])
plt.title("Night Traffic Volume")
plt.xlabel("Volume of Traffic")
plt.ylabel("Number of Traffic")
plt.xlim(0,7000)
plt.show()

In [39]:

day_data["traffic_volume"].describe()

Out[39]:

count    21798.000000
mean      4764.132948
std       1021.369570
min          0.000000
25%       4271.000000
50%       4792.000000
75%       5410.000000
max       7280.000000
Name: traffic_volume, dtype: float64

In [40]:

night_data["traffic_volume"].describe()

Out[40]:

count    26406.000000
mean      2018.015375
std       1713.201969
min          0.000000
25%        581.000000
50%       1485.000000
75%       2934.000000
max       7260.000000
Name: traffic_volume, dtype: float64

In day, maximum number of times(apprx. 50 %) traffic is between 4000 to 5000.

In night, mximum number of times(approx. 75%) traffic is less than 3000.

So, We can concludde that traffic is very light at night as compare to day time. So, for accurate analysis we should analyze day time data only.

Analyze & Plot Month vs Traffic Volume¶

In [41]:

# Creating new column 'month' to get month during record of data

day_data["month"] = day_data['date_time'].dt.month
by_month = day_data.groupby('month').mean()
by_month = by_month.reset_index()
by_month

<ipython-input-41-62ef0351e4d9>:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Out[41]:

	month	temp	rain_1h	snow_1h	clouds_all	traffic_volume	time
0	1	265.610396	0.015543	0.000692	58.614160	4499.832053	12.901207
1	2	267.190126	0.004091	0.000000	51.797986	4705.570170	12.873505
2	3	274.005655	0.016981	0.000000	56.890018	4896.060371	12.902570
3	4	280.040564	0.107413	0.000000	59.176874	4887.885428	13.008448
4	5	289.754275	0.138673	0.000000	57.174344	4901.648341	13.001981
5	6	295.068585	0.250047	0.000000	49.030409	4905.114035	13.000000
6	7	297.301008	4.780495	0.000000	42.222017	4595.017576	12.926457
7	8	295.664621	0.225874	0.000000	42.723892	4918.958227	12.933775
8	9	292.927130	0.276739	0.000000	45.394830	4870.988249	12.912456
9	10	284.455871	0.017481	0.000000	54.050625	4934.438125	13.035625
10	11	276.962591	0.006747	0.000000	57.025210	4698.226291	13.018607
11	12	267.920650	0.037282	0.002347	67.122122	4422.761261	12.885886

In [42]:

# Calculating variation between maximum & minimum mean traffic volume grouped 
# by month.

x = by_month["traffic_volume"].max() - by_month["traffic_volume"].min()
Variation = (x*100)/(by_month["traffic_volume"].min())
print( Variation )

11.5691721418583

In [43]:

# Plotting line graph for month vs number of vehicle.

plt.plot(by_month["month"],by_month["traffic_volume"],color = "Blue")
plt.xlim(1,12)
plt.xlabel("Months")
plt.ylabel("Number of Vehicles")
plt.title("Month vs Traffic")

Out[43]:

Text(0.5, 1.0, 'Month vs Traffic')

Traffic Volume is high & almost constant in Summer Months.

Traffic Volume is slightly low in Winter Months.

There is deviation from the trend in July as it shows sudden decrease in traffic volume.

The difference between the lowest and highest traffic volume averaage based on month iss 11.56 % only.

Analyze & Plot Day Of Week vs Traffic Volume¶¶

In [44]:

# Creating new column 'day_week' to get day of week  during record of data

day_data["day_week"] = day_data["date_time"].dt.dayofweek
by_dayofweek = day_data.groupby("day_week").mean()
by_dayofweek = by_dayofweek.reset_index()
by_dayofweek

<ipython-input-44-dd921e1f494b>:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Out[44]:

	day_week	temp	rain_1h	snow_1h	clouds_all	traffic_volume	time	month
0	0	282.558381	3.192710	0.000019	58.087024	4807.164138	12.998142	6.408486
1	1	282.637195	0.113711	0.000180	52.385495	5109.419471	12.869977	6.474028
2	2	282.559503	0.072698	0.001192	53.852568	5207.297083	12.956563	6.633164
3	3	282.620872	0.166149	0.000162	54.300779	5228.966580	12.984101	6.515574
4	4	282.541041	0.096456	0.000246	51.827497	5220.602140	12.972763	6.593709
5	5	282.694025	0.111649	0.000103	50.681539	4119.368251	12.916263	6.500485
6	6	282.705909	0.095725	0.000000	52.319871	3652.753150	12.945396	6.609047

In [45]:

# Calculating variation between maximum & minimum mean traffic volume grouped 
# by dayofweek.

x = by_dayofweek["traffic_volume"].max() - by_dayofweek["traffic_volume"].min()
Variation = (x*100)/(by_dayofweek["traffic_volume"].min())
print( Variation )

43.15138102874187

In [46]:

# Plotting line graph for day of week vs number of vehicles.

plt.plot(by_dayofweek["day_week"],by_dayofweek["traffic_volume"],color = "Blue")
plt.xlim(0,6)
plt.xlabel("day_of_week")
plt.ylabel("Number of Vehicles")
plt.title("Day vs Traffic")

Out[46]:

Text(0.5, 1.0, 'Day vs Traffic')

There is tremendous decrease in traffic volume on weekends.

On weekdays, Traffic Volume is almost constant with highest on Thrusday.

The difference between lowest and highest traffic volume average is 43.15% .

Analyze & Plot Traffic Volume on Bussiness Days & Weekends¶

In [47]:

# Creating 'hour' column to get time in during record of data.
day_data['hour'] = day_data['date_time'].dt.hour

# Creating business days & weekend dataframe separately.
business_days = day_data.copy()[day_data['day_week'] <= 4]
weekend = day_data[day_data['day_week'] >= 5].copy()

# creating separate DataFrame for business days & weekends.
by_hour_business = business_days.groupby('hour').mean()
by_hour_weekend = weekend.groupby('hour').mean()

<ipython-input-47-f72c3f42cb28>:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In [48]:

by_hour_business = by_hour_business.reset_index()

In [49]:

by_hour_business

Out[49]:

	hour	temp	rain_1h	snow_1h	clouds_all	traffic_volume	time	month	day_week
0	8	278.938443	0.144614	0.000135	53.666441	5503.497970	8.0	6.567659	1.989175
1	9	279.628421	0.156829	0.000139	53.619709	4895.269257	9.0	6.484386	1.981263
2	10	280.664650	0.113984	0.000033	54.781417	4378.419118	10.0	6.481283	1.957888
3	11	281.850231	0.151976	0.000000	52.808876	4633.419470	11.0	6.448819	1.979957
4	12	282.832763	0.090271	0.001543	53.855714	4855.382143	12.0	6.569286	1.989286
5	13	283.292447	0.092433	0.000370	53.325444	4859.180473	13.0	6.465237	1.982988
6	14	284.091787	0.102991	0.000746	55.326531	5152.995778	14.0	6.588318	1.990852
7	15	284.450605	0.090036	0.000274	54.168467	5592.897768	15.0	6.541397	1.962563
8	16	284.399011	0.118180	0.000632	54.444132	6189.473647	16.0	6.580464	1.995081
9	17	284.263033	7.299358	0.000000	55.204960	5784.827133	17.0	6.510576	1.994165
10	18	284.388061	0.121533	0.000125	54.183079	4434.209431	18.0	6.529126	1.988211

In [50]:

by_hour_weekend = by_hour_weekend.reset_index()

In [51]:

by_hour_business

Out[51]:

	hour	temp	rain_1h	snow_1h	clouds_all	traffic_volume	time	month	day_week
0	8	278.938443	0.144614	0.000135	53.666441	5503.497970	8.0	6.567659	1.989175
1	9	279.628421	0.156829	0.000139	53.619709	4895.269257	9.0	6.484386	1.981263
2	10	280.664650	0.113984	0.000033	54.781417	4378.419118	10.0	6.481283	1.957888
3	11	281.850231	0.151976	0.000000	52.808876	4633.419470	11.0	6.448819	1.979957
4	12	282.832763	0.090271	0.001543	53.855714	4855.382143	12.0	6.569286	1.989286
5	13	283.292447	0.092433	0.000370	53.325444	4859.180473	13.0	6.465237	1.982988
6	14	284.091787	0.102991	0.000746	55.326531	5152.995778	14.0	6.588318	1.990852
7	15	284.450605	0.090036	0.000274	54.168467	5592.897768	15.0	6.541397	1.962563
8	16	284.399011	0.118180	0.000632	54.444132	6189.473647	16.0	6.580464	1.995081
9	17	284.263033	7.299358	0.000000	55.204960	5784.827133	17.0	6.510576	1.994165
10	18	284.388061	0.121533	0.000125	54.183079	4434.209431	18.0	6.529126	1.988211

In [52]:

# Defining size of canvas.
plt.figure(figsize =(12,8))

# Plotting line graph for number of vehicles on business days.
plt.subplot(1,2,1)
plt.plot(by_hour_business["hour"],by_hour_business["traffic_volume"])
plt.xlabel("Hour")
plt.ylabel("traffic volume")
plt.ylim(2250,6250)
plt.title("Traffic vs Hour on Weekdays")

# Plotting line graph for number of vehicles on weekends.
plt.subplot(1,2,2)
plt.plot(by_hour_weekend["hour"],by_hour_weekend["traffic_volume"])
plt.xlabel("Hour")
plt.ylabel("traffic volume")
plt.ylim(2250,6250)
plt.title("Traffic vs Hour on Weekend")

Out[52]:

Text(0.5, 1.0, 'Traffic vs Hour on Weekend')

As usual, traffic volume on weekdays is more than the weekend.

The business graph shows that normal business hour is between 9 am to 3 pm .

There is very high volume of traffic in the morning aand evening during business days. It is due to rush of people toward office and then coming back to home.

Generally, evening has more rush than moring in the weekdays. It may be possible that people who do freelance move out of their home in the evening.

In , weekend traffic volume increases till 12 noon and then become constant till 4 pm with a uniform decreament afterwards.

It may be due to outing of people on late moring and then return on evening.

Correlation b/w Traffic Volume & Numerical Weather Columns¶

In [53]:

df.corr()['traffic_volume']

Out[53]:

temp              0.130299
rain_1h           0.004714
snow_1h           0.000733
clouds_all        0.067054
traffic_volume    1.000000
time              0.352401
Name: traffic_volume, dtype: float64

In [54]:

df['temp'].corr(df['traffic_volume'])

Out[54]:

0.13029879817112658

In [55]:

by_temp = df.groupby('temp').mean()
by_temp = by_temp.reset_index()
by_temp

Out[55]:

	temp	rain_1h	snow_1h	clouds_all	traffic_volume	time
0	0.00	0.0	0.0	0.0	1318.2	5.1
1	243.39	0.0	0.0	1.0	1462.0	8.0
2	243.62	0.0	0.0	1.0	1037.0	7.0
3	244.22	0.0	0.0	1.0	800.0	6.0
4	244.82	0.0	0.0	11.0	483.0	4.0
...	...	...	...	...	...	...
5838	308.87	0.0	0.0	40.0	4798.0	18.0
5839	308.95	0.0	0.0	40.0	3812.0	15.0
5840	309.08	0.0	0.0	40.0	5314.0	17.0
5841	309.29	0.0	0.0	40.0	5902.0	16.0
5842	310.07	0.0	0.0	75.0	3810.0	16.0

5843 rows × 6 columns

The coorelation between al weather factors & traffic volume is very weak.

The highest correlation value is for Temp vs Traffic Vol. which is 0.130299 only.

Let's analyze Temp vs Traffic Vol. further.

Analyze & Plot Scatter Graph On Temperature vs Traffic Volume¶

In [56]:

#Plotting scattr plot b/w temperature vs traffic volume.

plt.scatter(by_temp["temp"],by_temp['traffic_volume'])
plt.xlabel("temp")
plt.ylabel("traffic volume")
plt.xlim(230,320)
plt.title("temp vs traffic volume")

Out[56]:

Text(0.5, 1.0, 'temp vs traffic volume')

Since, Temp range is concentrated b/w 245 - 305 kelvin only.

Due to 50 thousand (apprx.) rows, the graph appears as a block.

Let's plot scatter graph with lesser number of rows.

In [57]:

# Defining size of the canvas.
plt.figure(figsize = (10,12))

# Plotting scatter plot with all rows(apprx. 6000).
plt.subplot(2,2,1)
plt.scatter(by_temp.loc[:,"temp"],by_temp.loc[:,'traffic_volume'])
plt.xlabel("temp")
plt.ylabel("traffic volume")
plt.xlim(240,320)
plt.title("temp vs traffic volume")

# Plotting scatter plot with first 2000 rows.
plt.subplot(2,2,2)
plt.scatter(by_temp.loc[0:2000,"temp"],by_temp.loc[0:2000,'traffic_volume'])
plt.xlabel("temp")
plt.ylabel("traffic volume")
plt.xlim(240,320)
plt.title("temp vs traffic volume")


#Plotting scatter plot with next 2000 rows.
plt.subplot(2,2,3)
plt.scatter(by_temp.loc[2000:4000,"temp"],by_temp.loc[2000:4000,'traffic_volume'])
plt.xlabel("temp")
plt.ylabel("traffic volume")
plt.xlim(240,320)
plt.title("temp vs traffic volume")


# Plotting scatter plot with last 2000 rows.
plt.subplot(2,2,4)
plt.scatter(by_temp.loc[4000:6000,"temp"],by_temp.loc[4000:6000,'traffic_volume'])
plt.xlabel("temp")
plt.ylabel("traffic volume")
plt.xlim(240,320)
plt.title("temp vs traffic volume")

Out[57]:

Text(0.5, 1.0, 'temp vs traffic volume')

From the above graph, we can infer that there is uniform distribution of traffic between the 245 - 305 kelvin temperature.

More or less numerical weather column is not reliable indicator of heavy traffic.

In [77]:

# Cretaing Dataframe which contain temperature range b/w 245-305 kelvin.

a = day_data['temp'] > 245
b = day_data['temp'] < 305
combined = a & b
x = day_data[combined]
x

Out[77]:

	holiday	temp	rain_1h	snow_1h	clouds_all	weather_main	weather_description	date_time	traffic_volume	time	month	day_week	hour
0	None	288.28	0.00	0.0	40	Clouds	scattered clouds	2012-10-02 09:00:00	5545	9	10	1	9
1	None	289.36	0.00	0.0	75	Clouds	broken clouds	2012-10-02 10:00:00	4516	10	10	1	10
2	None	289.58	0.00	0.0	90	Clouds	overcast clouds	2012-10-02 11:00:00	4767	11	10	1	11
3	None	290.13	0.00	0.0	90	Clouds	overcast clouds	2012-10-02 12:00:00	5026	12	10	1	12
4	None	291.14	0.00	0.0	75	Clouds	broken clouds	2012-10-02 13:00:00	4918	13	10	1	13
...	...	...	...	...	...	...	...	...	...	...	...	...	...
48194	None	283.84	0.00	0.0	75	Rain	proximity shower rain	2018-09-30 15:00:00	4302	15	9	6	15
48195	None	283.84	0.00	0.0	75	Drizzle	light intensity drizzle	2018-09-30 15:00:00	4302	15	9	6	15
48196	None	284.38	0.00	0.0	75	Rain	light rain	2018-09-30 16:00:00	4283	16	9	6	16
48197	None	284.79	0.00	0.0	75	Clouds	broken clouds	2018-09-30 17:00:00	4132	17	9	6	17
48198	None	284.20	0.25	0.0	75	Rain	light rain	2018-09-30 18:00:00	3947	18	9	6	18

21677 rows × 13 columns

In [76]:

day_data.shape

Out[76]:

(21798, 13)

So, given temperature range has only 100 (apprx.) rows less than full dataset.

Analyze & Plot Traffic Volume based on Main Weather¶

In [60]:

# Calculating mean traffic volume grouped by 'weather_main' column.
by_weather_main = day_data.groupby('weather_main').mean()
by_weather_main = by_weather_main.reset_index()

In [64]:

# Plotting horizontal bar graph for traffic volume vs weather type.
plt.barh(by_weather_main["weather_main"],by_weather_main["traffic_volume"])

Out[64]:

<BarContainer object of 11 artists>

The Weather Type doesn't bring any significant changes to the traffic volume.

However, Squall & Fog reduce the traffic volume near to 4000.

Fog has most negative impacted traffic volume among all factors. It may be due to vague visibility in atmosphere during fog.

Snow, Mist & Haze has also shown minor negative changes in traffic volume.

Analyze & Plot Traffic Volume based on Weather Description¶

In [65]:

# Calculating mean traffic volume grouped by 'weather_description' column.
by_weather_description = day_data.groupby('weather_description').mean()
by_weather_description = by_weather_description.reset_index()

In [66]:

# Plotting horizontal bar graph for traffic volume vs weather description.
plt.figure(figsize = (10,20))
plt.barh(by_weather_description["weather_description"],by_weather_description["traffic_volume"])
plt.yticks(fontsize=14)
plt.show()

The Weather Description doesn't bring significant changes to the traffic volume except during

**Thunderstorm with Drizzle** which reduces the traffic volume close to **2000**.

Other negative factors mainly include: Thunderstorm With Rain, Thunderstorm With Light Rain, Snow, Sleet, Proximity Thunderstorm With Rain, Light Snow, Mist, Light Shower Snow, Heavy snow, llight Intensity Shower Rain, Fog, Freezing Rain, Squalls.

Some bad weather description has tend to increase the traffic volume mainly includes: Heavy Rain, Shower Drizzle, Proximity Thunderstorm with Drizzle, Light Rain & Snow.

It may be due to fact that these weather condition are not too bad. So, people want to travel but with car rather other means such as bike or by walk or waiting for bus.

Conclusion¶

In this project, we tried to find a few indicators of heavy traffic on the I-94 Interstate highway. We managed to find two types of indicators:

Time indicators

The traffic is usually heavier during warm months (March–October) compared to cold months (November–February).
The traffic is usually heavier on business days compared to the weekends.
On business days, the rush hours are around 9 and 15.

Weather indicators

Heavy Rain
Shower Drizzle
Light rain and snow
Proximity thunderstorm with drizzle