By Ben Welsh
The Los Angeles Times conducted an analysis of federal data to evaluate the makeup and pay of construction work.
The analysis found that wages have been in decline nationwide for several decades. In LA County, this has coincided with a shift to a workforce made up overwhelmingly of Latinos, which leads a nationwide trend in the industry toward employing more Latino workers.
Those results were published in the April 20, 2017 story "Immigrants flooded California construction. Worker pay sank. Here’s why".
The story cities other studies by UCLA, unionstats.com and other sources that are not reproduced here.
Downloaded data on wages from the Bureau of Labor Statistics' Current Employment Statistics Program and prepared it for analysis.
%%capture %run bls/01-download.ipynb %run bls/02-transform.ipynb
Downloaded U.S. Census data tracking the hispanic ethnicity of adult workers in Los Angeles by industry from the University of Minnesota's IPUMS online data analysis system and prepared it for analysis.
import cpi import calculate import pandas as pd import matplotlib.pyplot as plt
import warnings warnings.filterwarnings("ignore")
Read in BLS data that tracks a number of metrics by industry
bls = pd.read_csv("./bls/output/bls_ce_transformed.csv")
Filter down to unadjusted data, as recommended by BLS staff
bls = bls[bls.seasonal == 'U']
Filter down to records that track large sectors of the economy the BLS calls "super sectors." Construction is one of them.
bls = bls[bls.supersector_name == bls.industry_name]
Filter down to data tracking the average hourly pay for non-supervisors in each supersector
bls = bls[ bls.data_type_text == 'AVERAGE HOURLY EARNINGS OF PRODUCTION AND NONSUPERVISORY EMPLOYEES' ]
Filter down to annual totals
bls = bls[bls.period == 'M13']
Adjust the wages for inflation to 2016 dollars using the Consumer Price Index
bls['value_2016_dollars'] = bls.apply( lambda x: cpi.to_2016_dollars(x.value, x.year), axis=1 )
Trim that data down and chart the change in Construction pay
bls_construction = bls[ bls.supersector_name == 'Construction' ].set_index("year")
<matplotlib.axes._subplots.AxesSubplot at 0x7f5daad84490>
Output data for a graphic
bls_construction[['value_2016_dollars']][bls_construction.index > 1972].to_csv("./bls/output/graphic.csv")
What as the peak year?
bls_construction.sort_values( "value_2016_dollars", ascending=False ).value_2016_dollars.head(1)
year 1972 31.866958 Name: value_2016_dollars, dtype: float64
How much do they make today?
bls_construction.sort_index( ascending=False ).value_2016_dollars.head(1)
year 2016 25.97 Name: value_2016_dollars, dtype: float64
How much has pay declined between the two years?
max_construction_pay = bls_construction.at[1972, 'value_2016_dollars']
construction_pay_today = bls_construction.at[2016, 'value_2016_dollars']
construction_pay_today - max_construction_pay
Pull the same time series for the private sector as a whole
bls_overall = bls[ bls.supersector_name == 'Total private' ].set_index("year")
Compare the two
bls_comparison = pd.merge( bls_construction.reset_index()[['year', 'value_2016_dollars']], bls_overall.reset_index()[['year', 'value_2016_dollars']], on="year", suffixes=["_construction", "_private"] ).set_index("year")
<matplotlib.axes._subplots.AxesSubplot at 0x7f5da9ad68d0>
Measure the difference
bls_comparison['diff'] = bls_comparison.apply( lambda x: x.value_2016_dollars_construction - x.value_2016_dollars_private, axis=1 )
What year was the gap the greatest?
What is the gap today?
Extract out the pay for those two years
bls_11v16 = bls_comparison[ (bls_comparison.index == 2011) | (bls_comparison.index == 2016) ]
Calculate the percentage increase for construction
Do the same for the overall private sector
Read in construction industry worker totals by hispanic status, retrieved from the University of Minnesota's IPUMS compilation of U.S. Census data.
la_hispanics = pd.read_csv("./ipums/output/hispanics_la_combined.csv")
Plot it by year
<matplotlib.axes._subplots.AxesSubplot at 0x7f5daa636710>
Output the totals
Output data for a graphic
la_hispanics[['year', 'latino_percent']].to_csv("./ipums/output/graphic.csv", index=False)