So far we have seen how to use the database interface of Smart EDGAR. In this document I give a quick overview of the core functionality of the File based API which does not require any DBMS.
As a precondition we expect that you have executed the download of the files from EDGAR.
We install the Smart EDGAR library with the help of Maven. We also install the Jupyter-jdk extensions so that we can render ITable objects as BeakerX tables.
%classpath config resolver maven-public http://software.pschatzmann.ch/repository/maven-public/
%%classpath add mvn
ch.pschatzmann:smart-edgar:LATEST
ch.pschatzmann:jupyter-jdk-extensions:LATEST
import ch.pschatzmann.display._
import ch.pschatzmann.edgar.base._
import ch.pschatzmann.edgar.reporting.company._
import ch.pschatzmann.edgar.dataload.rss.RSSDataSource
import ch.pschatzmann.edgar.utils.Utils
import ch.pschatzmann.edgar.base.Fact._
Displayers.setup
true
We can access the information starting from the EdgarCompany.list which provides the list of all companies
%%time
import scala.collection.JavaConverters._
val companies = EdgarCompany.stream.iterator.asScala.slice(0,10).toSeq
CPU times: user 0 ns, sys: 219 µs, total: 219 µs Wall Time: 410 ms
[[0001368622, 0001286181, 0001577898, 0000886136, 0000886137, 0001505155, 0000904918, 0001674335, 0001474042, 0001276262]]
val company = companies(0)
0001368622
company.getCompanyName
AeroVironment Inc
company.getSICDescription
AIRCRAFT
company.getStateLocation
CA
company.getStateIncorporation
DE
val filings = company.getFilings
[1368622-10-K-20120626, 1368622-10-K-20130625, 1368622-10-K-20130626, 1368622-10-K-20140708, 1368622-10-K-20150630, 1368622-10-K-20160628, 1368622-10-K-20160629, 1368622-10-K-20170627, 1368622-10-K-20170628, 1368622-10-K-20180627, 1368622-10-Q-20110908, 1368622-10-Q-20111206, 1368622-10-Q-20120306, 1368622-10-Q-20120307, 1368622-10-Q-20120905, 1368622-10-Q-20120906, 1368622-10-Q-20121205, 1368622-10-Q-20130305, 1368622-10-Q-20130827, 1368622-10-Q-20131126, 1368622-10-Q-20131127, 1368622-10-Q-20140305, 1368622-10-Q-20140903, 1368622-10-Q-20141125, 1368622-10-Q-20141126, 1368622-10-Q-20150303, 1368622-10-Q-20150901, 1368622-10-Q-20151208, 1368622-10-Q-20160308, 1368622-10-Q-20160830, 1368622-10-Q-20160831, 1368622-10-Q-20161207, 1368622-10-Q-20170307, 1368622-10-Q-20170308, 1368622-10-Q-20170829, 1368622-10-Q-20170830, 1368622-10-Q-20171205, 1368622-10-Q-20171206, 1368622-10-Q-20180306, 1368622-10-Q-20180307, 1368622-10-Q-20180905, 1368622-10-Q-20180906, 1368622-10-Q-20181129, 1368622-10-Q-20181130]
We can combine multiple filings into one XBRL data access object. The selection can be done with the help of a regex expression
%%time
val xbrl = company.getXBRL(".*10-K.*")
CPU times: user 0 ns, sys: 229 µs, total: 229 µs Wall Time: 12 s
ch.pschatzmann.edgar.base.XBRL@64ba18cf
...or we just use all files
%%time
val xbrlAll = company.getXBRL()
CPU times: user 0 ns, sys: 234 µs, total: 234 µs Wall Time: 63 s
ch.pschatzmann.edgar.base.XBRL@7efaf2d8
We are automatically indexing by all attribute values. Thus we can use the findValues method to search in that index. In order to display the data in BeakerX as Table we convert the data to a Scala collection of Maps
val cogs = xbrl.findValues("Cost of Goods Sold")
new TableDisplay(cogs.asScala.map(v => v.getAttributes.asScala.toMap))
val values = xbrl.findValues()
new TableDisplay(values.asScala.map(v => v.getAttributes.asScala.toMap))
We should filter all values which are not relevant for our purpose. E.g.
val values = xbrl.findValues().asScala
.filter(v => v.getContext.getSegments.isEmpty )
.filter(v => v.getDataType == DataType.number)
.filter(v => !v.getValue.isEmpty)
new TableDisplay(values.map(v => v.getAttributes.asScala.toMap))
val labelAPI = xbrl.getLabelAPI()
labelAPI.getLabel("CostOfGoodsSold").getLabel
Cost of Goods Sold
Usually we want to access the numerical information. However we also provide the consolidated text that we can use to feed some NLP functionality with the getCombinedTextValues method:
val values = xbrl.getCombinedTextValues
new TableDisplay(values.asScala.map(v => v.getAttributes.asScala.toMap))
We can also convert all the values to CSV by calling the toValueCSV method
Utils.setCSVDelimiter(",")
val file = Utils.createTempFile(xbrl.toValueCSV)
val table = new TableDisplay(new CSV().readFile(file.getAbsolutePath))
We can also convert the (numerical) data to an ITable object
xbrl.toTable
From the EdgarCompany object we can also access the CompanyEdgarValues class which supports the calculation of KPIs. However it is much more efficient to use the corresponding Database functionality
val values = company.getCompanyEdgarValues
.setUseArrayList(true)
.setAddTime(true)
.setFilter(new FilterYearly())
.setParameterNames("NetIncomeLoss","OperatingIncomeLoss","ResearchAndDevelopmentExpense",
"CashAndCashEquivalentsAtCarryingValue","AvailableForSaleSecuritiesCurrent","AccountsReceivableNetCurrent",
"Revenues","SalesRevenueNet","InventoryNet","AssetsCurrent","LiabilitiesCurrent","Assets","EarningsPerShareBasic",
"StockholdersEquity")
.addFormula("Revenue","Edgar.coalesce('Revenues', 'SalesRevenueNet')")
.addFormula("QuickRatio","(CashAndCashEquivalentsAtCarryingValue + AccountsReceivableNetCurrent + AvailableForSaleSecuritiesCurrent) / LiabilitiesCurrent")
.addFormula("CurrentRatio","AssetsCurrent / LiabilitiesCurrent")
.addFormula("InventoryTurnover","Revenue / InventoryNet")
.addFormula("NetProfitMargin","NetIncomeLoss / Revenue")
.addFormula("SalesResearchRatio%","ResearchAndDevelopmentExpense / Revenue *100")
.addFormula("NetIncomeResearchRatio%","ResearchAndDevelopmentExpense / NetIncomeLoss * 100")
.addFormula("NetIncomeChange%", "NetIncomeLoss - Edgar.lag('NetIncomeLoss', -1) / Edgar.lag('NetIncomeLoss', -1) * 100 ")
.addFormula("RevenueChange%", "Edgar.percentChange('Revenue')" )
.addFormula("ResearchAndDevelopmentChange%","Edgar.percentChange('ResearchAndDevelopmentExpense')" )
.removeParameterNames("Revenues","SalesRevenueNet")
val list = values.getTable
Instead of accessing the information by company we can request all filings with the EdgarFiling.list() method
%%time
var filings = EdgarFiling.list(".*10-K.*")
filings.size
CPU times: user 0 ns, sys: 212 µs, total: 212 µs Wall Time: 3 s
53402
var filing = filings.get(0)
1750-10-K-20110713
var companyName = filing.getCompanyInfo.getCompanyName
AAR CORP
var xbrl = filing.getXBRL
xbrl.toTable
The data can be downloaded from EDGAR with the help of the RSSDataSource: If the history flag is set to false we just download the most recent docouments from https://www.sec.gov/Archives/edgar/usgaap.rss.xml. If the histry flag is set to true we download all available data back to 2005-04
import scala.collection.JavaConverters._
var downloadData = new RSSDataSource().getData(false, "10-K.*").asScala
downloadData.foreach(d => d.download())
null
Last but not least, we can load an xbrl directly from the EDGAR database via the Internet. The getXBRL method on the FeedInfoRecord is parsing the local XBRL file if it exists - otherwise the information is downloaded from the URL indicated in the FeedInfoRecord.
val first = downloadData.toSeq(0)
val xbrl = first.getXBRL
val values = xbrl.findValues().asScala
new TableDisplay(values.map(v => v.getAttributes.asScala.toMap))
first.getUriXbrl()
https://www.sec.gov/Archives/edgar/data/1000230/000143774918022311/0001437749-18-022311-xbrl.zip
first.getUrlHttp()
https://www.sec.gov/Archives/edgar/data/1000230/000143774918022311/0001437749-18-022311-index.htm