%load_ext sql
from django.conf import settings
connection_string = 'postgresql+psycopg2://{USER}:{PASSWORD}@{HOST}:{PORT}/{NAME}'.format(
**settings.DATABASES['default']
)
%sql $connection_string
u'Connected: ccdc@calaccess_website'
The documentation says that the records are unique on the following fields:
FILING_ID
AMEND_ID
LINE_ITEM
REC_TYPE
FORM_TYPE
REC_TYPE
is always the same value: S497
, so we can ignore this column.
FORM_TYPE
is either F497P1
or F497P2
, indicating in whether itemized transaction is listed under Part 1 (Contributions Received) or Part 2 (Contributions Made). I'll split these up into separate tables.
S497_CD
records actually unique on FILING_ID
, AMEND_ID
and LINE_ITEM
?¶Yes. And this is even true across the Parts 1 and 2 (Contributions Received and Contributions Made).
%%sql
SELECT "FILING_ID", "AMEND_ID", "LINE_ITEM", COUNT(*)
FROM "S497_CD"
GROUP BY 1, 2, 3
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;
0 rows affected.
FILING_ID | AMEND_ID | LINE_ITEM | count |
---|
TRAN_ID
¶The S497_CD
table includes a TRAN_ID
field, which the documentation describes as a "Permanent value unique to this item".
TRAN_ID
ever NULL
or blank?¶No.
%%sql
SELECT COUNT(*)
FROM "S497_CD"
WHERE "TRAN_ID" IS NULL OR "TRAN_ID" = '' OR "TRAN_ID" = '0';
1 rows affected.
count |
---|
0 |
TRAN_ID
unique across filings?¶Decidedly no.
%%sql
SELECT "TRAN_ID", COUNT(DISTINCT "FILING_ID")
FROM "S497_CD"
GROUP BY 1
HAVING COUNT(DISTINCT "FILING_ID") > 1
ORDER BY COUNT(DISTINCT "FILING_ID") DESC
LIMIT 100;
100 rows affected.
TRAN_ID | count |
---|---|
1 | 14271 |
2 | 5893 |
3 | 3490 |
4 | 2336 |
5 | 1661 |
6 | 1139 |
7 | 848 |
8 | 659 |
01 | 627 |
9 | 534 |
10 | 460 |
11 | 383 |
INC1 | 347 |
EXP1 | 332 |
12 | 318 |
13 | 273 |
02 | 272 |
14 | 239 |
INC2 | 225 |
EXP2 | 218 |
15 | 208 |
INC3 | 187 |
16 | 182 |
INC4 | 176 |
EXP3 | 167 |
INC5 | 163 |
INC7 | 158 |
INC6 | 157 |
17 | 147 |
03 | 147 |
INC8 | 142 |
EXP4 | 142 |
18 | 139 |
EXP5 | 127 |
INC10 | 126 |
EXP7 | 125 |
19 | 124 |
INC12 | 124 |
INC9 | 117 |
EXP9 | 115 |
20 | 114 |
INC11 | 112 |
INC14 | 110 |
EXP6 | 108 |
INC15 | 108 |
INC13 | 108 |
21 | 106 |
EXP8 | 103 |
INC18 | 102 |
INC25 | 101 |
INC563 | 100 |
INC573 | 100 |
04 | 100 |
INC564 | 100 |
EXP11 | 98 |
INC26 | 98 |
INC670 | 97 |
INC714 | 97 |
INC583 | 97 |
INC24 | 96 |
INC809 | 96 |
INC625 | 96 |
INC16 | 96 |
INC609 | 96 |
INC20 | 95 |
INC778 | 95 |
INC575 | 95 |
EXP18 | 94 |
INC17 | 94 |
INC567 | 94 |
INC721 | 94 |
INC22 | 94 |
INC681 | 94 |
INC574 | 93 |
INC665 | 93 |
INC95 | 93 |
INC701 | 93 |
EXP16 | 93 |
INC32 | 93 |
INC887 | 92 |
INC832 | 92 |
INC570 | 92 |
INC561 | 92 |
INC766 | 92 |
INC716 | 92 |
22 | 92 |
INC738 | 92 |
INC23 | 91 |
INC565 | 91 |
INC591 | 91 |
INC722 | 91 |
INC646 | 91 |
INC764 | 91 |
INC606 | 91 |
INC584 | 91 |
INC705 | 91 |
INC651 | 91 |
INC736 | 91 |
INC664 | 91 |
INC640 | 90 |
But TRAN_ID
does appear to be unique within each filing amendment, and appears to be reused for each filing.
%%sql
SELECT "FILING_ID", "TRAN_ID", COUNT(DISTINCT "AMEND_ID") AS amend_count, COUNT(*) AS row_count
FROM "S497_CD"
GROUP BY 1, 2
ORDER BY COUNT(*) DESC
LIMIT 100;
100 rows affected.
FILING_ID | TRAN_ID | amend_count | row_count |
---|---|---|---|
1896576 | C10522528 | 6 | 6 |
1361544 | NON98 | 6 | 6 |
1361544 | NON92 | 6 | 6 |
1896576 | C10522532 | 6 | 6 |
1058082 | C1742A | 6 | 6 |
1361544 | NON100 | 6 | 6 |
1058082 | C1743 | 6 | 6 |
1361544 | NON95 | 6 | 6 |
1361544 | NON99 | 6 | 6 |
977789 | F497P10000244485 | 6 | 6 |
1361544 | NON102 | 6 | 6 |
1361544 | NON96 | 6 | 6 |
1896576 | C10522529 | 6 | 6 |
1361544 | NON101 | 6 | 6 |
1361544 | NON97 | 6 | 6 |
1896576 | C10522533 | 6 | 6 |
1896576 | C10522531 | 6 | 6 |
1299277 | 2007-0458 | 5 | 5 |
1169898 | C2674 | 5 | 5 |
1299277 | 2007-0459 | 5 | 5 |
1169898 | C2671 | 5 | 5 |
1161821 | 2006-0921 | 5 | 5 |
1161821 | 2006-0918 | 5 | 5 |
1161821 | 2006-0922 | 5 | 5 |
1161821 | 2006-0925 | 5 | 5 |
1169898 | C2670 | 5 | 5 |
1169898 | C2673 | 5 | 5 |
868371 | INC1066 | 5 | 5 |
868371 | NON1071 | 5 | 5 |
1299277 | 2007-0445 | 5 | 5 |
1169898 | C2672 | 5 | 5 |
1299277 | 2007-0460 | 5 | 5 |
1161821 | 2006-0908 | 5 | 5 |
1110656 | F497P10000340769 | 5 | 5 |
1161821 | 2006-0909 | 5 | 5 |
1058082 | C1749 | 5 | 5 |
960322 | C97966 | 5 | 5 |
960322 | C97962 | 5 | 5 |
977789 | F497P10000244544 | 5 | 5 |
1058082 | C1750 | 5 | 5 |
1058082 | C1745 | 5 | 5 |
1058082 | C1748 | 5 | 5 |
1058082 | C1753 | 5 | 5 |
1110656 | F497P10000340767 | 5 | 5 |
1161821 | 2006-0904 | 5 | 5 |
1161821 | 2006-0907 | 5 | 5 |
1161821 | 2006-0914 | 5 | 5 |
1161821 | 2006-0917 | 5 | 5 |
1161821 | 2006-0919 | 5 | 5 |
1161821 | 2006-0920 | 5 | 5 |
1161821 | 2006-0923 | 5 | 5 |
1161821 | 2006-0924 | 5 | 5 |
1161821 | 2006-0926 | 5 | 5 |
1161821 | 2006-0927 | 5 | 5 |
868371 | INC1058 | 5 | 5 |
868371 | INC1068 | 5 | 5 |
868371 | INC1059 | 5 | 5 |
868371 | INC1060 | 5 | 5 |
868371 | INC1069 | 5 | 5 |
868371 | NON1070 | 5 | 5 |
1169898 | C2675 | 5 | 5 |
1169898 | C2676 | 5 | 5 |
1161821 | 2006-0910 | 5 | 5 |
960322 | C97928 | 5 | 5 |
960322 | C97923 | 5 | 5 |
960322 | C97930 | 5 | 5 |
960322 | C97918 | 5 | 5 |
960322 | C97908 | 5 | 5 |
960322 | C97907 | 5 | 5 |
960322 | C97911 | 5 | 5 |
960322 | C97919 | 5 | 5 |
960322 | C97931 | 5 | 5 |
960322 | C97914 | 5 | 5 |
960322 | C97916 | 5 | 5 |
960322 | C97921 | 5 | 5 |
960322 | C97922 | 5 | 5 |
960322 | C97925 | 5 | 5 |
960322 | C97926 | 5 | 5 |
960322 | C97957 | 5 | 5 |
960322 | C97961 | 5 | 5 |
960322 | C97902 | 5 | 5 |
961164 | INC8906 | 5 | 5 |
960322 | C97877 | 5 | 5 |
960322 | C97903 | 5 | 5 |
1058082 | C1746 | 5 | 5 |
1058082 | C1747 | 5 | 5 |
1058082 | C1751 | 5 | 5 |
1058082 | C1752 | 5 | 5 |
1110656 | F497P10000340705 | 5 | 5 |
1110656 | F497P10000340707 | 5 | 5 |
1161821 | 2006-0902 | 5 | 5 |
1161821 | 2006-0903 | 5 | 5 |
1161821 | 2006-0905 | 5 | 5 |
1161821 | 2006-0906 | 5 | 5 |
1161821 | 2006-0912 | 5 | 5 |
1161821 | 2006-0913 | 5 | 5 |
1161821 | 2006-0915 | 5 | 5 |
1161821 | 2006-0916 | 5 | 5 |
960322 | C97875 | 5 | 5 |
960322 | C97874 | 5 | 5 |
There's one exception:
%%sql
SELECT "FILING_ID", "TRAN_ID", "AMEND_ID", COUNT(*)
FROM "S497_CD"
GROUP BY 1, 2, 3
HAVING COUNT(*) > 1;
1 rows affected.
FILING_ID | TRAN_ID | AMEND_ID | count |
---|---|---|---|
2072379 | EXP9671 | 1 | 2 |
Looks like this TRAN_ID
is duplicated across the two parts of the filing. So it was both a contribution both made and received?
%%sql
SELECT *
FROM "S497_CD"
WHERE "FILING_ID" = 2072379
AND "TRAN_ID" = 'EXP9671';
2 rows affected.
id | FILING_ID | AMEND_ID | LINE_ITEM | REC_TYPE | FORM_TYPE | TRAN_ID | ENTITY_CD | ENTY_NAML | ENTY_NAMF | ENTY_NAMT | ENTY_NAMS | ENTY_CITY | ENTY_ST | ENTY_ZIP4 | CTRIB_EMP | CTRIB_OCC | CTRIB_SELF | ELEC_DATE | CTRIB_DATE | DATE_THRU | AMOUNT | CMTE_ID | CAND_NAML | CAND_NAMF | CAND_NAMT | CAND_NAMS | OFFICE_CD | OFFIC_DSCR | JURIS_CD | JURIS_DSCR | DIST_NO | OFF_S_H_CD | BAL_NAME | BAL_NUM | BAL_JURIS | MEMO_CODE | MEMO_REFNO | BAL_ID | CAND_ID | SUP_OFF_CD | SUP_OPP_CD |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
62013206 | 2072379 | 1 | 2 | S497 | F497P1 | EXP9671 | COM | McCarty for Assembly 2016 | Sacramento | CA | 95814 | None | 2016-08-25 | None | 1000.00 | 1373943 | EXP:A:9671 | ||||||||||||||||||||||||
62013205 | 2072379 | 1 | 1 | S497 | F497P2 | EXP9671 | COM | McCarty for Assembly 2016 | Sacramento | CA | 95814 | 2016-11-08 | 2016-08-25 | None | -1000.00 | 1373943 | Assembly Member Kevin McCarty | ASM | ASM | 07 | H | EXP:S497:9671 |
Looking at the PDF for the filing, it appears to be a check from the California Psychological Association PAC to the McCarty for Assembly 2016 committee, which was given and returned on 8/25/2016.
Regardless, because the combinations of FILING_ID
, AMEND_ID
and TRAN_ID
are unique within each part of the Schedule 497, we could substitute TRAN_ID
for LINE_ITEM
in the composite key when splitting up the contributions received from the contributions made.
The advantage is that the TRAN_ID
purportedly points to the same contribution from one amendment to the next, whereas the same LINE_ITEM
might not because the filers don't necessarily list transactions on the same line from one filing amendment to the next.
Here's an example: On the original Schedule 497 filing for Steven Bradford for Senate 2016, a $8,500.00 contribution from an AFL-CIO sub-committee is listed on line 1. But on the first and second amendments to the filing, it is listed on line 4.