For this post I am assessing the data concerning Alabamian contributions to the 1984 Reagan campaign as provided by the Federal Electoral Commission. I downloaded the csv file provided by the Commission and uploaded the dataset to Tableau to assist with analysis.
Are there changes needed to make the data tidy?
No, the dataset is tidy.
Are there over- or under-represented groups?
- Contributions from 1984 are represented more than contributions from 1983. This may be due to the simple fact that more Alabamians contributed in 1984 than 1983.
- Similarly, contributions received at the end of 1983 and the beginning of 1984 are more numerous than any other time period.
- There were far more Birmingham area contributors than in any other city. Montgomery and Mobile comprised the next greatest number of contributors.
Is there missing data, i.e., “null” values?
- Forty-eight variables/columns contain null values for every row.
- Many of these null columns seem to be for internal use (e.g., “line_number,” “entity_type,” and “original_sub_id”).
- Other null columns concern variables such as “contributor occupation” and “contributor_street1” that represent information that may not have been asked of contributors when they donated or were redacted to preserve some privacy.
- Other null columns are irrelevant for the campaign in question, such as “candidate_office_state” (Reagan was running for president, not a state or district position), or obvious for the selection of data I filtered. Since I focused my search on contributions to the 1984 Reagan campaign, the “candidate_name” would obviously be Ronald Reagan.
- However, the codebook provided for individual contribution datasets does not include explanations of many of the variables present in the dataset I am assessing. This may reflect changes to the form and information submitted by contributors and campaigns over time from the 1984 election to more recent elections, which the codebook acknowledges.
- There are some null values for information that the contributor may not have provided, such as no response for “employer.”
Does the data pass the “sniff test”? Does anything look wonky?
- For the most part the data does not look weird.
- A few contributions are negative amounts, however, and this is not explained in the dataset nor the codebook.
Where did the data come from?
Individual contribution filing forms (F3P) from 1983 and 1984.
Who collected it?
Federal Electoral Commission.
The dataset was loaded to the FEC database on June 17, 2017.
How was it collected?
From information provided on the F3P contribution form for each contributor. There is a degree of self-reporting for some variables.
Why was it collected?
The FEC attempts to enforce federal campaign finance laws to ensure campaign finance integrity.
Whose goals are prioritized?
Since the FEC wants to increase campaign transparency and regulate campaign contributions, these are prioritized over, say, privacy concerns since contributors’ names, cities, zip codes, and employers are listed.
Who benefits and who is overlooked?
- People researching campaign accountability benefit from the data, while those concerned about privacy are overlooked.
- Within the data itself, cities and towns that are not Birmingham, Montgomery, and Mobile are overlooked because they do not contain as many contributors; however, this may be because people outside of these cities did not contribute as much to Reagan’s 1984 campaign.
- Another important point is that contributions fewer than $500 are not reported by the FEC so smaller-amount contributors are overlooked.