Skip to main content
SearchLoginLogin or Signup

Forecasting the 2020 U.S. Elections With Decision Desk HQ: Methodology for Modern American Electoral Dynamics

Published onOct 27, 2020
Forecasting the 2020 U.S. Elections With Decision Desk HQ: Methodology for Modern American Electoral Dynamics
·

Abstract

Øptimus has constructed models to predict the outcomes of the 2020 presidential and congressional general elections in collaboration with Decision Desk HQ. The model is an iteration from its 2018 U.S. Congressional model designed to predict the outcome of the election as if it were held today. The congressional model predicts the probability of a Republican (GOP) victory in individual House and Senate elections, as well as the number of aggregate seats expected to be won by each party (to predict partisan control of each chamber). The presidential model uses a similar framework to estimate vote shares and probabilities of victory for each major party candidate in each of the states.1 These estimates are then used to proxy electoral college predictions that determine who is elected as the next President of the United States. We provide a survey of features, feature engineering techniques, models, and ensembling techniques. We also provide some empirical results.

Keywords: elections, political science, government, machine learning


Media Summary

We start with a data set of 200+ base features spanning economic indicators, political environment measures (both national and local), candidate traits, campaign finance reports, and engineered variables designed to draw context-specific information into the model. This data set is refreshed on a rolling basis. Not every feature makes it into every model; a large number of these features are most fruitful when paired with certain other features and models. We spend a considerable amount of effort engineering features that reflect some aspect of historic election outcomes that are not quite captured in the raw data. Then we pair features with models (both manually and automatically), and build a set of base models. Occasionally, these base models have target variables that are slightly different from our final dependent variables. We then ensemble these base models together, either by taking a weighted average of the prediction of each model, or by applying a stacking classifier on top of the base model. These predictions are then blended together with probabilities derived from current polling. Using these poll-informed predictions, we run 14,000,605 simulations for the Senate/House forecasts and 140,605 simulations for the Presidential forecast to determine the range of possible outcomes.

We’re constantly iterating on our modeling workflow - trying out new features, different ensembling techniques, and configurations. It is likely that our model will exist in a slightly different form by the end of the 2020 election cycle.


1. Target Variables

Our model attempts to accurately forecast the outcome of both the US Presidential and Congressional elections in the 2020 election cycle. In both the US Senate and House of Representatives, we provide a probability of each party’s winning each particular seat. Using these seat-by-seat probabilities, we provide an overall probability of each party winning control of each chamber. We adopt a similar approach for the Presidential election. Our model estimates the likelihood of each candidate winning each state. Using these state-by-state probabilities as a starting point, we simulate possible outcomes in the Electoral College. From these simulations, we derive an overall probability of victory for each candidate.

2. Data

There are two broad classes of features in our model: raw and engineered features.2 Raw features are external data fed directly to the model, while engineered features are raw data modified in some way to be more useful to the model. Both categories span various domains, including candidate fundraising, demographic information, economic indicators, electoral history, and political environment.

Many of the features we incorporate are broadly recognized in the political science sphere and beyond. For example, it is well understood that congressional candidates of the President’s party are strongly impacted by presidential approval ratings (Edwards, 2009). Demographic variables also fall into this category. The political dynamics of a locality depend strongly on the African-American share of the total population for instance.

In addition to primary source data, we also engineer several features. For instance, in addition to routine financial data provided by the Federal Election Commission (FEC), we also incorporate a formula to compare GOP and Democratic campaign finance numbers in each district/state, as well as an indicator for whether a House race has surpassed $3 million in contributions or a Senate race has surpassed $20 million. These thresholds are derived from an internal empirical analysis of what counts as an ‘expensive race.’ Races that cross this fundraising threshold are typically the most competitive, and are governed by different dynamics than less competitive races. Polling data for each race is consolidated using a weighted average that accounts for recency, sample size and pollster quality. Other engineer features take into account not the total money raised by each candidate, but also level-off as one candidate surpasses the other in fundraising. This incorporates the well-known political science insight that additional fundraising only provides an additional benefit up to a certain point (e.g., Barutt and Schofield, 2016).

We refresh the data set on a rolling basis to ensure that any and all changes to individual races are accounted for quickly. This includes adding any new individual race polling, changes in the national environment and special election environment variables, quarterly and 48-hour FEC reports, new economic indicators, primary election outcomes, and candidate status changes.

3. Feature Selection

Our approach strikes a balance between explanatory power and predictive power. The structure of our model is such that we can not only accurately determine the winner of a congressional race for example, but we can also identify what features provided to the model are driving the projected outcome. The success of our model is demonstrated by its success during the 2018 midterm elections, when we successfully predicted the outcome in 97% of House races.

To this end, our final ensemble model is composed of base models whose features are drawn from different sources. The feature sets in some models emerge from political science theory, while others are derived from a more strictly machine-learning driven approach. This approach bolsters the interpretability of our model, and reduces the risk of overfitting. With these concerns in mind, we explicitly avoid a ’kitchen sink’ approach to feature selection: including every possible feature may provide a small gain to model accuracy, but only at the cost of model interpretability. There are three key elements of our feature selection approach -

  1. Starting with an approach informed by political science literature, we hand-pick a set of features. Academic models predicting congressional results span the latter half of the 20th century (e.g. Lewis-Beck and Rice, 1984; Stokes and Miller, 1962; Tufte, 1975) with relatively simple quantitative analyses, and are still relevant and accurate through the work of contemporary scholars (e.g. Campbell, 2010; Lewis-Beck and Tien, 2014). We chose to include many of the same variables that these researchers find most important, such as incumbency (Abramowitz, 1975; Erikson, 1971), district partisanship (Brady et al., 2000), and whether a given year is a midterm or presidential cycle (Erikson, 1988; Lewis-Beck and Tien, 2014). At the same time, we exclude minor variables that lack a strong theoretical grounding. For example, FEC reports include detailed information about offsets to campaign expenditures, and refunded individual campaign contributions. These variables—alongside many others provided by the FEC—do not provide the model with useful new information, beyond what is already captured by a campaign's overall fundraising. As a result, we exclude variables like these from the feature set. We do include a ratio of GOP and Democratic contributions to incorporate FEC data at large because, while scholars typically fail to find a general causal linkage between raising more money and winning, it does appear to be a considerably predictive variable for challenger success (Jacobson, 1978).

  2. Beginning from this feature set derived from the political science literature, we conduct feature selection by determining linear dependence of features with one another and weeding out variables that are highly correlated. We use ANOVA F-values in order to determine variance between features (Pedregosa et al., 2011).

  3. Finally—leaving behind the explanatory power afforded by political science theory—we conduct a randomized feature selection and optimize over accuracy and ROC-AUC (Receiver Operating Characteristics-Area Under Curve) (Zou and Hastie, 2003). We use ridge and lasso regression in this stage (the elastic net approach).

After determining the relevant feature sets, we pair them with various models and back-test them to see which feature-model pairs are ideal. Example pairings might include a Random Forest model paired with features generated via elastic net feature selection, and a logistic regression using hand-picked feature set. There is a good deal of caution applied at this stage in order to ensure that we are not merely overfitting to historical data: a model tailored too closely to the 2016 election may not perform well in another year, for instance. In this manner, our final predictions incorporate information from the best features identified by both political science and machine-learning, while mitigating the shortcomings of each approach.

4. Model Choice

Most election forecasting models use either a Bayesian or frequentist approach to predict the outcome of an election. We find that empirically, both perform quite well and have different strengths with respect to inference. Because different models complement one another in this way, our modeling process adopts an ensemble approach, incorporating different kinds of models including Bayesian logistic regressions, logistic regressions, Random Forests, XGBoosts, and Elastic Nets. Because of the quantity of congressional data available to us—435 races every two years extending back to 1992 in the House of Representatives, for example—the prior for a Bayesian regression is not very significant, and the regression performs similarly to one conducted in a frequentist framework. The predictions produced by each model associated feature set are then averaged together into the final overall ensemble prediction. While more sophisticated ensembling algorithms based on model ‘boosting’ are well-known in the literature, our simpler approach accurately predicts the outcome in 95% of 2018 congressional election. The individual models composing the ensemble produce equivalent accuracies between 90 and 95%.

Including a variety of models and variable subsets in our ensemble reduces error in two ways. First, ensembles have proven to be more accurate on average than their constituent models alone. Second, they are less prone to making substantial errors (i.e., if they miss, they miss by smaller margins on average); see Montgomery et al., 2012. Individual models produce good results, but give different estimates for each race. Individual models typically produce similar accuracy and F1 scores, but produce better estimates when averaged together. Our empirical results from 2018 illustrate the success of this approach.

In the House model, we combine two separate ensemble models—one based on candidate party affiliation, and the other based on incumbency—and then add recent polling information. In the Senate, a single party-oriented ensemble model is sufficient to produce accurate results, and is later combined with polls to make a final prediction.

In the Presidential model, we adopted a different approach from the House and Senate. This is a result of data availability: usable historical data for the presidency extends back only to 1992. This time window encompasses only seven Presidential elections on which to train a model. This makes Presidential models particularly prone to overfitting. Combine this with the fact that the national environment is extraordinarily volatile, and one has a recipe for uncertainty. We overcame this problem by implementing a stacking ensemble that incorporates a collection of different submodels. Because these constituent models differ in both feature set and model-type (logit, SVM, Random Forest, and XGBoost), we were able to avoid severe overfitting, given the limited amount of training data available. This approach back-tested better than any of the alternatives, especially with regards to model calibration.

5. Polls

Poll results are a key ingredient in our model. Each individual poll in a race is converted to a probability representing the likelihood of a GOP win. This probability is generated by sampling from a posterior normal distribution centered on the share of the vote GOP received by the Republican candidate in a particular poll. The variance V of the normal distribution is determined predominantly by the sample size of the poll, and the typical methodology of the pollster. We simulate election outcomes from each poll by drawing a GOP vote share R from the resulting normal distribution:

R ~ N(GOP, V)

From this simulated Republic vote share R, we simulate a Democratic vote share D as:

D = GOP + DEM - R

where DEM is the Democratic vote share reported by the poll. By comparing each set of simulated vote shares, we determine a probability of Republican victory. If the GOP candidate's vote share is greater than that of their Democratic opponent, a GOP win is recorded. The number of GOP wins divided by the total number of draws represents a simulated probability of a GOP win, given the poll's margin.

Public polls make up the bulk of polling in our model. For clients, we commission private polling with turnout modeling and consistent data collection methodology. Because private polls typically have larger sample sizes than public polls - and are typically concentrated in key battleground states - they play a significant role in improving our model performance. Private polls also frequently sample individuals using a registration-based (RBS) methodological approach, in contrast to the random digit dialing (RDD) often used in public polling. Existing literature has found polls based on RBS to often provide more accurate results in congressional races (Green and Gerber, 2006).

The spread of the sampling distribution is based on the estimated total survey error of the poll. Since a poll's reported margin of error often does not adequately capture its uncertainty (Shirani-Mehr et al., 2018), we perform an adjustment to better reflect the true uncertainty of a GOP win. Using an empirical distribution of polling errors gathered from House and Senate races dating back to 2006 as a baseline, we adjust the margin of error associated with each poll. These adjustments vary by poll, and depend on both the methodology of a poll, and its proximity to the election. The margin of error on higher-quality polls, and on polls conducted closer to the election, are adjusted downward. The typical pollster-based margin of error adjustment is approximately 20-30%. The individual probabilities are then ensembled.

Weights are based on a poll’s proximity to the election, as well as to the pollster's FiveThirtyEight rating. A linear decay function is applied to the poll's date as well as the polls rating. Polls with higher pollster ratings that are closer to the election are weighted more heavily. The final probability we project for a given race is a weighted average between the poll and non-poll probabilities, with the weight of the poll probability increasing as the election becomes closer in time. Polling weights were developed by Øptimus during the 2018 election cycle, and were successful in back-testing on previous election cycles. The approach incorporates well-known insight from political science regarding the reliability of polls at different points in the election cycle. Our method also allows the model to separate lower-quality polls from those likely to be higher in quality.

6. Election Simulations

Figure 1. A simulated distribution of electoral college outcomes using the modeling approach we describe. As of this writing (Summer 2020), we project Donald Trump to receive an average of 235 electoral votes to 303 for Joe Biden. Joe Biden defeats Donald Trump 84% of the time.

Using the computed probabilities for each House, Senate, and Presidential race, we predict the aggregate number of seats we expect the GOP to win and the probability of maintaining control of the House and Senate. We use each seat’s predicted probabilities to run simulations of the 2020 Congressional elections.

The final outcomes in different races are strongly correlated with one another. In 2016 for example, we saw this occur in the upper midwest: Trump not only outperformed in Wisconsin, but also in states like Michigan, Iowa, and Pennsylvania that share similar demographic profiles. Polling errors are mildly correlated across races within an election cycle due to various sources of error, which can result in systematic bias. However, across election cycles going back several decades, the mean partisan bias computed over all polls is approximately zero (Shirani-Mehr et al., 2018). Because the overall partisan bias of polling in a given year is not a priori known, this is not explicitly corrected for within our model.

The mechanism of a wave election—an election in which one party performs overwhelmingly better than the other—is simulated by treating our predicted probabilities as beta random variables. Each race is assigned a beta distribution centered on the predicted probability, with shape parameters chosen to reflect the volatility of toss-up races in wave elections and conversely, the relative resilience of non-competitive races. Within a given simulation, as the outcome in each state is sequentially determined, the probability of victory for each party in each remaining state is modified in reaction. Thus—as a candidate rises or falls in a particular simulation—their fortunes elsewhere rise or fall. In this manner, state-to-state correlations are explicitly incorporated into our simulation framework.

We perform over 10 million simulations to create a distribution of potential outcomes. This approach allows us to qualitatively analyze individual ‘scenarios’ for a more narrative backed description of how the election will turn out. For example, we can find the most likely path to victory for a candidate, contingent on them winning or losing in a specific set of states.

For the Presidential race, we draw from a Binomial distribution for each state and then calculate electoral college totals in order to determine the overall distribution of electoral votes. Attempts to force certain correlations between states did not produce significantly different simulation results.

7. Empirical Results

This cycle’s model3 is an iteration of a model we released in 2018. In 2018, we publicly released our House and Senate predictions beginning in June and updating regularly until Election Day. Our final House prediction had Democrats at a 95.9% chance of taking control of the chamber. The mean prediction was 233 Democratic seats to 202 GOP seats, and the 90% confidence interval spanned from 218 to 248 Democratic seats. Control of the House was called early in the night by most outlets. Ultimately, Democrats won 235 seats, and Republicans won 200. This outcome produced an overall accuracy of 97% for our model, with predictions within the 90% confidence interval of the model typically producing an accuracy between 93% and 100%.

Because congressional incumbents are overwhelmingly reelected, a good baseline for comparison is provided by simply assuming that incumbents are all reelected, and retirements result in no change of partisan control. This simplistic model would incorrectly predict the outcome in 45 US House races held during the 2018 election, producing an overall accuracy of 90%, 7% worse than our model. The difference is even more stark when we only examine the 31 House seats we identified as toss-up. Among this subset, our model achieves a 67% accuracy, while simply assuming incumbent victory would result in only a 32% accuracy. Because House control is determined largely by the outcomes in these kinds of competitive races, this may be a better baseline for comparison.

On the Senate side, our final prediction gave Republicans a 91.9% chance of keeping control of the chamber. The mean seat prediction was 52 GOP seats to 48 Democratic seats, with a 90% confidence interval spanning from 49 GOP seats to 55 GOP seats. Our GOP chance of keeping the majority peaked above 89% at three different points: in mid-August, in mid-October, and right before the election. As with the House race, chamber control was decided early. The final outcome in the Senate was 53 GOP seats to 47 Democratic seats. Among all 35 Senate seats, our model correctly predicted 33, for an overall accuracy of 94%. In contrast, a baseline model assuming incumbent-party victory would have incorrectly forecast 6 Senate races that changed partisan control, for an accuracy of only 83%.

Table 1 contains individual race performance metrics for the Øptimus House 2018 model. For the 434 races called4, the Øptimus House model called 421/434 races correctly, or an accuracy measure of 96.8%. Among the 31 toss-ups, the model predicted 21/31 toss-up races correctly, meaning it had 67.7% accuracy among these races. Excluding the toss-ups, the House model predicted 400 out of 404 non-toss-up races correctly, or 99.01% accuracy. The orientation of the metrics is based on the Republican win percentage. A true positive is a correctly predicted Republican victory, while a false positive is a predicted Republican victory that was actually a Democratic win.

Table 1. 2018 House model performance.

All Seats

Excluding Toss-Ups

Toss-Ups Only

Number of Seats

434

403

31

Accuracy

97.00%

99.26%

67.74%

Total Misses

13

3

10

False Negatives

3

0

3

False Positives

10

3

7

True Negatives

225

210

15

196

190

6

Brier Score

0.034

0.019

0.235

Matthew's Correlation

0.94

0.985

0.321

AUC

0.996

0.998

0.763

F1

0.968

0.992

0.546

F2

0.978

0.997

0.612

Precision

0.952

0.985

0.462

Recall

0.985

1

0.667

Table 2 contains the performance scores for the Øptimus Senate model. The Øptimus Senate model predicted 33 out of 35 races correctly, an accuracy measure of 94.29%. Among the 4 toss-ups called, the model correctly predicted 3 out of 4 races. Out of the 31 non-toss-ups, the only race missed by the Senate model is the Florida Senate seat. As with Table 1, the orientation of the metrics is based on the Republican win percentage.

Table 2. 2018 Senate model performance.

All Seats

Excluding Toss-Ups

Toss-Ups Only

Number of Seats

35

31

4

Accuracy

94.29%

96.77%

75.00%

Total Misses

2

1

1

False Negatives

2

1

1

False Positives

0

0

0

True Negatives

24

22

2

True Positives

9

8

1

Brier Score

0.056

0.031

0.246

Matthew's Correlation

0.869

0.922

0.577

AUC

0.985

1.000

0.500

F1

0.9

0.941

0.667

F2

0.849

0.909

0.556

Precision

1.000

1.000

1.000

Recall

0.818

0.889

0.500

Table 3 (House) and 4 (Senate) display the accuracy and total number of incorrect predictions made by every individual model included in our ensemble in 2018, as well as by the final ensemble of individual models and polls. In this table, we note the performance of each individual model included in the ensemble: Random Forests, logits, XGBoost, Bayesian MCMC, and the corresponding ensemble performance. The Bayesian MCMC calculations were performed using the JAGS and PyMC3 computational packages.

Models using the ‘Poli Sci’ feature set rely upon a set of features widely regarded as crucial variables in political science literature. The ‘Select K Best’ feature sets are determined algorithmically, using ANOVA F-values to minimize collinearity between all features included in the set. The ‘Elastic Net’ feature sets are similarly produced, using the elastic net approach to algorithmically identify the best features to include.

As expected, the ensemble performs better on average than the individual models that compose the ensemble. For example, in Table 3 the main ensemble model incorrectly predicted only 21 of the 434 called House toss-up races in 2018, in contrast to 22-41 misses each for the individual models composing the ensemble. While some individual models outperform the overall ensemble with respect to accuracy, most do not. Because there is no way to determine a priori which constituent models will outperform, the ensemble remains the best overall choice. Creating an ensemble of individual models helps to minimize the systematic bias in each of the models. In this way, weaknesses of individual models can be compensated by combining them together

The final rows of both Table 3 and 4 indicate the performance of the ensemble when combined with polling data. A linear combination of the chances of GOP victory based on ensemble of individual models and polls gives the best performing model. We have observed this in our out-of-sample back-tests (2016, 2014, 2010, and 2006) as well. For example, the inclusion of polls into the 2018 House ensemble boosts accuracy by around 2%, while polling data boosts accuracy of the 2018 Senate ensemble by 8%. Because polling is typically more prevalent in the Senate than in the House, it is unsurprising that polling does more to improve the Senate model.

Table 3. 2018 House performance by model.

UNDERLYING MODELS

ALL SEATS (434)

NON TOSS UPS (403)

TOSS UPS (31)

FEATURE SELECTION

MODEL

NUMBER OF VARIABLES

ACCURACY

TOTAL MISSES

ACCURACY

TOTAL MISSES

ACCURACY

TOTAL MISSES

Select K Best

Random Forest

92.40%

96.77%

13

35.48%

20

Pol Sci

Logistic Regression

26

94.01%

26

97.52%

10

48.39%

16

Pol Sci

Random Forest

26

92.63%

32

96.53%

14

41.94%

18

Select K Best

Bayesian Modeling - MCMC( JAGS)

30

94.93%

22

97.27%

11

64.52%

11

Select K Best

Logistic Regression

31

92.63%

32

95.78%

17

51.61%

15

Select K Best

Bayesian Modeling - MCMC (PyMC3)

31

94.47%

24

97.02%

12

61.29%

12

Select K Best

Random Forest

31

91.94%

35

96.03%

16

38.71%

19

Select K Best

XGBoost

31

94.70%

23

97.52%

10

58.06%

13

Elastic Net

Random Forest

33

90.55%

41

94.29%

23

41.94%

18

Select K Best

Logistic Regression

61

91.71%

36

93.55%

26

67.74%

10

Select K Best

Random Forest

61

91.71%

36

96.03%

16

35.48%

20

Select K Best

XGBoost

61

94.24%

25

97.27%

11

54.84%

14

Main Ensemble Only

95.16%

21

98.01%

8

58.06%

13

Main + Incumbency Ensemble

95.16%

21

98.26%

7

54.84%

14

Ensembles + Polls

97.00%

13

99.26%

3

67.74%

10

Table 4. 2018 Senate performance by model

UNDERLYING MODELS

ALL SEATS

NON TOSS UPS

TOSS UPS

FEATURE SELECTION

MODEL

NUMBER OF VARIABLES

ACCURACY

TOTAL MISSES

ACCURACY

TOTAL MISSES

ACCURACY

TOTAL MISSES

Elastic Net

Bayesian Modeling - PyMC3

10

82.86%

6

90.32%

3

25.00%

3

Pol Sci

Logistic Regression

20

80.00%

7

83.87%

5

50.00%

2

Pol Sci

Random Forest

20

80.00%

7

83.87%

5

50.00%

2

Select K Best

Logistic Regression

26

80.00%

7

90.32%

3

0.00%

4

Elastic Net

Bayesian Modeling - JAGS

27

80.00%

7

83.87%

5

50.00%

2

Elastic Net

Elastic Net

85.71%

5

90.32%

3

50.00%

2

Ensemble

85.71%

5

90.32%

3

50.00%

2

Ensembles + Polls

94.29%

2

96.77%

1

75.00%

1


Acknowledgments

The authors would like to thank Don Green for his feedback to the forecasting model during its development. Additionally, we are grateful for the essential contributions of Neha Bora and Jakob Grimmius, former Øptimus modeling team members and pioneers of the 2018 forecast. Finally, we would also like to thank Olivia Blute, Austin Kim, and Alexander Podkul for supporting the modeling team in years past and present.

Disclosure Statement

Every author of this article either is employed, or has recently been employed, by Øptimus Analytics, a data science firm specializing in predictive modeling across the public and private sector.


Appendix

Table A1. List of variables used in our models.

Name

House

Senate

President

Description

Source

3 Month Net Change in Weekly Wage

T

T

F

Net change in weekly wage over previous 3 months

Federal Reserve Economic Data

3 Month Percent Change in Weekly Wage

T

T

F

Percent change in weekly wage over previous 3 months

Federal Reserve Economic Data

Adjusted PVI

T

T

F

PVI+national environment

Calculated in-house

Asian Pct

F

F

T

Asian population percent

US Census Bureau

Average Weekly Wage

T

T

F

Average weekly wage in the previous quarter

Federal Reserve Economic Data

Bachelor’s Degree Pct

T

T

T

Bachelor’s degree percent

US Census Bureau

Black Pct

F

F

T

Black population percent

US Census Bureau

CFG Involvement

T

T

T

CGF spent money T/F

Federal Election Commission

CFG Percent

T

T

T

Percent of spending from CFG

Federal Election Commission

CLF Involvement

T

F

T

CLF spent money T/F

Federal Election Commission

CLF Percent

T

F

T

Percent of spending from CLF

Federal Election Commission

Congressional District or Senate Class

T

T

F

Congressional District or Senate Class

Historical election results

CPI

F

F

T

Consumer price index

Federal Reserve Economic Data

D 2 Party Pct

F

F

T

Democratic percentage of two-party vote

Historical election results

D Candidate Ideology

F

F

T

Democratic candidate ideology

Database on Ideology, Money in Politics, and Elections

D Consecutive Terms

F

F

T

Number of consecutive Democratic terms

Historical election results

D Home State

F

F

T

Home state of Democratic presidential candidate

Historical election results

D IEM Price

F

F

T

Closing price for Democratic candidate in winner-take-all market on day before election

Iowa Electronic Markets

D Incumbent Candidate

F

F

T

Democratic incumbent running

Historical election results

D Incumbent Party

F

F

T

Democratic incumbent running

Historical election results

D Overall Pct

F

F

T

Democratic percentage of overall vote

Historical election results

D President Net Approval

F

F

T

Net approval rating for Democratic president

The American Presidency Project

D Primary Margin

F

F

T

Difference in overall primary popular vote percentage between Democratic nominee and closest primary challenger

Historical election results

D VP Home State

F

F

T

Home state of Democratic vice presidential candidate

Historical election results

D Win

F

F

T

Democratic win

Historical election results

DCCC Involvement

T

F

F

DCCC spent money T/F

Federal Election Commission

DCCC Percent

T

F

F

Percent of spending from DCCC

Federal Election Commission

Decade

T

F

F

>2010 or <2010

Historical election results

Dem CFG Oppose

T

T

T

Amount spent by CFG opposing Democrat

Federal Election Commission

Dem CFG Support

T

T

T

Amount spent by CFG supporting Democrat

Federal Election Commission

Dem CLF Oppose

T

F

T

Amount spent by CLF opposing Democrat

Federal Election Commission

Dem CLF Support

T

F

T

Amount spent by CLF supporting Democrat

Federal Election Commission

Dem DCCC Oppose

T

F

F

Amount spent by DCCC opposing Democrat

Federal Election Commission

Dem DCCC Support

T

F

F

Amount spent by DCCC supporting Democrat

Federal Election Commission

Dem Debts Or Loans Owed By

T

F

T

Democratic debts or loans owed by committee

Federal Election Commission

Dem Debts Or Loans Owed To

T

F

T

Democratic debts or loans owed to committee

Federal Election Commission

Dem Ending Cash On Hand

T

F

T

Democratic ending cash on hand

Federal Election Commission

Dem HMP Oppose

T

F

T

Amount spent by HMP opposing Democrat

Federal Election Commission

Dem HMP Support

T

F

T

Amount spent by HMP supporting Democrat

Federal Election Commission

Dem Ind Expenditure Oppose

T

T

T

Independent Expenditures to oppose democratic candidate

Federal Election Commission

Dem Ind Expenditure Percent

T

T

T

Democratic percent of independent Expenditures

Federal Election Commission

Dem Ind Expenditure Support

T

T

T

Independent Expenditures to support democratic candidate

Federal Election Commission

Dem Individual Refunds

T

F

T

Democratic individual refunds

Federal Election Commission

Dem Itemized Individual Contributions

T

F

T

Democratic itemized individual contributions

Federal Election Commission

Dem Last Vote Count

T

F

F

Democratic vote count from previous cycle (same cd)

Historical election results

Dem Last Vote Percent

T

F

F

Democratic vote percent from previous cycle (same cd)

Historical election results

Dem Loans Made By Candidate

T

F

T

Democratic loans made by candidate

Federal Election Commission

Dem NAOR Oppose

T

T

T

Amount spent by NAOR opposing Democrat

Federal Election Commission

Dem NAOR Support

T

T

T

Amount spent by NAOR supporting Democrat

Federal Election Commission

Dem NRCC Oppose

T

F

F

Amount spent by NRCC opposing Democrat

Federal Election Commission

Dem NRCC Support

T

F

F

Amount spent by NRCC supporting Democrat

Federal Election Commission

Dem Num Opponents

T

T

F

Number of opponents in dem primary

Historical election results

Dem Offsets To Operating Expenditures

T

F

T

Democratic offsets to operating Expenditures

Federal Election Commission

Dem Operating Expenditures

T

F

T

Democratic operating Expenditures

Federal Election Commission

Dem Other Committee Contributions

T

F

T

Democratic other committee contributions

Federal Election Commission

Dem Other Committee Refunds

T

F

T

Democratic other committee refunds

Federal Election Commission

Dem Other Disbursements

T

F

T

Democratic other disbursements

Federal Election Commission

Dem Other Loan Repayments

T

F

T

Democratic other loan repayments

Federal Election Commission

Dem Other Loans

T

F

T

Democratic other loans

Federal Election Commission

Dem Other Receipts

T

F

T

Democratic other receipts

Federal Election Commission

Dem Outspend

T

F

T

Whether the democratic candidate outspent the republican

Federal Election Commission

Dem Party Committee Contributions

T

F

T

Democratic party committee contributions

Federal Election Commission

Dem Political Party Refunds

T

F

T

Democratic Party refunds

Federal Election Commission

Dem Pres Net Approve

T

T

F

Net presidential approval (approval rating-disapproval rating) for Democratic Presidents, coded 0 if opposite party controls Presidency

Gallup

Dem Primary HHI

T

T

F

Herfindahl-Hirschman index (HHI) using vote share distribution in dem primary

Calculated in-house

Dem Quarterly Itemized

T

F

T

Democratic quarterly itemized contributions

Federal Election Commission

Dem Quarterly Unitemized

T

F

T

Democratic quarterly unitemized contributions

Federal Election Commission

Dem Raised

T

F

T

Democratic total raised

Federal Election Commission

Dem Spent

T

F

T

Democratic total spent

Federal Election Commission

Dem Spent Ind Support Oppose

T

T

T

Democratic total spent+Democratic independent Expenditures supporting+Republican independent Expenditures opposing

Federal Election Commission

Dem Total Contribution Refunds

T

F

T

Democratic total contributions and refunds

Federal Election Commission

Dem Total Contributions

T

F

T

Democratic total contributions

Federal Election Commission

Dem Total Individual Contributions

T

F

T

Democratic total individual contributions

Federal Election Commission

Dem Total Loan Repayments

T

F

T

Democratic total loan repayments

Federal Election Commission

Dem Total Loans Received

T

F

T

Democratic total loans received

Federal Election Commission

Dem Transfers From Other Authorized Committees

T

F

T

Democratic transfers from other authorized committees

Federal Election Commission

Dem Transfers To Other Authorized Committees

T

F

T

Democratic transfers to other authorized committees

Federal Election Commission

Dem Unitemized Individual Contributions

T

F

T

Democratic unitemized individual contributions

Federal Election Commission

Dem Vote Count Last3

T

T

F

Democratic vote count from previous 3 cycles

Historical election results

Dem Vote Percent Last3

T

T

F

Democratic vote percent from previous 3 cycles

Historical election results

Democrat Gender

T

T

F

Gender of Democratic candidate

Database on Ideology, Money in Politics, and Elections

Democrat Ideology Score

T

T

F

Ideal point estimate of Democratic candidate ideology based on campaign finance records (positive values are more conservative, negative values are more liberal; the further away from 0 a value is, the more extreme their ideology)

Database on Ideology, Money in Politics, and Elections

Democratic candidate contributions

T

F

T

Democratic candidate contributions

Federal Election Commission

Democratic loan repayments

T

F

T

Democratic loan repayments

Federal Election Commission

Effective Federal Funds Rate

F

F

T

Effective Federal Funds Rate

Federal Reserve Economic Data

EV

F

F

T

Number of Electoral Votes available

n/a

Freshman Incumbent

T

F

F

0 = not freshman, 1 = freshman elected previous general election, 2 = freshman elected in special election more than 1 year earlier, 3 = freshmen elected in a special election during election year, 9 = seat not defended by major party incumbent

Historical election results

GDP

F

F

T

Gross Domestic Product (GDP)

Federal Reserve Economic Data

Generic Ballot National Environment

T

T

F

15 day average of generic congressional ballot of D vs R; positive values favor R, negative values favor D

RealClearPolitics

Geo Class

T

T

F

Description of how rural/urban the district is (e.g. "quite_rural", "extremely_urban", "semi_urban_rural")

US Census Bureau

GNP

F

F

T

Gross National Product (GNP)

Federal Reserve Economic Data

GOP Candidate Contributions

T

F

T

Republican candidate contributions

Federal Election Commission

GOP Candidate Contributions Score

F

T

T

Difference between Republican and Democratic candidate contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Candidate Loan Repayments

T

F

T

Republican loan repayments

Federal Election Commission

GOP Candidate Loan Repayments Score

F

T

T

Difference between Republican and Democratic candidate loan repayments (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP CFG Oppose

T

T

T

Amount spent by CFG opposing Republican

Federal Election Commission

GOP CFG Support

T

T

T

Amount spent by CFG supporting Republican

Federal Election Commission

GOP CLF Oppose

T

F

T

Amount spent by CLF opposing Republican

Federal Election Commission

GOP CLF Support

T

F

T

Amount spent by CLF supporting Republican

Federal Election Commission

GOP DCCC Oppose

T

F

F

Amount spent by DCCC opposing Republican

Federal Election Commission

GOP DCCC Support

T

F

F

Amount spent by DCCC supporting Republican

Federal Election Commission

GOP Debts Or Loans Owed By

T

T

T

Republican debts or loans owed by committee

Federal Election Commission

GOP Debts Or Loans Owed By Score

T

T

T

Difference between Republican and Democratic debts or loans owed by committee (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Debts Or Loans Owed To

T

F

T

Republican debts or loans owed to committee

Federal Election Commission

GOP Debts Or Loans Owed To Score

F

T

T

Difference between Republican and Democratic debts or loans owed to committee (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Ending Cash On Hand

T

F

T

Republican ending cash on hand

Federal Election Commission

GOP Ending Cash On Hand Score

F

T

T

Difference between Republican and Democratic ending cash on hand (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP HMP Oppose

T

F

T

Amount spent by HMP opposing Republican

Federal Election Commission

GOP HMP Support

T

F

T

Amount spent by HMP supporting Republican

Federal Election Commission

GOP Ind Expenditure Oppose

T

T

T

Independent Expenditures to oppose republican candidate

Federal Election Commission

GOP Ind Expenditure Percent

T

T

T

Republican percent of independent Expenditures

Federal Election Commission

GOP Ind Expenditure Support

T

T

T

Independent Expenditures to support republican candidate

Federal Election Commission

GOP Individual Refunds

T

F

T

Republican individual refunds

Federal Election Commission

GOP Individual Refunds Score

F

T

T

Difference between Republican and Democratic individual refunds (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Itemized Individual Contributions

T

F

T

Republican itemized individual contributions

Federal Election Commission

GOP Itemized Individual Contributions Score

F

T

T

Difference between Republican and Democratic itemized individual contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Last Vote Count

T

F

F

GOP vote count from previous cycle (same cd)

Historical election results

GOP Last Vote Percent

T

F

F

GOP vote Percent from previous cycle (same cd)

Historical election results

GOP Loans Made By Candidate

T

F

T

Republican loans made by candidate

Federal Election Commission

GOP Loans Made By Candidate Score

F

T

T

Difference between Republican and Democratic loans made by candidate (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP NAOR Oppose

T

T

T

Amount spent by NAOR opposing Republican

Federal Election Commission

GOP NAOR Support

T

T

T

Amount spent by NAOR supporting Republican

Federal Election Commission

GOP NRCC Oppose

T

F

F

Amount spent by NRCC opposing Republican

Federal Election Commission

GOP NRCC Support

T

F

F

Amount spent by NRCC supporting Republican

Federal Election Commission

GOP Num Opponents

T

T

F

Number of opponents in GOP primary

Historical election results

GOP Offsets To Operating Expenditures

T

F

T

Republican offsets to operating Expenditures

Federal Election Commission

GOP Offsets To Operating Expenditures Score

F

T

T

Difference between Republican and Democratic offsets to operating Expenditures (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Operating Expenditures

T

F

T

Republican operating Expenditures

Federal Election Commission

GOP Operating Expenditures Score

F

T

T

Difference between Republican and Democratic operating Expenditures (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Committee Contributions

T

F

T

Republican other committee contributions

Federal Election Commission

GOP Other Committee Contributions Score

F

T

T

Difference between Republican and Democratic other committee contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Committee Refunds

T

F

T

Republican other committee refunds

Federal Election Commission

GOP Other Committee Refunds Score

F

T

T

Difference between Republican and Democratic other committee refunds (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Disbursements

T

F

T

Republican other disbursements

Federal Election Commission

GOP Other Disbursements Score

F

T

T

Difference between Republican and Democratic other disbursements (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Loan Repayments

T

F

T

Republican other loan repayments

Federal Election Commission

GOP Other Loan Repayments Score

F

T

T

Difference between Republican and Democratic other loan repayments (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Loans

T

F

T

Republican other loans

Federal Election Commission

GOP Other Loans Score

F

T

T

Difference between Republican and Democratic other loans (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Other Receipts

T

F

T

Republican other receipts

Federal Election Commission

GOP Other Receipts Score

F

T

T

Difference between Republican and Democratic other receipts (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Party Committee Contributions

T

F

T

Republican party committee contributions

Federal Election Commission

GOP Party Committee Contributions Score

F

T

T

Difference between Republican and Democratic party committee contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Political Party Refunds

T

F

T

Republican Party refunds

Federal Election Commission

GOP Political Party Refunds Score

F

T

T

Difference between Republican and Democratic party refunds (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Pres Net Approve

T

T

F

Net presidential approval (approval rating-disapproval rating) for Republican Presidents, coded 0 if opposite party controls Presidency

Gallup

GOP Primary HHI

T

T

F

HHI using vote share distribution in GOP primary

Calculated in-house

GOP Quarterly Itemized

T

F

T

Republican quarterly itemized contributions

Federal Election Commission

GOP Quarterly Itemized Score

F

T

T

Difference between Republican and Democratic quarterly itemized contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Quarterly Unitemized

T

F

T

Republican quarterly unitemized contributions

Federal Election Commission

GOP Quarterly Unitemized Score

F

T

T

Difference between Republican and Democratic quarterly unitemized contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Raised

T

F

T

Republican total raised

Federal Election Commission

GOP Raised Score

T

T

T

Difference between Republican and Democratic total raised (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Spent

T

F

T

Republican total spent

Federal Election Commission

GOP Spent Ind Support Oppose

T

T

T

Republican total spent+Republican independent Expenditures supporting+Democratic independent Expenditures opposing

Federal Election Commission

GOP Spent Score

T

T

T

Difference between Republican and Democratic total spent (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Total Contribution Refunds

T

F

T

Republican total contributions and refunds

Federal Election Commission

GOP Total Contribution Refunds Score

F

T

T

Difference between Republican and Democratic total contributions and refunds (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Total Contributions

T

F

T

Republican total contributions

Federal Election Commission

GOP Total Contributions Score

F

T

T

Difference between Republican and Democratic total contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Total Individual Contributions

T

F

T

Republican total individual contributions

Federal Election Commission

GOP Total Individual Contributions Score

F

T

Difference between Republican and Democratic total individual contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Total Loan Repayments

T

F

T

Republican total loan repayments

Federal Election Commission

GOP Total Loan Repayments Score

F

T

T

Difference between Republican and Democratic total loan repayments (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Total Loans Received

T

F

T

Republican total loans received

Federal Election Commission

GOP Total Loans Received Score

F

T

T

Difference between Republican and Democratic total loans received (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Transfers From Other Authorized Committees

T

F

T

Republican transfers from other authorized committees

Federal Election Commission

GOP Transfers From Other Authorized Committees Score

F

T

T

Difference between Republican and Democratic transfers from other authorized committees (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Transfers To Other Authorized Committees

T

F

T

Republican transfers to other authorized committees

Federal Election Commission

GOP Transfers To Other Authorized Committees Score

F

T

T

Difference between Republican and Democratic transfers to other authorized committees (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Unitemized Individual Contributions

T

F

T

Republican unitemized individual contributions

Federal Election Commission

GOP Unitemized Individual Contributions Score

F

T

T

Difference between Republican and Democratic unitemized individual contributions (larger differences have values approaching 1, smaller differences have values approaching 0)

Federal Election Commission

GOP Vote Count Last3

T

T

F

Republican vote count from previous 3 cycles

Historical election results

GOP Vote Percent Last3

T

T

F

Republican vote percent from previous 3 cycles

Historical election results

GOP Win

T

T

F

Response variable, boolean indicating if republican won

Historical election results

Grn Last Vote Percent

T

F

F

Green party vote Percent from previous cycle (same cd)

Historical election results

Hispanic Pct

F

F

T

Hispanic population percent

US Census Bureau

HMP Involved

T

F

T

HMP spent money T/F

Federal Election Commission

HMP Percent

T

F

T

Percent of spending from HMP

Federal Election Commission

Incumbent

T

T

F

1 = GOP incumbent, 0 = no incumbent, -1 = Dem incumbent

Historical election results

Ind Last Vote Percent

T

F

F

Independent party vote Percent from previous cycle (same cd)

Historical election results

Index of Consumer Sentiment

F

F

T

Index of Consumer Sentiment

Federal Reserve Economic Data

Industrial Production Index

F

F

T

Industrial Production Index

Federal Reserve Economic Data

Last D 2 Party Pct

F

F

T

Democratic percentage of two-party vote in previous election

Historical election results

Last D Overall Pct

F

F

T

Democratic percentage of overall vote in previous election

Historical election results

Last D Pres Percent

T

F

F

Democrat\u2019s share of two-party vote, previous election

Historical election results

Last GOP Percent

F

T

F

Same as "GOP Last Vote Percent" variable, but for the Senate

Historical election results

Last R 2 Party Pct

F

F

T

Republican percentage of two-party vote in previous election

Historical election results

Last R Overall Pct

F

F

T

Republican percentage of overall vote in previous election

Historical election results

Lib Last Vote Percent

T

F

F

Libertarian party vote Percent from previous cycle (same cd)

Historical election results

Lib Vote Percent Last3

T

T

F

Libertarian vote percent from previous 3 cycles

Historical election results

Median Age

F

F

T

Median age

US Census Bureau

Midterm

F

T

F

Midterm election T/F

Historical election results

NAOR Involved

T

T

T

NAOR spent money T/F

Federal Election Commission

NAOR Percent

T

T

T

Percent of spending from NAOR

Federal Election Commission

NASDAQ

F

F

T

NASDAQ Composite

Federal Reserve Economic Data

National Polls

T

T

T

Average support in national ballot test polling

Compiled in-house

Non-Farm Pay

F

F

T

Nonfarm payrolls

Federal Reserve Economic Data

Nonhispanic White Pct

T

T

T

Non-hispanic white population percent

US Census Bureau

NRCC Involved

T

F

F

NRCC spent money T/F

Federal Election Commission

NRCC Percent

T

F

F

Percent of spending from NRCC

Federal Election Commission

Of Prespty

T

T

F

If GOP candidate is of the president's party

Historical election results

Of Prespty By Midterm

T

T

F

1 = GOP candidate is of sitting president's party and it is a midterm election year, 2 = pres party and presidential election year, 3 = not pres party and midterm, 4 = not pres party and pres year

Calculated in-house

Open Seat

T

T

F

Indicates whether this is an open seat election (current election)

Historical election results

Over 20 Million

F

T

T

Indicates whether 20 million total was spent by all candidates combined

Federal Election Commission

Over 3 Million

T

F

T

Indicates whether 3 million total was spent by all candidates combined

Federal Election Commission

Per Capita Income

F

F

T

Per capita personal income

Federal Reserve Economic Data

Personal Consumption Expenditures

F

F

T

Personal consumption expenditures

Federal Reserve Economic Data

Pop Density

T

F

T

Population density of a cd/state

US Census Bureau

Pres By Midterm

F

T

F

Gives party of sitting president and indicates whether election cycle is midterm (e.g. "R1" if Republican president and midterm election year; "D0" if Democratic president and not a midterm election year)

Calculated in-house

Prespty

T

T

F

Party of current president

Historical election results

Previous Party

T

T

F

Names the party that previously held the seat

Historical election results

Primary HHI

T

T

F

HHI using primary voters in the dem+GOP primary combined

Calculated in-house

PVI

T

T

T

Cook Partisan Voting Index (positive values are R+, negative are D+)

Calculated in-house based on Cook formula

PVI Adjusted Net Approval

F

F

T

PVI minus net approval for Democratic president, plus net approval for Republican president

Calculated in-house

R 2 Party Pct

F

F

T

Republican percentage of two-party vote

Historical election results

R Candidate Ideology

F

F

T

Republican candidate ideology

Database on Ideology, Money in Politics, and Elections

R Consecutive Terms

F

F

T

Number of consecutive Republican terms

Historical election results

R Home State

F

F

T

Home state of Republican presidential candidate

Historical election results

R IEM price

F

F

T

Closing price for Republican candidate in winner-take-all market on day before election

Iowa Electronic Markets

R Incumbent Candidate

F

F

T

Republican incumbent running

Historical election results

R Incumbent Party

F

F

T

Republican incumbent running

Historical election results

R Overall Pct

F

F

T

Republican percentage of overall vote

Historical election results

R President Net Approval

F

F

T

Net approval rating for Republican president

The American Presidency Project

R Primary Margin

F

F

T

Difference in overall primary popular vote percentage between Republican nominee and closest primary challenger

Historical election results

R VP Home State

F

F

T

Home state of Republican vice presidential candidate

Historical election results

R Win

F

F

T

Republican win

Historical election results

Race ID

T

T

T

Unique identifier reflecting office, year, state, district/class

n/a

Real Personal Income

F

F

T

Real personal income

Federal Reserve Economic Data

Redistricted

T

F

F

Indicates redistricting since last election

Calculated in-house

Republican Gender

T

T

F

Gender of GOP candidate

Database on Ideology, Money in Politics, and Elections

Republican Ideology Score

T

T

F

Ideal point estimate of Republican candidate ideology based on campaign finance records (positive values are more conservative, negative values are more liberal; the further away from 0 a value is, the more extreme their ideology)

Database on Ideology, Money in Politics, and Elections

State

T

T

T

State the election is being held in

Historical election results

State Ideology

F

F

T

State/district ideology

American Ideology Project

State Polls

T

T

T

Average support in state-level ballot test polling

Compiled in-house

Total Money in Race

T

T

T

Total money spent by GOP and Dem

Federal Election Commission

Turnout Count Last

F

F

F

Total turnout from previous cycle

Historical election results

Turnout Count Last3

T

T

F

Total voter turnout for last 3 cycles

Historical election results

Unemployment Rate

T

T

T

Unemployment rate

Federal Reserve Economic Data

Unemployment Rate Net Change

T

T

F

State unemployment rate net change over year

Federal Reserve Economic Data

Unemployment Rate Percent Change

T

T

F

State unemployment rate percent change over year

Federal Reserve Economic Data

Unopposed Democrat

T

F

F

Whether dem is unopposed in this election

Historical election results

Unopposed Democrat Last Cycle

T

F

F

Whether dem ran unopposed in previous election

Historical election results

Unopposed Republican

T

F

F

Whether GOP is unopposed in this election

Historical election results

Unopposed Republican Last Cycle

T

F

F

Whether GOP ran unopposed in previous election

Historical election results

Urban Pop Density

F

F

T

Percent of population in urban areas

US Census Bureau

Urban Population Percent

T

T

F

Percent urban population

US Census Bureau

Year

T

T

T

Calendar year election occurs within

Historical election results


References

Abramowitz, A. I. (1975). Name familiarity, reputation, and the incumbency effect in a congressional election. Western Political Quarterly, 28(4), 668–684. https://doi.org/10.2307/447984

Barrut, B., & Schofield N. (2016). Measuring campaign spending effects in post-citizens united congressional elections. In The Political Economy of Social Choices (pp. 205–232). https://doi.org/10.1007/978-3-319-40118-8_9

Brady, D. W., D’Onofrio, R., & Fiorina, M. P. (2000). The nationalization of electoral forces revisited. In D. W. Brady, J. F. Cogan, & M. P. Fiorina (Eds.), Continuity and Change in House Elections (pp. 130–148).

Campbell, J. E. (2010). The seats in trouble forecast of the 2010 elections to the US House. PS: Political Science & Politics, 43(4), 627–630. https://doi.org/10.1017:S1049096510001095

Edwards, G. C. (2009). Presidential approval as a source of influence in Congress. Oxford Handbook of the American Presidency. https://doi.org/10.1093/oxfordhb/9780199238859.003.0015

Erikson, R. S. (1971). The advantage of incumbency in congressional elections. Polity, 3(3), 395–405. https://doi.org/10.2307/3234117

Erikson, R. S. (1988). The puzzle of midterm loss. The Journal of Politics, 50(4), 1011–1029. https://doi.org/10.2307/2131389

Green, D., & Gerber, A.S. (2006). Can registration-based sampling improve the accuracy of midterm forecasts? Public Opinion Quarterly, 70(2), 197–223. https://doi.org/10.1093/poq/nfj022

Jacobson, G. C. (1978). The effects of campaign spending in congressional elections. American Political Science Review, 72(2), 469–491. https://doi.org/10.2307/1954105

Lewis-Beck, M. S., & Rice, T. W. (1984). Forecasting U.S. House elections. Legislative Studies Quarterly, 9, 475–486. https://doi.org/10.2307/439492

Lewis-Beck, M. S., & Tien, C. (2014). Congressional election forecasting: structure-X models for 2014. PS: Political Science & Politics, 47(4), 782–785. https://doi.org/10.1017/S1049096514001267

Montgomery, J. M., Hollenbach, F. M., & Ward, M. D. (2012). Improving predictions using ensemble Bayesian model averaging. Political Analysis, 20(3), 271–291. https://doi.org/10.1093/pan/mps002

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V. & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research, 12(85), 2825–2830. https://www.jmlr.org/papers/v12/pedregosa11a.html

Shirani-Mehr, H., Rothschild, D., Goel, S., & Gelman, A. (2018). Disentangling bias and variance in election polls. Journal of the American Statistical Association, 113(522), 607–614. https://doi.org/10.1080/01621429.2018.1448823

Stokes, D. E., & Miller, W. E. (1962). Party government and the saliency of Congress. Public Opinion Quarterly, 26(4), 531–546. https://doi-org.eres.qnl.qa/10.1086/267126

Tufte, E. R. (1975). Determinants of the outcomes of midterm congressional elections. American Political Science Review, 69(3), 812–826. https://doi.org/10.2307/1958391

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x


©2020 Kiel Williams, Mukul Ram, Matthew Shor, Sreevani Jarugula, Dan DeRemigi, Alex Alduncin, and Scott Tranter. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

Connections
A Supplement to this Pub
Comments
0
comment
No comments here
Why not start the discussion?