The Primary Model and the 2008 Presidential Election

The PRIMARY MODEL predicted that in a race of New Hampshire Primary winners, Democrat Hillary Clinton would narrowly defeat Republican John McCain in the November general election (50.5 to 49.5 percent of the two-party vote). The predicted margin of victory, however, was so small that the confidence attached to this forecast was less than 60 percent, given the size of the forecast standard error (2.5).

In match-ups between the Republican primary winner and Democratic primary losers, McCain was forecast to end up in a virtual tie with Barack Obama  (49.9 to 50.1 percent) while defeating John Edwards (52.2 to 47.8 percent) by a margin close to one unit of the forecast standard error (2.6). At the same time, in match-ups between the Democratic primary winner and Republican primary losers, Clinton would have dispatched Mitt Romney, Mike Huckabee, and Rudolph Giuliani by margins way beyond that error range.

Finally, in match-ups between primary losers, both Obama and Edwards were forecast to beat any of the Republicans, and quite handily so in most cases. Candidates not listed in the forecast table were forecast to do no better than the weakest one in their respective parties.

Voting results for the New Hampshire Primary, Jan. 8, 2008:

Republicans: McCain (37.2), Romney (31.6), Huckabee (11.2), Giuliani(8.6)

Democrats: Clinton (39.2), Obama (36.4), Edwards (16.9)


  The Primary Model 


American elections in November are typically preceded by “primary” elections earlier in the year. So is the voting in presidential primaries a leading indicator of the vote in November? Remarkably so, as it turns out. How well presidential candidates do in primary elections foretells their prospects in the November election with great accuracy.

In addition to primary elections, the forecast model relies on a cyclical dynamic detected in presidential elections. The model estimates are based on presidential elections going back as far as 1912, the first year of presidential primaries, with an adjustment applied to partisanship for pre-New Deal elections. The primary performance of the incumbent-party candidate and that of the opposition-party candidate enter as separate predictors. For elections since 1952, the primary-support measure relies solely on the New Hampshire primary.

The absence of a sitting president from the race is no problem. The model does not use presidential approval or the state of the economy as predictors. Instead it relies on the performance of the presidential nominees in primaries (hence the sobriquet PRIMARY MODEL). (1)

The PRIMARY MODEL treats the primary performance of the incumbent-party candidate and that of the opposition-party candidate as two separate predictors. For elections since 1952, the primary-support measure relies solely on the New Hampshire primary, while using all primaries prior to that date. The model is estimated with data from presidential elections going back as far as 1912, the first year of presidential primaries, with an adjustment applied to partisanship for pre-New Deal elections. In addition to primaries, the forecast model takes advantage of a cycle in presidential elections that appears related to the two-term limit of the American presidency. The out-of-sample forecasts of the PRIMARY MODEL pick the popular-vote winner in all but one of those 24 elections.

With the New Hampshire Primary decided, the model is capable of making a final forecast for any match-up in November between Democratic and Republican candidates. In the race between the two primary winners, Democrat Hillary Clinton would edge Republican John McCain by a narrow margin: 50.5 to 49.5 percent of the two-party vote. Barack Obama would win by the narrowest of margins over McCain: 50.1 to 49.9 percent. Needless to say, both of these forecasts are well within one unit of the forecast standard error.

Candidate Support in Primaries

Ever since presidential primaries were introduced, in 1912, the ultimate nominees have played a key role in those contests. Only once (1920) did neither party give the presidential nomination to its primary winner. How primary support for a presidential nominee translates into general election support is best examined separately for the party with a president in the White House and the out-party. In the party holding the White House at the time of a presidential election, many of the nominees, of course, are presidents seeking reelection. Or they are incumbent vice presidents winning their party’s nomination (e.g., Al Gore in 2000), turning the presidential contest into a “succession election” (Weisberg and Hill 2004). During the period of interest (1912-2004), it was quite rare for the incumbent party to nominate a presidential candidate lacking any official connection to the outgoing administration (e.g., Democrats in 1952).

Until 1952, no single state with a primary managed to play the leading role in the presidential-selection drama. That changed with swift and lasting impact when the state holding the first contest decided to put presidential candidates rather than convention delegates on the ballot. Since 1952, New Hampshire has allowed primary voters to check their preferences for would-be presidents rather than delegates. That switch “gave presidential hopefuls an opportunity to demonstrate early strength” (Buell 2000, 93), and they seized on it immediately. The beauty-contest format also propelled New Hampshire into the most coveted spot of the primary season, attracting more media attention than any other state (Adams 1987). To win in New Hampshire, however small and unrepresentative, meant a boost for a presidential hopeful that a victory in no other state could match since 1952. At the same time, many of the subsequent primaries have lacked competition, proving little about the electoral appeal of the leading candidate in the general election. So, beginning with 1952, only the vote in the New Hampshire primary will be used, whereas for elections from 1912 to 1948 the vote of all primaries is used.

Looking for predictors of presidential elections, one can see the advantage of primary outcomes in the entries in Table 1. (2) The first presidential election with primaries set a remarkable precedent. In that year, the sitting president (Taft) was challenged for the nomination of the Republican Party and barely mustered one-third of the Republican primary vote. His chief rival, Teddy Roosevelt, a former president, beat Taft soundly with 51.5 percent.(3) Yet while Roosevelt failed to be rewarded with the party nomination at the Republican national convention, the nominee—Taft, after all—lost the general election. Denying the primary winner the nomination proved costly for the incumbent party. At the same time, the Democrats in 1912 nominated Woodrow Wilson, who had won the primaries and went on to victory in the general election. Message: the party that nominates its primary winner wins the general election over the party that does not. Ditto in the following presidential election, when Wilson (a primary winner, again) beat the Republican candidate Hughes (a primary loser). Is this a pattern that has held up since then in presidential contests?

Figure 1 plots the vote in the general election against primary support in the incumbent party. To establish a standard measure, the primary support of the nominee in each party is computed based on the sum total of votes received by that candidate and his chief primary rival (the one with the next most votes, or the leading vote-getter if the nominee did not win the primary battle). That rule will also be applied to primary contests of the opposition party, where it is actually far more compelling. As for the vote in the general election, the share of the incumbent-party candidate is based on the major-party vote only; votes for third-party candidates are excluded. (4)

As shown by Figure 1, primary support offers a strong, though not perfect, predictor of electoral support for incumbent-party candidates in the general election. Any time primary support falls below 50% (by the standard measure adopted here), the presidential candidate of the party holding the White House loses in the general election (getting less than 50% of the major-party vote). The precedent was set by President William Howard Taft in 1912: he lost the primary battle and went on to lose the general election. By the same token, nearly every time primary support exceeds 50%, the candidate of the White House party goes on to victory in November. That precedent was set by President Woodrow Wilson in 1916: he won the primary battle and went on to win the general election. But there are exceptions to this rule. Several times an incumbent candidate was defeated in the general election despite winning most of the primary support.

Case in point: President George H. W. Bush in 1992: ahead in the primary count, but behind in November of 1992. It appears that for sitting presidents 50 percent is not a safe mark. Significant opposition in the primaries hints at trouble for re-election. Yet regardless of whether or not a sitting president is running, incumbent-party candidates appear to gain little further in general-election safety once they reach about 70 percent of support in the primary battle. In other words, the predictive relationship between primary support and November vote is not linear, or is linear only within a restricted range of primary support. That is a point to consider for the estimation of the forecast model.

Turning to the primary battle within the out-party, Figure 2 suggests that the better the opposition-party candidate does in primaries, the worse the incumbent party fares in the November election. Primary success and general election victory go hand in hand for the out-party. That was the precedent set by Wilson in 1912. But it did not always hold. Most notably, it did not do so for Al Smith (1928) or Michael Dukakis (1988); but then in each of those instances the incumbent party had nominated its primary winner. By the same token, the electoral prospects for the out-party in November are gloomy when the primary support of its candidate falls short of 50%. In most of those elections, the incumbent party has won the general election. One exception is the 1920 election. The Democrats lost the White House in a landslide to a Republican candidate (Warren Harding) who barely registered in his party’s primaries that year. As Figure 2 makes clear, the 1920 case is an outlier, but it is also the only outlier. Even in the old days—contrary to much conventional wisdom—winning general elections without strong primary support was not common. In sum, the predictive power of primary success for general election performance is impressive for the out-party, competing with that of the incumbent party in some cases and complementing it in others.

The Forecast Model

In addition to primaries, the PRIMARY MODEL also enlists a cyclical dynamic of the presidential vote that is useful for forecasting all by itself (Midlarsky 1984, Norpoth 1995). A compelling explanation for that dynamic is the existence of a term limit in presidential elections (Norpoth 2002). Except for FDR, American presidents have eschewed running for more than two terms; and have been barred from doing so since then. The rule guarantees that incumbent presidents are missing from those contests in some periodic fashion, as will be the case in 2008. In many such instances the absence of a sitting president with a high degree of popularity may improve the chances of the opposition party of capturing the White House. Given his high approval rating, Bill Clinton’s ineligibility in 2000 probably hurt the Democratic prospects that year, although the absence of a much less popular George W. Bush in 2008 may be a blessing for the GOP. In any event, elections without a sitting president in the race tend to favor the opposition party more than elections with an incumbent running for another term. We can model this periodicity in presidential elections by means of a second-order autoregressive process, as proposed long ago by Yule (1927, 1971). All it takes is a positive sign of the coefficient for the vote in the preceding presidential election and a negative sign of the coefficient for vote in the presidential election two terms back.

Finally, the forecast model includes an adjustment for long-term partisanship. While there is much dispute over a certifiable realignment during the last half century, there is no question about the reality of the New Deal realignment. The 1930s, by all accounts, witnessed a major shift of the baseline of partisan support, as recently confirmed by a time series analysis of the congressional vote from 1828 on (Norpoth and Rusk 2007). The forecast model incorporates this historic shift of the partisan baseline, but no further ones. As shown below, the partisan baseline in presidential elections since the 1930s stays very close to the point of equal division.

The parameters of the model are statistically estimated with data from presidential elections since 1912. Note that the dependent variable is the Democratic percentage of the major-party vote, regardless of whether that party was in the White House or not. As a result, the primary-support variables had to be inverted for elections with Republicans in control. (5) The evidence in Table 2 confirms that all predictors prove significant. The effect of primary support for the incumbent-party candidate is enormous and far stronger than is the effect of primary support for the opposition-party candidate. Hence whatever is happening in the Republican primary race carries far more weight for the ultimate election outcome than what happens in the Democratic race. There is also strong evidence for the cyclical dynamic. The estimates for the two autoregressive vote parameters translate into an expected periodicity of 5.3 for presidential elections. Put simply, a party can expect to hold the White House for about two and half terms. Going for a third term, as Republicans are trying to do is 2008, would seem to be an even bet. Finally, the partisan adjustment pays off handsomely. The pre-New Deal level of partisanship put the Democrats at a sizable disadvantage in presidential elections during the early period covered here. Yet, as indicated by the estimate for the constant right at the 50-mark, the partisan competition in presidential elections since then has been very even, notwithstanding the lead that Democrats enjoyed in party identification for much of that period.

The 2008 Forecast

So what outcome does this vote model forecast for the 2008 presidential election? All of the information required by the model is known by now—the vote in the last two elections, the outcome of the New Hampshire Primary, along with the partisan adjustment. Hence we can offer unconditional forecasts for any match-up of candidates. The only uncertainty is which of those match-ups will be on the November ballot. If past experience is any guide, we won’t have to wait until the national conventions to know the identities of the nominees for sure, especially with the heavily front-loaded schedule of primaries this year.

The prediction equation for the presidential vote in 2008 (expressed as the Democratic share of the major-party vote) is:

.361 (RPRIM – 55.6) (-1) + .124 (DPRIM – 47.1) +.368 (48.8) -.383 (50.3) + 50.7

= .361 (RPRIM – 55.6) (-1) + .124 (DPRIM – 47.1) + 49.4

where RPRIM and DPRIM represent the primary support of the Republican (incumbent party) and Democratic (opposition party) nominees for President, capped within a 30-70 percent range. (6) It can be quickly seen that at mean levels of primary support for both candidates (55.6 for the incumbent-party Republican and 47.1 for the opposition-party Democrat), the model predicts a narrow defeat for the Democratic ticket with 49.4 percent of the vote, albeit within one forecast standard error (2.5). Put another way, this would be the forecast derived solely from the cyclical dynamic with candidate strengthheld constant.

For primary support above and below those means, consider the following scenarios: One, a match-up of the primary winners; two, a match-up between the primary winner in one party and a loser in the other party; and three, a match-up between primary losers. Table 3 presents 2008 forecasts for match-ups between each of the tier-one candidates in both parties (Democrats Clinton, Obama, and Edwards vs. RepublicansMcCain,Romney, Huckabee, and Giuliani). In reading the forecasts in Table 3, keep in mind that these percentages refer to the Democratic share of the major-party vote; hence the Republican share is simply the complement of 100 percent. Conditional forecasts for a grid of primary-support levels are presented in the Appendix. (see below)

The PRIMARY MODEL predicts that in a race of New Hampshire Primary winners, Democrat Hillary Clinton would narrowly defeat Republican John McCain in the November general election (50.5 to 49.5 percent of the two-party vote). The predicted margin of victory, however, is so small that the confidence attached to this forecast is less than 60 percent, given the size of the forecast standard error (2.5). In match-ups between the Republican primary winner and Democratic primary losers, McCain would end up in a virtual tie with Barack Obama (49.9 to 50.1 percent) while defeating John Edwards (52.1 to 47.9 percent) by a margin close to one unit of the forecast standard error (2.6). At the same time, in match-ups between the Democratic primary winner and Republican primary losers, Clinton would dispatch Mitt Romney, Mike Huckabee, and Rudolph Giuliani by margins way beyond that error range. Finally, in match-ups between primary losers, both Obama and Edwards would beat any of the Republicans, and quite handily so in most cases.

That is no sign of partisan bias. Rather, it has to do with the PRIMARY MODEL assigning more weight to the primary performance of incumbent-party candidates than to the performance of out-party candidates. Nominating a primary loser, or even a candidate with a lackluster primary showing, costs the incumbent party more dearly than it does the out-party. Candidates not listed in the forecast table would do no better than the weakest one in their respective parties.

Forecast Diagnostics

How much confidence should one have in the model producing these forecasts? Earlier versions of this model predicted the popular-vote victories for Clinton in 1996, Gore in 2000, and Bush in 2004 (Campbell and Garand 1996, 8; Norpoth 2001, 45; Norpoth 2004). While all these versions relied on the cyclical dynamic, primary support has been adapted in several ways. In addition to primaries in the incumbent party, the opposition party has been included as well. What is more, instead of using a simple win-lose dichotomy, the relative share of primary support of each party’s nominee has been employed; the latest version of the forecast model constrains such support within a 30-70 range. And the model has also incorporated a partisan adjustment for the pre-New Deal level of long-term partisanship. Judging by the model standard error (2.38), the latest version tops all its predecessors in fitting the outcomes of presidential elections covered.

A key diagnostic test of a forecast model lies in its ability to come up with accurate out-of-sample predictions. This involves re-estimating the model for (n-1) elections and then using the respective model estimates to predict the omitted case. Table 4 presents such forecasts along with forecast standard errors and deviations of actual from predicted outcomes for all elections in the time frame covered here (1912- 2004). There is only one election where the forecast misses the popular-vote winner, and even that miss (1960) is debatable. (7) Only one of the forecasts is off by more than two units of the forecast standard error, and that was an election (1972) which ended up in a landslide almost five standard errors from the 50-mark. To be sure, the out-of-sample forecast for 2000 does not pick George W. Bush as the winner of the election. The forecast model is strictly a popular-vote model, and that is what George W. Bush certainly did not win. (8)

Even though the model got it right for 2004, that forecast ranks as one of its lesser accomplishments. First posted the day after the 2004 New Hampshire Primary, it saw a far easier Bush victory ahead (with 54.7 percent of the two-party vote) than what ultimately happened. There were undeniable warning signals, to be sure. The most obvious was Bush’s anemic approval rating, dipping into the low 40’s in some polls and rarely settling above the 50-point mark. Only Truman in 1948 managed to overcome such an obstacle. It was undeniable that the mood of the country was beginning to sour over the war in Iraq, fed by an incessant stream of bad news. In midyear Bush was trailing Kerry in the horse race polls.

For all the support George W. Bush enjoyed among his partisans in the electorate—as captured by his performance in the Republican Primary in New Hampshire and polls throughout the election year—he struggled with support from voters outside his party. In the end, he managed to attract just 11 percent of Democrats and split the Independents about evenly with Kerry (as shown by the exit poll). On top of his near-unanimous backing from Republicans, that was enough to make the forecast of a Bush victory come true, though just barely. To reach the margin of the forecast, Bush needed to secure the support of at least six of every ten Independents and/or more defections among Democrats. In past elections sitting presidents with strong primary showings—such as Clinton in 1996, Reagan in 1984, Nixon in 1972 etc.— all succeeded in making deep inroads among the other partisans and Independents in November. It remains to be seen whether this is a sign of a deepening polarization in the American electorate requiring some model revision or simply a special case of the Bush Presidency.


What is unique about the forecast model presented here is the reliance on primary elections as a predictor of the vote in the general election. The advantages of primaries as a vote predictor are several: One, it puts the model estimation on a firmer footing by letting us include elections all the way back to 1912. Two, it allows one to include both incumbent and opposition candidates. Granted, the incumbent candidate’s performance proves more powerful, but the out-party’s primary showing is not negligible.

Three, primary support is not just a proxy or a trial heat, but a real-life test of the candidates’ electoral performance. And finally, the use of primaries as a predictor permits an unconditional forecast of the November vote at a very early moment. Once the New Hampshire primary contests have been decided, final forecasts are available for all possible match-ups in November. After that, the only uncertainty remaining is which of those match-ups it will be.

With either Hillary Clinton or Barack Obama facing off against John McCain in November, the forecast is for a nail-biter, not a sure time-for-a-change vote. Why such a close contest at a time when a Republican president with low approval ratings and a sagging economy portend a sure Democratic victory? For one thing, this is an election without the sitting president on the ballot. His legacy counts for a lot less than it would with him on the ballot. Historically, in elections without a sitting president the outcome in November is very close. Remember 2000, or 1960! That is the message of the cyclical predictor of the model. What is more, the GOP nominee is the primary winner—McCain coming in first in New Hampshire—and that makes the incumbent party competitive in the November election. The ability of a party to rally around an early-primary winner says a lot about its electoral strength in the general election. Plus, in this particular instance the primary winner has proven that he can appeal to voters beyond the partisan base.


1) Portions of the paper have appeared in PS: Political Science & Politics, 2004. For an excellent overview of forecast models of presidential elections, see Jones 2002, as well as Lewis-Beck and Rice 1992, and Campbell and Garand 2000. For forecasts in 2004, see PS: Political Science & Politics, Oct. 2004, and Jan. 2005.

2) The elections of 1952 and 1968 rely on the primary vote received by sitting presidents (Truman and Johnson, respectively) who later withdrew from the race. The ultimate nominees (Stevenson and Humphrey, respectively) did not compete in primaries.

3) The support for “rival” in Table 1 refers to the primary vote received by whatever rival candidate for the nomination was in second or first place in primary voting, depending on whether the nominee was the primary winner. In a few cases rival support refers to the “uncommitted” category or the sum of all other candidates.

4) For the 1912 election, the two-party vote was approximated through a regression of the congressional vote on the presidential vote. The intrusion of Teddy Roosevelt’s third-party campaign was so severe that the Republican candidate ended up in third place with only 23.2% of the total popular vote while Wilson, the Democrat, won with 41.8%. Using a regression of the House vote on the presidential vote in the 10 elections preceding and following the 1912 case (1872-1952), I derived an estimate of the two-party Republican vote in the 1912 presidential election (56.3%) that was used in this analysis. Note that the correlation between the two-party vote for president and House in that period was extremely high (.95).

5) The inversion was done around the means of the variables: 60.0 for incumbent-party candidates who were sitting presidents; 55.6 for other incumbent-party candidates; and 47.1 for out-party candidates.

6) The measure for the Republican candidate is inverted (-1) because the Democratic vote is used as the dependent variable. Note that there is no need to include the partisan adjustment in the prediction equation since this variable is scored 0 for all post-1932 elections.

7) It is by no means certain that Kennedy won the popular vote. The format of the presidential ballot in Alabama makes it nearly impossible to determine the popular vote for Kennedy and Nixon in that state. Alabama voters were able to vote for each of the 11 electors separately rather than cast a single vote for a whole slate of partisan electors. Of the 11 Democratic electors, five ran as pledged to support the official Democratic nominee (Kennedy) in the electoral college, while the other six ran as “free” electors and wound up voting for Harry F. Byrd in the electoral college. The “official” count of the popular vote for Alabama lists the votes received by the top Democratic elector—a free elector, who voted for Byrd, not Kennedy, in the electoral college. Given all these complications, it might seem justified to award Kennedy no more than 5/11 of the average vote cast for Democratic electors in Alabama (the share of pledged electors). In that event, Nixon wins the national popular vote in 1960. See Gaines (2001). 

8) The close fit of the current forecast model for 2000 is especially pleasing since the earlier version used for an advance forecast in 2000 overstated the Gore vote by nearly two standard errors. The switch from a win-lose measure of primary support to one relying on relative strength, albeit with constraints, appears to have paid off.


Adams, William C. 1987. “As New Hampshire Goes . . .” In Media and Momentum, eds. Gary Orren and Nelson Polsby. Chatham: Chatham House.

Buell, Emmett H. 2000. “The Changing Face of the New Hampshire Primary.” In Pursuit of the White House 2000, ed. William Mayer. New York: Chatham House.

Campbell, James E., and James C. Garand. 2000. Before the Vote. Thousands Oaks: Sage Publications.

Gaines, Brian J. 2001. “Popular Myths about Popular Vote-Electoral College Splits.” PS: Political Science & Politics 34: 72-75.

Jones, Randall. 2002. Who Will Be in the White House? Predicting Presidential Elections. New York: Longman.

Lewis-Beck, Michael S., and Tom W. Rice 1992. Forecasting Elections. Washington, D.C.: CQ Press.

Midlarsky, Manus I. 1984. “Political Stability of Two-Party and Multiparty Systems: Probabilistic Bases for the Comparison of Party Systems.” American Political Science Review 78: 929-951.

Norpoth, Helmut. 1995. “Is Clinton Doomed? An Early Forecast for 1996.” PS: Political Science & Politics 28: 201-07.

Norpoth, Helmut. 2001. “Primary Colors: A Mixed Blessing for Al Gore.” PS: Political Science & Politics 34: 45-48.

Norpoth, Helmut. 2004. “From Primary to General Election: A Forecast of the Presidential Vote.” PS: Political Science & Politics 37: 737-740.

Norpoth, Helmut. 2002. “On a Short Leash: Term Limits and Economic Voting.” In The Context of Economic Voting,eds. Han Dorussen, and Michael Taylor. London: Routledge, 121-136.

Norpoth, Helmut, and Jerrold Rusk. 2007. “Electoral Myth and Reality: Realignments in American Politics.” Electoral Studies 2007, forthcoming.

Pomper, Gerald M. et al. 2001. The Election of 2000. New York: Chatham House.

Weisberg, Herbert F., and Timothy G. Hill. 2004. “The Succession Presidential Election of 2000: The Battle of Legacies.” In Models of Voting in Presidential Elections: The 2000 U.S. Election, eds. Herbert F. Weisberg and Clyde Wilcox. Stanford: Stanford University Press, 27-48.

Yule, G.U. 1971. On a method of investigating periodicities in disturbed series with special reference to Wolfer’s sunspot numbers. In A. Stuart and M.Kendall (Ed.), Statistical papers of George Udny Yule (pp. 389-420). New York: Hafner Press. (originally published 1927)