With the least favored major party candidates in history, the 2016 presidential election has left many feeling unsatisfied and has stirred up vocal support for minor party candidates. There are compelling reasons to support minor parties, especially insofar as they can align closely with individual principles in a way that large generic parties cannot. On the other hand, voting with those principles influences, by its absence, the practical contest between major parties. How to resolve these competing forces, and defend your decision to rabid Facebook friends? Luckily, there’s a principle in mathematics that already unites these considerations: *maximizing expected value*.

Consider another probabilistic choice: deciding whether to play the lottery. Say the cost of a ticket is one dollar, while the prize is one million. The decision is obvious when made only on the *value* represented by each choice: having one million dollars is much better than having one dollar! But, as high school math teachers across the country shake their heads and explain, while the represented value of one million dollars is certainly better, the *expected value* of buying a ticket is nearly always worse.

The expected value of a random variable \(x\) given a choice \(i\) is the sum over all possible values of \(x\) weighted by the probability \(P(x\mid i)\) of the value given that choice, or
\[
\langle x\mid i\rangle=\sum_x x\,P(x\mid i)
\]
and represents the value you should expect on average. Say the probability of winning the lottery \(P_{\text{win}}\) is one in ten million, regardless of whether or not you buy a ticket. The expected value that results from buying a ticket
\[
\langle\text{money}\mid\text{you bought a ticket}\rangle
=$1\,000\,000\cdot P_{\text{win}} + $0\cdot(1-P_{\text{win}})
=$0.1
\]
is much less than the expected value of abstaining
\[
\langle\text{money}\mid\text{you didn’t buy a ticket}\rangle
=$1\cdot P_{\text{win}} + $1\cdot(1-P_{\text{win}})
=$1
\]
which is usually the case and why most mathematicians don’t play the lottery. Occasionally the expected value of buying a ticket *is* greater than not, and mathematicians everywhere split hairs trying to stick to their principles.

Though quantifying the value of a presidency isn’t straightforward, the same ideas carry through qualitatively. Voting for a candidate to advance their principles is like buying a lottery ticket to advance your pocketbook: a misleading fallacy. In fact, there are deep mathematical reasons why voting based on candidate preference can *never* faithfully represent a population’s preferences. However, given information about how others plan to vote, you can work to maximize the expected value of your choice.

Say we label the candidates A (Hillary Clinton), B (Donald Trump), and C (Gloria La Riva), and I assign them relative values of \(v_{\text A}=2\), \(v_{\text B}=-5\) and \(v_{\text C}=4\). I can estimate the expected value of a vote for candidate \(i\), or \(\langle v\mid i\rangle\), for \(i=\text A,\text B,\text C\): \begin{align} \langle v\mid i\rangle &=v_{\text A}P(\text A\mid i)+v_{\text B}P(\text B\mid i)+v_{\text C}P(\text C\mid i)\\ &=2P(\text A\mid i)-5P(\text B\mid i)+4P(\text C\mid i) \end{align} where again \(P(a\mid i)\) is the probability that candidate \(a\) will be elected given that I vote for candidate \(i\). The largest result will decide my vote.

A vote for candidate B will obviously lower the expected value, so let’s focus on the difference between voting for candidates A and C. Voting for candidate C over A will increase the probability that candidate C is elected, decrease the probability that candidate A is elected, and increase the probability that candidate B is elected. How large are these differences relative to each other? In a swing state, where candidates A and B have nearly the same support, the increase in probability of electing candidate C is negligible compared to the decrease in probability of electing candidate A. In that situation, even given their value difference, my expected value is maximized by voting for candidate A. However, I live in New York, where candidate A is highly favored. In this case, changing my vote changes all probabilities only very slightly, and likely by similar amounts. This model seems to indicate that a vote for C maximizes my expected value.

This schematic model isn’t everything, though. For instance, I could consider more nuanced outcomes. Let’s split candidate A’s presidency into two possibilities: A1 if candidate A wins with a clear majority, and A2 if candidate A wins otherwise. Given the potential ramifications of A2, I value it less than A1, say \(v_{\text{A2}}=1\). My vote can now potentially affect the expected value in a nontrivial way, despite the fact I live in a safe state for candidate A. The value of your outcomes might be influenced by similarly nuanced considerations—say, if you value the social capital gained from performativity voting for a minor candidate.

In practice, this approach can look very strange. I did not participate in the New York primary this year, but if I had it might have been best to register as a Republican and vote for John Kasich. After all, weighing the Democratic candidates about the same and with stark differences between Republican candidates (not to mention a more influential Republican vote in this state), expected outcomes would probably have been maximized by decreasing the probability of Trump!

I don’t pretend to know how you value each election outcome, or even how you measure the probabilities as influenced by your vote. Come to your own conclusions, as quantitative or as hand-wavy as you would like. But don’t ignore the philosophy of voting laid out here: vote in a way which maximizes expected value. You can vote blindly based on preference, but you’ll be no better than the rube who keeps buying lotto tickets because they like money.