Data Science

The Presidential Race: A look at election speeches and what personality wins

Kanav Hasija
Tue 28 July 2015

Our text analytics group recently enjoyed a coffee break during a light monsoon. We discussed the idea that leaders have a unique personality trait that makes them lead. It did not take much time to convince ourselves to carry out a pet project in our free time to prove it right or otherwise. We are data science geeks, and it is in our blood to extract intelligence from data. So we began our hunt for data.

The U.S. presidency race was an ideal choice, because nothing is bigger than the US., President as a leader in today’s world. We discovered a niche worthwhile project running at University of California, Santa Barbara called, The American Presidency Project. Established in 1999, they claim to host the most comprehensive archive of American Presidencies with more than 100,000 documents, including the campaign speeches of presidential candidates of six U.S. elections. That was it, the data was in our hands, now it was time to tame it!

Within an hour, we got about 1,500 speeches across all states for seven candidates during six presidential elections. It was time to extrapolate interesting features from these speeches. With a hypothesis in mind that certain characteristics of spoken text impact people in some way and influence their vote towards a presidential candidate, we set out to prove that features of spoken text by presidential candidates influence the voting outcomes.

We extracted quantified measures of Big Five Personality Traits (Extraversion, Emotional Stability, Agreeableness, Conscientiousness, and Openness to Experience) using a machine learning model based on Mairesse et. al (2007) model. Use of past, present, and future verbs in these speeches tells about futuristic vision versus past comparative behaviors of candidates. These were quantified by examining types of verbs. Finally, we came down to measuring optimism/pessimism in speeches using a verified list of dictionaries to index these measures.

Obama and Kennedy had a strikingly unique winning characteristics as compared to the losing candidates in all three elections, shown below. However, this doesn’t prove anything until we run a statistically valid model on the same.


Next, we looked into what helps a candidate win elections. We ran four models with four assumptions:

  1. State – Votes: Speech in a particular state affects % of votes of that state only (Multi-Variate Linear Regression).
  2. State – Outcome: Speech in a particular state affects winning outcome of that state only (Logit Regression).
  3. Nation – Votes: Speech in a particular state affects % of votes across the nation.
  4. Nation – Outcome: Speech in a particular state affects winning outcome across the nation.

The outcomes below indicate three things very clearly:


The results demonstrate that presidential elections are also personality battles. Carefully drafted campaign speeches, optimism and forward looking candidates are rewarded by voters, while pessimism and bringing up past are punished. Similar applications of text data analysis can be replicated in the fields of sales calls, customer service calls and corporate leadership speeches.

Text analytics is as powerful a tool for political parties as it is for brands. It is real data, in real time, from real people that can give you the intelligence to vote. When you look at words as data, you can draw interesting insights from various intersections of transcribed spoken text that lead to new realms of possibilities. Perhaps the day is not far when a self-learning machine will be able to compose a speech on its own and deliver it, without the audience knowing.


The author of this blog is Kanav Hasija, Co-founder and Chief Research & Strategy Officer of Innovaccer Inc.

We at Innovaccer are working towards newer and better algorithms, technologies, and systems, which can help us derive the right information from a piece of text, a powerful tool for any industry.

Learn more about us at


Please enter valid .
Please enter valid .
Please enter valid comment.