Tom Inglesby, editor
Big Data. In 2014—ages ago in internet time—Hugh J. Watson of the Department of MIS at the University of Georgia explained, “From an evolutionary perspective, big data is not new. A major reason for creating data warehouses in the 1990s was to store large amounts of data. Back then, a terabyte was considered big data.”
In fact, Teradata, a leading data warehousing vendor, has more than 35 customers, such as Walmart and Verizon, with data warehouses over a petabyte in size. Even four years ago, eBay captured a terabyte of data per minute and maintained over 40 petabytes, the most of any company in the world.
In 2012, president Obama’s campaign was seeking ways to overcome the Democrat’s serious losses in the 2010 mid-term election. One approach was to analyze voters’ data, big data, in order to find where the Obama Coalition from 2008 had gone. Dan Wagner was hired as the “targeting director” for the Democratic National Committee (DNC) in January of 2009 and he became responsible for collecting voter information and analyzing it to help the committee approach individual voters by direct mail and phone. According to Sasha Issenberg, writing in MIT Technology Review in 2012, Wagner appreciated that the raw material he was feeding into his statistical models amounted to a series of surveys on voters’ attitudes and preferences. He asked the DNC’s technology department to develop software that could turn that information into tables, and he called the result Survey Manager.
Issenberg reports that, by the 2012 election, Chris Wegrzyn, a database applications developer, became the DNC’s lead targeting developer and oversaw a series of acquisitions, all intended to free the party from the traditional dependence on outside vendors. The committee installed a Siemens Enterprise System phone-dialing unit that could put out 1.2 million calls a day to survey voters’ opinions. Later, party leaders signed off on a $280,000 license to use Vertica software from Hewlett-Packard to allow their servers to access not only the party’s 180-million-person voter file but all the data about volunteers, donors, and those who had interacted with Obama online.
That wasn’t the beginning of Big Data in elections but it turned out to be the defining moment for this generation. In 2016, Big Data had become a common term used by millions with twice as many definitions. However, Jason Shultz wrote for Surefire Data Solutions, “… big data should not be a scapegoat to hold any electoral woes or aggressions. Data is a tool, and it needs to be used properly. In the future, data of all sizes needs to be looked at more objectively, to give citizens a clear view of the possibilities. We all also need to remember that ‘data is not a substitute for innovation.’ Winning takes not only the right tools, but strategy.”
And next up: 2018 mid-terms!