DNC Rolls Out New Data Warehouse
April 2, 2019
After many months of preparation and intense development efforts, the DNC is rolling out a brand new, state-of-the-art Data Warehouse. This is one of the biggest efforts to date fulfilling Chairman Tom Perez’s commitment to overhaul the party’s tech and data infrastructure.
The Data Warehouse is the centerpiece of our tech efforts and will allow campaigns and committees at all levels of Democratic politics to better store, access, and analyze their data. It will also help us to continue to innovate, expand and improve our data capabilities in 2019, 2020, and beyond. The new Data Warehouse replaces Vertica, which was established as a temporary solution in 2011, and will make it dramatically easier to run smart, data-driven campaigns no matter what office a Democrat is seeking. This is a game changer.
Wired: INSIDE THE DEMOCRATS’ PLAN TO FIX THEIR CRUMBLING DATA OPERATION
Since 2011, Vertica has been the Democratic Party’s central repository for data—a place to store every state’s voter file, every door knock and phone call organizers make, and every bit of commercially available data campaigns collect. It played an important role in President Obama’s successful bid for reelection in 2012, and established having a strong data operation as core to modern-day campaigning. After just a few years, however, the system was already showing its age, and many Democrats feared that the lack of a strong data operation could handicap their candidates in 2020 and beyond.
Krikorian started hearing what he calls “war stories” about Vertica almost immediately, as he interviewed former campaign staffers like Robby Mook, Clinton’s campaign manager, and Stephanie Hannon, a former Googler and Clinton’s chief technology officer. The system was famous for crashing for 16 hours at a time. One data director in North Carolina told him she used to nap in her car just waiting for Vertica to come back online. Mook, Krikorian recalls, likened Vertica to Beirut—when the system got overloaded, as it almost always did, it would just shut down until the shelling stopped.
“It’s not the system’s fault it wasn’t working,” Mook tells WIRED. “It wasn’t built to last a long time or have the number of users it ended up having.”
For Krikorian, Vertica seemed like the main impediment to technological progress within the party. “I came in with a whole set of lofty goals of things we wanted to achieve at the party,” Krikorian says. “Once I peeled the onion, it all sort of came down to, well, we can’t do Interesting Thing X until Vertica’s fixed.”
So, in the months before the 2018 midterms, a make-or-break election for Democrats, he made the risky bet to divide his 40-person tech staff into two teams. One team would need to keep Vertica alive through Election Day; the other would be in charge of building whatever came next.
Now, Krikorian’s team is preparing to pull the plug on Vertica and stand up a new, more powerful system called, simply, the Data Warehouse. It will be backed by Google’s analytics tool called BigQuery, a cloud-based platform capable of handling massive datasets at the scale and speed necessary for an organization the size of the Democratic party.
“One of my top priorities has been to overhaul the party’s tech and data infrastructure and make sure we put the 2020 nominee and all of our candidates in the best possible position to take on the GOP and win,” DNC chair Tom Perez told WIRED in a statement. “The DNC’s Data Warehouse is the centerpiece of our tech efforts and will allow campaigns and committees to better store, access, and analyze their data.”