The Incredible Challenge and Joy of Building the data.world Community

1.jpeg

On July 11, 2016, we launched data.world to the public. Before that date, we were in stealth mode with only this quote on our home page:

If the universe of data were suddenly made available, it would unleash the creativity of problem-solvers to combine different data sets — public and private — to develop innovative solutions to innumerable challenges.

— Mikael Hagstrom, EVP at SAS (at the World Economic Forum)

It was revealing and very inspirational, if you knew what we were up to. And if you walked into our office, you would have seen our mission statement hanging proudly in the lobby:

To build the most meaningful, collaborative, and abundant data resource in the world.

Last week, we celebrated our one-year anniversary since going live, and I want to take a moment to reflect on the incredible challenge and joy of building the data.world community over the past twelve months.

The past year has been one of the most challenging and exhilarating of my life as an entrepreneur (this is my sixth business and I’ve backed many others). Challenging because building a platform business is daunting — you are subjecting your platform to the court of public opinion and hopefully winning in the largest arena on Earth (a nod to one of my favorite quotes as an entrepreneur: The Man in the Arena by Theodore Roosevelt). We are in a globally-networked society of unparalleled historical context, and that is what makes a business as ambitious as data.world possible in the first place, but the pressure to get it right is immense. Exhilarating because the speed of feedback is incredible — you can literally watch community members interact with new functionality as soon as you deploy it. Exhilarating because of the immenseness and importance of our mission. Exhilarating because we are democratizing access to one of the most challenging (read: esoteric) but important technology stacks in history — the Semantic Web (watch Sir Tim Berners-Lee’s TED talk on linked data if you want to understand why this is so exciting). And exhilarating because of our amazing team, partners, and, of course, community members.

We’ve accomplished a lot in the past year on behalf of our community. Allow me to highlight some of our most notable milestones.

Although we’re not currently sharing numbers about the size of our community, I can tell you that we’ve grown faster than GitHub did in its first year of being live. Given how impressive of a business GitHub has become, we are very proud of that. Public affirmation is huge in building a community like data.world. We are, in fact, getting it right. That has helped us raise over $33 million from a very impressive list of investors, who are constantly helping us. But let me tell you: we always feel like we have so far to go because data.world has a huge and hugely important mission.

Related to that growth, we have had dozens of non-profits, universities, and government entities join the data.world community. This is a critical measure for me personally because it is the focus of my Henry Crown Fellowship Project. If it wasn’t for the Henry Crown Fellowship, which is the flagship Fellowship of The Aspen Institute, I’m not sure I would have started data.world. I was pretty happy investing in startups and playing Super Dad and husband. The Henry Crown Fellowship really gets under your skin and makes you think a lot about your utility and duty to the world (my daughter was also a major influence, as I wrote about in this Lucky7 post). But I don’t regret it one bit: this is the most meaningful entrepreneurial mission of my life.

One of the nonprofits that we were able to help is the Anti-Defamation League, which had been documenting an historic increase in hate crimes. When presented with the need by Jonathan Greenblatt, ADL’s CEO and also a Henry Crown Fellow, we rallied in a 24-hour period and helped them launch data.world/ADL. Recently Jonathan testified to the Senate about this historic increase in hate crimes, and he mentioned his use of data.world to shine a spotlight on the data underlying this issue. It was a very proud moment for us, to say the least. This important mission is at our core.

One output of our work with ADL.

One output of our work with ADL.

On the university front, over the past year we have contributed to closing the huge skills gap in the data science and analytics field by helping instructors at universities including The Wharton School, University of Texas at Austin, The University of North Carolina at Chapel Hill, The Ohio State University, Northwestern University, NC State University, and Drexel University and alternative higher education programs including Galvanize, Trilogy Education Services, and General Assembly. Professors and students have used data.world to collaborate on data projects, find interesting data for assignments, and students in particular have begun to build a portfolio of their data work.

Carolee demoing to the University of Chicago Crime Lab team.

Carolee demoing to the University of Chicago Crime Lab team.

We created the first open and Linked Data representation of the US Census, along with the schema and ontology, to make the US Census easily joinable to other data. The Census is one of the most linkable data assets in the US. We publicly launched this initiative with AWS in April 2017. Our work on this was partially funded by the National Science Foundation (NSF), specifically the South Big Data Hub’s DataStart program. Here’s a blog postwith more details. We were exposed to amazing people, like Jeff Meisel, CMO of the US Census, who started out as a Presidential Innovation Fellow and was also instrumental in this important Linked Data launch.

Len Fishman, Jonathan Ortiz, and Alexandra Barker from Census at Open Data Science Conference.

Len Fishman, Jonathan Ortiz, and Alexandra Barker from Census at Open Data Science Conference.

Because of our work in government, we were chosen to serve as the collaborative workspace for the Pentagon’s first open data portal, data.mil. Alongside them, we helped publish millions of digitized aerial mission records from World War I through the Vietnam War. You can access the data, discussions, and some interesting community insights and visualizations at data.world/datamil. We also worked with many others in the federal government, such as The Opportunity Project (out of The White House) and NASA.

Data viz by Noah Rippner based on data.mil’s THOR data.

Data viz by Noah Rippner based on data.mil’s THOR data.

In October 2016 we were selected as a Partner in the Commerce Department’s NTIS (National Technical Information Service) Joint Venture to accelerate data innovation within the federal government. We are the youngest company in the JV and proudly join industry leaders like Amazon Web Services, IBM, Palantir, Deloitte Consulting, and Booz Allen.

I joined the Board of Data Coalition and Data Foundation. My co-founder and our CTO, Bryon Jacob, joined the Advisory Board as well. The work here has been really rewarding, and Hudson Hollister, founder and CEO of both organizations, has built a very strong team and inroads into federal government. The most recent major product of his work is the implementation of the DATA Act of 2014. Now federal spending data is becoming machine readable and easily accessible. Read Hudson’s blog postfor more and check out beta.usaspending.gov.

We launched a pilot with The Associated Press to help AP members find local stories within national datasets. The workspace brings together data journalists and subject matter experts within a shared, collaborative context (there’s more about this in this blog post). The articles coming out on topics like US housing inventory being at near twenty-year lows are more data-driven than most articles from local news outlets. We were inspired to do more for data journalists and launched an open-sourced FOIA application (based on a lot of data, of course), available here. Also related to data journalism and FOIA requests, we collected and published metadata on 930,000 declassified CIA documents in collaboration with journalist Michael Best. This metadata release helps journalists, analysts, and others fully understand and navigate the corpus of 13M pages. You can access this dataset here.

Excerpt from one of the documents in the CIA Crest archive.

Excerpt from one of the documents in the CIA Crest archive.

Efforts like these have led to newsrooms publicly using our platform to support their journalism, such as NJ.com at data.world/njdotcom. We were even cited last month in Fortune as the repository for data about how diverse Fortune 500 companies actually are. This is only the beginning of our important work with data journalists.

We developed a new dialect of the SQL query language named dwSQL, which bridges the gap between linked data and relational databases. This is a significant piece of IP and we’ve pursued patents on it (as well as quite a bit of our other core technology innovations). You can read about it in two separate posts here and here and also at docs.data.world.

We’ve won various awards, including being named to Austin Business Journal’s Best Places to Work for two years in a row and winning an Austin A-List award. Culture really matters, and we’ve built a really terrific and diverse team (28% women and 25% ethnically diverse for a 44% gender- and ethnically-diverse workforce). We are committed to doing even better in this area. As a platform business that is rapidly scaling, we have been very focused on how we built our team. We’ve hired proven managers that still maintain their great individual contributor ability. This is rarer than it sounds and this is a huge benefit of having an experienced co-founder team and early team members from companies like Bazaarvoice, HomeAway, Indeed, and Trilogy. Our recruiting network is strong. The hiring breaks thankfully went our way (candidates chose us over competing offers) and we couldn’t be happier with how our team has come together. It is the strongest early-stage team I’ve ever been a part of.

Accepting the Austin A-List Award with data.world community member Dr. Philip Cannata of UT Computer Science.

Accepting the Austin A-List Award with data.world community member Dr. Philip Cannata of UT Computer Science.

We were certified as a B Corporation in November 2016. We converted from a C Corporation to a Public Benefit Corporation right before we launched on July 11, 2016. One-hundred percent of our shareholders signed off on this conversion and for many of them we are the first B Corp that they’ve been involved in. As a Public Benefit Corporation, we are legally required to explain our mission, which you can find here. And we regularly discuss how we are fulfilling our mission at our Board of Directors meeting too. We even got to play a hand in testifying to help Public Benefit Corporation legislation get passed here in Texas recently. Having raised one of the largest financing rounds for a B Corporation, we were approached to help Rep. Gina Hinojosa, a freshman Texas House Representative out of Austin, in her efforts to pass legislation to allow for Public Benefit Corporations in Texas. Ariane Chan, our General Counsel, worked with Rep. Hinojosa’s staff on the drafts of the bill and responses to critics of it. Ariane was also one of two supporters to testify at the House committee hearing on the bill. We were thrilled when the bill passed and was signed by the governor. The new law will go into effect on September 1, 2017 when Texas will join 32 other states to allow for this type of society-benefiting corporate structure.

We’ve received terrific media coverage and also authored guest articles in Fast Company, NewCo Shift (x2), TED.com, Gigaom, FedScoop, Sunlight Foundation, TechCrunch (x2), Inc., IBT (x2), Business Insider, Lifehacker, KDnuggets (x2), government technology, BuzzFeed News, MediaShift, Poynter, governmentCIO, and many other publications. My personal favorite is still John Battelle’s article about our launch last year because it places data.world so well in historical context. We also had a Priceonomics article based on a few datasets go really viral, including being featured on Lifehacker. If you haven’t read it yet, check out “Do State Department Travel Warnings Reflect Real Danger?”. It was picked up and written about all over the world, including by journalists in Cuba.

From “Do State Department Travel Warnings Reflect Real Danger?”

From “Do State Department Travel Warnings Reflect Real Danger?”

We have spoken at so many events in the past year, including at The White House, World Web Forum, UN World Data Forum, SXSW, NICAR, Techonomy, Wolfram Data Summit, Open Data Institute, and dozens of other important events.

We’ve launched powerful integrations with Tableau, Python, R, Java/JDBC, CKAN, Open Knowledge International, and now Vega, which is an open-source visualization project by the founder of D3. If you are a Tableau customer, check out our tutorial on that integration.

Screenshot of our Tableau integration.

Screenshot of our Tableau integration.

On our one-year anniversary, we launched our most significant new feature yet, Data Projects. Data Projects represents hundreds of hours of listening to our community members on the challenges of successfully collaborating on and completing a data project. It was our most significant change to our platform as previously everything revolved around datasets. You can read all about it here.

Data Projects screenshot.

Data Projects screenshot.

Also on our one-year anniversary, we opened up our Slack community to all data.world community members (previously it had been for our Community Member Advisory Council only). This is an important move to amplify the voices of our community members and open up our aperture even wider. You can access it at slack.data.world and all of the conversational archives are at datadotworldcommunity.slackarchive.io. We are just getting started with this, and expect to see more from us here (we have a lot of ideas on how to leverage this more).

Whew, that is a lot to highlight in the past year and it makes me feel really proud to recap it all! I feel very fortunate to be on this journey, alongside such a strong team and community, and on such an important mission. We are changing the way that the world works with data forever. We believe that data should be social, linked, and integrated. Down with the siloes and the massive waste of human effort, finally.

Thank you so much for being on this mission with us, and we are looking forward to the amazing year ahead.

Sincerely,

Brett (data.world/databrett)