Digging into Open Data

Data Insights, 2012-02-02

Rufus Pollock @rufuspollock[.org] – @okfn[.org]

Licensed under cc-by v3.0 (any jurisdiction)

Open Knowledge Foundation Shuttleworth Foundation

The Open Knowledge Foundation

Open Knowledge Foundation

Community-based not-for-profit founded in 2004

The Foundation now has projects and partnerships throughout the world and is especially active in Europe

Open Knowledge Foundation

We build tools and communities to create, use and share open knowledge - content and data that everyone can use, share and build on.

The DataHub powered by CKAN


Making it easy to get, use and share data.



Mapping (public) money worldwide.

Many more - see OKFNLabs.org




Sumer, Mesopotamia, 5000 years ago


The UK Census (1801)


The Hollerith Tabulator (US 1890 Census)


An IBM (1960s)

Today We Find Ourselves in the Midst of a Revolution

Driven by
Info Complexity [Necessity]
Info Tech [Opportunity]

We are Innovating
Opening Up

Government is Opening up Data

Open Gov Initiatives Around the World. 2.5y ago ~ 0. Now UK, US, Finland, Kenya, Netherlands, ...

Companies are Opening up Data

Like Nike, who are opening up supply chain and sustainability data


Open Data: What?

What does Open Mean?

Open Data button Open Data button Open Data button



"A piece of content or data is open if anyone is free to use, reuse and redistribute it - subject only, at most, to a requirement to attribute or share-alike."

Anyone means Anyone!
(So no restrictions on commercial use etc)

What Data?

Transport, Geodata, Statistics, Electoral-Legal ...

Key point: Non-Personal Data!

(E.g. Train times, station locations, spending breakdowns, national laws ...)

Open Data: Why?

A Story

(About Medicine Gone Wrong)

Better Understanding
Better Governance
Better Research
Better Economy

The Challenge
the Opportunity

Challenge: Exploding Info Complexity

In 1820s all UK bank clearing done in a single room in London once a day. Today, millions of transactions a minute.

=> componentization to divide and conquer complexity

Opportunity: Info Technology

Today a smart phone has much computing power as the system for the Apollo moon landings. 1TB of storage is around $100 -- in 1994 this would have cost ~ $400,000.

=> Mass participation in information access, processing and production. Decentralization.

Claim: Openness is Key

Open Data button Open Data button Open Data button

Openness and Scaling
(Closed Data Doesn't Scale!)

Woven Ball Broken Humpty
We're Weaving Data Together To Scale We Need to Componentize But We Need to Put Humpty-Dumpty Together Again - Not Possible if Closed

Information is Special: Non-Rivalrous

Very cheaply copied ~ zero cost

Giving me a 'copy' of your car is a problem, giving me a copy of your data isn't

In Products and Services

The Best Thing to do With (Your) Data will be Thought of by Someone Else

Fixing is Faster with Open Data
(And You Don't Repeat Yourself)

To many eyes all bugs are shallow

Est 6% of all bus-stops in NAPTAN wrongly located

Transparency and Efficiency

Where Does My Money Go

Where Does My Money Go?

Building the Open Data Ecosystem

How Do We Scale?
(In the 'Open' Community)

Sharing, Reworking, Improving, Learning



Small (and Medium) Data

Rather than "Big Data"


'Data Management Systems' (CMS) Like CKAN and the DataHub

CKAN and theDataHub

ETL Tools like Scraperwiki and OpenSpending


Mixins for Data


Open Data is Here
And Will Just Get Bigger

Data is a Platform – You Build on It Rather than Sell It




