Now consider an even more current search
application, the Google Mobile App for the iPhone.
The application detects the movement of the phone
to your ear, and automatically goes into speech
recognition mode. It uses its microphone to listen
to your voice, and decodes what you are saying by
referencing not only its speech recognition database
and algorithms, but also the correlation to the most
frequent search terms in its search database. The
phone uses GPS or cell-tower triangulation to detect
its location, and uses that information as well. A
search for “pizza” returns the result you most likely
want: the name, location, and contact information
for the three nearest pizza restaurants.
All of a sudden, we’re not using search via a
keyboard and a stilted search grammar, we’re talk-
ing to and with the Web. It’s getting smart enough
to understand some things (such as where we are)
without us having to tell it explicitly. And that’s just
the beginning.
And while some of the databases referenced by the
application—such as the mapping of GPS coordinates
to addresses—are “taught” to the application, others,
such as the recognition of speech, are “learned” by
processing large, crowdsourced data sets.
Clearly, this is a “smarter” system than what we
saw even a few years ago. Coordinating speech
recognition and search, search results and location,
is similar to the “hand-eye” coordination the baby
gradually acquires. The Web is growing up, and we
are all its collective parents.
In our original Web 2.0 analysis, we posited that
the future “internet operating system” would
consist of a series of interoperating data sub-
systems. The Google Mobile Application pro-
vides one example of how such a data-driven
operating system might work.
In this case, all of the data subsystems are
owned by one vendor—Google. In other cases,
as with Apple’s iPhoto ’09, which integrates
Flickr and Google Maps as well as Apple’s own
cloud services, an application uses cloud data-
base services from multiple vendors.
As we first noted back in 2003, data is the
“Intel Inside” of the next generation of computer
applications. That is, if a company has control
over a unique source of data that is required for
applications to function, they will be able to
extract monopoly rents from the use of that data.
In particular, if a database is generated by user
contribution, market leaders will see increasing
returns as the size and value of their database
grows more quickly than that of any new entrants.
We see the era of Web 2.0, therefore, as a
race to acquire and control data assets. Some of
these assets—the critical mass of seller listings
on eBay, or the critical mass of classified adver-
tising on craigslist—are application-specific.
But others have already taken on the character-
istic of fundamental system services.
Take for example the domain registries of the
DNS, which are a backbone service of the
Internet. Or consider CDDB, used by virtually
every music application to look up the metadata
for songs and albums. Mapping data from pro-
viders like Navteq and TeleAtlas is used by virtu-
ally all online mapping applications.
There is a race on right now to own the social
graph. But we must ask whether this service is
so fundamental that it needs to be open to all.
It’s easy to forget that only 15 years ago, email
was as fragmented as social networking is today,
with hundreds of incompatible email systems
joined by fragile and congested gateways. One of
those systems—internet RFC 822 email—
became the gold standard for interchange.
We expect to see similar standardization in
key internet utilities and subsystems. Vendors
who are competing with a winner-takes-all mind-
set would be advised to join together to enable
systems built from the best-of-breed data sub-
systems of cooperating companies.