Alexa’s Search Platform

John Battelle writes about a potentially game-changing service by Alexa (owned by Amazon).

Anyone can also use Alexa’s servers and processing power to mine its index to discover things – perhaps, to outsource the crawl needed to create a vertical search engine, for example. Or maybe to build new kinds of search engines entirely, or …well, whatever creative folks can dream up. And then, anyone can run that new service on Alexa’s (er…Amazon’s) platform, should they wish.

It’s all done via web services. It’s all integrated with Amazon’s fabled web services platform. And there’s no licensing fees. Just “consumption fees” which, at my first glance, seem pretty reasonable. (“Consumption” meaning consuming processor cycles, or storage, or bandwidth).

The fees? One dollar per CPU hour consumed. $1 per gig of storage used. $1 per 50 gigs of data processed. $1 per gig of data uploaded (if you are putting your new service up on their platform).

In other words, Alexa and Amazon are turning the index inside out, and offering it as a web service that anyone can mashup to their hearts content. Entrepreneurs can use Alexa’s crawl, Alexa’s processors, Alexa’s server farm….the whole nine yards.

Michael Parekh adds: “…it IS an out-of-the-box move that potentially enables a fountain of creative, search services that emerge to take advantage of what I’d call an “Index Utility” business model that Amazon/Alexa proposes.”

Om Malik writes: “The argument thus far has been that it is tough to do the indexing, build the infrastructure and stay competitive. Only a handful have been able to compete with the GYM gang. Gigablast (my personal favorite) is one such search service. Still, no one has pulled an Alexa. Interesting move, but quite understandable. Amazon knows it has little or no chance of being a player in the search game. John thinks that by offering an outsourced index it become a player. I see it slightly differently – is trying to inflict death by a thousand cuts to rivals including the GYM Gang.”

Nicholas Carr writes:

Whether Amazon’s looking to make money or make it harder for rivals to make money, the move does look like something of a watershed. What’s interesting is that it separates, or unbundles, the “engine” that, in a real sense, powers the web from the applications of that engine. And it turns the engine into a cheap commodity. It’s not hard to think of what happened when another engine – the steam engine – became a commodity a couple hundred years ago. An incredible number of applications of steam power were rapidly invented. Now, the search engine is far from the steam engine, but the example shows what can happen when you commoditize a basic piece of commercial infrastructure, giving a lot of people and companies access to it. The big question is: Are there a whole bunch of incredibly valuable search applications to be invented, or will this just set off an explosion of cute mashups? We’ll see.

Danny Sullivan comments: “I guess I get to be the underwhelmed one about Alexa announcing a new Alexa Web Search Platform that’s available to anyone willing to pay a fee…Pay a fee for what? You can create your own search engine by tapping into the 4 billion web pages Alexa has indexed over time. You can search against the entire index or just a selected set, in case you want to make your own vertical search engine. It’s hardly new territory.”

India’s Year

The Economist writes:

In the coming year India will enjoy greater international prestige than at any time since independence in 1947. With an economy that could grow at more than 7% for the fourth year in succession, that prestige is based in part on sheer clout. Over the next half-century India will emerge as one of the worlds biggest economies and great powers. But Indias standing also reflects its greatest achievement: the almost uninterrupted preservation of democratic rule in a poor country of 1.1 billion people.

Hypermediation 2.0

Nicholas Carr writes:

The hypermediation phenomenon is continuing in the Web 2.0 world of online media. We’re seeing the emergence of another new set of diverse intermediaries focused on content rather than commerce: blog subscription services like Bloglines, headline aggregators like Memeorandum, blog search engines like Technorati, ping servers like, community platforms like MySpace and TagWorld, tag aggregators like tRuTag, podcast distributors like iTunes, and of course blogs of blogs like Boing Boing. (Many of the most popular blogs in fact play more of a content-mediation role than a content-generation one.) Despite the again common feeling that the web is a force for disintermediation in media, connecting content providers and consumers directly, the reality is that the internet continues to be a rich platform for intermediation strategies, and it’s the intermediaries who stand to skim up most of the profits to be made from Web 2.0.

It’s no coincidence that the most profitable internet businesses – eBay, Google, Yahoo – play intermediary roles. They’ve realized that, when it comes to making money on the web, what matters is not controlling the ultimate exchange (of products or content or whatever) but controlling the clicks along the way. That’s become even more true as advertising clickthroughs have become the main engine of online profits. Who controls the most clicks wins.

Free Grows Revenue

Jonathan Schwartz of Sun writes:

Free software creates volumes that lead the demand for deployments – which generate license and support revenues just as they did before the products were free. Free software grows revenue opportunities.

Opening up Solaris and giving it away for free has led to the single largest wave of adoption Solaris has ever seen – some 3.4 million licenses since February this year (most on HP, curiously). It’s been combined with the single largest expansion in its revenue base. I believe the same will apply to the Java Enterprise System, its identity management and business integration suites specifically. Why?

Because no Fortune 2000 customer on earth is going to run the heart of their enterprise with products that don’t have someone’s home number on the other end. And no developer or developing nation, presented with an equivalent or better free and open source product, is going to opt for a proprietary alternative.

Those two points are the market’s reality. And having reviewed them today at length at a customer conference, with some of the largest telecommunications customers on earth, I only heard the strongest agreement. They all, after all, are prolific distributors of free handsets.

Betting against FOSS is like betting against gravity. And free software doesn’t mean no revenue, it means no barriers to revenue. Just ask your carrier.

Dark Fibre and Opportunities

morph writes:

The abundance of dark fiber is the result of fiber network build-outs that included laying substantial extra fiber-optic cables at the time the trenches were opened, as a non-specific plan for the future and as a hedge to limit the likelihood that new trenches would have to be dug. I emphasize the availability of this commodity to make the point that new forms of communication that require incredible bandwidth are viable now because of the absurd availability of nascent bandwidth infrastructure.

To fill this bandwidth, many things are possible, with more in the future. Video telephony is one obvious opportunity. Videophone capability can be done and is being done in 3G networks around the world already.

But with the amount of available bandwidth, one can consider not just video telephony as per 3G networks, but multi-channel, HD-quality video telephony. Multi-channel video, along with some substantial, local image processing, allows for some very interesting things, such as 3D or interactive scene “fly through” at the control of the receiving end. This would be interesting for extended family interaction at the holidays, but if such a service becomes available, who knows how it will be used? Another application I have heard a lot about is in the medical field. 3D medical-imaging devices, such as Tom Cruz’s baby sonogram, can send streaming, ultra-high-res images to a remote facility, which could control the instrument remotely. Remote medical procedures like this are being used hospital-to-hospital even now, but they are viable over the Net to the home in some neighborhoods without additional infrastructure, due to the abundance of the dark fiber.

TECH TALK: The Best of 2005: Google

3. Robert Cringely on Google

Google has been the flavour of the year. [In fact, I wouldnt be surprised if it (or its founders) is named as Time magazines Person of the Year.] There have been numerous stories — and a couple books on Google and its intentions. For me, the ones which made me think most were the set of three articles by Robert Cringely in November-December.

The first article was entitled Google-Mart and based about how Google has learnt more from Wal-Mart than Microsoft. The real focus was on Googles plans with its dark fibre purchases.

Google’s strengths are searching, development of Open Source Internet services, and running clusters of tens of thousands of servers. Notice on this list there is nothing about operating systems.

The same follows for the rumor that Google, as a dark fiber buyer, will turn itself into some kind of super ISP. Won’t happen. And WHY it won’t happen is because ISPs are lousy businesses and building one as anything more than an experiment (as they are doing in San Francisco with wireless) would only hurt Google’s earnings.

So why buy-up all that fiber, then?

The probable answer lies in one of Google’s underground parking garages in Mountain View. There, in a secret area off-limits even to regular GoogleFolk, is a shipping container. But it isn’t just any shipping container. This shipping container is a prototype data center. Google hired a pair of very bright industrial designers to figure out how to cram the greatest number of CPUs, the most storage, memory and power support into a 20- or 40-foot box. We’re talking about 5000 Opteron processors and 3.5 petabytes of disk storage that can be dropped-off overnight by a tractor-trailer rig. The idea is to plant one of these puppies anywhere Google owns access to fiber, basically turning the entire Internet into a giant processing and storage grid.

The second article in the series looked at the endpoints which would connect to these data centers the Google Box.

…the most important reason for Google to distribute its data centers in this way is to work most efficiently with a hardware device the company is thinking of providing to customers. This embedded device, for which I am afraid I have no name, is a small box covered with many types of ports – USB, RJ-45, RJ-11, analog and digital video, S-video, analog and optical sound, etc. Additional I/O that can’t be seen is WiFi and Bluetooth. This little box is Google’s interface to every computer, TV, and stereo system in your home, as well as linking to home automation and climate control. The cubes are networked together wirelessly in a mesh network, so only one need be attached to your broadband modem or router. Like VoIP adapters (it does that too, through the RJ-11 connector) the little cubes will come in the mail and when plugged in will just plain work.

Think about the businesses these little gizmos will enable. The trouble with VoIP in the home has been getting the service easily onto your home phone. Then get a box for each phone. The main hurdle of IP TV is getting it from your computer to your big screen TV. Just attach a box to every TV and it is done, with no PC even required. Sounds like Apple’s Video Express, eh? On top of entertainment and communication the cubes will support home alarm and automation systems – two businesses that are huge and also not generally on the radar screens of any Google competitors.

The third article extrapolated to The Sweet Spot and Googles plans to win the broadband game.

Parked at the peering point, sitting on the same SONET ring as the local telephone company, Google will have done as much as it possibly can to reduce any network disadvantage. By leveraging its own fiber backbone Google not only further avoids such interference, it has a chance to gain a step or two through better routing or more generous backbone provisioning. What’s stored IN the data centers is important, but how they are CONNECTED is equally important.

The other part of the strategy is the gBox or gCube or — how about this one, the gSpot? — Google’s interface device, which might be Google’s version of the “Home Gateway.” Another example would be France Telecom’s Livebox (or the number two French ISP Free’s Freebox, which is even better), integrating video, Internet, and VoIP. And if you check out the latest Xbox or PS/2 releases, you’ll see everyone is heading that same way, from different starting points in the home. But the gSpot strategy is completely different. Where the company is deliberately deciding NOT to compete against the infrastructure builders on the street corner, they plan to overwhelm all players inside homes and businesses.

Who is going to win the triple play? It doesn’t matter. Who is going to win the game? Any player with deep pockets and no particular technological dependency. At this point that could be Yahoo or Microsoft or AOL or some new player altogether, but it probably means Google.
Whether this is indeed what Google does or not only time will tell. But theres plenty of food for thought on how to build the infrastructure for tomorrows world, especially in emerging markets.

Tomorrow: Mobility

Continue reading