India as Innovation Hub

Wired News writes:

Generations of Indians have grown up recounting jokes about how the only contribution their nation has made to the world is the invention of zero. Innovation was something other people did.

That’s no longer the case. At research labs across the country, Indians are creating technologies specifically designed for the nation’s multilingual masses and its poor. In doing so, the country is emerging as a research hub for technologies geared to the Third World.

The article also discusses a number of examples of innovation happening in India. I will be writing a series as part of my Tech Talk columns on Innovation in India sometime soon.

Tomorrow’s Semantic Web

AlwaysOn Network has a column by Dr. Deborah McGuinness, associate director and senior research scientist of the Knowledge Systems Laboratory at Stanford University:

We’re moving to a web where I know what was meant instead of I know what was input, where the web can understand the meaning of the terms on the page instead of just the text size and color that it should present those words in. It knows a little bit about the user background. We’re facilitating interoperability so you don’t have to have all these stand-alone applications that don’t understand each other. The web is moving to be programmable by normal people instead of just geeks like me.

This is becoming important for outward-facing applications. If you’re going to trust the answers from something, you’ve got to be able to understand why you should trust them. The web is also moving to being explainable, more capable of filtering, and more capable of executing services.

While the RDF schema and RDF and XML just got to recommendation status from W3C, the basics were already there; ontology and forward is where we’re focusing these days. Potentially the most important is a level of sophistication, from the very simple notion of a catalog, a unique index for a term, to things like having a glossary where those terms have natural-language descriptions. It might be understandable by humans but not very operational by agents unless they’re capable of full natural-language understanding.

We move to a more structured realm where we see a thesaurus and the notion of a narrower or more general term. There you start to get some various structured relationships between terms and things that agents can start to use. I captured this notion of what I call an informal isa [“is a”].

Google Search Appliance

InfoWorld writes about Google’s plans for the enterprise networks:

“We want to index and allow users to search on the broadest set of enterprise documents, while retaining the simplicity of use that people have come to expect from Google,” said Dave Girouard, Google’s enterprise general manager.

That means turning the Search Appliance from a tool that today is primarily used to index and search intranets and external company Web sites to a tool with a much broader reach into enterprise data sources, he said.

While Girouard declined to specify the data sources the Search Appliance might support, he said the product at some point could be extended to tap enterprise databases, desktop PCs, content management systems and CRM (customer relationship management) applications.

Extending the Search Appliance in this manner “makes more than perfect sense” because most enterprise data resides in a variety of formats that aren’t Web-accessible, said Whit Andrews, a Gartner Inc. analyst. “How much (corporate data) gets turned into an HTML file? Relatively little,” Andrews said. “If Google wants to continue to grow that (Search Appliance) business opportunity, then it absolutely must be in the business of dealing with a large variety of enterprise document formats and data repositories.”

As such, Google is positioning itself to compete with vendors such as Verity Inc., Autonomy Corp. PLC and Fast Search & Transfer ASA, Andrews said. One thing that sets Google apart is the ease of use and interface simplicity of its online services, and this is something that the company should always strive to replicate in the enterprise search space as it pushes forward, he said. “Ease of use is something Google has really excelled at,” he said. “That’s its hallmark.”

The Search Appliance can crawl and index documents in more than 250 file formats, as long as the documents are accessible via HTTP or HTTPS.

Search could be a good extension to our Emergic MailServ.

Sun’s Java Desktop System

ONJava has an article by Sam Hiser:

While Java is important to JDS, there should be no mistaking that JDS is a complete and thoroughgoing Linux distribution. In fact, JDS is based on Novell’s SuSE Linux distro, employs the GNOME user interface, and carries a complete selection of desktop applications. Many, if not most, of JDS’s components, too, are open source software.

JDS, a hardened enterprise Linux distro, contrasts dramatically to what we call “popular Linux.” The latter is exemplified by Fedora Core 2 (or now, 3), SuSE Linux Desktop (9.x), Slackware, Gentoo, Debian, and others, which have the very latest Linux kernel and the latest versions of open source applications. JDS, being built upon older, more stable components, has faced criticism from open source users who are accustomed to the latest Free Software components and are willing to live with the attendant instability and incompleteness of the application toolset. But such critics are consistently out of touch with enterprise software demands, often unable to see the necessity or the value proposition to large organizations of completeness and integration over currency.

It’s important also to note that JDS will be available by the end of 2004 to run on Solaris workstations and on the Sun Ray thin-client system, as well as Linux.

Alan Cox on Writing Better Software

Krzysztof Kowalczyk points to the presentation (Pingwales):

Even though there has been a movement for some time to introduce traditional engineering concepts such as quality assurance to software development, Cox sees today’s software engineering as “the art of writing large bad programs rather than small bad programs”.

Of the much-vaunted ‘holy grail’ of reusable objects, Cox said, “As far as I’m concerned these all generally suck too. Part of the problem is that they’re sold as products and the original ideas behind a lot of reusable products is that you wrote it once. If you write it once, it has to do everything. If it does everything it’s complicated, and if it’s complicated, it’s broken. That’s not always the case but it is quite frequently the case.”

As for QA, “Everybody in the real world will agree – the moment a project is behind deadline, quality assurance tends to go out the window. People go through the specification and everything marked ‘optional’ becomes ‘version 2’, and everything marked ‘QA needed’ becomes, ‘we’ll find out from the users if it works,'” Cox said.

Another factor that’s led to the current state of affairs is that of canny software companies which shift bad software as quickly as possible, on the basis that once the end user has one piece of software for the job it becomes harder to switch to another one – in that context, Cox considers Microsoft’s release of early versions of MS Windows as a very sound economic and business decision.

Compounding the situation even further is the incentive for businesses to deny all knowledge and point fingers when software errors are uncovered. If there are several parties responsible for the maintenance of a piece of software, he said, it’s in everybody’s interests that the other person fixes the bug because the customer will assume that whoever fixes the bug was responsible for it. Most businesses, particularly SMEs, don’t have that luxury.

How does one make the world a better place by writing better software? For starters, Cox says, we need to accept that humans are fallible and that software engineers, no matter how well trained, will make large numbers of mistakes in their software – so we should start using the right tools to keep the error count as low as possible.

TECH TALK: Web 2.0 Conference: Observations

Jeff Jarvis summarized the conference:

Trust is an organizing principle. In our world of instant access to everything, we’ll get what we want we want with a little help from our friends — via links as a measure of trust (see Google and Technorati and more to come).

We want to control our data. There was much discussion of big, bad companies’ efforts to keep us by keeping control of our data: the roach motel strategy, as Steve Gillmor called it. They get our email (Yahoo) or our reputation (eBay) or our IM (AOL) and don’t want us to export or sync it with anyone else. But that is clearly a losing strategy.

Open source rules: Whether via Kim Polese’s new open-source-integrator business … or a couple of wiki businesses out to replace expensive enterprise software … or talk of the web, indeed, becoming our operating system … or calls to have interoperable and open standards on phone OSs …. or talk of the big, old software industry’s days being over … it’s clear that open-source is both the architecture and the culture of technology today.

RSS has arrived. I know, it had arrived before. But the RSS session in which I participated was jammed. RSS kept coming up in every tech presentation. There were lots of RSS vendors: Feedburner, Topix, Rojo.

Podcasting will arrive: Much buzz about the new platform for radio.

Martin Tobias had his own observations:

1. The web as a platform. This is the basic contention of web 2.0. If web 1.0 was about making the web safe for people, web 2.0 is about making it safe for computers. The next generation of web applications will leverage the shared infrastructure of the web 1.0 companies like EBay, Paypal, Google, Amazon, and Yahoo, not just the “bare bones transit” infrastructure that was there when we started all those companies in the late 90s (UUNet, Exodus, doubleclick,, etc.). This fact is fundamental to the next generation of entrepreneurs thinking about companies today.

2. Search is going to see major innovation over the next 12 months. Just when you thought Google had it all tied up, here comes the next generation. Some are calling it personalization, some local search, whatever you call it, you can think about it like this: The mass of web links was tamed and organized into a card catalog by Yahoo but it was still too hard to find relevant stuff, so Google reduced the process to one box with superior results. Having simplified the massively complex web to one box, we are not aching for more control, more customization, etc. Next generation search will go away from simplification into personalization and user control of the search results.

3. China is the most important country in the internet’s futureBasically if you are not paying attention to the China internet market you are not watching the future of the internet.

4. There are many people ready and willing to run up the hill again. After the meltdown of 2000-2003, I had some serious concerns about who would be stupid enough to run up the innovation hill again after such a bloodbath. Luckily, the spirit of entrepreneurship is very resilient and the successful are coming back (as well as a new crop).

Tomorrow: Observations (continued)

Continue reading