Topix and Findory

Excerpts from a SearchEngineWatch interview with Rich Skrenta about Topix, which “combines an excellent news search engine with two other hot technologies: local search and personalization. The Topix database includes full text news stories from over 4,000 sources, including a great deal of content that’s difficult to quickly access elsewhere. The real power of this nifty news search engine comes from its easy-to-use pre-built pages that aggregate news and other information into more than 150,000 topic-specific pages.”

Rather than starting with a full web crawl, which has 4 billion+ pages, we started with news, which has 4,000 sources, and is very dynamic and high quality content. We don’t cover everything in the world yet, but we do have every place in the U.S., every sports team, music artist, movie personality, health condition, public company, business vertical, and many other topics.

We developed separate software modules to crawl, cluster and categorize articles. The heart of our system is a proprietary AI categorizer that uses a massive Knowledge Base (KB) to determine the geographic location and subject categorization for each story. The final step is the Robo-Editor, which picks the best stories for display.

We have a commercial feed business for companies that want to enhance their own website offerings with deeply categorized news content. offers an extremely rich newsfeed — in addition to the standard URL, title, and summary, we have the latitude/longitude of the news source, the latitude/longitude for the subjects of the story, the prominence of the news source, the subject categorizations, and more. We can also “geo-spin” any subject category, to produce a locally focused version. These features give us a lot of flexibility to customize feeds for clients.

In addition to newspapers, is crawling radio and TV station websites, college papers, and some high school papers and weblogs. We’re also crawling government websites with “newsy” public information, such as police department crime alerts, health department reports, OSHA violation announcements, coast guard notices, and news releases from other city, county and state level government entities. We are crawling and including press releases too.

Our focus is on hyperlocal deep coverage of the U.S.. We love police blotters and little papers with extremely local coverage. If your local PTA has online meeting minutes, that’s the kind of source we want to add.

The Seattle Times has an interview with Greg Linden of Findory, “offers free, instant personalization of news searches at It learns from the news you select to read and finds articles that match your interests.”

What’s the point? There’s a glut of news out there, and Findory News is one way to find some focus within that information, Linden said. The Web site keeps a record of the articles you’ve read in the past and uses that information to automatically pick the articles you would likely be interested in.

How does it work? Each visitor is assigned an anonymous identifier, a random number that’s part of a cookie a piece of data that tracks a visitor’s preferences. It associates news searches with the individual identifiers. The service is anonymous in that it doesn’t know anything about users other than the articles they’ve read.

Why news? Lots of companies are developing their own personalization services, and Linden could make a nice living as an executive geek. He said he picked the news business because it has a redeeming social value. “If you make it easier for people to read the news, to spend less time and be more informed, that actually has a lot of value,” he said. “People make better decisions.”

India’s IT Sector

The Indian Express has some statistics about what to expect in the coming year. [USD 1 billion = Rs 4,400 crore]

Indias domestic IT industry is expected to record a growth of around 11 per cent in 2003-04, touching a revenue of Rs 33,700 crore, according to Nasscom.

Domestic IT industry has three segments software and services, hardware, peripherals and networking, and training. Software services exports are expected to grow to Rs 55,510 crore in 2003-04.

The software and services segment is expected to touch revenues of Rs 15,400 crore in 2003-04, up from Rs 13,400 crore in 2002-03, a growth of 14.8 pc, which is lower than the 23 per cent growth witnessed by the segment in 2002-03.

Hardware, networking and peripherals segment is expected to register a growth of eight per cent to reach a revenue Rs 17,200 crore in 2003-04; training segment is likely to recover in 2003-04 with revenues increasing six per cent to Rs 1200 crore, Nasscom said.

The software and services segment can be further divided into packaged software and software services.

The packaged software segment ERP, SCM, CRM and E-commerce is estimated to grow five per cent to reach Rs 2100 crore in 2003-04 from Rs 2000 crore in 2002-03.

The services segment customised software services, turnkey projects and consulting and other services is estimated to grow 18 per cent from Rs 8500 crore in 2002-03 to Rs 10,000 crore in 2003-04.

Meanwhile, small and medium enterprises have shown a significant increase in the usage of information technology with about 81 per cent of them having their own websites in 2003-04 as against 73 pc in the previous year.

According to a snap poll conducted by CII, all the respondents were availing it facilities in their organisation with 17 per cent of them describing the IT penetration as excellent compared to 12 pc in 2002-03. Last year, 19 pc of the respondents had described this phenomenon as marginal which has come down by six per cent this year, a CII release said.

Further, all the respondents have indicated an increase in their turnover due to the use of IT. While 55 pc of those polled indicated an increase of up to five pc in their turnover, 37 pc claimed a rise between 5-25 pc.

According to the survey, there has also been an increase in the annual expenditure on it by SMEs. On an average, 64 pc of the respondents spend up to five per cent of their turnover in it hardware, software and manpower.

Regarding the returns on investments on their websites, 46 pc of the organisations claimed they were satisfied while 12 pc said they had realised more than their expectations. The CII-SME survey is fourth in the series of surveys undertaken by the CII Small Industry Desk during 2003-04.

US Yellow Page Battles

The US Yellow Pages market has been largely controlled by the Baby Bells. Now, Yellow Book is upsetting the applecart with its aggressive pricing. WSJ has more:

Mr. Walsh, the 41-year-old chief executive of Yellow Book USA, a unit of Yell Group PLC, is riling the once-sleepy yellow-pages industry with a simple formula: selling cheaper ads that attract more advertisers. In just 10 years, Yellow Book has grown from a small local directory service to the publisher of more than 500 directories with a total distribution of 72 million in 42 states.

While Wall Street is fixated on the brutal competition to provide phone service, there’s an equally brutal competition under way in the $14 billion yellow-pages industry. As recently as 1995, Baby Bell phone companies and other incumbents snared around 96% of yellow-pages revenue, according to the Kelsey Group, which analyses the industry. Now, they get about 86%.

The Bells badly want to protect this lucrative franchise from further inroads. While yellow-pages advertising accounts for a relatively small part of the revenue for the former Baby Bells, it accounts for a much larger share of their profits.

For now, Mr. Walsh’s Yellow Book USA appears to be the most ominous threat to the Bells. Through a combination of acquisitions and internal growth, the company’s revenue in 2003 hit $1 billion, compared with $46 million 10 years ago. Today, three out of four Americans live in a market with Yellow Book directories, and Mr. Walsh aims to push that number higher.

It’s not just acquisitions that make Yellow Book tick. Mr. Walsh’s strategy is to go after smaller customers. Where Verizon charges around $3,300 for a full-page ad in Philadelphia, Mr. Walsh charges less than $1,900 for the same. Mr. Walsh says that he can live with Yellow Book’s profit margins being a lot lower than those of the Bells’ directory services, which can run well in excess of 50%.

Tomorrow’s Office

Business Week looks ahead to the offices we will work in:

Many scientists at universities and government labs — and at companies such as IBM, Microsoft, and even office-furniture maker Steelcase — [are working on] changing the office environment. They’re developing desk chairs that will sense when you’re stressed and, perhaps, tell your boss to offload some of your work; PCs that can figure out during your senior moments where you’ve seen a particular name; and desktops that, with a push of a button, transform themselves into computer monitors to help facilitate discussion during a roundtable meeting.

All of these ideas have one goal in common: To raise white-collar productivity — or at least preserve the huge gains of recent years while avoiding employee burnout. The idea is to build upon the innovations that have transformed offices over the past 15 years. As recently as 1990, voice mail was still being introduced in Corporate America, e-mail was largely self-contained within companies, and attending a meeting in another city meant going there.

Since then, Net-based forms of communication — such as e-mail, instant messaging, and videoconferencing, abetted by lighter and more versatile cell phones and laptop computers — have sped up both work and business decisions. The tools have improved so rapidly that “customers are starting to feel that office technology has come about as far as it can,” says Tom Gruver, a marketing manager at Microsoft. “There are no more expectations of productivity increases.”

Experts, however, swear that office innovation is about to take another leap. One reason they cite is technology advances, such as the ability to make larger computer screens. More powerful than that, however, is the need for technologies to help keep an aging workforce spry, that can compensate for the growing complexity of many jobs, and that meet the needs of increasingly mobile employees. Taken together, such technologies will ultimately change the definition of the modern office.

In fact, the idea that an office is an enclosure with walls is already disappearing, thanks to technologies such as Wi-Fi, which provides high-speed access to a network or the Internet from any place a connected employee chooses to wander, be it down the hall or to a caf, airport, or hotel.

Simputer Learnings

Sydney Morning Herald has a story by Supriya Singh:

Professor Swami Manohar, one of the original group of designers of the Simputer, says: “We have learnt that ideas and technology alone do not make a success . . . You have to succeed in the market.”

Manohar is a computer scientist with the Indian Institute of Science in Bangalore. He is also the chief executive of PicoPeta Simputers, one of the two companies developing the Simputer. He says publicity that it was a simple, cheap computer for the poor and illiterate “gave it the stigma of ‘ration rice’. Even the poor see the more expensive rice as the better rice. It is marketing and packaging that influence the decision makers.”

Marketing did not, however, convince the decision makers in and out of government who would have funded the purchase of the Simputer, so now its supporters are going directly to the retail market.

Looking back over the five years since the concept was articulated, he says: “We should have marketed it as world-class technology that makes computing simpler for everyone. World-class development is the right thing also for the poor.”

It is the “daughter syndrome”, says Professor S. Sadagopan, director of the Indian Institute of Information Technology, also in Bangalore. “You love your daughter so much that you cannot stand to give her away in marriage. Hence they did not make the deals that came to them from foreign companies.”

Others say the timing is wrong. PCs in 2001 were selling for 50,000 Indian rupees ($1500). PDAs and mobile phones were not as popular. But now some universities are using PCs with Linux and a Taiwanese chip, costing less than $300. Mobile phones in India are now among the cheapest in the world. So why would somebody spend $450 on a hand-held computer?

The Simputer team is betting that the quality of the technology, aided by better marketing, will make the difference. “It is the device you want to have,” Manohar says.

TECH TALK: As India Develops: Distribution Hubs (Part 4)

Continuing from the RISC paper by Atanu Dey and Vinod Khosla:

The solution presented in effect solves a constrained optimization problem where the constraints are:

  • Limited resources
  • Large population of 700 million people in 600,000 villages spread over a large subcontinent
  • Very poor infrastructure in terms of power, roads, telecommunications
  • Very low per capita income
  • Very low literacy rates

    With 700 million people in about 600,000 villages, every cluster of 100 villages will have approximately 100,000 people. With 5000 such clusters one can cover most of the rural population. Geographically, if one was to draw 20-kilometer circles one could cover the whole country with about 5000 circles. Most of the population in each circle would be about 10 km from the cluster center, well within a bicycle commute of such a center. 5000 such cluster centers could provide the basis for small, but critical mass towns around which the rural economy could develop. They could provide the infrastructure for power, communication, healthcare, education and government, the services to kick start market economy, sufficient demand for a diversity of services to emerge, and in general be the catalyst, the mixing bowl for our soup so the system can become autocatalytic. The idea is to make available at the center of such a circle all the services and functions that are normally only available in a city. The services are available to the entire population just a bicycle commute away, with a majority of the population within 10 km, and most within 20 km of this center.

    It should be noted that the exact number of such centers is not important and neither is their exact location. Much of this infrastructure exists around existing small Tier III/IV towns (about 4000 of them), the 5000 or so railway stations in the country, the 5000 haats or informal weekly markets that exist. The proposal here is to reinforce these sites with a focusing of most rural investment around these locations rather than scattering them in individual villages. The notion is that this focusing of the investment will result in a critical mass center for each cluster of 100 villages or 100,000 villagers rather than a larger number of sub-critical mass individual villages. The money will get a substantially higher rate of return, spurring economic growth, relative to an even more distributed model. We are in essence proposing that between the village and the megacity, there is an optimal size around 100,000 people (actually a range from 50,000 to less than 100,000) where current investment should be directed.

    The cost of providing basic governmental and non-governmental services like housing, police protection, legal, education, information, communication, regulatory, are much lower and easier than in villages. Most people can have access to these services but they are available at a scale where if demand is small enough resources are not wasted as they would be in the context of the individual village.

    A partial list of essential services (henceforth referred to as user services) would include

  • Market making and access to markets
  • Supply aggregation of agricultural and non-agricultural outputs
  • Demand aggregation of agricultural and non-agricultural inputs
  • Diversity of services to make an autocatalytic soup
  • Education and library
  • Health care
  • Banking and financial intermediation
  • Telecommunications and internet access
  • Governance
  • Entertainment
  • Legal
  • Charity and social services
  • Market information
  • Weather and agricultural information

    The cost of these services will depend on, among other things, the cost of the inputs to provide the services and on the total amount of the services supplied. The costs of the user services (and consequently their prices) depend on the cost of core infrastructural services such as

  • Power
  • Telecommunications
  • The physical plant including building, water, sanitation, security, HVAC
  • Transportation
  • Finance

    If the core infrastructural services are reliably available at low prices, the user services will be correspondingly low.

    The basic function of a RISC is to provide the core set of infrastructural services reliably and inexpensively so that user services that require these as part of their inputs can be efficiently provided and optimally priced. The contention is that reliable, ubiquitous, easily accessible infrastructural services form the platform that can support a full set of appropriate services critical for rural economic development. The critical mass of consumers and producers together with cost effective infrastructure which reduces the cost of services will achieve autocatalytic criticality and hence significantly enhanced economic growth.

  • Tomorrow: Distribution Hubs (continued)

    Continue reading