Emergic: Rajesh Jain's Blog

Emergic: Rajesh Jain's Blog header image 2

The Future of Databases

May 31st, 2005 · No Comments

ACM Queue has an article co-authored by Jim Gray of Microsoft:

We live in a time of extreme change, much of it precipitated by an avalanche of information that otherwise threatens to swallow us whole. Under the mounting onslaught, our traditional relational database constructsalways cumbersome at bestare now clearly at risk of collapsing altogether.

In fact, rarely do you find a DBMS anymore that doesnt make provisions for online analytic processing. Decision trees, Bayes nets, clustering, and time-series analysis have also become part of the standard package, with allowances for additional algorithms yet to come. Also, text, temporal, and spatial data access methods have been addedalong with associated probabilistic logic, since a growing number of applications call for approximated results. Column stores, which store data column-wise rather than record-wise, have enjoyed a rebirth, mostly to accommodate sparse tables, as well as to optimize bandwidth.

Is it any wonder classic relational database architectures are slowly sagging to their knees?

But wait theres more! A growing number of application developers believe XML and XQuery should be treated as our primary data structure and access pattern, respectively. At minimum, database systems will need to accommodate that perspective. Also, as external data increasingly arrives as streams to be compared with historical data, stream-processing operators are of necessity being added. Publish/subscribe systems contribute further to the challenge by inverting the traditional data/query ratios, requiring that incoming data be compared against millions of queries instead of queries being used to search through millions of records. Meanwhile, disk and memory capacities are growing significantly faster than corresponding capabilities for reducing latency and ensuring ample bandwidth. Accordingly, the modern database system increasingly depends on massive main memory and sequential disk access.

This all will require a new, much more dynamic query optimization strategy as we move forward. It will have to be a strategy thats readily adaptable to changing conditions and preferences. The option of cleaving to some static plan has simply become untenable. Also note that well need to account for intelligence increasingly migrating to the network periphery. Each disk and sensor will effectively be able to function as a competent database machine. As with all other database systems, each of these devices will also need to be self-managing, self-healing, and always-up.

Tags: Software

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment