Internet Data
Although poets and idealists may disagree, the future of the Internet is already written, with the high-tech industry pushing it towards becoming a platform designed for and dominated by business-to-business information sharing.
In this model, the Internet is effectively a large public WAN, with organizations interconnecting their business systems into indistinguishable, cooperative facilities on an ad hoc basis, as required by their current missions. When Company X goes into partnership with Corporation Z, the two firms won't just swap phone numbers but instead will exchange user accounts and passwords, opening their systems to their new partners on a totally-unprecedented scale.
This kind of thing has been going on for years, of course, but only for those organizations with the money and expertise to build and manage their own private WANs (automobile dealer networks are a good example). But with the Internet, these barriers to entry have been significantly reduced. The cost for building a virtual WAN over the Internet is considerably less than it was when you had to buy your own copper, and the number of administrators with the required expertise has also increased substantially, allowing these kinds of cooperative computing agreements to increase dramatically.
So in the future, we can expect to see a lot more small organizations taking advantage of these "virtual company" capabilities, building dynamic networks with highly-distributed access and authority. One example of this might be a small hardware vendor who provides tools for their resellers to enters sales and shipping orders directly into their databases, allowing for reduced time-to-ship windows. At the other end of the opportunity range, another example might have that same hardware vendor opening their systems to their outsourcing providers (such as accounting and other service firms), who work on the local systems directly rather than batch data back and forth.
These scenarios are good examples of the problems that have already been resolved by Internet technologies, allowing this vision to come closer to reality. Rather than ask what protocols will be used for the network, IP's pervasiveness makes the question moot. Similarly, HTTP and HTML have been proven to work for distributing access to functionally-rich applications, allowing them to be exposed without much worry. As for security, there are many options to choose from there as well.
Wot? No Data?
In fact, there's only one component missing from this picture, although it's absence is so significant that it threatens the entire vision. The thing that's missing is a vendor-independent database-specific application-layer protocol, allowing users at Company X to access the databases on Corp. Z's network, regardless of the database system in use at either location.
In my opinion, this hole is the most significant hurdle to implementing true inter-organization communications. Oh sure, companies can open up their internal web-based applications, but that kind of environment requires duplicate data-entry or batch transfers, neither of which provides much in the way of efficiency.
In order to truly reduce the time-to-ship window, a reseller from the example above must be able to integrate the hardware vendor's remote database into their own local applications, and they must be able to do so seamlessly. And in order for the vendor to minimize their time-to-payment window, they have to be able to incorporate EDI in the same manner. In other words, if the small players are to take advantage of the newly-levelled playing field, they have to be able to incorporate the same database-sharing technologies as their larger competitors.
Without these capabilities, the smaller firms will simply not be able to compete with the big boys and their WANs. This will, in turn, keep them from adopting Internet technologies as a competitive tool. If we are all to benefit from the lowered barriers-to-entry that Internet technologies provide, then we must collectively work to resolve this fundamental issue. That means developing an IETF-sanctioned application-layer standard for vendor-independent database connectivity.
Such a standard would allow for many things. At one end, it would allow for remote users to access multiple remote databases without having to load vendor-specific agents or write vendor-specific application code. At the other end, it would allow organizations to interconnect their back-end servers together directly, with updates and joins being applied to all of them simultaneously. The latter of these two examples is the most compelling, I think, because it allows for true partnership at the corporate level.
Current Solutions
But before we run off half-cocked and start writing a new Internet Data Protocol, we need to examine how some of the current implementations work (or don't work), and the issues that they present. In general, there are four technologies that are of interest here, although it my belief that none of them are appropriate as foundations for a new protocol:
- Vendor-Specific Implementations: For those of you who
aren't heavily involved in database stuff, the first thing you have to realize
is that every vendor uses different protocols, commands and feature sets for
their network services. Oracle's software and command set are totally incompatible
with Sybase', which are totally incompatible with Informix', ad nauseum. This
is the problem that must be resolved.
All kinds of issues crop up in this kind of environment. What version of Oracle's SQL*Net are you running, versus what version is the server running? Even worse, what if you need to connect to multiple systems using different flavors of SQL*Net? Compound this problem by adding in multiple flavors of Sybase, Informix, et al, and the problem becomes very, very complicated. And we haven't even talked about the logistics of distribution and licensing in this environment. - ODBC: Microsoft's Open Database Connectivity API was the
first really successful attempt at resolving this problem. By providing a
single generic database interface, ODBC allows desktop applications to talk
to multiple database providers simultaneously, whether they are an Access
file on a floppy disk or a DB2 dataset on a remote mainframe. However, ODBC
does not provide the actual connectivity to those databases, instead relying
on vendor-specific drivers to handle the authentication, connectivity and
data-management services that are specific to each of the products. Thus the
problem has not been resolved, but instead has been relocated.
Come to think of it, putting the focus of network-wide data integration at the desktop is probably the worst place for it. Rather than equipping every client with a complex database gateway, we would be better off if we were to put an open interface on the servers instead. This is similar in concept to IMAP and LDAP, which offer much better functionality when they are implemented on the servers directly. We also need a solution that removes Microsoft (or any vendor, for that matter) from the single point of control. ODBC is very proprietary, and is used as part of a larger campaign that isn't necessarily geared towards vendor-indepedent usage. - RDA: I am not the first person to think that a vendor-neutral,
network-specific database protocol would serve the user community well. Indeed,
there exists today an OSI-specific extension to the SQL specification called Remote
Database Access (RDA), which offers much of the functionality that I'm
calling for already.
However, there are many shortcomings with RDA. First and foremost, there is very little support for it within the vendor community (Sybase is the only major player that really supports it, based on my research). I'm also of the strong opinion that RDA should not be simply ported to IP, as we've seen from X.509 and LDAP that this rarely works well. We need something similar, but also something different, designed specifically for Internet usage. - XML-over-HTTP: Nor am I the first person to think that
a generic data-exchange mechanism would be a good thing for the Internet.
Indeed, many vendors are already working with XML
technologies, hoping to use it as a pseudo-protocol for exchanging data
over the Internet.
There are of course problems here as well. First of all, much of XML's promise is based on the premise that users and systems will exchange data using text strings, which simply will not work with complex databases. Period. Secondly, HTTP does not offer an adequate verb set for managing database records. GET and POST simply won't provide the functionality of INSERT, UPDATE, DELETE and the other verbs that make SQL work. Finally, XML's structure is decidedly focused on server-to-client transactions, and is not geared towards the specific intricacies that server-to-server communications require. Although people are suggesting that XML can be made to work, I'm dubious of their prospects.
After looking at these existing technologies, we can gain a pretty clear understanding of the features that an Internet Data Protocol would need to provide. It would need to offer a consistent implementation (TCP port number, character set, etc.), and a consistent set of verbs (such as INSERT, UPDATE and DELETE).
In fact, I would think that the first version of such a protocol could be something as simple as a listener that did nothing more than accept standard SQL input, returning data and errors in a predefined form. Such an implementation would likely be a five-page RFC: "support SQL 92 commands and object types on TCP port XX, returning data and errors using this format."
Future versions would of course need to add support for arbitration, feature negotiation, version control, asynchronous operations and vendor-specific extensions. But the first version could be very, very simple. Simple enough to help drive the adoption of business-to-business information sharing in the small- and medium-enterprise markets, anyway.
Making It Happen
Writing new standards isn't easy, at least not for folks like me who have no real pull in the market. In order for this kind of thing to really fly, a variety of things need to happen:
- The IETF has to sanction the effort: We can't leave this effort up to the vendor community. We've tried that for the past twenty years, and are no closer now than we were then. The only way that this type of effort will succeed is if the IETF sanctions an IDP working group. It is my bet that vendors will fall all over themselves to get involved with such an effort at the preliminary stages. As a result of this involvement, products will ship that support the new specification.
- Vendors have to involve themselves: Rather than being strong-armed into participation, the database vendors ought to be looking for ways that they can leverage such a standard to their own benefit. History teaches us that the companies who embrace a technology and push it along are the ones who benefit, and not those who resist the paradigm shift. Surely one or two of the major players can see this as an opportunity to establish themselves as the leaders in a new market.
- Developers have to support it: It's one thing for vendors to support a standard, but it's another for application developers to do so. Rather than writing to ODBC and vendor-specific drivers, the ISV market will have to start supporting the new IDP service if they wish to see its adoption by their customers.
- Users have to demand it: This should be self-explanatory. If you don't buy the products, then no vendor will be motivated to build this support into their offerings. You are going to be the primary beneficiaries here, so do your part and don't leave it up to the next guy. Think about this: would SMTP have succeeded if you hadn't used it? Why is this any different?
- The press has to enforce it: Last but not least, the press has to cover such a technology if the masses are to learn about it. Let's face it: the majority of the user community consists of lemmings who implement based on word-count coverage (see: Windows NT), so you'll have to give this technology the same space you gave LDAP and IMAP if you also want the technology to move ahead.
The good news here is that I've already had some preliminary talks with the Area Directors of the IETF's Applications Working Group, and have received very positive responses so far. Taking this to the next step involves finding somebody with influence to chair a working group and bring version one of the protocol together quickly.
As stated, I believe that this person should come from the vendor community. There are people on this newsletter's mailing list at Oracle, Sybase and Solid Technologies, among others, any one of which I would think could lead this effort easily. If you are interested, please contact me and we'll get the ball moving. I, as a user, am most anxious to see this technology come to life quickly.