by Jon Udell

Managing data the XML way

analysis
Sep 19, 20024 mins

XML documents, and the ability to transform them, are vital to the business Web

The XPath and XSLT techniques I mentioned last week will become as fundamental to this generation of developers as SQL was to the last. Likewise XQuery, the query language for XML data that’s finally emerging from years of gestation. Likewise a yet-unspecified XQuery extension (or companion) for the declarative updating of XML. But just as SQL’s table-oriented approach to data management took a long time to sink in, so too will these document-oriented disciplines.

We’re long overdue for standard ways to manage data that sits in documents, in addition to data that sits in relational stores. Even after SQL won its long struggle for acceptance, it had failed to capture a lot of the pre-relational data that legacy systems — to this day — continue to manage. But all of our operational data put together are just a drop in the bucket. Documents are the lifeblood of the information economy. Recognizing this reality, the Web services stack is migrating away from a remote procedure-call model and toward a document-exchange model.

Is the Web page you visit in order to buy an airline ticket or enroll for a conference a document or an application? The Web’s genius was to forever blur that distinction. Documents are applications. They are also self-contained mobile databases, the design of which has been, historically, the province of an elite cadre of SGML geeks. Often, there’s been no design at all, as Tim Bray recently pointed out:

“Many on this list will find it shocking, but lots of important XML dialects don’t have any DTDs or schemas. Particularly in the application-glue space. People email back and forth some examples, they cut some code, and then everything’s working and they’re too busy to go back and write a schema.” [1]

This is a wonderfully pragmatic statement from one of XML’s primary inventors who, as I mentioned on my Weblog [2], has also taken a long and patient view of the transition from HTML to XML. Nevertheless, the XML document is a species of database. XML technologies (XPath, XSLT, XQuery) are among the tools we’ll use to manage these databases.

SQL, it shouldn’t need to be said (but does need to be said), isn’t going away. Object databases presented themselves as alternatives, but weren’t. More recently XML databases present themselves as alternatives, but aren’t. With each release of the RDBMS engines from IBM, Oracle, and Microsoft, XML path expressions and transformations become more integral. There’s debate about whether documents sit as blobs in the database or dissolve into their constituent elements and are digested more completely. One way or another, wrangling of semi-structured documents side by side with relational tables is becoming a core competency.

As I pointed out in a May 2002 InfoWorld article entitled “Hyperlinks Matter” [3], XSLT nicely complements the pipelined architecture of the Web. Useful services are sometimes just transformations of other services. Transformation itself is a crucial service. I’ve been relying on the W3C’s public XSLT service to produce a variant of my Weblog’s RSS feed. Amazon has fielded another public XSLT service in conjunction with its recently created Web Services API [4].

The educational value of these public services is huge. Why don’t IBM, Oracle, and Microsoft offer similar services? While they’re at it, why not showcase their emerging technologies for storage and retrieval of XML by giving every developer 10MB of Xperanto [5], Yukon [6], and SQL/XML [7] storage to play with? Like Microsoft’s TerraServer, which aims to demonstrate NT and SQL Server scalability, this would be a great marketing play — and more. We’re all going back to school to learn this stuff. Tuition-free online labs are in everyone’s best interest.

1. https://lists.xml.org/archives/xml-dev/200209/msg00056.html

2. https://weblog.infoworld.com/udell/2002/09/09.html#a406

3. /articles/pl/xml/02/05/20/020520pllinks.xml

4. https://associates.amazon.com/exec/panama/associates/join/developer/faq.html

5. https://xperanto.dfw.ibm.com/demo/

6. https://msdn.microsoft.com/vstudio/productinfo/roadmap.asp#section4

7. https://otn.oracle.com/tech/xml/xmldb/htdocs/sql_xml_codeexamples.html