The power behind the SOA repository

The nature of SOA data requires a native XML data management server

The basic tenet of a service-oriented architecture (SOA) is to provide loose-coupling for different applications. It is thus imperative that data is produced by and for these applications, and that this data is stored and handled optimally. Given the pervasive nature of each application in an SOA, the way this data is stored is typically location-dependent and specific to the application.

An SOA repository is a mechanism that handles the persistence of distributed SOA data. It is a complex and sophisticated enterprise-grade technology that not only handles persistence and caching, but also enables lifecycle management, security, discovery, and transformation of distributed data from diverse service-oriented applications such as silo applications, Web portals, business processes, and mobile applications.

SOA data is basically transient and streaming in nature. It thus necessitates a native XML data storage that aggregates the data relevant to a specific service, regardless of the applications used, rather than assigning the data to the individual applications that make up that service. Otherwise, data becomes difficult to access and cost-prohibitive to store and replicate.

SOA data is typically stored in relational databases and filesystems, but these are not entirely capable of handling SOA data. Elliotte Harold, in his article “Managing XML Data: Native XML Databases,” (IBM developerWorks, June 2005) clearly addresses the need for and benefits of a native XML database. In his words, “When your only tool is a hammer, everything looks like a nail. When your only tool is a relational database, everything looks like a table. Reality, however, is more complicated than that. Data often isn’t tabular and can benefit from a tool that more closely fits its natural structure. When that data is XML, the appropriate tool for managing it might well be a native XML database.”

Being fundamentally XML, SOA data cannot be easily modeled in relational databases. The inflexibility of relational database schemas does not lend itself well to the ever-evolving nature of schemas in an SOA, and more so when trading partners collaborate across enterprises. Filesystems also do not provide advanced querying and management capabilities, which is a typical need in an SOA. For these compelling reasons, we strongly believe that data created as XML should be persisted, managed, and treated as XML.

Consider the complex and ever-evolving list of Web services standards. They include a number of OASIS initiatives such as Web Services Business Process Execution Language (WSBPEL), Web Services Security, Web Services Distributed Management (WSDM), ebXML Collaboration Protocol Profile and Agreement (CPPA), and Web Services Policy Framework (WS-Policy), as well as numerous World Wide Web Consortium initiatives, and REST-based XML artifacts. Wading through this exhaustive alphabet soup of standards, one realizes that at their core, these standards are basically represented by XML Schemas such as the WS-Policy XML Schema, the Collaboration Protocol Profile (CPP) XML Schema, the Collaboration Protocol Agreement (CPA) XML Schema, further strengthening the case that if SOA data is created in XML, it should be persisted, managed, and treated as XML.

Consider Figure 1’s WS-Policy Schema ws-policy.xsdfile associated with the WS-Policy Framework initiative, which standardizes how policies are to be communicated between service consumers and providers.

As shown in Figure 1, data and metadata associated with WS-Policy can be represented in XML and, hence, stored and managed with ease in an XML persistence mechanism. Similarly, CPP, CPA, and Web Services Security details can also be natively stored and managed in an XML persistence mechanism.

Mid-tier caching in an SOA

SOAs need persistence mechanisms to persist information such as the state of a business step in an application, the state of a long-running business process in execution, Web services management, monitoring information, lists of available Web services, and more. Often, much of this information is frequently requested and accessed, thus making the case for caching in the middle tier, which also alleviates the performance bottleneck that can be caused by multiple requests to the same information store.

With SOA data and metadata being XML, we propose a simple, yet effective, mid-tier caching architecture that includes an XML database as a mid-tier cache along with a number of XQuery-powered services. An SOA repository can enable increased performance, reliability, functionality, and usability of SOA artifacts through an effective mid-tier caching architecture powered by a number of important services as follows:

Policy-based caching service: For increased performance and quality of service (QoS)

A policy-based caching service can enable the setup of XQuery-based policies to cache result sets of low-performing services. These policies can also be constructed to include the time-to-live before the cache is refreshed. Policies based on time-of-day requests can determine if the data in the cache is valid for this request or if the originating source must be used. Also, policies based on service availability ensure that if the service is not available, results are obtained from the cache. A cache can be refreshed based on time and other configurable parameters by letting policies trigger the XML persistence mechanism. The design can also include dynamic just-in-time trace logging for service calls made by the XML persistence mechanism.

Data repurposing service: For richer functionality and improved performance

A data repurposing service can enable additional filtering and search criteria on content returned from a given service. Additionally, XQuery can be used to drive transformations for repurposing the content and provide analytics and reporting on returned content. XQuery can also deliver portions of result sets and create a final result set based on aggregation of content from multiple services.

Data abstraction service: For easier deployment and maintenance

A data abstraction service can eliminate the need for Web services to be aware of individual datasources. Figure 2 shows a better use of Web services by eliminating the need to develop separate clients and Web services for each operation. Datasource management for disparate datasources such as JDBC, HTTP, WSDL, and filesystems can be enabled using this service.

In addition, since services can run on any system, an SOA repository can be used to enable the federation of services in an SOA. It can also be used to alleviate performance issues for center-tier process-abstracting remote services by collocating the data as close to the data processing as possible. As a persistence layer at the central-tier, an SOA repository can be used to store transactional information for many purposes, including analysis and integrity management issues, such as logging. By handling abstracted and composite data elements at the central-tier, a centralized repository for SOA data can be enabled.

Exciting new SOA technologies such as enterprise service buses and orchestration engines can employ an SOA repository for state management, workflow persistence, and message persistence. An SOA repository can also provide the persistence backbone in SOA registries, whether they are UDDI (Universal, Description, Discovery, and Integration) or ebXML registries, to enable the discovery, publishing, and subscription of services.

The need for complex and sophisticated XML data management for an SOA

As already discussed, Web services and SOAs create huge amounts of complex and sophisticated new data in the form of data-rich XML messages exchanged between applications, which must be stored so they can be effectively audited and analyzed. When we look at the various technologies that enable and empower an SOA, it is apparent that an SOA’s key characteristics and benefits form the basis for many vendor offerings in this space. As shown in Figure 3, the following functionalities form the core SOA and Web services infrastructure:

Web services management
Web services monitoring
SOA governance
Web services security
SOA persistence and caching
SOA discovery, publishing, and subscription

We can map the use of XML data management as an enabling and/or empowering technology at various points in this infrastructure in the following ways:

SOA metadata persistence
SOA discovery, publishing, and subscription
Persistence of Web services management data
Acceleration caching
Service aggregation
Web services policy caching and management

XML data management in an SOA can also enable and/or empower:

Persistence of monitoring, logging/auditing
Persistence of security capabilities
SOA governance
SOA OLAP data and metadata transformations and persistence
Trading partner profile and agreement persistence
Message persistence
State management
Schema versioning

Native XML data management server overview

An XML data management server (XDMS) is much more than a data store for XML data. An XDMS is a sophisticated system that must be designed with flexibility, scalability, and performance in mind. The reality is that most XML data management servers do not measure up to these exacting demands. Typically with an XDMS, no prior knowledge of the XML document?s structure is necessary. Any valid XML document such as XML, Web Services Description Language (WSDL), CPPA, XML Schema, or Extensible Stylesheet Language Transformation, can be inserted at will, and the native XML data management server automatically will create the required internal structures to accommodate such storage.

In addition, XML data management servers support transactions, indexing, schema or DTD validation (some support schema versioning), extended connectivity, users- and groups-based security, plus backup/restore and server mirroring. An XDMS solution must also be able to store non-XML data (such as binary data), thus providing a solution for storing any other content you may require.

The native XML interface for SOA repository operations is XQuery. To tap into the full potential of XML databases, XQuery is the way to create, manipulate, examine, and manage XML data. XQuery also provides a standard way to unify disparate datasources and make them all appear to be a single server.

Introducing XQuery

XQuery is a functional language; as such, expressions are composed and combined to create arbitrarily complex queries over one or more sets of XML data. XQuery offers both strongly-typed mechanisms using XML Schema and DTD, and weakly-typed mechanisms for handling raw XML data.

The XQuery data model

The XQuery data model is more extensive than the standard XML data model of XML Infoset and Post-Schema Validation Infoset (PSVI). XQuery is defined in terms of operations on the data model, but it does restrict how documents and instances in the data model are constructed. The data model consists of the XML data being queried, any intermediate values, and the final query results. It supports intermediate expressions that can result in values that are not XML (for example, a list of integers or strings), XML fragments, and both typed and untyped data.

XQuery and XML Schema have the same type concept for XML data. XQuery provides built-in types based on XML Schema and support for user-defined Schema types. XQuery also supports additional data types outside the existing XML Schema data types.

The components of the XQuery data model and type system are as follows:

Items and sequences

Items are a single node or atomic value (singleton), which is equivalent to a sequence of length one. A sequence is made up of a series of items. Sequences can be empty, but cannot contain other sequences. Every value in the data model is a sequence of zero or more items.

Items in an XQuery data model can be:

Typed: Receive their type annotations from XML Schema or DTD documents.
Untyped: Employ untyped semantics to coerce string values into their desired typed values.

Atomic values

Atomic values are singletons with an atomic type derived from the XQuery type xdt:anyAtomicType.

Nodes

As in XML, XQuery has seven types of nodes. Each node has a unique identity and an inherent ordering in the document (document order).

Types of nodes

Types of nodes that build an XML tree in the data model are:

Document

Represents the entire XML document, including basic information (base URI, children, unparsed entities, document URI).

Element

Represents elements within a document, including basic information (base URI, node name, parent, type, children, attributes, namespaces).

Attribute

Represents attributes within a document, including basic information (node name, string value, parent, and type).

Text

Represents XML character content within a document, including basic information (content, parent).

Namespace

Represents namespaces within a document, including basic information (prefix, URI, parent). Namespace nodes are used to map namespace prefixes to URIs.

Processing instruction

Represents processing instructions within a document, including basic information (target, content, base URI, parent). Contains instructions for applications in documents and starts with a target to identify the application where the instruction is directed.

Comment

Represents comments within a document, including basic information (content, parent).

XQuery expressions have a static type and a dynamic type. The static type pertains to the expression and is applied at compile-time. The dynamic type pertains to the value that results from the expression and is applied at runtime.

Static typing versus dynamic typing:

Static: Set of type-reference rules that match the query to the document.
Dynamic: Set of value-reference rules that govern how the query is processed.

XQuery syntax

Several types of XQuery expressions can be used in a syntax query:

Primary expressions

Basic primitives, which include literals, variables, function calls, and the parenthesized expressions.

Path expressions

Used to pattern-match arbitrary nodes according to their name and type by navigating through the hierarchical structure and locating nodes.

Direct and computed constructors

Used to create nodes and provide structure for the XQuery result. These expressions are capable of composing arbitrary results into new XML documents.

FLWOR expressions

The clauses for and let bind variables to values and use these values to evaluate items associated with expressions. For expressions are recursive and can be nested; let expressions are bound to intermediate results and not recursive.
Whereclauses contain one or more predicates used to filter through a set of values and limit the values to only those that meet the required criteria.
Order-byclauses sort values in a result stream.
Returnclauses use the values to build the results.

Functions and operators

Includes:

Arithmetic operators
Comparison operators
Node sequence operators
Logical operators
Built-in functions (accessor, numeric, string, Boolean, date, time, duration, anyURi, QName, node, sequence, aggregate, context)
User-defined functions

Conditional expressions

If-then-else statements used with Boolean conditions.

Logical expressions

The expressions and and or.

Expressions on SequenceType

Describe an XQuery value when referring to a type in an XQuery expression.

Furthermore, XQuery implementations can be extended. More sophisticated implementations include support for such datasources as filesystems, HTTP, Web services, Java Database Connectivity (JDBC), Java Message Service, and more.

In Figure 4, XQuery provides the glue for a native XML data management server used as a flexible and standards-based persistence mechanism.

We recommend the use of a native XML data management server, as illustrated in Figure 4, as a best practice approach for persistence within a SOA. A native XML data management server can be used to enable the federation of services in an SOA, as services are location independent by nature. David S. Linthicum in a blog titled “The Importance of Persistence within a SOA,” has called out federation of services, performance issues, storage, and management of transactional data and centralized metadata as key aspects of information-oriented integration using services. Federating services can also alleviate performance issues by enabling the collocation of data closer to the composites, where the actual data processing occurs. Having a native XML data management server at the central tier also allows architects to store transactional information for analysis and logging.

SOA powered by XQuery examples

The power of XQuery in an SOA can be realized by many sophisticated queries, examples of which follow below. These examples demonstrate how a specific implementation of XQuery can be coupled with a specific native XML data management server as the technology for seamlessly interacting with SOA artifacts.

Example 1. Use XQuery to insert operations into a WSDL

                        ...
declare namespace wsdl = "http://schemas.xmlsoap.org/wsdl/";
insert
<wsdl:operation name="testInsertOperation" parameterOrder="id">
      <wsdl:input message="impl:testInsertOperationRequest" name="testInsertOperationRequest"/>
      <wsdl:output message="impl:testInsertOperationResponse" name="testInsertOperationResponse"/>
    </wsdl:operation>
after doc("xxx:///SOARepository/webservices/PurchaseOrderWS.xml")
   /wsdl:definitions/wsdl:portType/wsdl:operation[@name="getPODetail"]
...

Example 2. Use XQuery to check the availability of a service and update the status in the registry

                        ...
declare namespace wsdl = "http://schemas.xmlsoap.org/wsdl/";
declare namespace wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/";
for $i in collection('xxx:///SOARepository/webservices')/wsdl:definitions
where fn:doc-available
   (fn:concat($i/wsdl:service/wsdl:port/wsdlsoap:address/@location,'?wsdl' ))
return 
<Service>
   <TimeStamp>fn:current-date()</TimeStamp>
   <Status>down</Status>
</Service>
...

Example 3. Use XQuery to measure the reliability of a SOAP message

                        ...
declare namespace wsdl = "http://schemas.xmlsoap.org/wsdl/"; 
declare namespace? msg ="http://schemas.xyz.com/msg"; 
declare namespace SOAP="http://schemas.xmlsoap.org/soap/envelope/"; 
<ExpiredMessage> 
{ 
for $i in doc('file:///c:/SOARepository/ReliableMessaging/RM14.xml')/SOAP:Envelope 
where $i/SOAP:Header/ReliableMessage/TimeToLive < current-dateTime() 
return 
   $i 
} 
</ExpiredMessage> 
...

Example 4. Use XQuery to retrieve all TokenTypes used in Policy document

                        ...
declare namespace wsp="http://schemas.xmlsoap.org/ws/2004/09/policy/";
declare namespace wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd";
<Tokens>
{
 for $policy in collection('xxx:///SOARepository/WSPolicy')/wsp:Policy
  return  $policy/wsp:ExactlyOne/wsp:All/wsse:SecurityToken/wsse:TokenType
}
</Tokens >
...

Conclusion

In this article, we have presented a native XML data management server as the most logical approach for storing SOA data, as this data is basically XML. Some may argue that such data and metadata can just as well be stored and managed in a relational database management system. However, data and metadata born as XML must then be transformed into a relational representation, leading to a sizeable overhead in mapping and management. As the amount of SOA data and metadata increases in an enterprise and across trading partner boundaries, this complexity becomes even more an issue.

Furthermore, since this data is frequently accessed and consumed, a native XML data management server powered by XQuery can be used to provide a standards-based mid-tier cache to reduce performance overhead and increase scalability and reliability in an SOA.

We would like to thank Miko Matsumura, vice president of marketing at Infravio, former Java Evangelist at Sun Microsystems, and co-creator of SOA Blueprints; Sacha Schlegel, an expert on ebXML and software engineer with Cyclone Commerce; and Frank Cohen, director of solutions engineering with Raining Data Corporation for technically reviewing this article.

Ash Parikhis the director of technology and development for the Enterprise Applications Group at Raining Data Corporation. He is a named expert in the field of SOA and distributed computing and has presented and authored abstracts for OASIS Symposium 2005, Delphi BPX Summit 2004, Delphi Enterprise On-Demand 2004, JavaOne 2004, JavaOne 2003, BEA e-World 2002, and JavaOne 2002. Parikh has more than 15 years of IT experience and is an active member on a number of Java Specification Requests in the Java Community Process and in OASIS technical committees. He is also the president of the Bay Area Chapter of the Worldwide Institute of Software Architects. Parikh is the collaborating author of Oracle9iAS Building J2EE Applications (Osborne Press, November 2002), and has also authored several technical articles in journals such as JavaWorld, XML-Journal, Java Pro, Web Services Journal, ADTmag, Softwaremag.com,and Java Skyline. Robert Smikis a lead architect/team lead with the Enterprise Applications Group at Raining Data Corporation. He has been involved in application and software design and development for more than 15 years. His experience includes design and development of highly complex database systems, architecting multitier Web environments, and architecting and developing various connectivity solutions, products, and smart cards, in addition to SOA and data aggregation tools. Smik is an active member of HL7 and CDISC. He has also co-authored articles in XML-Journal and JavaWorld. Premal Parikhis a lead architect/team lead with the Enterprise Applications Group at Raining Data Corporation. He has more than 10 years of experience in the software industry, which includes design and architecting products, along with prototyping, analysis, project modeling, and development of portals for the B2B marketplace. Parikh is also an active member on a number of OASIS Web services standards technical committees.

JavaSoftware DevelopmentWeb DevelopmentProgramming Languages

Topics

About

Policies

Our Network

More

The power behind the SOA repository

The nature of SOA data requires a native XML data management server

Mid-tier caching in an SOA

Policy-based caching service: For increased performance and quality of service (QoS)

Data repurposing service: For richer functionality and improved performance

Data abstraction service: For easier deployment and maintenance

The need for complex and sophisticated XML data management for an SOA

Native XML data management server overview

Introducing XQuery

The XQuery data model

XQuery syntax

SOA powered by XQuery examples

Conclusion

Show me more

OpenAI’s desktop superapp: The end of ChatGPT as we know it?

Google’s Stitch UI design tool is now AI-powered

Stop using AI to submit bug reports, says Google

How to build desktop apps in Typescript with Electrobun

Write and run assembly in Python with Copapy

Run AI Models Locally on Your PC — No Cloud Required (LM Studio Guide)