by Dave Linthicum

When Thinking SOA think DFSS

analysis
Jun 29, 20072 mins

Those that are creating an SOA are jumping right to services, and in most cases that's a huge mistake. I understand that the S in SOA is services, however, without a good foundation of data understanding you're creating services won't be of much use. Indeed, the foundation of a good service design is a good data design, or use of data abstraction. Without that you're putting lipstick on a very ugly pig, and you'

Those that are creating an SOA are jumping right to services, and in most cases that’s a huge mistake. I understand that the S in SOA is services, however, without a good foundation of data understanding you’re creating services won’t be of much use. Indeed, the foundation of a good service design is a good data design, or use of data abstraction. Without that you’re putting lipstick on a very ugly pig, and you’ll have to revisit the data issue at some point in the future. Thus, when you’re thinking SOA, think DFSS or Data First, Services Second.

The point of DFSS is that you can’t deal with information you don’t understand, and that includes information bound to behavior (services). Thus, it is extremely important for you to identify all application semantics—metadata, if you will—that exist in your domain, which will allow you to properly deal with that data within the context of your SOA.

The understanding of application semantics establishes the way and form in which a particular application refers to properties of the business process. For example, the very same customer number for one application may have a completely different value and meaning in another application. Understanding the semantics of an application guarantees that there will be no contradictory information when the application is integrated with other applications at the information or service levels.

Defining application semantics is a tough job since many of the existing systems you’ll be dealing with are older, proprietary, or perhaps both. The first step in identifying and locating semantics is to create a list of candidate systems. This list will make it possible to determine which data repositories exist in support of those candidate systems.

Any technology that can reverse-engineer existing physical and logical database schemas will prove helpful in identifying semantics within the problem domains. However, while the schema and database model may give insight into the structure of the database or databases, they cannot determine how that information is used within the context of the application or service.