SOAs and Meta Data Management
By David Marco
Service oriented architecture (SOA) is one of the hottest buzzwords in information technology (IT) today. Indeed, it is difficult to pick up any IT related magazine and not discover three, four or more articles on this topic. However, while a good number of senior executives are clamoring for SOAs, I find that there is a great deal of misinformation on this topic and even less knowledge on what it truly takes to implement this environment.
SOA refers to the use of “loosely coupled” (meaning that they should be independent, sharable and technology agnostic) services that are built to provide reusable business processes that enable the communication between systems and the creation of entire applications within an organization. These services will typically pass data between each other as they perform some predefined common process. The process could be as simple as basic data movement or much more intricate processes like data transformations, data cleansing or the use of multiple services to create a multi-step, complex process.
The goal of a SOA environment is to make these common services readily available so that they can be accessed without knowledge of their underlying technology platform implementation or programming language. In addition, the SOA environment should monitor the use and performance measurements of these individual services within the environment. In other words, a good SOA environment has standardized, sharable, common services that are well understood and highly efficient.
SOA Driving Factors
Many factors are driving the adoption and implementation of SOA in organizations. Chief among these factors is that the current IT application architectures are far too rigid, inflexible and inefficient. In fact, a survey by the Business Performance Management Institute found that 36% of company’s IT departments state that they have “significant difficulties” (27%) or “can’t keep up at all” (9%) with business demand. In addition, only 11% of executives say they’re able to keep up with business demand for changes to their technological processes.
SOA promises to give companies a portfolio of services (think common processes) that can be mixed and matched without difficulty, to create automated business processes with the hope of reducing application development time and costs. This is the promise that has been sold into many companies. Let’s look at what the reality has been.
SOA Defined in Real Terms
OK, your company’s key executives have bought into the suggested value of having an SOA environment, but what are the actual tasks that companies typically take in building this environment? In my experience, these companies are building messaging buses utilizing technologies like TIBCO and WebMethods, along with a host of standards like OASIS, XML, CORBA, DCOM and many others.
How successful are these messaging bus implementations? Most organizations I have seen are not building standardized processes with their messaging buses but rather using them to mimic point-to-point interfaces. This approach is successful in slowing down already over burdened point-to-point interfaces; however, it completely undermines the entire goal of SOA, which is to simplify overly complex IT environments. For example, I recently visited an executive of one of our clients, and in between our various meetings of the day, we stopped by an internal event that they were holding called a “Solutions Showcase.” In this Showcase they had setup a vendor floor-like area where many of their large IT project teams could discuss their efforts. One of the booths featured the messaging bus that they were currently implementing. The executive and I approached the booth (neither one of us had name tags so we were incognito), which was being staffed by the large consulting vendor that was building the bus. The featured attraction at this booth was a mock messaging bus that was constructed using LEGOs™. Yes, those same LEGOs™ that you and I played with as children. The LEGO™ set was supposed to move the LEGO™ blocks between the different LEGO™ built buckets (I guess this was supposed to be what a messaging bus does!). Before the vendor started this demo he looked at the executive and I, and then proudly stated that they used over 2,000 LEGOs™ to build the demo. As soon as he turned on the demo, the little LEGO™ pieces that were supposed to be gently moved between the buckets started flying all over the vendor floor to the dismay of the 10 or so people watching this catastrophe. At this point they immediately stopped the demo and I have to confess that a little devil sitting on my shoulder made me offer aloud the comment “that is the most realistic demo I have ever seen!” After the audience laughter settled down, I began to ask the vendor a couple of very simple questions on their SOA architecture, including how they will monitor their environment and track the efficiency of their messages. The vendor looked me in the eye and said that the key to their project was their “data meta repository.” That is not a misprint. “Data Meta!” That same little devil on my shoulder then coerced me into asking “could you please tell me more about your ‘data meta’ repository!” With intelligence at this level, it is not surprising that this project has not delivered on its promises.
Please do not misinterpret this article to be anti-SOA. A SOA environment can provide tremendous value to an organization; however, properly implemented meta data management is an absolute necessity to define the common business processes which make the SOA environment a reality. Defining an organization’s common reusable business processes isn’t as easy as some may think or as vendors may portray. Indeed the vast majority of SOA vendors just ASSUME that this process has been completed and that they can just walk in and start building a service oriented architecture.
Quite recently, a multi-billion dollar government supplier was having an information technology (IT) “Vendor day.” This is where the supplier invites 20 IT vendors that would come in and present to them the software or services that they provide[DM: I changed this sentence back to its original as it read better and was more accurate.]. This large government supplier received over 300 requests from vendors to present at this event. Of these requestsEWSolutions was one of the firms that they selected. During the course of this day each vendor was provided 20 minutes to explain who they are and what they do.
In looking at the list of vendors that they choose, essentially they picked 19 vendors, mostly software, whose core business is based on SOA. They had vendors that provide SOA testing software, SOA message brokers and so on. EWSolutions was literally the only vendor that doesn’t focus on SOA that was invited to present.
On the morning of the event I flew out early in the day and I had the opportunity to hear three other vendor’s presentations. Each presentation was focused heavily on SOA; however, I noticed an interesting and subtle theme. As each vendor spoke of the “magic” of SOA and how they could revolutionize the government supplier’s business, they would each state “well, once you’ve defined your centralized processes and you have a catalog of your data then we do…(insert specific vendor’s pitch).”
When it was our turn to present I begin by saying that we are not an SOA company per se, but you know all that “stuff” you need to do before you can implement SOAs? – “We do all that stuff.”
That “stuff” includes implementing a Managed Meta Data Environment (MME) that would capture all of the meta data needed to define and manage your SOA, and it includes building a fully operational data governance organization to interact with the MME to capture and maintain the business rules around your common processes.[DM: I deleted you’re added sentence as I felt it took away from my next paragraph.]
Meta data management and data governance are REQUIRED precursors to any significant enterprise SOA effort. In fact, it is not possible to create centralized, enterprise-wide processes when you don’t have meta data on your systems, processes and data. As this article series continues we will learn a great deal more about the types of meta data that needs to be captured, stored and distributed in order to support SOAs.
Before we can understand how to create common processes we need to first understand some basic IT terminology. Specifically, terms like System, Process, Data and Business Rules. Figure 1 illustrates the interrelationships between these concepts.
Figure 1: IT Component Relationships
“System” is a term that the vast majority of IT professionals believe they understand – and why wouldn’t they – as they have probably worked on dozens of individual applications. However, having had the experience of bringing a group of IT professionals together to write an enterprise definition on what a system is was quite revealing. Invariably, people just keep naming systems within the company as opposed to defining what “system” actually means. It turns out “a system is a collection of processes with an arbitrary beginning point and end point.” Some may question whether or not a system has an “arbitrary” beginning and end point; however, that is exactly what they have. For example, an order entry application most likely is connected, through interfaces with many other systems. At what point does each system begin and end. Basically this is something that someone decides. In other words it is arbitrary.
A process inputs data from a source or creates the data itself, then performs a designed sequence of tasks on the data, which it then outputs. These tasks can include transformations, data derivations, data movement, calculations, formulas and other data processing related activities.
Data consists of numbers, characters, symbols, images, audio, video, descriptors and other information (can we use “descriptors” instead of information?)[DM: I like information but we can add descriptors to my list.] that represents a specific meaning.
Business rules come in many flavors and varieties. For the purpose of this article I will only discuss their uses in processes and data. Business rules in a process refers to quality rules, tasks, definitions and constraints that apply to the data within the process. Business rules for data include domain values, definitions, data heritage and data lineage. For example, suppose that you have a process that calculates order cost. There would be business rules around the entire process. The rules may state order_cost = ((product_quantity x product_cost) + order_tax + shipping). Within this rule there will be business rules that govern the data elements so product_cost must be in the currency of the United States (dollars), numeric and a positive number.
Meta Data Management
“Meta data is all physical data (contained in software and other media) and knowledge (contained in employees and various media) from inside and outside an organization, including information about the physical data, technical and business processes, rules/constraints of the data, and structures of the data used by a corporation.”[1
When we talk about meta data, we are really talking about knowledge. Knowledge of our business, the systems that support it, the data that is housed in those systems and the people that are using that data. In other words, we are talking about all of the information that is required to define our common business processes.
MMEs gather, retain and disseminate meta data and unless a company has built them, it will be very difficult to initially define their common SOA processes. By having a fully functional MME, along with a strong data governance program, a company can truly capture the promises of SOA.
 David Marco (2000), Building and Managing the Meta Data Repository, Page 5, John Wiley & Sons (2000)
About the Author
Mr. Marco is an internationally recognized expert in the fields of enterprise information management, data warehousing and business intelligence, and is the world’s foremost authority on meta data management. He is the author of several widely acclaimed books including “Universal Meta Data Models” and “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide”. Mr. Marco has taught at the University of Chicago, DePaul University, and in 2004 he was selected to the prestigious Crain’s Chicago Business “Top 40 Under 40” and is the chairman of the Enterprise Information Management Institute (www.EIMInstitute.org). He is the founder and President of EWSolutions, a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class solutions using data warehousing, enterprise architecture, data governance and managed meta data environment technologies (www.EWSolutions.com). He may be reached directly via email at DMarco@EWSolutions.com