The following is an excerpt from Architecture and patterns for IT service management, resource planning, and governance:
Making shoes for the cobbler's children, written by Charles T. Betz. It is printed with permission from Morgan Kaufmann, a division of Elsevier; Copyright 2006. Click here to download the complete chapter, "Business process management and IT process entities."
Table of contents
- Part 1: Understanding metrics for business process management
- Part 2: A guide to conceptual data models for IT managers
- Part 3: Business process management and IT process entities
We start with the first subgrouping, Strategy, and related entities.
A Strategy is a top-level organizational direction or guidance toward the overall mission. The term Strategy is used generically here and might include concepts such as mission, goal, and objective detailed into a more concrete framework.
Strategies have two avenues into lower-level IT data: they drive Programs and Projects to implement new functionality, and they require the support of Business Processes to achieve ongoing success. (Notice that for graphical simplicity the Strategy–Business Process and Release–Configuration Item links were not drawn in the main data model in Figure 3.2 and appear as thinner lines. There will be other cases of such omissions.)
Strategies are related to other Strategies (this is the meaning of the "U"-shaped line on the left side of the Strategy entity).
Strategies should be measurable using Metrics; this relationship is critical to the establishment of digital dashboards.
A Program is an ongoing, large-scale organizational commitment and corresponding investment toward meeting a major goal or objective of the enterprise. A Program typically consists of one or more Projects.
An Idea is an initial, typically business-generated, opportunity for IT services. It is minimally qualified.
An Idea becomes a Demand Request after going through some form of IT assessment for sizing or capacity impacts and preliminary feasibility. A Demand Request is a fully qualified request for an IT service change, awaiting full funding authorization to become a Project.
A Project is a defined set of manageable activities to achieve a well-specified mission (e.g., Demand Request fulfillment), usually represented by some set of deliverables or enumerated changes, with explicitly allocated resources (time, money, staff ), executed and measured within the scope of those resources. A Project has one or more Releases (see the "Release" section). Projects in many cases are constrained to a fiscal year. A Project should always be associated to a Demand Request.
Projects may be non-IT (e.g., construction projects), but that usage is out of scope for this book.
A Project before it is approved may be considered a Demand Request.
Projects relate to Configuration Items either directly or (more rigorously) through defined, named Releases. This ambiguity can be seen in Figure 3.3.
Projects may be grouped into larger Programs (not represented in the model). A Program is an ongoing, large-scale organizational commitment and corresponding investment toward meeting a major goal or objective of the enterprise. A Program typically consists of one or more Projects.
Figure 3.3 Strategy and related entities.
Strategy–Program–Project versus Idea–Demand
The model graphically depicts two competing paradigms: one from the top down, the other from the bottom up. A traditional top-down IT planning model would state that Strategy drives Program drives Project. However (especially when executed using an annual time frame), this is not an agile method for responsive IT. An event-driven, business-responsive demand process is also necessary. Aligning these two paradigms will be a different exercise for every organization; commonly, Demand Requests are evaluated against the annual strategy baseline.
(Request for) Change
A Change is an authorization to alter the state of some Configuration Item. Information Technology Infrastructure Library (ITIL) defines Change as follows:
The addition, modification or removal of approved, supported or baselined hardware, network, software, application, environment, system, desktop build or associated documentation.
It defines request for change (RFC) as follows:
Form, or screen, used to record details of a request for a Change to any CI within an infrastructure or to procedures and items associated with the infrastructure.
There is much additional discussion of Change in ITIL. However, the scope of Change in this framework is somewhat more limited; business-driven RFCs are Demand Requests.
This model does not distinguish between Changes and RFCs. However, an operational configuration management tool may detect unapproved Changes for which there are no RFCs; these can be considered Events and potentially Incidents.
Figure 3.4 Change and Release context.
This is perhaps the most important relationship in all of Information Technology Service Management (ITSM). Simply, a Change by definition affects configuration items (CIs), and CIs are objects under change control. This is far simpler to state and to model than to execute in the real world. A naïve approach to implementing this concept will result in unmanageable data. Clearly, it is not optimal for a Change record to have to be related to 1500 individual CIs, yet this is what a simplistic approach will arrive at (e.g., in putting in an initial release of a software package with many separate binary assets).
There are various techniques for mitigating and simplifying this, mostly involving encapsulation and abstraction. If a logical Application CI is defined, for example, it can be presumed to include all lower-level physical binary Components. Whether or not to inventory those binaries in the CMDB is one of the most critical decisions the ITSM implementer faces. For high security organizations this may be done, but it is questionable whether lower-criticality information systems organizations truly require it, especially in a world of purchased software where the physical architecture of a software product is less and less of a concern for the package vendor's customers.
Alternatively, the concept of assembly CI (which is also a CI) can be used. An Application plus its Datastores and Deploy Points might be a logical assembly CI. This is where the issue of Logical versus Physical CI comes in, pointing up the importance of having a defined process for maintaining logical Applications and related assembly CIs. It is not recommended to allow individuals the ability to create high-visibility logical CIs; this results in a chaotic environment. Everyone must agree that there is one Application (e.g., Quadrex), composed of, for example, these 50 Components.
Changes may require a Service Request to implement, for example, if database administration services are part of the service catalog and the addition of a new table is handled as a Service Request. This will depend on the maturity of the IT organization.
Changes are tied to Releases. In this framework, a Release is typically associated with a Project and results in one or more RFCs to add or alter CIs for a given IT service.
Production Change and the Software Development Life Cycle
RFCs in this architecture, and the concept of Change generally, are not applied to project deliverables. This is in keeping with the ITIL philosophy that "changes to any components that are under the control of an applications development project—for example, applications software, documentation or procedures—do not come under Change Management but would be subject to project Change Management procedures…. [The] Change Management process manages Changes to the day-to-day operation of the business. It is no substitute for the organisation-wide use of methods…to manage and control projects." While the project change management concepts are similar, they are managed in a project context that is quite different from production operations and out of scope for this book because they are extensively covered in the project management literature.
In this framework, a Release is the gateway from the software development life cycle into the ITSM world. It is one of the most important concepts for which to develop an enterprise approach. A Release is (if narrowly defined) a distinct package of new or changed functionality deployed to production, usually enabling new capabilities and/or addressing known Problems.
ITIL says "a Release should be under Change Management and may consist of any combination of hardware, software, firmware and document CIs…. The term 'Release' is used to describe a collection of authorised Changes to an IT service."
Releases, like Changes, should be transactional, although their larger grain makes this more challenging.
The concept of assembly CI may be helpful in supporting a Release's various elements. However, some consider a Release to primarily be a dependent entity of an Application.
Note that release management as an overall capability includes planning and harmonizing all Releases in the environment, not just managing Releases for an individual Project or Program (the enterprise release managers should interface with the program or project release managers).
The relationship between Project and Release can work two ways: a Project may have several (smaller-grained) Releases, and a large-grained enterprise Release may coordinate across multiple Projects. This flexibility of interpretation, coupled with narrower and broader scopes for Release, make it a particularly difficult concept from a conceptual modeling perspective.
A Release may have a number of Changes associated with it, but a Change should be "owned by" only one Release. That is to say, two different Releases should not be cited as justification for one Change. (See the "Justify Change" pattern in Chapter 5.) A Release usually affects multiple CIs; however, CIs can be grouped, as with the assembly CI.
Project, Release, and Change
The ITIL conception of the relationship between Project, Release, and Change is presented in Figure 3.5.
Note that in ITIL terms, an RFC precedes the establishment of a Project, in theory. The Release might also result in smaller-grained RFCs for change control (e.g., actual physical deployments); thus, there is a conceptual difficulty in distinguishing Change granularity, which ITIL calls out as a risk but does not present a systematic framework for resolving.
This may be problematic in terms of language and culture for organizations with a strong tradition of change control, possibly including a function named Change Management. They will not want their process (and system) "contaminated" with RFCs more focused on Project initiation; a forward schedule of change is as far as they may wish to go.
An alternate view is presented in Figure 3.6.
The controversy is primarily linguistic. The ITIL intent behind front-loading the RFC is presumably so that it is suitably assessed by all stakeholders. This is also the objective of the demand and portfolio management processes (as well as the function of enterprise architecture), and there is arguably more maturity in their conceptions of how to do this.
Whether you subscribe to the ITIL view or this book's framework, these issues should be clarified in any large IT organization.
Figure 3.5 ITIL representation of RFC, Project, and Release.
Figure 3.6 Alternate representation of RFC, Project, and Release.
An Event is raw material. It is any operational signal emitted by any Production CI. Only a small fraction of Events are meaningful to ITSM, and an even smaller fraction result in Incidents. Events are one basis for Metrics, which in turn drive Agreements and Contracts.
One important type of Event is emitted by change control and detection systems, and that is the identification of physical change. This Event specifically indicates that for a given CI a state change has occurred that is of management interest. Change Events may be generated automatically by the CI in question or detected by active probing (e.g., tools such as Tripwire that compare the current state with a known baseline). The most sophisticated IT operations reconcile such change detection Events with the RFC process.
ITIL implies that an Event is equivalent to an automatically detected Incident. Anyone who has experienced an autogenerated "ticket storm" will know that this definition is not suitable—most Events are not Incidents; extensive and well architected correlation and filtering are required.
Of course, in the broadest sense, an Event can apply to any entity undergoing a state change of any kind. In this sense, a Contract might "raise" a logical Event when it expires. However, this is so broad that it's not a focus of this model.
Figure 3.7 Event, Incident, Problem, and Known Error context.
Note that Events can be related to both discrete physical CIs, such as Servers and Datastores, and to logical Services. This is characteristic of monitoring correlation architectures, business service management (BSM) and end-to-end transaction monitoring. Rather than monitoring an individual, granular CI, the major Event of interest is an aggregation or derivation of multiple internal Events within the Service (e.g., expressed as overall transaction response time or customer-visible service failure).
Events are also indicators of capacity consumption and support measurements for that purpose: hardware utilization, memory, transactions, and so forth. Financial chargeback may depend on event management.
Advanced IT providers and infrastructure systems are starting to work with statistical analysis of Events, for example, to determine whether a certain repeated Problem has an identifiable Event signature that may help resolve it. This gets into cutting-edge research into pattern detection across large data sets, related to data mining.
A best practice for all operational Events is the embedding of an appropriate CI identifier. By definition, an Event must have had a CI that emitted it—it cannot arise out of the ether. This reinforces the case for managing unique and terse CI naming conventions, because many Event data structures will not be able to support long identifiers. See the "Application ID and Alias" pattern in Chapter 5.
The change Event is discussed further in the "Configuration Management" section in Chapter 4.
ITIL defines Incident as "any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service." ITIL also states that a Service Request is a type of Incident, which seems perverse. (A Service Request is not an interruption unless you are trying to build a culture of hostile customer service!) This line of thinking is not supported here.
Service requests may be tied to Incidents through the CI against which the Incident is reported. In this interpretation, Incidents are independent of their mode of detection; this is necessary to support Incidents that may be reported or derived through enterprise monitoring without ever being reported through the centralized service desk.
An Incident has to be experienced. It is an occurrence. This distinguishes it from the Known Error concept used for knowledge management for the help/ service desk (an error being a known condition in the abstract).
A Service Request may occur in response to an Incident. Incidents (especially when generated from monitoring tools) often require correlation and root cause analysis, which are supported through the relationship of Incidents and Events to each other.
A Change may be in response to an Incident, without going through the more formal and heavyweight Release process. Alternatively, an Incident might be the result of a poorly executed Change. This means that the relationship between Change and Incident should probably have a type attribute so that it is clear which caused which (see the section on intersection entities later in this chapter).
Problem and Known Error
In ITIL, a Problem is "the unknown underlying cause of one or more Incidents," and a Known Error is "a Problem that is successfully diagnosed and for which a Work-around is known."
However, this leaves a hole for Problems with known underlying causes that nevertheless have no workaround, so the ITIL specification won't do as a data definition. The definition here is that a Problem is generally a (known or unknown) root cause of many Incidents, although in the current model it is possible for an Incident to be caused by several Problems.
ITIL further states, "A Problem can result in multiple Incidents, and it is possible that the Problem will not be diagnosed until several Incidents have occurred, over a period of time. Handling Problems is quite different from handling Incidents and is therefore covered by the Problem Management process."
A Known Error is a knowledge management hook—it is an entity that can house the known resolution techniques for a given Problem.
Problem–Release and Problem–RFC
Problems may be addressed by Releases, which might solve multiple Problems. An individual Problem might also be addressed by one or several RFCs. One possible approach is to say that Problems are generally handled by Releases (using demand management), and Incidents are handled directly by RFCs (when called for). Ideally, an RFC should be able to reference both Incidents (tactical) and Problems (longer term). This will depend on the capabilities of incident management and its degree of integration with Problem and Change.
A Service Request is a logged interaction between an individual and the service desk that requires follow-up. Service requests may have various types, such as the following:
- Hardware or software request
- Incident report (i.e., the request is "resolve this incident")
- Configuration change request (the Service Request is the actual work request, not the authorization request)
- Security request
Figure 3.8 Problem, Release, and RFC context.
Figure 3.9 Service Request context.
A critical distinction is that between Service Request and Project initiation. The service management architects will need to pay close attention to the differences among Service Offerings that may be straightforward products, Service Offerings that are more open ended (analogous to professional services or consulting), and work requests that should not be framed as Service Requests but should be routed to demand management. Alternatively, the architects might view a Demand Request as a type of Service Request and drive to a more generalized approach (the "single pane of glass" philosophy).
See the "Clarify Service Entry Points" pattern in Chapter 5.
A Service Request is not a CI. It has a defined life cycle and typically figures in only one Business Process—its own fulfillment.
Service Request–Service Offering
A common relationship pattern is that Service Requests turn Service Offerings into Services.
A Service Request may occur with respect to an already-delivered Service. See the discussion later in this chapter.
A Risk is a known possibility of adverse events, usually described by 1) likelihood of happening and 2) cost of occurrence. Risks are best seen as directly applying to CIs; a deficiency of modern risk management software is that it is often designed in a vacuum, with the risk management team entering their own representations of CIs, such as Application and Process, and not looking to a common system of record for this reference data. See the CMDB-based risk management pattern in Chapter 5.
Figure 3.10 Risk context.
Risks may theoretically be associated with virtually any entity in the model, but the primary targets should be CIs, Projects, and Change requests.
Account and Cost
An Account is a financial construct. According to Wikipedia, it is "a record of an amount of money owned or owed by or to a particular person or entity, or allocated to a particular purpose." Other terms are "cost center" and "charge code."
The relationships of Account were not included in the main data model because of graphical complexity issues. Account is typically tied to a number of different entities, depending on the financial management approach being used (Figure 3.11).
Account might also be tied to any arbitrary CI, but this can imply considerable complexity.
Cost is an attribute, not an entity, and therefore does not appear in the conceptual model. Cost might be an attribute on any of the entities surrounding Account in Figure 3.11 and others (e.g., lower-level entities supporting Service, such as Application or database). A CMDB technically might allow any entity (not just CIs as defined in this book) to have an associated cost, and determining which CIs might appropriately have a cost would be an important implementation task.
Figure 3.11 IT accounting relationships.
Figure 3.12 Account and wholly owned item.
One common issue is allocation. If a given entity instance is related to one and only one Account, it "rolls up" and financial management is simpler—the account holders know that they bought the whole item. This is represented as a one to- many relationship (Figure 3.12).
However, if the costs for a given IT item are to be split across multiple accounts, it turns the relationship into many to many, requiring resolution with a specific allocation percentage (Figure 3.13).
For example, if a network Service is shared across several accounts, a percentage allocation must be established for each Account (Figure 3.14).
Figure 3.13 Model for allocating across accounts.
Figure 3.14 Example of allocated service.
Direct versus allocated (or indirect) costs are a substantial management challenge in IT. The desire for financial visibility runs into the issue of "dollars chasing dimes": the costs of managing the direct allocations outweigh the benefits in having granular visibility. In ITIL's words, the risk is that "the IT Accounting and Charging processes are so elaborate that the cost of the system exceeds the value of the information produced." This book takes no position on what is an appropriate level of complexity but rather seeks to describe the general case capabilities needed to support a variety of approaches—one thing architects can be sure of is that requirements will change.
As Jeff Kaplan notes,
Each IT service component (development, integration, help desk, network management, data center operations, maintenance, etc.) has a unit cost. Unit cost is the cost of providing one unit of service at predetermined service levels. Examples include cost per call, cost per connection, and so on. The specific units used are less important than is measuring each service's variance from the standard cost. Using cost accounting, organizations should set a standard cost per unit for each service and project, based on the expected cost of providing an incremental unit of service.
This passage, although informative, requires some thought to interpret as a requirements specification. First, the distinction between orderable and nonorderable services becomes important. A nonorderable service by definition has a large fi xed cost that can be allocated arbitrarily against a user base, but doing so might not be advisable. For example, consider an investment in a high-capacity customer facing online order system. This system must be kept running regardless of workload, and the marginal cost for heavy use as opposed to no use may be negligible. In naïve chargeback models, cost to the customer will vary inversely with usage, and this does not help IT credibility. (Even worse is when a unit's cost goes up—with stable consumption—because another unit has decreased its consumption.)
The concept of activity-based costing is a significant departure from older costing approaches. This book's interpretation of activity-based costing requirements applied to IT is that a concept of the business transaction is needed (this is the true "activity").
The core data model has no Roles or people in it. This is deliberate. Organizational approaches to managing the processes and their data will vary, titles will change, and in general the human organization will be more fluid than the core ITSM and metadata concepts. Therefore, the Role structure is generalized; Parties (people or groups composed of other parties) have Roles with respect to any entity in the model.
Party, Person, and Group
A Party is either a group or a person, people are members of groups, and groups can contain other groups. The following are all Parties:
- Oracle Incorporated
- Bill Smith
- Support group APPL-2-CNS
- IT Service Management Forum
Party is a controversial concept in data modeling, because business users do not understand it. They understand concepts like "administrator" or "steward." However, these are Roles. (These are well-understood issues in data modeling.)
Figure 3.15 Role model.
Here are some example Role types and the entities they might interact with. Note that ITIL and other industry sources, such as the Enterprise Computing Institute, go into some depth about this, so this section doesn't include an exhaustive survey.
|Requester||Service request (as related to Service Offering or Service)||A requester can request a new instance of a Service Offering (which becomes a new service) or can request a Change to an existing Service.|
|Support group||Usually Application||A support group would usually be a group associated with one or more Applications. Sometimes, a support group might be associated with a Technology Product (e.g., a Windows Engineering group).|
|Developer||Project (preferably related to Release and Application)||A developer carries known expertise on a given system. For any Application, a complete record of all developers (especially at the senior level) who worked on it is recommended. To provide value, this list might be sorted by hours worked on the system; those who spent the most time on the system would be of highest interest. Other software development roles (e.g., architect, tester, and analysis) could be handled analogously.|
|Release manager||Project, Release, Change||A release manager is responsible for coordinating the output of a project into releases to be accepted into production.|
|Change coordinator||Change||A change coordinator is responsible for the successful execution of one or more Changes. They may be part of a specific capability team or part of an enterprise change team.|
|Operational change approval group||Operational CIs||An operational change approval group is often seen as a dynamic entity, composed of representatives from the support groups associated with the CIs in question, as well as overall change coordination from a central enterprise group. Often, the change approval group may have standing representation from major technology product areas (e.g., Unix engineering or network engineering) or other operational capabilities (e.g., security).|
Here is a common Role type that may be problematic:
|Change Advisory Board||Any CI||ITIL calls for a unitary Change Advisory Board, admitting that the composition of that group may vary even within a single meeting. However, different CIs may have radically different stakeholders. For example, if a Contract is a CI, it should be under change control, but the change approvers would be the senior IT executives, the contract office, and legal—your engineers would not be involved. The concept of a Change Advisory Board becomes so general that its usefulness is questionable. The better understood use of change approver is with respect to Production CIs. See the "Clarify Service Entry Points" pattern in Chapter 5 and related discussions throughout.|
Support roles for a Service (e.g., an Application) may be ordered, which requires an escalation path (Figure 3.16).
Escalation paths may be of several types, typically functional and hierarchical; a functional escalation path is, for example, from level 1 to level 2 to level 3 support, and a hierarchical escalation path might walk the organization chart from application manager to director to vice president. Specialized escalation paths to technical subject matter experts (e.g., database administrators and senior software engineers) may also exist; alternately, the escalation path may become a tree with decision points and not just a linear progression.
Figure 3.16 Escalation.
Figure 3.17 Classification taxonomy.
Taxonomies are used extensively in IT information management, for the same reasons they are used in science and other fields requiring knowledge management. A hierarchical tree structure is an intuitive and effective way to manage complexity. Typical taxonomies encountered in internal IT systems are functional decompositions, data subject hierarchies, application and technology categorizations, and so forth. There are commercial providers of taxonomies.
There is overlap between this entity and other treelike structures. The differentiation is that a classification taxonomy is merely a lightweight conceptual structure. Each node is of the same basic type. One does not typically establish dependencies between the taxonomy nodes or assign extensive attributes to them.
A valuable use of the taxonomy concept is to identify overlap or redundancy, for example, in an application portfolio. See the "Taxonomy-Based Rationalization" pattern in Chapter 5.