Data Maturity


Introduction

What's the big deal about data "maturity"? Well, in any large organization, "data" is defined, collected, transformed, used, summarized, and reported for the purpose of making business decisions. The value of such data is commensurate with the amount of resources that went into its definition, collection, and use. Data that is collected in an ad-hoc manner with no standard method is of far less value than data defined for a business purpose that has been thought through and agreed on by all affected business staff. Data collected using a common standard can be thought of as being more "mature" and more useful to the organization.

For example, send five children out to a playground to "count weeds". The five answers received will depend on what each child thinks is a weed (one child might consider grass a weed; another might consider a dandelion a flower), whether they cover the whole playground, how well they can count, what the weather was like (if rotten they might quit early), etc. The value of any one response is low, since none of the collection parameters were well defined. On the other hand, it's very easy to send a child out to count the weeds; you get a number back in short order.

This example illustrates the trade-off we often make in business, in the data that we collect and use. It's very easy for a ministry employee to simply go out and measure some trees, but more difficult for the entire ministry to have each ministry employee (or consultant, etc.) do the measuring in the same way. It takes considerable resources (training, communication, auditing...) to ensure from a ministry-wide perspective that the latter situation is the one that occurs.

To collect more reliable, accurate data, agreement must be reached on what to collect (definition), how to collect (collection method), and how much to collect (boundaries in scope). This agreement, in large organizations, is often difficult to get because, after all, we're humans and we like to each have our own opinions. In the Ministry of Forests, we've assigned ultimate responsibility for defining sets of business information to a Data Custodian, as outlined in ministry policy.

The above reasons indicate that it is useful to know what kind of data you are dealing with when trying to make a business decision - was the data collected in an ad-hoc manner, or was the data collected to a recognized and agreed to standard. The Data Administration group has defined categories of data maturity (see also the summary of a presentation on Information Maturity Categories, given in November 1995 to the Corporate Spatial Database project team, and a more detailed description of the corporate data categories). These categories are not separated by any one factor, but instead should be thought of as being defined along a continuum. Generally, the only thing separating each category is the amount of resources (energy) put into the decision processes around the data's definition, coordination, and ongoing management.

Top

The short definitions for Corporate data (data maturity) categories:

More detailed definitions of the Corporate Data categories are available.
  • Non-corporate data: Completely ad-hoc; no common standards whatsoever. Example: meeting notices, agendas, even minutes; personal calendars.
  • Local Corporate data: Data collected that has meaning in context of the local program only. May be data collected to often different standards but still shared among multiple districts (with the associated formatting and translation problems!). Staff using this data generally know its limitations and are careful. What sets Local data apart from Non-corporate data is the fact (or, staff judgement) that the data is vital to the ministry's business, and is permanent.
  • Extended Corporate data: The thing that sets Extended data apart from Local is that some energy has been put into defining Extended data from a province-wide context, and any such data is stored on a corporate platform. For example, the definition and attributes would have been published to all districts and the data is stored on each district LAN (i.e. NOT a personal workstation C: drive!!). (We don't yet have an easy way for district staff to publish their definitions, but the design of the new Corporate Spatial Database -- INCOSADA -- will provide it.) With Extended data, the negotiation process is just beginning, to move the data to a province-wide standard.
  • Full Corporate data: Full Corporate data is the most mature data, where a Data Custodian has defined ministry standards for its definition, collection, entry, and use. Full Corporate data can be further broken into Shared (used by several business areas) and Program-Specific (all collection and use is within the same line program -- e.g. lightning location data). Most Full Corporate data is Shared.
Top

Summary

The framework of Corporate Data categories will help us assess

  • the state of the information in use in the ministry;
  • how much sharing is possible (or, cost-justified); and
  • who is responsible for each set of information.

This document introduces the idea of data structures and collection processes together having different maturity levels. Original ideas about Corporate Data were published in January 1993 in our Systems Development Guide S35, Management Guide to Custodianship (it now needs updating, and that is planned for early 1999). There is slightly more detail on data maturity in the summary of a presentation on Information Maturity Categories: it shows the main categories, briefly describes each, and also includes a brief description of the roles necessary for Corporate data. And finally, there is yet another document on the corporate data categories, somewhat similar to the presentation but with more detailed explanations.

For more information please contact J. Janzen, I.S.P., Data Administrator.