Corporate Data Categories


Introduction

This document describes Corporate data in detail. There is also an overview of Data Maturity document, outlining the ministry's need for defining data as Corporate or Non-corporate (see also the summary of the November 20/95 presentation on Information Maturity (Corporate Data Categories)).

We formally define "Corporate data" to distinguish what information is important to the ministry (see ministry policy 7.3, Corporate Information Custodianship). Data is Corporate if it is:

  • vital to the ministry (i.e. critical to the ministry's business)
  • permanent or lasting nature (i.e. kept for a significant period of time).

Therefore, Non-corporate data is data that does not meet one or both of the above points. Non-corporate data may have a significant impact on an individual (e.g. if they lose/delete their personal calendar), but the organization's business on the whole won't be affected.

Corporate data is further broken down into categories: Local Corporate, Extended Corporate, and Full Corporate (see diagram). Any ministry staff can decide what is Local Corporate or Extended Corporate, however a Data Custodian is accountable for defining Full Corporate data. The decision is not a light one for anybody -- once you make the decision that something falls within the corporate domain, you have to manage it! The current district term "operational data" refers to Non-corporate, Local, and Extended data.

These categories can be used to assess the state of the information we deal with in the Ministry of Forests.
Top

Corporate Data Categories

The following are detailed definitions of the Corporate Data category terms (Non-corporate, Local Corporate, Extended Corporate, and Full Corporate); an overview of the categories is also available.

Non-Corporate Data

Non-corporate data is defined and collected in a completely ad-hoc manner; no common standards are developed whatsoever. Quick results are desired, so undocumented (and therefore unrepeatable) processes are developed to deliver the data. The data has totally unpredictable quality (the quality depends completely on the person who defines and collects it). Non-corporate data is usually not shared beyond the person or group who creates it. Examples of Non-corporate data are meeting notices, informal agendas, personal calendars, information on scraps of paper, temporary files, or temporary spreadsheets.

Non-corporate data might be created to help answer tactical questions that operational-level Corporate data is not quite in the best format for. Non-corporate data could also be a copy of Corporate data, taken at some point in time and manipulated outside the corporate infrastructure (i.e. without regard to the defined standards for use and update of corporate information). As soon as the copy of Corporate data is taken, the copy becomes Non-corporate data.

Non-corporate data is still useful, but not relied on for making critical decisions or providing specific information (the intent is quick results). Care must be taken in the use of such data: since there is little investment in ensuring Non-corporate data is accurate (no standardization in collection procedures, etc.), it should not be relied on without extremely well-informed judgment. If the use of Non-corporate data might lead to erroneous or inaccurate assumptions, then it should not be used. Staff are accountable for how they use Non-corporate data in their decision-making processes and for copies of Corporate data they or others have transferred to their workstation -- if the Corporate data copy is corrupted, and the workstation user makes a decision based on the erroneous data, they are still professionally responsible for the outcome.

Top

Local Corporate Data

Local data has meaning in the context of the local program only. Similar information with different definitions or interpretations may exist elsewhere, but the data still may be shared among multiple districts. Because of the different definitions or collection standards, when different districts try to share the data there are formatting and translation problems. Trying to share Local data among multiple districts therefore requires significant effort before actual sharing can take place. This effort must happen each time sharing is attempted.

Local data is not defined or measured to standard. The staff who create Local data have put some effort into standardization -- the data is produced by a repeatable but not well-defined process. In other words, the process is repeatable (defined enough for a person to redo the same thing and get the same results), but since it is not fully defined and published, other groups, sections, or district offices will not be able to use it properly.

Staff using this data generally know its limitations and are careful. What sets Local data apart from Non-corporate data is the fact (or, staff judgement) that the data is vital to the ministry's business, and is permanent (i.e. has lasting value).

Top

Extended Corporate Data

Extended data refers to the extra data that a district may need to create, to general standards within the district or across a few districts. This data is often identified and collected because of an operational need at the district level, where staff choose not to wait for the Custodian to negotiate full provincial agreement on what exactly to collect. Different districts will often collect different things under the same name or the same things under different names since there is no ministry-wide agreement. Because of this quasi-agreement and only limited published understanding, staff may use the data in activities not suited to the manner or purpose of collection. The data may generate answers that are ok for that specific time and that specific district, but the answers cannot be generalized (if they are generalized unknowingly, flawed business decisions can be made).

The thing that sets Extended data apart from Local is that some (more) energy has been put into defining Extended data from a province-wide context. For example, the definition and attributes would have been published to all districts. (We don't yet have an easy way for district staff to do this, but the design of the new Corporate Spatial Database -- INCOSADA -- will provide it, and the Data Administration group is working on an interim solution.) With Extended data, the negotiation process is just beginning, to move the data to a province-wide standard.

The difference between Extended and Full Corporate data is that Extended data has not yet gone through the negotiation and validation process necessary with business stakeholders (normally facilitated by the Data Custodian) to reach full province-wide agreement on the data standards.

Top

Full Corporate Data

Full Corporate data is the most mature data, where a Data Custodian has defined ministry standards for its definition, collection, entry, and use. Full Corporate data can be further broken into Shared (used by several business areas -- e.g. Forest Client information) and Program-Specific (all collection and use is within the same line program -- e.g. lightning location data). Most Full Corporate data is Shared -- it is potentially used by many staff or by different programs.

The major point that sets Full Corporate data apart from the other types (Extended Corporate, Local Corporate) is that it is managed rigorously; it must meet the most stringent ministry-wide standards for accuracy and use, because so many people rely on it to make business decisions. Some of the ways corporate data is rigorously managed are:

  • A Data Custodian is identified who is responsible for ensuring the data meets ministry information needs.
  • a primary source is identified for the data, from which all requests or copies originate.
  • standards for entry, update, use, and disposal or corporate data are defined by the Data Custodian, along with procedures for ensuring those standards are followed properly by all staff.

Rationale
(Why this is so important)

In the case of data used from a corporate perspective (that is, potentially used by many staff or by different programs), ensuring that the data is accurate, timely, and correct has quite a significant cost. For example, to ensure data about something is entered properly (accurately, correctly) requires appropriate ongoing training programs across the province for all staff, so they have the knowledge to act appropriately. These kinds of training programs alone are expensive to create and administer; and yet, they are only part of the cost of maintaining data. An example of the direct cost of data is that the ministry often hires contractors to gather and input data from the field; while this frees ministry staff to perform their other vital roles, the contractors do cost money.

If we don't stringently manage our corporate data, we cannot rely on that data when the time comes to make business decisions - we won't know how accurate the information is. And these days, tough questions are continually being asked; we are expected not only to have the answers, but the right answers. Also, each day different (unanticipated) questions are being asked.

Shared Full Corporate Data

Corporate data is generally, though not always, shared across one or more programs or between several offices. The distinction shared means the information must be treated with the highest standards of care and cooperation between ministry programs. Shared data generally affects several different program areas, and any change to such data invariably affects multiple computer applications, policies, procedures, and/or ministry business itself.

Program-Specific Full Corporate Data

On the other hand, there is vital data in the ministry that requires careful management, but is not shared at all with other program areas, or is only minimally shared. This data must still be designated Full Corporate since it is so vital to the ministry. However, since the effect on other areas resulting from decisions to change the data is slight, it may not need to be managed with quite so much rigor. The program area involved (the Data Custodian) consciously decides what the level of risk is, acts accordingly, and remains responsible for any outcome.

Top

Summary

The quality of data in any organization is directly dependent on the effort that is put into its definition and management. If we all do the work necessary to agree on how to define and use certain data, we can all (across the ministry, in each and every district) be confident that we can share data. However, if we each don't put that effort in (i.e. if each district defines their own set of data), then a huge amount of discussion, negotiation, translation, and conversion will be required during operational projects before sharing is possible.

Currently, ministry staff have no easy way of publishing Extended Corporate data definitions province-wide. This is being developed within the INCOSADA (the ministry's corporate spatial and attribute database being developed) project and by Data Administration staff.


Acknowledgements

Jeremey gratefully acknowledges the prior work done by the following people. Without their insightful theories and thought, the concept of the Corporate Data Categories would never have made any sense.

  • Capability-Maturity Model (process maturity) from the Software Engineering Institute at Carnegie Mellon University
  • Information/Process Maturity -- from the "Method for Establishment of Strategic Improvement Opportunities", developed by Guy Friswell (then of DMR, now of QVI Consulting Group, Victoria) and Gerry Moore of the BC Ministry of Transportation & Highways
  • And thanks to Richard Dzobia for bringing these works to my attention and explaining them to me

Jeremey Janzen, I.S.P.