Corporate Data Categories


First presented at the
Corporate Spatial Database Meeting
November 20/21, 1995
To ask for this real, live, in-person presentation at your site, please contact:
Jeremey Janzen, Data Administrator, Information Management Group
(250) 387-8449; email jeremey.janzen@gov.bc.ca
We encourage re-use of the ideas contained here, provided you acknowledge the source.
 


Remember Einstein's quote: "The significant problems we face cannot be solved at the same level of thinking we were at when we created them." This presentation outlined a framework that provides the new level of thinking required if we in the Ministry of Forests are going to solve the problems we face. It will not happen without some pain - some "letting go" of current thinking patterns.


A framework of data types, or information categories (non-Corporate, Local Corporate, Extended Corporate, Full Corporate) and roles (Data Custodian [/Data Standards Manager], Steward, Application Custodian, Data Resource Manager) will help us assess the state of our information and who is responsible for which parts.

Corporate Data Categories

Corporate data is data that is vital to the ministry's operations, and permanent. I.e. "Is this information important, critical, vital enough to collect on behalf of the ministry? Is it important enough to store?" If it is, then it's corporate data. There are three defined levels of Corporate data, as well as a fourth category "non-Corporate".

See a diagram of the Corporate Data categories (20kb); read it from left to right for increasing data sharing.

Non-Corporate

  • Produced by ad hoc, undocumented processes for quick results; unpredictable quality
  • Examples: daytimers, transient working files or spreadsheets, rough notes to plan a meeting, personal calendars

Local Corporate

  • Temporary, used for local (office) decision-making only
  • Not measured or defined to standards
  • Similar information with different definitions / interpretations may exist elsewhere
  • Produced by a repeatable, but not well defined process
  • Example: staff in a district may agree to collect estimated area, location, date for pest infestations if they happen to spot one while in the field - "10ha pest infestation 6km from Big Bear Creek, June 13/95" - but the accuracy of "area" and "location" will vary depending on who collects the data, and the definition is not correlated with other districts (i.e., another district may decide that area is not important but that the infestation intensity should be collected).

Extended Corporate

  • NOTE: The ministry does not yet have a data dictionary where district staff can enter their "operational data" definitions and process definitions. One of the goals of the INCOSADA (Integrated Corporate Spatial and Attribute Database) project is to provide a district-updatable Extended Data dictionary that will make the publishing of such definitions fairly easy. Until that is available, the Data Administration group within ISB is working on an interim solution.
  • On the road to full Corporate, maybe
  • Becomes Extended when the definition is published province-wide, and the data itself is stored on a corporate platform (not a PC). For the definitions, this means that at least some energy has been put into a ministry definition -- the definition may not be fully correlated to definitions from other sources (i.e. there still may be multiple different definitions throughout the province), but it's a start.
  • Produced by a defined process (as part of the rigor of becoming Extended), but not measured or optimized; not used for province-wide decision-making.
  • A district may choose to publish their definition of a particular information need, or work together with another district(s) to agree on the definition. The more energy and negotiation that goes into the definition, the farther along the data is (when the appropriate Branch is involved, the data is moving to Full Corporate).
  • Example: staff in a district have agreed to collect estimated area, location, date in the same way using the same codes -- "bark beetle damage, 12ha, severity 9, location 56" -- publish their definition so that all or nearly all other districts can see it (e.g. in a province-wide discipline expert meeting; as a published report; etc.), and store the data on a corporate platform (e.g. a LAN server, NOT a PC).
  • The key behind Extended Corporate Data is that its definitions are visible to all staff and kept in a place that has ongoing data management integrity. The good thing about the visibility part is that it allows the negotiation process to begin among all ministry staff (about what a good data structure is to meet the ministry's business needs).

Full Corporate

  • Common definition across the ministry, collected by well-defined, standard, formal, and measurable processes, such that it can be easily, reliably, and confidently linked to other Corporate data
  • Consistently used across the ministry (everyone collects and uses it in the same way)
  • Permanent and vital, for both operational and strategic decision making - relied on to be accurate, meaningful, and available
  • One source, one designated Custodian, and normally shared access
  • Ongoing training, support, and management of the data itself and the underlying data structures
  • Example, Shared Full Corporate: Client data; ISIS data; tenure information...
  • Example, Program-Specific Full Corporate: Lightning Location information (this data is vital to the ministry and of a lasting nature but only the Protection program uses it).

Notes:

  • It takes energy (staff resources, $, time) to move data towards full Corporate, and it also takes energy to keep data there. Status quo costs us as well - example is all the translation effort district staff need to go through when two districts want to share each of their Local data, e.g. for a Local Resource Use Plan on a common watershed.
  • As the data definition is negotiated throughout the province (and more districts agree), sharing is easier for everybody - the effort of defining a common definition has a payback. It is certainly worth it for some data (but maybe not everything).
  • Right now we think we have lots of fully Corporate data, but in many cases we (the ministry Custodians and data users) haven't put the full energy required into making it so.
  • The district term "operational data" includes Local and Extended data.

Roles

Corporate data has a Data Custodian

  • By definition, a Data Custodian (DC) is a branch director responsible for implementing that legislative requirement of the ministry. This is not a district staff role (districts do not have the resources to take a province-wide look at things).
  • Defines business data standards from a provincial or ministry perspective (not departmental or program-only), including:
    • what to collect (how much, how detailed, how accurate; as per RIC, or business standards from another source)
    • how to use
    • where to store/retrieve
  • There may also be a Steward involved:
    • A Steward is already a Data Custodian, therefore this is never a district staff role.
    • The Steward provides a processing platform and digital standards, for another program to store data.
    • The original Data Custodian (e.g. Range) is still responsible for defining the business standards (e.g. what is a Range Unit); the Steward (e.g. Resource Inventory) only deals with digital standards (i.e. Steward would designate that "a Range Unit is stored as Feature Code xx" and "has the following line weights", etc.).
  • A Data Custodian may (will probably) delegate the day-to-day data and issue management duties to a Data Standards Manager (this is a generic title and does not mean the staff have to be in a management classification).

Application Custodian

  • An Application Custodian (AC) builds applications to access Corporate Data.
  • The AC may be the same as the DC (e.g. Financial Management Branch is the Data Custodian of Client data and also Application Custodian of the Client Management System).
  • The AC may be different from the DC (e.g. for Operations Division applications such as ISIS or FTAS, Business Design Branch is the AC, and the respective DC's are Silviculture and Resource Tenures & Engineering). In these cases the AC and DC must team to provide the best solutions - DC what to collect; AC how to deliver it.

Data Resource Manager (DRM)

  • The staff responsible for collecting data at the field level are the DRMs, although ultimate accountability rests with the senior manager of the office (e.g. District Manager). They are accountable for collecting Corporate data to the standards set by the Data Custodian. The DRM is a key role for the ministry: where Full Corporate data does not exist, they will determine what data is important enough to begin moving into the Extended Corporate category.
  • In the new ministry organization, the LIM (Land Information Management) team in many cases are being asked to act in that capacity. This role is not yet clear or well defined but is being fully defined in the next update of ministry policy 7.3, Corporate Information Custodianship (target publish date spring 1998).

Coordinator

  • This role used to be the Steward's representative in the districts (in the pre-1994 ministry organization). The Coordinator (e.g. Inventory) would enter data from another program (e.g. Range), and would be accountable for ensuring the digital entry standards are followed (e.g. spatial numbering conventions). The Coordinator is not accountable for business standards, such as how accurate a Range Unit polygon measurement is. With the new LIM organization structure(s) in the districts, it is not clear whether this generic "coordinator" role is still relevant.

Issue, Current Technology

Spatial "modelling" standards (how to document spatial data requirements designs before building the database) do not exist in the systems industry. In current spatial technology, the logical data model ("business data requirements" oriented) is part of the physical data model (how a particular vendor's technology implements spatial data structures). Contrast this with relational databases - an SQL statement (logical) is the same no matter which vendor platform (physical) you are running it on.

The impact of this is that the work to design and input data requirements in a GIS is vendor dependent (i.e. switch vendor technology and all that effort of business design has to be done all over again). And, because of the computer industry instability and especially spatial industry instability, we don't know which vendors or which technologies will actually be around several years from now. This makes it very risky in the short term to invest heavily in any single vendor.