![]() |
| You are here: | Home > OASIS > Appendix 1 |
|
Appendix 1 Operational adjustment to site index study (OASIS): sampling design for estimating proportional areas of site series. Report prepared by A.F. Linnell Nemec for BC Ministry of Forests, June 1997.
Operational Adjustment to Site Index Study (OASIS): Sampling Design for Estimating Proportional Areas of Site Series
Revised: June 12, 1997
Resources Inventory Branch B.C. Ministry of Forests Victoria, B.C.
Prepared by Amanda F. Linnell Nemec I.S.R. Corporation P.O. Box 496, Brentwood Bay, B.C. V8M 1R3
1. INTRODUCTION Reliable estimates of site index are essential for forecasting timber supplies. Site index - a measure of the potential productivity of a site - is related to climate, topography, soil type, vegetation, and other characteristics embodied in the provincial Biogeoclimatic Ecosystem Classification (BEC) system (Meidinger and Pojar 1991). Therefore, estimates of the proportional areas occupied by different site series, and their associated site indices, can be used to make inferences about the productivity of a timber supply area (TSA) or other geographic region. This approach exploits empirical relationships derived from the Site Index/BEC (SIBEC) study (B.C. Ministry of Forests 1996) and is an alternative to the paired-plot method of adjustment that was used in the Old Growth Site Index (OGSI) project. The purpose of this report is to describe a survey design for apportioning a geographic area to site series. Section 2 outlines the main elements of the design. Computational formulas for the estimated proportions and their standard errors are given in Section 3. Section 4 describes a pilot study of the proposed sampling and estimation procedures, which will be conducted in the Bulkley TSA. The report concludes with a summary in Section 5.
2. SURVEY DESIGN 2.1 Target Population The first step in designing a sampling plan is identification of the target area. This will usually correspond to a TSA but can be any area with well-defined boundaries. Let R denote the region of interest and let A denote its total area in hectares. For present purposes, the region R is assumed to be made up of non-productive land (e.g., urban areas, roads, railway tracks, water bodies), for which neither site series nor site index is defined, and potentially productive land, which can in theory be assigned both a site series and a site index. These two classes are approximately equivalent to the "non-vegetated" and "vegetated" (Level 1) categories of the BC Land Classification Scheme (B.C. Ministry of Forests 1995). Productive land encompasses all types of operable and inoperable areas, including parks, private land, and crown land. This class can be divided into BEC zones, subzones, and variants, which can be further sub-divided into site series (Figure 1).
Figure 1. Land classification by site series.
The objective of the survey is to estimate the proportional area occupied by each site series that occurs in R, that is where: Aij is the total area (in hectares) of Site Series j in Subzone/Variant i; mi is the total number of site series in Subzone/Variant i; and I is the total number of subzones/variants in R. 2.2 Sampling Units There are three main types of sampling units that can be used to estimate map areas: points, lines, or fixed-area plots (Figure 2). If points are used then each unit is classified according to whether or not it falls within the site series of interest (shaded area in Figure 2). In the case of lines or plots, the intersection length or area of overlap is measured. Circular plots with a fixed area of 400m2 (i.e., radius 11.28m) will be used to estimate site-series proportions for OASIS. If a site series has a typical patchy spatial distribution then plot estimates are expected to be more precise (i.e., have smaller variances) than estimates based on the same number of points or lines. On the other hand, plots tend to be more time-consuming to measure. A 400m2 plot is consistent with SIBEC standards (Nigh et al. 1997) and represents a compromise between the relatively high cost of measuring even larger plots and the reduced precision of smaller plots or points.
Figure 2. Sampling units for estimating area: (a) points, (b) lines, and (c) fixed-area plots.
2.3 Sample Selection Three possible plans for locating the sample plots were considered: simple random sampling, stratified random sampling, and systematic random sampling. In simple random sampling, n plots are randomly located within the target region R. This plan is simple to implement and the associated estimators have well-known statistical properties. Simple random sampling has the additional advantage that extra sample plots can be added (e.g., to improve the precision of the estimates) without compromising the design. The main disadvantage is a potential loss of efficiency relative to stratified random sampling and systematic random sampling. Thus, the standard errors of the estimated proportions may be larger than the corresponding errors for a stratified or systematic random sample of the size n. In stratified random sampling, the region R is divided into non-overlapping strata (or sub-regions) and a simple random sample of plots is selected from each. If the variation in the proportional area of a site series is significantly greater between strata than it is within strata then stratified sampling is more efficient than simple random sampling (see Cochran 1977). This condition is likely to hold for strata defined by subzone, forest cover, or other key predictors of site series. However, pre-stratification by these variables is generally not feasible because subzone and forest cover boundaries are uncertain before the sample is drawn. Post-stratification by the same variables is possible. Another alternative is pre-stratification by geographic location (e.g., longitude and latitude), which is obviously known a priori and is likely to account for a substantial fraction of the variation. A systematic random sample is a sample that is selected in an orderly fashion starting with a randomly selected unit. For instance, to select a systematic sample from R, sample plots might be located on a uniform grid with a random orientation. Systematic spatial sampling results in an even distribution of sample points over the target region and is often more efficient than random sampling and stratified random sampling (refer to Das 1950, Dunn and Harrison 1993). Its main drawback is inflexibility in accommodating additional samples of arbitrary size (i.e., supplementary plots can be added only at fixed intervals between the original sample plots). Another disadvantage is an absence of exact formulas for calculating error variances, although various approximations have been suggested (see Dunn and Harrison 1993). After considering the pros and cons of simple random sampling, stratified random sampling, and systematic random sampling, simple random sampling from a uniform grid was adopted for the OASIS project. Sampling locations will be randomly selected without replacement from a 100m ´ 100m provincial grid, which is aligned with the 20km ´ 20km National Inventory Grid established by Natural Resources Canada. For a target region R, the sampling frame consists of N » A plots (i.e., one plot per ha) representing approximately 4% of the total area. Owing to the random placement of the provincial grid and the relatively small grid size, whenever A is large (e.g., more than 100,000 ha), the difference between the sampling frame and the target population is expected to be negligible, and will be ignored in the subsequent discussion. The primary advantage of using the provincial grid is compatibility the Vegetation Resources Inventory (VRI), which uses the same grid. This provides a direct link between OASIS and the VRI, thereby facilitating data handling and the sharing of information through a common Geographic Information System (GIS). Sampling from the grid also ensures that the sample plots are a minimum of 100m apart so that at least some of the advantages of systematic sampling are realized. 2.4 Measurements Sample plot centers will initially be located on maps and aerial photographs to determine whether field measurements are required. Plots that lie entirely within large bodies of water, urban centers, or other areas that have no site series will be classified as 100% non-productive and no further information (other than location) will be recorded. All other plots, including border plots and plots with uncertain classifications, will be assessed in the field and the following data will be collected: plot location (longitude, latitude, and UTM coordinates of plot center); BEC zone, subzone, variant, and site series; proportional areas occupied by site series and non-productive land; and, for plots that have uniform ecosystems (i.e., one site series) and meet other applicable SIBEC standards (see Nigh et al. 1997), site-index data (i.e., species, height, and breast-height age of three suitable site trees). The assessment of BEC zone, subzone, variant, and site series will follow standard procedures given in regional field guides for site identification and interpretation. Site series and non-productive land will be mapped for each plot and their areas recorded as percentages of the total plot area (Figure 3). Site-index data will be collected in accordance with SIBEC standards.
Figure 3. Measurement of areas of site series. All site series and non-productive land are mapped within the sample plot (circle) and the corresponding areas of overlap (a+b+c = 100%) are recorded.
To ensure that the estimated site-series proportions are as reliable as possible a complete set of measurements must be obtained for all plots. Plot access will be established during an initial inspection of maps and aerial photographs. Permission to gain access to private property will requested. If access is denied, or plots are for some other reason inaccessible (e.g., too hazardous to locate), then the following options, in order of preference, will be adopted: (i) ground measurements will be replaced with helicopter observations ("air calls"); (ii) missing data will be replaced with photo-estimates or the best available information from other sources; or (iii) ground measurements will be recorded as missing data. If either of the first two options is used then the source of data for the inaccessible plots will be documented for future reference. 2.5 Sample Size There are two main requirements that determine sample size. First, the sample size should be large enough to ensure that all site series occupying some minimum area are sampled at least once. Second, the estimated proportional areas of the site series should be sufficiently precise to meet the survey objectives. For simple random sampling, the probability that a site series overlaps at least one of the n sample plots is Selection Probability =
where p1 is the probability that a single plot hits the site series. In general, p1 depends on the spatial configuration of the site series, the total area of the site series, and the size and shape of the sample plots. A conservative estimate of sample size can be obtained by assuming that p1 is proportional to the total area of the site series. Under this assumption, there is a 78% chance that a site series that occupies only 0.5% of the total area will be sampled at least once when the sample size is 300. This increases to 92%, 97%, and 99% when the number of sample plots is increased to 500, 700, and 1000 (Figure 4).
Figure 4. Minimum selection probability [2] versus area of site series.
The minimum sample size required to achieve a desired level of precision in the estimated site-series proportions is given by a standard formula for simple random sampling:
where ta /2,n-1 is the a /2 ´ 100 percentile of Student’s t distribution with n -1 degrees of freedom, s is the standard deviation of the measured area (percentage overlap with sample plots) of the site series of interest, and E is the maximum acceptable error margin when the confidence level is (1-a )´ 100%. This equation is solved for n by specifying a , the error margin E, and by supplying an estimate of s (refer to Section 4.2).
3. PARAMETER ESTIMATION Estimates of the population proportions pij [1] are obtained by averaging the plot proportions for each site series, that is, where yijk is the proportional area (%) of Plot k that is occupied by Site Series j in Subzone/Variant i (see Figure 3). If the site series does not occur in Plot k then yijk = 0 and the numerator is simply the sum over those plots that overlap the site series. The proportional area of non-productive land p0 can be estimated in the same way: where y0k is the percentage area of Plot k that is non-productive. The estimated standard error (s.e.) of where
The finite population correction 1-n/N is assumed to be approximately one (i.e., n << N) and has been omitted from the above expressions. 3.1 Post-Stratification Post-stratification of the n plots according to leading species, age class, or other information contained in the map labels may improve the precision of the estimated site-series proportions. Moreover, if the strata match the appropriate units of analysis then the results can be applied to problem of forecasting timber supplies. Assume that the target region R is divided into H non-overlapping strata (see Figure 5) and that every plot in the sampling frame belongs to a single stratum (i.e., there are no border plots). A stratified estimate of the proportional area of each site series is obtained by calculating the weighted average of the stratum proportions: where the weight Wh for Stratum h is proportional to its area Ah (or the total number of plots Nh):
and the estimated stratum proportion:
is based on the nh sample plots that belong to Stratum h. If the stratum weights are known, or can be estimated from the sample with sufficient precision (i.e., Wh » nh /n and nh > 50 for every stratum), then the approximate standard error of the stratified estimate is: where
Figure 5. Post-stratification of sample plots. In this example, the target region (shaded area) is divided into four homogeneous strata (e.g., BEC zones, leading species, age groups, geographic sub-regions). Stratum membership is determined after the sample plots (circles) have been measured.
3.2 Inaccessible Plots The usual estimators [4, 5] can be used without modification when the missing data for inaccessible plots are replaced with approximate values (i.e., air calls or photo-estimates) and the number of inaccessible plots is small. Alternatively, plots with missing site series can be omitted and the sample size reduced by the number of such plots, in which case the revised estimator is:
where the summation is over the remaining
where 4. BULKLEY TSA PILOT STUDY A pilot study of the proposed survey design and estimation procedures (Sections 2 and 3) will be conducted in the Bulkley TSA. The study area has 11 subzone/variants (with up to ~10 site series per subzone) and a total area of 758,432 hectares (Table 1). The objectives of the pilot study are: (i) to develop and test field procedures for future surveys, (ii) to obtain improved variance estimates for calculating sample sizes, (iii) to determine the frequency of occurrence of inaccessible plots, and (iv) to test the application of the methodology in a timber supply analysis.
Table 1. Subzone areas for Bulkley TSA (Data Source: Bobby Love). Subzone Area (ha) Area (%) Atp 103642 13.67 CWHws2 18036 2.38 ESSFmc 198526 26.18 ESSFmk 5704 0.75 ESSFmv3 353 0.05 ESSFwv 60428 7.97 ICHmc1 27304 3.60 ICHmc2 20770 2.74 MHmm2 16589 2.19 SBSdk 68843 9.08 SBSmc2 238237 31.41 Total Area 758432 ha 100% 4.1 Sampling Procedures Plot centers corresponding to approximately 600 plots, or a minimum of 400 field plots, will be randomly selected, without replacement, from the provincial grid and entered into the GIS. Access will be determined and a suitable sampling schedule drawn up for all plots requiring ground measurements. Field procedures for locating and measuring the sample plots will be developed and tested during the pilot study. Appropriate safeguards (e.g., training programs, audits) will be implemented to ensure that measurements are made in a consistent and reliable manner. Table 2 is a summary of the data to be collected. All ecosystem data (excluding site-index measurements) will be recorded on the "TEM Visual Inspection Form." The form FS882 and the Vegetation Environment NexUS (VENUS) data entry module will be used to record and enter data for plots that meet SIBEC standards and require the measurement of site trees. Details of the sampling procedures are given in the Appendix.
Table 2. Summary of sample data to be collected. Date of Survey Survey Crew Plot Identification plot number (should reflect the order of measurement so that time trends can be analyzed) longitude, latitude, and UTM coordinates of plot center plot size (11.28m radius) Ecosystem BEC zone, subzone, and variant Site Series - plot map, estimated percentage areas Site Trees (3 suitable trees in plots that meet SIBEC standards) species crown class tree height age at breast height (based on increment bore) Miscellaneous assessment of operability species composition time to locate and measure plot (times should be recorded if possible so they can be used to estimate sampling costs)
4.2 Sample Size Estimates of the mean and variance of the measured site-series proportions yijk (Figure 3) were calculated using audit data supplied by Gord Nigh (Research Branch). The data, which were obtained during a previous survey of the Bulkley TSA, comprise site-series areas for a total of 50 clusters of four 100m2-plots (not all subzones were surveyed, e.g., Atp was excluded). To investigate the effects of plot size, the sample mean [4] and variance [6] were computed for a single plot from each cluster (50 plots in total) and for all four plots combined. Owing to the proximity of the four plots in a cluster, the latter should be approximately equivalent to a sample of fifty 400m2-plots. A total of six subzone/variants were sampled regardless of the sampling unit (plot or cluster). Twenty-seven site series were identified in the fifty 100m2-plots, 72% of which had only one site series, with an average of 1.3 site series per plot. This increased to a total of 35 site series, with a mode of three site series per cluster (i.e., 52% of the clusters had 3 site series) and an average of 2.9 site series per cluster, when all four plots were pooled. The observed relationship between the sample mean and variance is illustrated in Figure 6. Notice that when the plot area is 100m2 the variance is well-approximated by the binomial formula - that is s 2 » p(1-p), where p is the proportional area of the site series - but, when the sampled area is increased to 400m2, the variance is substantially reduced. A further reduction in variance might be expected for even larger plots, however, the optimum plot size could not be determined with this data set (because no cost estimates or plot maps were available). The sample variances (Figure 6) were used to assess the adequacy of the proposed sample size of ~600 plots for the Bulkley TSA pilot study. A maximum error margin of ± 2% and a confidence level of 95% were assumed. The resulting estimated sample sizes [3] are shown in Figure 7 for 100m2-plots and for clusters of 4 ´ 100m2-plots. For site series occupying less than 10% of the area, a sample size of 400 (or fewer) appears to be sufficient when all four plots are included. The required sample size is approximately double when the sampling unit is reduced in area to 100m2. For site series that cover between 10% and 15% of the study area, the respective sample sizes are 600-700 and 900-1000. Thus, the proposed sample size of 600 (400m2) plots appears to be adequate for ensuring that the error margin is around 2% (for site series covering 0-15% of the total area) and that site series occupying at least 0.5% of the area are likely to be sampled (Figure 4).
5. SUMMARY A survey design for estimating proportional areas of site series in a TSA, or other well-defined region, was described. The sampling plan consists of 400m2 circular plots centered on randomly selected 100m grid points. This design has several advantages over others. First, plots are generally more efficient (for estimating map areas) than points or lines, although the optimal plot size depends on spatial pattern. Second, locating the sample plots on a standard grid provides a link to the VRI and incorporates at least some of the desirable features of systematic spatial sampling. Finally, the theory of simple random sampling provides a sound basis for making inferences about the population parameters of interest, with potential gains in precision achievable through post-stratification. The sampling design, estimation methods, and application to a timber supply analysis will be tested in a pilot study to be conducted in the Bulkley TSA in the summer of 1997. Field data will be collected for ~400 plots (total sample size n » 600 plots). The results will provide better variance estimates for future sample-size calculations, data on the frequency of occurrence of inaccessible plots and the associated difficulties in obtaining measurements, and various other information that can be used to refine the sampling design. The pilot study will also serve as a test of the field procedures, which will undoubtedly lead to improvements. The pilot study is not designed to determine optimum plot size, although the effect of reducing plot size below 400m2 can, in theory, be evaluated by analyzing (using plot maps) the spatial distribution of site series within plots.
Figure 6. Sample variance versus estimated proportional area of site series in the Bulkley TSA. Sample variances (dots) are based on 100m2-plots (upper panel) or clusters of four 100m2-plots (lower panel). The theoretical variance for a binomial distribution, s 2 = p(1-p), is plotted as a solid line. (Data Source: Gord Nigh, Research Branch).
Figure 7. Number of plots required to estimate site-series areas (%) to within ± 2% with 95% confidence. Sample sizes are based on variance estimates (Figure 6) for 100m2-plots (open circles) and clusters of four 100m2-plots (filled circles).
6. REFERENCES B.C. Ministry of Forests. 1995. BC Land Classification Scheme. Vegetation Inventory Draft Photo Interpretation, Chapter 1 (Internet document), Resources Inventory Branch, B.C. Min. For., Victoria, B.C. B.C. Ministry of Forests. 1996. Average site index for select tree species and ecosystems in British Columbia: first approximation. SIBEC Guidebook (Draft #1, August 1996), B.C. Min. For., Victoria, B.C.. Cochran, W.G. 1977. Sampling techniques (3rd edition). John Wiley, New York. Das, A.C. 1950. Two dimensional systematic sampling and the associated stratified and random sampling. Sankhya, 10: 95-108. Dunn, R. and A.R. Harrison. 1993. Two-dimensional systematic sampling. Appl. Statist., 42: 585-601. Meidinger, D.V. and J. Pojar (editors). 1991. Ecosystems of British Columbia. B.C. Special Rep. Series No. 6, B.C. Min. For., Victoria, B.C. Nigh, G., D. Meidinger, and A. Mirza. 1997. SIBEC sampling and data standards. B.C. Min. For., Victoria, B.C.
Continue to Appendix 2. |
|
Last Modified: 2003 APR 17. Ministry contact: Gord Nigh. Webmaster: For.Prodres@gov.bc.ca |