As required for layer stacks, the imagery was resampled to a single scale using cubic convolution. The increase in pixel size brought the grain of the images and that of the field evaluations of vegetation composition to a similar scale, and one that was appropriate for the degree of geopositional error in our measurements .To develop and test the classification methods examined in this study, we used ground-truth points that represented locations with known geographic coordinates in patches in which vegetation was nearly pure weeds or forage. These points were extracted according to specific criteria from a broader multi-year database of vegetation points that included points collected both randomly and to represent particular vegetation types or features across a large watershed. The measures at each georeferenced point included cover of key vegetation groups within 1-m radius circle, as well as individual species of interest. Cover estimates were made using Daubenmire classes: < 5%, 5.1–25%, 25.1–50%, 50.1–75%, 75.1– 95% and 95.1–100%. To create and ground truth the image classifications for our study area, we selected all data points from our vegetation database that were in our study area and for which the vegetation composition was strongly dominated either by invasive weedy or by annual grass forage species, as represented by Daubenmire cover classes of 5 or 6, or equivalent, with no other species present in substantial amounts. In effect, weed drying rack these “pure” points represent the purest patches of each vegetation group we could locate and thus provide the clearest characterization of vegetation properties. Such dense near-monospecific patches are also important targets of weed control efforts.
We identified 98 such points for the 2008 analysis , and 119 for 2009 . For each year, each set of “pure” points was then stratified by ranch property and half of the points within each of the three properties were randomly assigned to a single combined training data point set to support initial vegetation classification, or to a similar test set for later evaluation of classification effectiveness. This stratified approach ensured that training and test ground truth sets shared similar geographic distributions.The success of phenological-based mapping depends on the identification of phenological differences in the spectral properties of target vegetation groups. This study was motivated by field observations indicating that the invasive weedy grass species generally remain green a little longer than the forage grasses at the end of the growing season ,although this window of difference can be very short. To characterize and differentiate the weedy grasses’ phenological signature, we compared NDVI values from the imagery at locations of known weed and forage patches in peak spring and at the end of the season in both study years. To test whether the signal from weed-dominated vegetation was distinct, we used repeated measures MANOVA with NDVI values in March and May as the time-repeated response variables with between-subject factors of year, vegetation type, property, and year x vegetation type and within-subject factors of month, month x year, month x vegetation type, month x property, and month x year x vegetation type. In the primary analysis, we compared values using the mid-May image for 2009 , which was most similar to the date of image acquisition in May 2008. However, because weed senescence may occur quickly in May, we also evaluated how NDVI values differed between two dates in May 2009 . Finally, in addition to considering the March and May NDVI characteristics of the vegetation types, we also considered the March—May NDVI differences.
After characterizing the phenological signature of weed-dominated vegetation, the next step was to determine whether a phenological-based approach could perform consistently enough over time to serve as a reliable tool for multi-year detection of weed persistence, expansion or contraction. Our first questions centered on the choice of imagery inputs. Could a single NDVI image provide enough classification power? If so, which month for image acquisition would be best: March , when vegetation is most likely to be green? Or May , when weed-dominated patches are most visible to a field observer? Alternatively, would using two images improve classification accuracy enough to merit the extra costs and processing time? If so, was it most effective to stack the two images and evaluate the two-layer set simultaneously as two bands of a single image? Or rather was it more effective to create a difference image that would highlight the phenological changes in which we were most interested? ΔNDVI contains less information than stacked NDVI , but that information focuses specifically on temporal NDVI changes relevant to a phenology-based analysis. To test these questions, we compared the robustness of classifications that used these four different types of imagery inputs, all used after conversion to NDVI-analogues: March NDVI alone ; May NDVI alone ; a two-layer stack of March and May NDVI, classified together as two bands of a single image ; and ΔNDVI, a single-band difference image made, as previously described, by subtracting May NDVI from March NDVI . With these four types of inputs, we tested both unsupervised and supervised classification methods to delineate vegetation types. For simplicity, all classifications relied solely on NDVI imagery and ground truth data; none utilized additional information . Throughout, the same mask was used to remove water, trees, roads, and structures from the classification.
To conduct a supervised classification on raster data, the operator must provide the software with information necessary to determine the vegetation type represented by each pixel, often by providing georeferenced “training” sites that exemplify the properties of the target vegetation to be identified. We conducted parallelepiped supervised classifications on all image sets using the training set of weed- and forage-dominated “pure” ground truth points previously described . Iterative testing indicated that the most effective classifications were produced when we used a standard deviation of 2.0 for the weed classes and 1.0 for the forage classes, although a small number of pixels fell outside the standard deviation constraints and were unclassified. We conducted additional supervised classifications in ENVI using maximum likelihood classification, which required at least two image layers per analysis; the maximum likelihood classification was thus conducted only with the two-layer stacked NDVI image inputs . Results were not strongly sensitive to threshold choice; we used 70% for consistency with the unsupervised approach . Unsupervised classification. Unsupervised classification is easier than supervised classification for the operator to initiate, as the computer simply generates the specified number of map classes from imagery inputs using one of several algorithms. However, then the operator must determine which, if any, of the computer-generated map classes best represents the target vegetation type. Here we conducted unsupervised isodata classifications in ENVI 4.7 on all image sets. Each classification was run for 40 iterations with a pixel change threshold of 2.0% to create 8 classes, as previously determined in iterative tests to be effective. After the unsupervised classifications were produced, rolling bench we assigned vegetation types to the computer generated map classes by comparing the distribution of the classes to the same training set of “pure” ground truth points used to produce the supervised classifications. A map class was designated as weed-dominated if at least 70% of the “pure” ground truth points falling within its extent were weed-dominated points. All other classes were designated “Non-Weeds,” which included both forage-dominated pixels and heterogeneous weed-forage mixes. In a few cases, unsupervised classification produced a map class that did not contain any ground-truth points, in which case that class was assigned the identity of the majority class surrounding it. Metrics for comparison of classification accuracies. To quantify classification accuracy, we compared the weed maps produced for each combination of imagery and classification approach with the “test” or “validation” set of ground-truth data points that were distinct from the training points used to produce the classification. Following the classic methods of Congalton, we used an error matrix to calculate four metrics, based on the distribution of weed and non-weed class pixels and ground truth points: Overall accuracy and the Kappa statistic describe the general accuracy of a classification. Overall accuracy is the percentage of test points for which map classes and field data agree, across all map classes. The Kappa statistic adjusts overall accuracy to take into account agreement that might occur solely by chance. It is calculated as: /, where Observed = Overall Accuracy. Expected is calculated as the product matrix divided by the cumulative sum of the product matrix . In the Kappa analysis, we used the Z-test to determine if each classification was better than random at α = 0.05. For the Kappa statistic, values above 0.60 indicate good to excellent agreement between the classification and the ground truth data. Producer’s accuracy measures the percentage of a specific target vegetation on the ground that the map properly describes .
It is calculated as the percentage of ground-truth points correctly identified as the target vegetation out of the total set of ground-truth points for that vegetation type. For example, if there are 100 ground-truth points on the ground that represent weed-dominated vegetation and only 80 of them fall within the map’s weed class, then the producer’s accuracy for the weed class in that map is 80% and the map has missed 20% . User’s accuracy describes the purity of a specific map class. For example, if the map says that a particular area is best classified as weed-dominated, how true is this in the field? What percentage of the vegetation in the field area corresponding to the map class “Weeds” is in fact weed-dominated? User’s accuracy for a target vegetation type is calculated as the percentage of all ground-truth points within the field area corresponding to a map class that correctly match the map class type. For example, if within the field area delineated by the map’s weed-dominated class there are 80 ground-truth points representing weed-dominated vegetation, but also 40 ground-truth points representing vegetation dominated by other species or by vegetation mixes, then the user’s accuracy for weed-dominated cover is 80/ or 66.7%. To identify which mapping approaches were most robust to changing environmental conditions, we calculated the accuracy metrics for each classification method x NDVI image input type combination for each of the two years and the two May dates . For each combination, we then calculated the mean and coefficient of variation of the accuracy metrics that resulted from using imagery inputs from these different dates. In our judgement, the best mapping approach would combine high accuracy with strong consistency , which would support its application in study of cover changes over time.After determining which mapping approach had the greatest and most consistent accuracy, we used this approach to compare weed distributions in 2008 and 2009 across the study site. We evaluated the percentages of the landscape that were dominated by weeds in each year and how much gain or loss of weed-dominated area occurred between years. We then analyzed cover distributions by management unit. The study site included four separate management units that represented a serendipitous pre-existing gradient of grazing intensity from west to east. At the time of imagery acquisition, the westernmost management unit was used for turkey hunting and for more than five years had experienced no grazing, except by an occasional animal that broke through a neighboring fence. A second central management unit on a separate property had likewise been set aside for most of the preceding five years, and had only been grazed briefly on a few occasions by a small number of sheep. In contrast, another unit on that same property had been moderately grazed by sheep and cattle on a regular basis; and the eastern unit had been moderately to intensely grazed for more than 8 years by sheep, cattle, and goats, largely with managed intensive rotational grazing. For M3 and M4, the estimated mean stocking rates were 0.4–1.3 animal units ha-1; short term stocking rates in sub-areas of M4 were higher during rotations. As a case study to evaluate application, we used the most effective mapping approach to compare the extent of weed-dominated cover in the largely ungrazed management units with that in the regularly grazed units . While the units we studied were not established with experimental research in mind and replication was limited, each grazing category spanned similar soils and topography and included management units from two different properties. Moreover, the study site offered an opportunity for realistic application: the management units were large and part of independent working ranches managed for diverse commercial purposes.