Spatiotemporal Analysis of Rice Yield Variability in two California Fields
Alvaro Roel and Richard E. Plant* Alvaro Roel1 , Graduate Group in Ecology, University of California, Davis, CA 95616, USA; Richard E. Plant, Departments of Agronomy and Range Science and Biological and Agricultural Engineering, University of California, Davis, CA 95616, USA
1
Present address: Instituto Nacional de Investigaciones Agropecuarias, Treinta y Tres,
Uruguay Received -- -- -- * Corresponding author:
[email protected] Acknowledgements We are very grateful to the cooperating farmers, Charley Mathews and Charley Mathews, Jr., for allowing us to carry out research on their farm. We are grateful to David Clay and to an anonymous reviewer for many helpful comments. This research was supported by the California Rice Research Board, by the Instituto Nacional de Investigaciones Agropecurarias of Uruguay, and by a William F. Golden Fellowship to A. Roel.
1
1
Spatiotemporal Analysis of Rice Yield Variability in two California Fields
2 3 4
Abstract Currently, little is known about the spatial and temporal variability of rice (Oryza
5
Sativa L.) yield patterns. This information is needed before implementing any site-
6
specific management strategy. The objective of this research was to characterize the
7
spatial and temporal yield variability of rice grown in commercial fields in California.
8
Rice cultivars M-202 and Koshihikari were grown and managed by a cooperating farmer,
9
who collected yield monitor data over a 4- year period. Alternative methods of data
10
quality analysis were applied to the data. To evaluate temporal variability yields from
11
different years must be placed on a common grid. The appropriate size of for these grids
12
was tested. Large-scale spatial structure was determined using median polish, while
13
small-scale spatial structure was evaluated by computing variograms of the yield
14
residuals after subtracting the trends. Temporal variability was determined using 2
15
approaches: 1) computing the variance among years of the original, trend and residual
16
yield values at fixed points in the field; and 2) cluster analysis of the standardized trend
17
yield values. Results from the study showed that the grid density necessary to capture the
18
spatial variability was site and year dependent. Trend surface spatial behaviors were year-
19
dependent, indicating a lack of temporal stability. Variograms showed strong spatial
20
structure of yield residuals. Cluster analysis reduced the considerable complexity in a
21
sequence of yield maps of these fields to a few general patterns of among year variations.
22
2
1
Abbreviations: DGPS: differential global positioning system; GR: grid resolution; MC:
2
Moran coefficient, RIC: relative information criterion; UTM: universal transverse
3
mercator.
4
Introduction
5
Yield monitoring and mapping technology that can measure, georeference, and
6
record grain yields makes it possible to document the location and magnitude of yield
7
variability with a spatial precision of meters. If the causes of this variability can be
8
identified then corrective action may be implemented to reduce costs, increase yield,
9
and/or reduce environmental impacts by adopting site-specific management practices
10
(Lowenberg- DeBoer and Erickson, 2000). Moreover, the availability of high precision
11
measurements may permit researchers to more efficiently test hypotheses by precisely
12
measuring crop response to environmental conditions as these conditions vary in the field
13
(Bhatti et al., 1991; Grondona and Cressie, 1991; Long, 1998). Both of these uses of
14
precision measurement technology require statistical methods that until now have more
15
commonly been employed in ecological, epidemiological, and econometric research
16
(Long, 1998; Griffith and Layne, 1999; Bongiovani and Lowenberg-DeBoer, 2001).
17
Yield map data quality is an important initial issue for farmers and even more so
18
for scientists and engineers who wish to use these data in the course of a scientific study.
19
Blackmore and Marshall (1996) identify six primary sources of error in yield map data:
20
unknown crop width at the header, time lag in the harvester, GPS error, grain surge, grain
21
losses, and sensor accuracy and calibration. Birrell et al. (1996), Blackmore and Marshall
22
(1996), Doerge (1999), Colvin and Arslan (2001), and Haneklaus et al. (2001) provide
23
discussions of the errors associated with yield maps.
3
1
It is a well-known property of spatial data that their statistical properties depend
2
on the scale at which they are represented. This is often called the “modifiable areal unit
3
problem” (Openshaw and Taylor, 1981; Wong, 1995).Yield monitor data are generally
4
recorded periodically (e.g., once per second). To compare data from different sources,
5
(e.g., yield monitor data from different years, bulk soil electrical conductivity data, and
6
aerial image data) these data must be converted to a common grid. In choosing the grid
7
size there is a tradeoff between maintaining spatial precision by selecting a fine grid and
8
reducing noise and making the data more manageable by selecting a coarser grid (Wong,
9
1995; Long, 1998). Since variability may be studied at any spatial scale, the choice of
10
grid size is dependent of the aims of the investigation. In making this choice the
11
investigator is aided by knowledge of how much information is lost in moving from one
12
scale to a larger one. One way of determining the effect of increasing grid size is to
13
examine the experimental variograms of the measured quantities (Isaaks and Srivastava,
14
1989). Long (1998) examined the change in correlation coefficients between yield and a
15
second variable as the grid size increased. Chen et al. (1995) developed the Relative
16
Information Criterion (RIC) to quantify efficiency of data representation. They defined
17
the RIC as the correlation coefficient between block-kriged estimates based on the
18
highest grid density and estimates obtained at the same points based on reduced grid
19
densities.
20
One important application of yield map data is the study of the spatiotemporal
21
properties of yield distribution. These properties should be understood before
22
implementing any site-specific management strategy (Bakhsh et al., 2000). From a
23
scientific perspective understanding spatiotemporal variation in yield may help to
4
1
determine the factors underlying yield variability. Several studies of spatiotemporal
2
patterns in terrestrial crops have been carried out. In corn (Zea mays L.) and soybean
3
(Glycine max L.) Jaynes and Colvin (1997) reported that yields displayed substantial
4
spatial and temporal variability. This variability may be due to interactions between
5
climatic growing conditions, soil properties, and the crop (Porter et al., 1998).
6
There have been few studies on the spatiotemporal variability of rice yields
7
(Doberman et al., 1995). Rice is one of the world’s most important staple crops. The
8
development of effective site-specific management techniques for rice production could,
9
if it increases production efficiency, contribute to increased yields (Doberman et al.,
10
2002). The study of spatiotemporal variability in rice production also provides
11
scientifically useful information as a comparis on with terrestrial crops. Eghball and
12
Power (1995) found that rice yields displayed less year-to-year variability than terrestrial
13
crops. Since rice is grown as a monoculture in California, this production system
14
provides a particularly good environment to analyze the evolution of yields in space and
15
time.
16
Yield variability is often defined in terms of summary statistics such as temporal
17
variance and spatial variance (Whelan and McBratney, 2000). To fully characterize the
18
spatiotemporal behavior of a field, however, one must also understand the tendency of
19
yield patterns to persist season after season, that is, the temporal stability of the spatial
20
pattern. Correlation coefficients between years are often relatively small (Jaynes and
21
Colvin, 1997). Moreover, determining the significance of a correlation coefficient is
22
complicated due to the effect of spatial autocorrelation of the data (Cliff and Ord, 1981).
23
One way to evaluate temporal variability is to compute the temporal variances of yield
5
1
values at fixed points in the field (Whelan and McBratney, 2000). By comparing the
2
among seasons (temporal) variances with the within season (spatial) variances the
3
importance of both components can be estimated. Cluster analysis of annual yields can
4
also be used to assess temporal patterns (Lark and Stafford, 1997; Lark et al., 1999;
5
Perez-Quezada et al., 2003). Cluster analysis may provide an objective quantification of
6
the spatial structure of yield patterns as well as an indication of the consistency of these
7
patterns from year to year (Perez-Quezada et al., 2003).
8
Our study was carried out to perform a spatial and temporal analysis of four years
9
of rice yield monitor data collected in two California fields. The first objective of this
10
study was to examine different methods for evaluating the quality of the data set. The
11
second objective was to determine the grid resolution that captures enough information to
12
represent yield spatial variability at a scale appropriate to management. The third
13
objective was to assess the usefulness of summary statistics in quantification of the
14
spatiotemporal variability. Finally, the fourth objective was to examine the use of
15
clustering to delineate areas in the field with different spatiotemporal yield behaviors.
16
Materials and Methods
17
The study was carried out from 1998 through 2001 in two rice fields
18
approximately 2 km apart, one of 38 ha (denoted Field 1) and one of 52 ha (denoted Field
19
2), located near Marysville, CA (UTM Zone 10, coordinates: E: 627,102, N: 4,340,769;
20
and E: 624,970, N: 4,341,076 for Field 1 and 2, respectively). The soils of the study fields
21
consist of approximately 45% Kimball loam (fine, mixed, active, thermic Mollic
22
Palexerafls), 30% San Joaquin loam (fine, mixed, active, thermic, Abruptic Durixeralfs),
23
and 25% Bruella loam (fine- loamy, mixed, Ultic Palexeralfs). Medium grain rice (Oryza
6
1
Sativa L.) cultivar M-202 and short grain cultivar Koshihikari were grown and managed
2
by the cooperator in Fields 1 and 2, respectively, using standard practices for the area
3
(Hill et al., 1992). The fields were selected based on conversations with the cooperating
4
grower with the object that one field would tend to have spatially uniform properties and
5
the second would be very heterogeneous. Fig. 1 shows gray scale renditions of false color
6
infrared aerial images taken of the two fields during the first year of the experiment.
7
These images were taken using Kodak 2443 infrared film on August 18, 1998 and are
8
shown prior to georeferencing. The dark areas in Field 2 correspond to areas of very
9
sparse vegetation. The field had recently been laser leveled and brought into production,
10
and these dark areas in the image were areas where considerable topsoil had been cut off
11
in the leveling process. The fields were managed without regard to spatial variability with
12
one exception: the consistently poor yielding portion of Field 2 was observed by the
13
grower to mature earlier than the rest of the field, and so the grower harvested this area
14
first. In 2001 the grower harvested this portion of the field five days earlier than the rest
15
of the field.
16
Rice was harvested using a stripper harvester combine equipped with a John
17
Deere Green Star® yield mapping system with real- time differential global positioning
18
system (DGPS) receiver in all years. The harvester followed a back-and- forth north to
19
south harvest pattern in Field 1 and a series of concentric patterns in Field 2. Combine
20
speed ranged from 1.1 to 1.25 m sec-1 , header width ranged from 5.5 to 6.7 m, grain
21
flows were recorded once per sec, and moisture content was recorded once per 15 sec.
22
Yield map data files (yield, grain, moisture, longitude and latitude) were collected and
23
imported into the ArcView® (ESRI, Redlands, CA) geographic information system for
7
1
analysis. The yield monitor was calibrated at the start of each harvest season by
2
comparing with a sample of known weight, and yield measurements were adjusted based
3
on this calibration. Yield monitor data points for a distance of approximately 50 m from
4
field edges were deleted from the data set to remove border effects and end-of- field yield
5
monitor errors. Aerial images were taken approximately mid July and mid August of each
6
year, using Kodak 2443 infrared film in the first year and a four band multispectral digital
7
camera in each other year, and were georeferenced to boundary files made using a
8
Trimble Ag 132 GPS.
9
Data quality analysis
10
Yield monitor data were imported into ArcView GIS as point shapefiles (each
11
point representing the grain flow and moisture measurements in an area of size
12
approximately 1m by 6m). GPS accuracy and consistent header width were verified by
13
visual inspection of the records. The John Deere Green Star yield monitor contains built-
14
in proprietary software to correct for time lag in the harvester (D. Goebel, John Deere
15
Co., personal communication), so we did not attempt to further correct for this. We could
16
not measure errors due to grain surge or losses, so this left sensor calibration errors as the
17
remaining source of inaccuracy we could detect.
18
Grain flow rates and moisture content were converted to yield at constant 14%
19
moisture. Considering the sensor record as a time series, we tested for two ways in which
20
sensor calibration could change during the course of the harvest of a single field: a
21
gradual trend and a sudden change. A convenient method of trend analysis in time series
22
data is by studying the differences between successive records (Kendall and Ord, 1990).
23
The data sets consisted of between approximately 50,000 and 90,000 values. We first
8
1
removed outliers by visual inspection of the data record. To make the data more
2
manageable and remove the statistical problems associated with very large data sets
3
(Matloff, 1991) we next selected every tenth point from the data records for time series
4
analysis.
5
The result ing data sets each consisted of a sequence of time series in which
6
discontinuities in the GPS clock time indicated points in which the harvest had been
7
stopped and then restarted. Yield data were plotted against GPS clock time. The resulting
8
plot was then inspected first for sudden changes in calibration and second for evidence of
9
a trend (Haneklaus et al., 2001). We assumed that sudden changes in calibration would
10
occur at discontinuity in the GPS clock time (indicating that the harvest had been stopped
11
and restarted). Because in Field 2 the cooperating grower harvested the low yielding
12
areas first and then the better yielding areas, an abrupt change in yield at a gap in GPS
13
time in Figure 2 does not necessarily represent a calibration artifact. To determine which
14
if any of the records showed an evident change in yield monitor calibration the yield
15
records were visually compared with false color infrared aerial images of the field, with a
16
change in calibration being indicated if the change in yield did not correspond to a
17
change in vegetation intensity in the aerial image.
18
The comparison was carried out as follows. In each year and for each field the
19
raw data values were displayed as points in the GIS. The display was examined for
20
evidence of abrupt changes in yield trend that occurred at discontinuities in the GPS time.
21
A false color aerial image of the field was then examined to determine if there was a
22
corresponding change in infrared reflectance at this location. Because of the difficulties
23
in estimating yield from aerial images (Plant et al., 2001), this process was done visually
9
1
rather than statistically. If there was no change in reflectance properties corresponding to
2
a change in yield, the change was assumed to be caused by a sudden change in calibration
3
of the yield monitor. In this case the yield data after the jump in GPS time was adjusted
4
by multiplying by a constant value to bring the yield trends before and after the change
5
into visual alignment.
6
To test for gradual drift or trend in the data the yield records were differenced and
7
the differences were tested against the null hypothesis of zero mean using the sign test.
8
The third data quality test was carried out in Field 1 to take advantage of the back and
9
forth harvest pattern. The test was carried out at 10 randomly selected locations in the
10
field. It consisted of comparing the mean of a sequence of 10 yield values with the mean
11
of the sequence of 10 values at points immediately to the left of the first 10. Confidence
12
intervals were computed for the means based on the standard deviation of the 10
13
sequential yield values. These confidence intervals are not exact due to the
14
autocorrelation of the data.
15
Spatial resolution analysis
16
Yield point shapefile data were interpolated to a fixed 5×5 m grid using inverse
17
distance weighted interpolation with power 2 and number of neighbors 12. This grid
18
provided a set of locations of the yield data that was consistent over the four years and
19
approximated the spacing of the original yield data. These interpolated grids were used as
20
the starting point for the analyses. Our primary interest was in studying variability on a
21
scale of tens of meters to eliminate very short-range effects while still maintaining the
22
ability to distinguish important patterns of spatial variability. Therefore, three different
23
regularly spaced square point grids of size 90 m, 60 m, and 30 m were used to extract the
10
1
values from the interpolated yield maps in both fields. These three grids form resolutions
2
with numbers of grid points: n=36, n=76, n=292 and n=42, n=93, n= 402 for the 90, 60
3
and 30 m grid in Field 1 and 2 respectively. Each larger grid was made up of a subset of
4
the smaller grids.
5
Large-sale variability (trend) and small-scale variability were separated by
6
detrending the data, us ing the median polish technique (Cressie, 1991; Jaynes and Colvin
7
1997; Bakhsh et al, 2000). Median polish by rows and columns may not capture the
8
entire large-scale trend, as the trend orientation is not known a priori (Cressie 1991).
9
Therefore, an additional term was included in the median polish equation to detect any
10
further trend in the polished data as described by Cressie (1991) and Jaynes and Colvin
11
(1997). No significant further trends were detected in any case in which this procedure
12
was used in our data sets.
13
For the characterization of the small-scale variation experimental variograms
14
were calculated using the residual yields from the median polish. Variography analyses
15
assume the data have a Gaussian distribution. Therefore, the distributions of the yield
16
residuals for the four years were checked using stem and leaf plots and normal
17
probability plots. Outliers, indicated by stem and leaf plots, were removed and replaced
18
by kriged estimates following the analysis of Bakhsh et al. (2000). In Field 1, 15, 22, 11,
19
and 12 values for the 30 m grid and 9, 4, 0 and 0 values in the 60 m grid were removed in
20
the 1998, 1999, 2000 and 2001 yield data sets, respectively. In Field 2, 11, 13, 9 and 22
21
values for the 30 m grid were removed in the 1998, 1999, 2000 and 2001 yield data sets,
22
respectively. In the 60 m grid 4 values in 1998 and 13 in 2001 yield data sets were
23
removed. Experimental variograms were calculated from the yield values. All sample
11
1
variogram computations and model fitting were performed using GS+ v. 5.0 software
2
(Gamma Design Software, Plainwell, MI) assuming isotropic conditions. The isotropic
3
assumptions were verified by variography analysis in different orientations. Only lags
4
less than half the total grid length were computed. A theoretical variogram model was fit
5
to each experimental variogram. Different models were tested for fitting the data (Isaaks
6
and Srivastava, 1989), and the isotropic spherical model provided a good fit in each case.
7
One method for estimating the appropriate resolution for spatial analysis was to base it on
8
the ranges of the fitted variograms.
9
A second method for determining the appropriate cell size for analysis is the
10
relative information criterion (RIC) (Chen et al., 1995). In this paper we use the square of
11
the RIC as defined by Chen et al. (1995). The RIC 2 is the square of the product moment
12
correlation coefficient of kriged residuals from the same locations in the field, viz.: RIC 2 (i ) = r 2 ( k 3 , k i ), i = 1,2
13 14
In this equation r2 (k j,k i)denotes the square of the correlatio n coefficient of the block-
15
kriged yield residual estimates using GR i and j. The RIC 2 as we use it gives the fraction
16
of the sample variation using GR j that is “explained” if it is related to GR i using a
17
simple linear regression.
18
Block kriging was used to interpolate the residuals from the different grids.
19
Following Chen et al. (1995), block kriging with a block size equal to a 4 ×4- m plot was
20
used to produce block means of yield residuals, from which distribution maps of yield
21
residuals were developed. The block size was chosen to approximate as closely as
22
possible the original cell size of the interpolated grid, which could not be fit exactly due
23
to limitations of the software. The search radius in the kriging procedure was 350 m in 12
1
both fields. In this procedure we first estimated the variogram as described above and
2
block-kriged using all locations in both fields, corresponding to a grid resolution (GR) of
3
30 m (i = 3). This procedure was repeated at GR of 60 and 90 m (i = 2 and 1).
4
Computation of summary statistics
5
All temporal analysis was carried out on the 30 m grid. The four seasons worth of
6
data do not provide the opportunity to study large versus small scale temporal variability;
7
any long term variability would only be evident with a longer temporal record (Eghball
8
and Power, 1995). We therefore focused on the stability of the spatial structure over the
9
four years. We computed the mean temporal variance and the mean spatial variance of
10
each field. These statistics are defined as follows. Let N be the total number of grid cells
11
and T total number of years. Let Y(n,t) be the yield in grid cell n in year t, and let YT (n)
12
be the mean yield of cell n over all years. The temporal yield variance σ T2 ( n ) of cell n is
13
the variance in yield over the T years,
σ T ( n) = 2
14
1 T 2 [Y ( n, t ) − YT ( n)] , ∑ T − 1 n=1
and the mean temporal variance is the average of these grid cell variances over all cells,
σT = 2
15
(1)
1 N
N
∑σ
2 T
( n) .
(2)
i =1
The spatial yield variance is the yield variance over all grid cells,
σ S (t ) = 2
1 N ∑ [Y (n, t ) − YN (t )]2 , N − 1 n=1
13
(3)
1
where Y N (t ) is the mean yield in year t over the N cells, and the mean spatial variance is
2
the average of these spatial variances over all years, given by
σS = 2
1 T
T
∑σ
2 S
(t) .
(4)
t =1
3
Finally, it is of interest to determine the extent to which temporal variability is distributed
4
over a field. The summary statistic for this is the standard deviation of the temporal
5
variance, given by
sT = {
6
1 [σ T2 ( n ) − σ T2 ]}1/ 2 . ∑ N −1
(5)
Cluster Analysis
7
We used the k- means clustering algorithm (Jain and Dubes, 1984). This algorithm
8
forms a fixed number k of cluster groups by selecting those clusters that minimize within
9
group variances and maximize the between group variance. Cluster analysis was carried
10
out on the trend data obtained by median polish. Yields were standardized by subtracting
11
the mean and dividing by the standard deviation to avoid emphasis on data from higher
12
yielding years. Cluster analysis was carried out using Statistica (StatSoft, Tulsa, OK).
13
Results and Discussion
14
Data quality analysis
15
Figures 2 and 3 show the yield monitor records presented as time series, or, more
16
precisely, as connected sequences of time series, plotted in the order in which the data
17
were entered into the record. The ordinate of each axis is the difference between the GPS
18
time in seconds of the corresponding record and the lowest GPS time of all the records in 14
1
the data set. Arrows mark the points at which there is an abrupt change in the GPS time,
2
indicating that the harvest had been stopped and resumed. Visual inspection of the figures
3
indicate no obvious anomalies in any of the data sets except that of Field 2 in 2001. This
4
data set shows evidence of poor quality, including apparently truncated values in the
5
early part of the data set. Based on this evidence we elected to remove the Field 2 2001
6
data set from most of the analysis.
7
Based on the analysis for abrupt changes in calibration we concluded that the
8
abrupt changes in Field 2 all corresponded to actual abrupt changes in yield, but that the
9
abrupt change in yield in Field 1 in 1998 at to the second GPS time gap did represent a
10
calibration change. This record was adjusted accordingly. Specifically, all data values
11
occurring after the second GPS time gap were multiplied by 1.37, which brought the data
12
set into visual alignment.
13
The second test for data quality was the check for gradual drift of the calibration.
14
Tests of the null hypothesis m = 0 where m is the median of the yield differences yt+1 – yt
15
were not significant (p > 0.05) for Field 1 in any year. They were significant (p < 0.05)
16
for Field 2 in 1999 in 2001. The 2001 data set had been removed from analysis already.
17
Comparison of 1999 data with the corresponding aerial image led us to conclude that the
18
trend was due to actual conditions in the field rather than sensor drift.
19
Figure 4 shows the means and confidence intervals for the years with the most (5
20
in 2000) and fewest (2 in 1998) overlapping confidence intervals in 10 randomly selected
21
sequences of 10 points. In general the agreement between subsequent passes of the yield
22
monitor at the same location was poor. This same high level of local spatial variability in
23
the data may be observed in the data presented by Searcy et al. (1989) for grain yield and
15
1
was reported by Searcy et al. (2000) for cotton yield and mentioned by Perez-Munoz and
2
Colvin (1996) for grain yield. It was also observed in data recorded for a variety of crops
3
by Perez-Quezada et al. (2003), and indicates that yield monitor data may be an imprecise
4
predictor of yield at specific locations in the field.
5
Grid Resolution
6
Figures 5 and 6 show the sequences of yield maps of the four years for Field 1
7
and 2, respectively, with data corrected as described in the Methods section and
8
interpolated to a 30 m grid. Yield in Field 1 has no consistent visually apparent pattern
9
over the years, while in Field 2 there are persistent low and high yielding areas. The low
10
yielding areas correspond to cut areas of the laser leveling process, which are the dark
11
areas in Figure 1(b). In both fields the 30 m grid appeared to visually correspond to
12
patterns of spatial variability observed in infrared aerial images of the field.
13
Figure 7 shows the variograms of the median polish residuals for the 30 and 60 m
14
GR of the 1998 data sets in Field 1 The variograms for Field 2 are almost identical. These
15
are consistent with the variograms for the other years, which are not shown. To facilitate
16
comparison the sill and nugget are scaled to the sample variance for each variogram. The
17
variogram for the 90 m grid, which consisted of only 3 points, displayed no structure
18
(pure nugget). The 90m grid was therefore not considered adequate for representing the
19
spatial structure of yield variability.
20
Table 1 gives the parameters of the variogram fit to the median polish residuals
21
using the spherical model for each of the years for the 30 and 60 m grid resolutions for
22
Fields 1 and 2 respectively. Table 2 gives the RIC 2 values for each field. RIC 2 values in
23
both fields vary from year to year in a manner that corresponds to the variogram ranges
16
1
displayed in Table 1. The values in the Table 2 may be interpreted as the portion of the
2
variance of the data in the 30 m grid explained by a simple linear regression of 60 m on
3
the 30 m grid. Comparison of the data in Table 1 with the data of Table 2 indicates that
4
that efficiency of representation is of the grid is directly related to the spatial continuity
5
of the variable, i.e., that the 60 m grid captures more of the information of the finer grid
6
in cases where the range of the variogram is smaller. Our results agree with those of Chen
7
et al. (1995). Since RIC 2 analysis indicated that a substantial amount of information may
8
be lost in some years by reducing the grid density all further analysis were done only with
9
the 30 m grid.
10
Summary statistics
11
Table 3 contains the spatial variances σs2 (t) of the total yield, the median polish,
12
residuals, and the covariance between median polish and residual values. The variances
13
of the residuals, which are a combination of sampling error and short-range effects, are of
14
the same magnitude as those of the median polish. There is a negative relationship
15
between large and small- scale values in Field 2, reflecting a tendency of the lower
16
yielding areas to have smaller short-range variability. Table 4 gives the mean temporal
17
variance (equation 2), standard deviation of the temporal variance (equation 5) and the
18
ratio between the mean temporal and the mean spatial variance (equation 4) for the
19
original, trend and residual yield values. Computations for Field 2 were carried out using
20
the years 1998 - 2000 only. Mean temporal variance values over the seasons are larger
21
than the mean spatial variances for the original and trend (large-scale) yield values in
22
Field 1. Field 2 presented both higher spatial and temporal variance values.
17
1
Figures 7 a and b show the spatial distribution of the temporal variance (equation
2
1) computed with the trend yield values in Field 1 and 2, respectively. To visualize
3
different variance behaviors, the magnitude of the variances were classified in three
4
different categories: values less than one, between one and two, and greater than two (Mg
5
ha-1 )2 . These may be considered as representing low, intermediate and high variability
6
behaviors, respectively. To aid in visualization the figures are drawn using Thiessen
7
polygons surrounding each grid point. Visual inspection of the figures shows that the
8
number of locations with high temporal variability was very small in Field 1 and that
9
most of the points with high temporal variability were concentrated in the southwest
10
corner of the field. Most of the rest of the field (184 of the 292 polygons or 63%) had a
11
small temporal variance (< 1 Mg ha-1 ). Figure 7b indicates that Field 2 displays very
12
different behavior from Field 1. In this field most of the polygons (296 of the 402 or
13
73%) had a large temporal variance (> 2 Mg ha -1 ).
14
Cluster analysis
15
Figure 8 shows the results of the k- means cluster analysis, with k equal 3. The
16
figure gives the values of the standardized yields in each of the clusters. In Field 1 cluster
17
1 is composed of locations of the field with yield performance above field yield average
18
in 1998, around field yield average in 1999 and below field yield average in 2000 and
19
2001; cluster 2 is composed of locations in the field that presented low yield performance
20
in 1998 and 1999, average performance in 2000 and were above field yield average in
21
2001; and cluster 3 is composed of locations in the field with yield performances similar
22
to or above the annual field average in all years. The three clusters in Field 2 consist of a
18
1
consistently high yielding area, a consistently moderate yielding area, and a consistently
2
low yielding area (Figure 8b).
3
Figures 9a and b show the spatial location of the three different cluster groups in
4
Figure 8a and 8b, respectively. These figures allow the visualization of locations in the
5
field classified in each cluster and give an estimation of the importance of the spatial
6
configuration of these clusters. The clustering procedure does not use any information
7
regarding the spatial location of the data sets. Nevertheless, Figure 9 shows that the three
8
different cluster groups tend to be spatially grouped. The Moran Coefficient (MC) can be
9
used to test the null hypothesis of zero spatial autocorrelation (Long, 1998). The
10
distribution of cluster values was significantly different from random in both fields
11
(MC=0.673, Z value = 72.0, p = 0.0029 in Field 1 and MC 0.620, Z value = 69.0, p=
12
0.0031 in Field 2).
13
Comparing Figures 5, 8a, and 9a shows that although Field 1 appears to have
14
highly disorganized yield behavior, there are yield patterns that emerge from the cluster
15
analysis. Comparing Figures 6, 8b, and 9b indicates that the organization of clusters in
16
this field closely matches the consistent pattern observed in the original data. This
17
indicated that yield patterns in Field 1 were associated with transient phenomena, while
18
yield patterns in Field 2 were associated with more permanent factors.
19
Summary and Conclusions
20
The study of spatial yield variation of field crops in commercial fields may be
21
addressed at several scales. Yield data collected with commercial yield monitor clearly
22
cannot be used to study variation at a scale smaller than the resolution of the harvesting
23
equipment. The results of our study, in agreement with those reported by others, indicate
19
1
that commercial yield monitors may not provide useful information at the scale of one or
2
two multiples of the width of the harvester. Nevertheless, our results provide an
3
indication that at a somewhat larger scale (30 m in our case) the commercial yield
4
monitor can provide useful information about patterns of relative yield. In Field 2 the
5
pattern of consistently low yield in certain areas matched observations of aerial infrared
6
photographs and was explainable in terms of the laser leveling history of the field. The
7
behavior in Field 1 was more complex and dynamic but cluster analyses indicated a
8
strong spatial pattern.
9
Our analysis was based on the interpolation of yield mo nitor data into a regular
10
grid, and then examining the properties of this grid. This is a commonly used procedure
11
(Birrell et al., 1996; Swindell, 1997; Long 1998; Perez-Quezada et al., 2002). Although
12
this is technically a geostatistical procedure, and has been analyzed by Long (1998) as
13
such, the dense pattern of the data also justify considering it from a lattice perspective. In
14
this context each data point may be considered as an estimate of the mean of the yield
15
distribution within a lattice cell whose width is the width of the harvester and whose
16
length is the distance traveled. The interpolation process then becomes one of changing
17
the scale and extent of the lattice cells to a regular grid. Openshaw and Taylor (1981) and
18
Long (1998) have investigated the effects of such changes and have shown that they
19
profoundly affect the autocovariance structure. Thus conclusions about spatial
20
autocorrelation of such resampled data apply only to aggregation at the scale in which the
21
study was carried out.
22 23
Variogram and RIC 2 analysis indicated that the grid density required to capture the spatial variability was different for different years. The low RIC 2 values in both fields
20
1
coincide with the smaller variogram ranges observed in those years. Visual inspection of
2
the yield maps of the two fields (Figures 5 and 6) supports the idea that Field 1 had a less
3
consistent spatial structure than Field 2. Trend surfaces, which followed the same patterns
4
as Figures 5 and 6, were very dissimilar among years for Field 1. For Field 2 trend
5
surfaces were quite similar from 1998 to 2000. The estimates of temporal variance over
6
the four seasons are larger than or similar to the mean spatial variances for the original
7
and trend (large-scale) yield values in both fields. This indicates that the magnitude of
8
both components of the spatiotemporal variance can be of equal importance in these
9
fields. Thus, the temporal component of the spatial variability must be taken into account
10
to make effective site-specific management decisions. Altho ugh both fields presented
11
similar temporal to spatial ratios, there were differences between them. Field 2 presented
12
both higher spatial and temporal variance values than Field 1, although it had an
13
obviously persistent spatial pattern. This indicates that simple summary statistics may not
14
capture the spatiotemporal structure of the field with sufficient descriptiveness to make
15
them useful by themselves.
16
The visually persistent spatiotemporal pattern displayed by Field 2 was captured
17
by a k- means cluster analysis. Moreover, the cluster analysis also captured a significant
18
spatial structure in Field 1, even though this field did not display a corresponding
19
persistent pattern. This indicates that cluster analysis may play a useful role in organizing
20
and simplifying the complex spatiotemporal behavior of yield trends over time. This may
21
aid in identifying the underlying causes of this behavior.
22
21
1
REFERENCES
2
Bakhsh, A., D. B. Jaynes, T. S. Colvin, and R. S. Kanwar. 2000. Spatio-temporal analysis
3 4
of yield variability for a corn-soybean field in Iowa. Trans. ASAE 43: 31-38. Bhatti, A. U., D. J. Mulla, F. E. Koehler, and A. H. Gurmani. 1991. Identifying and
5
removing spatial correlation from yield experiments. Soil Sci. Soc. of Am. J. 55:
6
1523-1528.
7
Birrell, S. J., K. A. Sudduth, and S. C. Borgelt. 1996. Comparison of sensors and
8
techniques for crop yield mapping. Computers and Electronics in Agric. 14:215-
9
233.
10
Blackmore, B. S., and C. J. Marshall. 1996. Yield mapping: errors and algorithms. p.
11
403-416 In: P. C. Robert, Rust et al. (ed.). Proc. Third Int. Conf. on Precision
12
Agriculture. ASA, Madison, WI.
13
Bongiovani, R., and J.Lowenberg-DeBoer. 2001. Nitrogen management in corn using
14
site-specific crop response estimates from a spatial regression model. unpaginated
15
CD In P. C. Robert et al. (ed.), Proc. Fifth Int. Conf. on Precision Agriculture.
16
ASA, Madison, WI.
17 18 19 20
Chen, J., J. W. Hopmans, and G.E. Fogg. 1995. Sampling design for soil moisture measurements in large field trails. Soil Sci. 159: 155-161. Cliff, A. D., and J. K. Ord. 1981. Spatial Processes: Models and Applications. Pion Ltd., London, UK.
21
Colvin, T. S., and S. Arslan. 2001. A review of yield reconstruction and sources of errors
22
in yield maps. In P. C. Robert, Rust, R. H., and Larson, W. E., (ed.), unpaginated
22
1
CD In P. C. Robert et al. (ed.), Proc. Fifth Int. Conf. on Precision Agriculture.
2
ASA, Madison, WI.
3
Cressie, N. A. C. 1991. Statistics for Spatial Data. Wiley, New York.
4
Dobermann, A.; M. F. Pampolino, and H. U. Neue 1995. Spatial and temporal variability
5 6
of transplanted rice at the field scale. Agron. J. 87:712-720 Dobermann, A., C. Witt, D. Dawe, S. Abdulrachman, H. C. Gines, R. Nagarajan, S.
7
Satawathananont, T. T. Son, P. S. Tan, G. H. Wang, N. V. Chien, V. T. K. Thoa,
8
C. V. Phung, P. Stalin, P. Muthukrishnam, V. Ravi, M. Babu, S. Chatuporn, J.
9
Sookthongsa, Q. Sun, R. Fu, G. C. Simbahan, and M. A. A. Adviento. 2002. Site-
10
specific nutrient management for intensive rice cropping systems in Asia. Field
11
Crops Res. 74: 37-66.
12
Doerge, T. A. Yield map interpretation. 1999. J. Prod. Agric. 12:54-61.
13
Eghball, B., and J. F. Power. 1995. Fractal description of temporal yield variability of 10
14 15 16 17 18 19
crops in the United States. Agron. J. 87: 152-156. Griffith, D. A., and L. J. Layne. 1999. A Casebook for Spatial Statistical Data Analysis. Oxford University Press, Oxford, UK. Grondona, M. O., and N. Cressie. 1991. Using spatial considerations in the analysis of experiments. Technometrics 33: 381-392. Haneklaus, S., H. Lilienthal, E. Schnug, K. Panten, and E. Haveresch. 2001. Routines for
20
efficient yield mapping. unpaginated CD In P. C. Robert et al. (ed.), Proc. Fifth
21
Int. Conf. on Precision Agriculture. ASA, Madison, WI.
22 23
Hill, J. E., S. R. Roberts, D. M. Brandon, S. C. Scardaci, J. F. Williams, C. M. Wick, W. M. Canevari, and B. L. Weir. 1992. Rice Production in California. University of
23
1
California Division of Agriculture and Natural Resources Publication 21498,
2
Oakland, CA.
3 4 5 6 7 8 9 10 11
Isaaks, E. H., and R. M. Srivastava. 1989. An Introduction to Applied Geostatistics. Oxford University Press, Oxford, U.K. Jain, A. K., and R. C. Dubes. 1984. Algorithms for Clustering Data. Prentice-Hall, Upper Saddle River, NJ. Jaynes, D. B., and T. S. Colvin. 1997. Spatiotemporal variability of corn and soybean yield. Agron. J. 89: 30-37. Kendall, M., and J. K. Ord. 1990. Time Series. Edward Arnold Publishing, Kent, Great Britain. Lark, R. M., H. C. Bolam, T. Mayr, R. I. Bradley, R. G. O. Burton, and P. M. R.
12
Dampney. 1999. Analysis of yield maps in support of field investigations of soil
13
variation. p. 151-161 In J. V. Stafford (ed.), Precision Agriculture '99. Sheffield
14
Academic Press, Sheffield, UK.
15
Lark, R. M., and J. V. Stafford. 1997. Classification as a first step in the interpretation of
16
temporal and spatial variability of crop yield. Annals of Applied Biol. 130: 111-
17
121.
18 19 20 21 22 23
Long, D. S. 1998. Spatial autoregression modeling of site-specific wheat yield. Geoderma 85: 181-197. Lowenberg-DeBoer, J., and K. Erickson. 2000. Precision Farming Profitability. Purdue University, West Lafayette, IN. Matloff, N. Statistical hypothesis testing: problems and alternatives. 1991. Environmental Entomology. 20:1246-1250.
24
1
Openshaw, S., and P. Taylor. 1981. The modifiable areal unit problem. p. 60-69 In N.
2
Wrigley and R. J. Bennett (ed). Quantitative Geography: A British View.
3
Routledge and Kegan, London, UK.
4 5 6
Perez-Munoz, F., and T. S. Colvin. 1996. Continuous grain yield monitoring. Trans. ASAE 39: 775-783. Perez-Quezada, J. F., G. S. Pettygrove, and R. E. Plant. 2003. Spatial- temporal analysis
7
of yield and the influence of soil factors in two four-crop-rotation fields in the
8
Sacramento Valley, California. Agron. J. 95:676-687.
9
Plant, R. E., D.S. Munk, B.A. Roberts, R.N. Vargas, D.W. Rains, R.L. Travis, and R.B.
10
Hutmacher, R. B. 2001. Application of remote sensing to strategic cotton
11
management. J. Cotton Sci. 5:30-41.
12
Porter, P. M., J. G. Lauer, D. R. Huggins, E. S. Oplinger, and R. K. Crookston. 1998.
13
Assessing spatial and temporal variability of corn and soybean yields. J. Prod.
14
Agric. 11: 359-363.
15 16 17
Searcy, S. W., J. K. Schueller, Y. H. Bae, S. C. Borgetl, and B. A. Stout. 1989. Mapping of spatially variable yield during grain combining. Trans. ASAE 32:826-829. Searcy, S. W., A. D. Beck, and J. P. Roades. 2001. Cotton yield mapping: Texas
18
experiences in 2000. p. 342-344 In Proc. 2001 Beltwide Cotton Conf. National
19
Cotton Council, Memphis, TN.
20
Swindell, J. E. G. 1997. Mapping the spatial variability in the yield potential of arable
21
land through GIS analysis of sequential yield maps. p. 827-834 In J. V. Stafford,
22
(ed.), Precision Agriculture '97: Papers Presented at the First European
25
1
Conference on Precision Agriculture, BIOS Scientific Publishes Ltd., Oxford,
2
UK.
3 4 5
Whelan, B. M., and A. B. McBratney. 2000. The "null hypothesis" of precision agriculture management. Prec. Agric. 2: 265-279. Wong, D. W. S. 1995. Aggregation effects in geo-referenced data. p. 83-106 In S. L.
6
Arlinghaus et al. (ed.), Practical Handbook of Spatial Statistics. CRC Press, Boca
7
Raton, FL.
8
26
Table 1. Range, sill, and nugget of variograms fit using the spherical model to 30 m and 60 m grid resolutions of the yield median polish residuals for both fields. Field 1
Field 2
Year
Grid Res. (m)
Nugget
Sill
Range (m)
R2
Nugget
Sill
Range (m)
R2
1998
30
0.26
0.99
53.7
0.75
0.55
1.00
100.0
0.89
1998
60
0.22
1.05
88.0
0.29
0.15
1.10
100.0
0.92
1999
30
0.50
1.10
181.0
0.85
0.45
1.09
160.0
0.99
1999
60
0.57
1.16
211.0
0.85
0.45
1.00
160.0
0.95
2000
30
0.44
1.02
90.9
0.93
0.51
1.03
200.0
0.98
2000
60
0.24
0.95
89.0
0.78
0.45
1.01
244.6
0.97
2001
30
0.20
1.17
156.7
0.93
0.59
0.95
95.0
0.80
2001
60
0.40
1.11
143.4
0.91
0.50
0.98
120.0
0.95
27
Table 2. Squared relative information criteria (RIC 2 ) between the 30 m and 60 m GR for Fields 2 and 2 for each of the 4 years. 1998 1999 2000 2001 Field 1 0.15
0.63
0.52
0.70
Field 2 0.32
0.60
0.61
0.13
Table 3. Spatial variance in total yield, yield trend obtained through median polish, and small scale yield variation obtained as median polish residuals, and covariance between median polish and residuals. Units are (Mg ha -1 )2 .
Field 1 Year
Yield
Med.
Residuals
Field 1 Covariance
Yield
Polish
Med.
Residuals
Covariance
Polish
1998
2.65
1.02
1.86
-0.12
2.41
1.00
1.79
-0.19
1999
1.47
0.64
0.99
-0.08
6.28
2.39
4.71
-0.41
2000
0.38
0.18
0.22
-0.01
3.85
1.63
2.74
-0.26
2001
3.70
1.75
1.54
0.41
Avg.
2.05
0.90
1.15
0.0
4.18
1.67
3.08
-0.39
28
Table 4. Mean temporal variance (eq. 2), standard deviation (eq. 5) and the ratio between the mean temporal and the mean spatial variance (eq. 4) for the years 1998-2001 for Field 1, and 1998-2000 for Field 2. Mean Temporal
Std. Dev. †
Ratio:
Variance (Mg ha -1 )2
(Mg ha-1 )
(Temporal/Spatial)
Grain Yield
2.46
2.04
2.46 / 2.05 = 1.20
Median Polished (Large-
1.38
0.85
1.38 / 0.90 = 1.53
1.13
1.56
1.13 / 1.15 = 0.72
Grain Yield
4.07
5.00
4.07 / 4.18 = 0.97
Median Polish
1.87
2.73
1.87 / 1.67 = 1.12
Residuals
2.18
1.78
2.18 / 3.08 = 0.70
Field 1 (1998-2001)
scale) Residuals (Small-scale) Field 2 (1998-2000)
†
Standard deviation
29
List of Figures
Figure 1. Gray scale renditions of false color infrared aerial images taken on August 8, 1998 of the two study fields. (a) Field 1. Note that in the image the field becomes darker as one moves from west to east. This corresponds to higher near infrared values in the original image. (b) Field 2. The dark areas in this image correspond to areas of very sparse vegetation in the field. The field had recently been laser leveled and brought into production, and these dark areas in the image were areas where considerable topsoil had been cut off in the leveling process.
Figure 2. Yield monitor data (every tenth data point) of Field 1 plotted in the order in which records were entered into the database. Abscissa is the difference between the GPS clock time for that record and the lowest GPS time of the data set. Ordinate is yield (kg ha-1 ) converted to common moisture content. Arrows placed at gaps in the GPS time record, indicating that the harvest had been stopped and restarted.
Figure 3. Yield monitor data (every tenth data point) of Field 2 plotted in the order in which records were entered into the database. Abscissa is the difference between the GPS clock time for that record and the lowest GPS time of the data set. Ordinate is yield (kg ha-1 ) converted to common moisture content. Arrows placed at gaps in the GPS time record, indicating that the harvest had been stopped and restarted.
30
Figure 4. Relationship between the means of sequences of 10 yield monitor data points recorded next to each other in Field 1. Error bars show 95% confidence intervals based on standard deviation of the 10 data values.
Figure 5. Sequence of maps of Field 1 data as analyzed for each of the years 1998 through 2001. Yield data in kg ha -1 . Outliers have been deleted and data have been resampled to a common grid representing 30 meter square cells (Grid 1).
Figure 6. Sequence of maps of Field 2 data as analyzed for each of the years 1998 through 2001. Yield data in kg ha -1 . Outliers have been deleted and data have been resampled to a common grid representing 30 meter square cells (Grid 1).
Figure 7. Experimental and fitted variograms of yield data in Field 1. Values of the fitted variogram sill, co+c1 , range, Ao and model fitness, r2 , are given in the figures. a) 30m, b) 60m sampling grid. Variograms for Field 2 are very similar.
Figure 8. Spatial distribution of the temporal variance (Mg ha -1 )2 .(a) Field 1, (b) Field 2. Each figure consists of Thiessen polygons surrounding each sample point.
Figure 9. (a) Cluster behaviors defined by standardized yields in four years in Field 1. (b) Cluster behaviors defined by standardized yields over the three years 1998 – 2000 in Field 2.
31
Figure 10 (a) Thiessen polygons interpolated clusters in Field 1 (1998 – 2001). (b) Thiessen polygons interpolated clusters in Field 2 (1998 – 2000).
32
(a)
(b)
Figure 1
33
Field 1, 1998
Yield (kg/ha)
25000 20000 15000 10000 5000 0 43
8580
17159
86349
94558
102877 175802 184507 193145 202090 210769 Time (sec)
Field 1, 1999
Yield (kg/ha)
25000 20000 15000 10000 5000 0 93
6773
13255
19742
26486
91743
98062
109000
252708
258991
265293
Time (sec)
Field 1, 2000
Yield (kg/ha)
25000 20000 15000 10000 5000 0 18
6454
12852
19234
81332 Time (sec)
Field 1, 2001
Yield (kg/ha)
25000 20000 15000 10000 5000 0 22
6348
12709
18992
1516
Time (sec)
Figure 2 34
7850
14135
20481
Field 2, 1998
Yield (kg/ha)
25000 20000 15000 10000 5000 0 79
13874
24220
14589
23537
97489
107343
Time (sec)
Field 2, 1999
Yield (kg/ha)
30000 25000 20000 15000 10000 5000 0 157
7841
2592
9335
169232
168382
510386
517052
524168
Time (sec)
Field 2, 2000
Yield (kg/ha)
25000 20000 15000 10000 5000 0 184
7212
5537
13699
21174
6500
12796
19971
5402
Time (sec)
Field 2, 2001
Yield (kg/ha)
25000 20000 15000 10000 5000 0 9776
23014
12514
12890
19840
14793
Time (sec)
Figure 3 35
24104
30914
35104
Field 1, 1998
Field 1, 2000 18000
15000 #1
10000
#2
5000
Mean Yield (kg/ha)
Mean Yield (kg/ha)
20000
16000 14000
#1 #2
12000 10000 8000
0 1
4
7
1
10
Sample Number
4
7
Sample Number
Figure 4
36
10
Figure 5.
37
Figure 6.
38
a)
b)
Figure 7.
39
a)
b)
Figure 8.
40
a)
Standardized Yield
1.5 1 0.5
Cluster 1
0
Cluster 2 Cluster 3
-0.5 -1 -1.5 1998
1999
2000
2001
Year
b)
Standardized Yield
1.5 1 0.5
Cluster 1
0
Cluster 2 Cluster 3
-0.5 -1 -1.5 1998
1999
2000
Year
Figure 9.
41
(a)
(b) Figure 10
42