Many factors can influence the accuracy of soil test results, ranging from field sampling technique, sample preparation, and quality control in the laboratory. Many people expect that if a field is sampled more than once, the soil test results should be identical. When identical results are not obtained from successive sampling, much concern about soil test reliability is often expressed. We have analyzed soil test results from some controlled field experimental sites which help provide an understanding of variability which can occur naturally in the field, how various field sampling techniques influence soil test readings obtained, and how laboratories duplicate readings from the same samples tested on different dates. The examples discussed here represent only a few of the many scenarios which can affect soil test results.Vol. 28, No. 9, 1995

Variability In Soil TestingK.L. Wells and Vern Case

**Natural Variability In Soil Test Values**

Natural variability occurs in soil, both horizontally and vertically, and will be different from field to field. To illustrate this, we have presented a set of data in Figure 1, which shows vertical (0 to 30 inches depth) and horizontal variation for soil pH along a traverse of only12-ft in a Maury soil on which bluegrass had been grown continuously for many years. There were 99 samples taken along this 12-ft transect at depths of 1.5, 5.5, 9.5, 13.5, 17.5, 21.5, and 25.5 inches. Soil pH varied more than 2 pH units among the 99 samples taken within a distance of only 12 feet. Although the range of variation narrowed somewhat with depth (reflecting less influence from surface management effects), it still was wide. Assuming that variation in soil pH is normally distributed, the proportion of samples occurring within certain ranges can be estimated. The ranges shown in Figure 1 are those associated with deviation from the mean value for the set of 99 samples measured at each depth. A large proportion of the samples (68%) occured within the range described by the sample mean plus or minus one standard deviation unit (5.8 to 6.4). Widening the range to include plus or minus 1.5 standard deviation units ( pH 5.6 to 6.6) included 86% of the samples. The mean plus or minus 2 standard deviation units contained 95% of the samples, and 100% were contained within the range of the mean plus or minus 3 standard deviation units. There is no single absolute pH value which describes that occurring along a 12-ft transect of Maury soil at this site. As an estimate, we use the sample mean to describe the pH. For this example, the mean of the 99 surface soil samples was 6.1 and we would interpret the lime needs based on that value. However, in reality, the surface pH of 68% of the samples taken fell within the range 5.8 to 6.4, and, if 100% of the samples were included, the pH range widened to 5.1 to 7.1. So, in a very detailed sampling along a 12-ft traverse, we use a mean pH value of 6.1 to represent all the values measured within a range of 5.1 to 7.1 The point to keep in mind is that pH measurement of a soil sample submitted to a lab is assumed to be the mean pH value for an entire field, and that as such, the pH at any one location within the field may deviate from the one soil test value obtained from the sample taken to represent the entire field. This raises questions about how intensively a field should be sampled in order to obtain a reliable estimate of soil test value contained within that field.

**Effect of the Number of Samples Taken Within a Field**

An experiment conducted in a 3.4 acre field of 2-6% sloping Shelbyville silt loam soil resulted in a very intensive,systematic soil sampling of the area. These data enabled an estimation of soil test values as affected by different ways in which the field could be sampled. Table 1 shows these effects on estimating the soil test level for phosphorus (P). A randomly collected soil sample from the field prior to detailed sampling showed the P level to be 25 lbs/A. A rigid, systematic system of sampling, resulted in 162 separate samples from within the 3.4acre field with an average soil test P value of 25.2 lbs/A. Selection of samples on a 40-ft x 50-ft or a 200-ft x 100-ft grid within the field gave average P values of 27.4 and 27.1 lbs/A, respectively. A random, zig-zag method of sampling 9 locations, starting from either side of the field, resulted in identical average soil test P values of 24.9 lbs/A. Sampling the field in 3 longitudinal alternate 40- ft swaths at 3 locations in each swath, gave an average soil test P value of 24 lbs/A. Sampling at 6 locations along a diagonal transect across the field gave an average soil test P value of 29.7 lbs/A in one diagonal direction, and 26.7 in the opposite diagonal direction.

The variation in soil test P levels estimated by the different sampling procedures shown was minimal and with one exception, all would have received the same fertilizer recommendation. The one exception would have had a slightly lower recommendation. The precision of the estimated average, however, increased as the number of samples increased. Based on the standard error of the mean, the average value of 25.2 from the 162 samples varied only by plus or minus 0.7 lbs/A, meaning that the average fell within a range of 24.5 to 25.9 lbs P/A. Variation about the mean for 54 samples increased to plus or minus 1.6 lbs/A and increased further as number of sampling sites dropped to 9 or 6. Despite the variation in precision of the average P test values, there was little effect on rate of P2O5 fertilizer recommended based on the mean soil test P value of the various soil testing procedures. On this uniformly lying ridgetop field of 3.4 acres being used in a corn, wheat, and soybean rotation, sampling at 9 sites in a random zig-zag traverse through the field, as sufficient to accurately estimate the soil test P level.

**Variability Related to Different People Sampling the Same Area**

Another small experimental area of 0.56 A of nearly level Shelbyville silt loam soil was divided into 24 blocks and each was sampled separately by taking 6 cores from within each block. To test soil sampling procedures, two different people collected a sample on the same day from the entire 0.56 A area by taking 10 random cores. Results obtained are summarized in Table 2. Soil test values for P are not shown, since most measured over 240 lbs/A, which was the upper limit reported by the lab. The means obtained from averaging the 24 samples were very precise, and results from sampler A and B were remarkably similar (compare wk. 1 for A to wk. 1 for B). Although pH values from the 10-core samples taken by A and B were almost identical, they did exceed the range shown by the intensive sampling to contain 68% of the samples, and exceeded the mean value by about 0.3 pH unit. Buffer pH averages of A and B would have fallen within the range shown to contain 68% of the samples, and were identical to the mean of the intensive sampling. Based on the pH difference, the 10-core samples taken by A and B would have underestimated lime needs by about 1 T/A. The variations shown in average soil test potassium (K) values would have had little effect on the amount of potash (K2O) recommended. Results from both the intensive sampling and from sampler B would have resulted in a recommendation of 30 lbs K2O/A while that from sampler A would have been 40 lbs K2O/A. As compared to the average from sampling the0.56 A area in 24 blocks (144 total cores), the results from a random sampling of 10 cores by A and B compared favorably.

**Variability Due to Splitting Soil Samples and Sending Them to the Lab At
Different Times**

Samples taken by samplers A and B from the Shelbyville soil described above were thoroughly mixed by hand and then divided into two samples. One of the two samples was sent to the lab and the other was sent to the lab a week later. These results are also contained in Table 2 (compare results from wk. 1 to results from wk. 2 for each sampler). Although pH values for samples sent on week 1 were about 0.3 pH unit higher than those reported from samples sent on week 2, buffer pH and K values were very similar. Based on the pH range containing 68% of the samples in the intensive sampling, it would appear that pH values reported from the samples sent by A and B on week 2 were a better estimate than the values reported for samples sent on week 1. The difference in pH readings obtained on the split samples could have resulted from inadequate mixing of the samples before they were split, or from variations in pH meter calibrations in the lab. Variability can and does occur in the laboratory. In the University of Kentucky's soil testing lab, variability is estimated from repeated measurements of specially prepared soil samples used for quality control. Routinely, one sample in every group of 20 soil samples is a control sample that is randomly placed within the group. For example, from March 30 to August 4, 1995, six different quality control soil samples were analyzed 100 times. All measurements are included for accurate calculation of lab variability. However, if results for the quality control sample are more than 2 standard deviations above or below the mean of the last 100 measurements, results for the entire group of 20 samples are not accepted. The 20 samples are re-done (scooped, extracted, and analyzed again) until the quality control sample results are within 2 standard deviations of the mean for the last 100 measurements. Table 3 shows some of the results for samples RS0137 and RS0142. The range for all 100 results was similar to the range based on calculating plus and minus 2 standard deviations from the mean. These results show that the difference of 0.3 pH units for weeks 1 and 2 in Table 2 is at the limit or exceeds the usual range for our soil lab measurements.

**Minimizing Soil Test Variability**

Take separate samples from areas of a field that you know are physically different or have varied differently in lime and fertilizer applications. This would be particularly important where other fields have been incorporated into one larger unit. If row application of fertilizer has been made, do not sample from the old rows. Sample depth should be 3 to 4 inches for no-till, pasture, and hay fields; 6 to 8 inches for fields to be plowed with a moldboard plow. Do not vary sample depth while the field is being sampled. Sample on a predetermined traverse through the field; either a grid or random zig-zag pattern will work just be consistent.

Collect subsamples in a clean plastic bucket and thoroughly mix them after completing the field. If the subsamples are too moist to easily crumble by hand, air dry them enough to do so before mixing the subsamples. Do not try to mix muddy cores. After thoroughly mixing, subsample the composite to get about a pint of soil for sending to the lab (soil sample bags or boxes hold about a pint).

Sample each field at about the same time of year. Samples taken in the fall will usually test lower than samples taken in the spring. So, in order to compare sample results over a period of years, be sure you take them at about the same time each year.

**Summary**

There will be variability even under the best of conditions, and if you split samples, don't expect to get exact duplication of results. Variation can occur from sampling procedures, time of year sampled, within lab, and between labs. Small variation will rarely cause differences in lime and fertilizer recommendations. The major objective in controlling variation is to minimize the effect of the factors mentioned above which can cause large variation.

K.L. Wells Extension Soils Specialist