Systematic sampling

Author:

Abstract

This paper gives an account of the results of an investigation into one-dimensional systematic sampling, i.e. the sampling of sequences of quantitative values by the use of sampling points equally spaced along the sequence. New methods, using what are termed partial systematic samples, are evolved for estimating the systematic sampling error from short sections of sequences of completely enumerated numerical material. This gets over the difficulty, which previously existed, that the only estimates of the systematic sampling error of a numerical sequence, even when completely enumerated, were those provided by the actual deviations of the systematic samples of the whole sequence. Such deviations are few in number and by no means independent. Simple end-corrections are proposed for eliminating the errors, due to trend, which are otherwise inherent in randomly located systematic samples. It is demonstrated that it is impossible to make any fully reliable estimate of the sampling error from the systematic sampling results themselves, though if the continuous components of variation are not too marked, the sum of sets of terms taken alternately positive and negative, with suitable end adjustments, will provide a moderately satisfactory estimate, which will always be an overestimate provided there are no periodicities. This estimate is substantially better than the customary estimate based on successive differences. In other cases supplementary sampling is required to furnish an estimate of error, and methods are described whereby estimates can be derived from supplementary samples at half-spacing, or at half and quarter spacing. The performance of systematic sampling is investigated theoretically for certain mathematical functions, and also by the numerical analysis of certain numerical sequences. The mathematical functions investigated are (1) the two-valued function,/ ( a?) = 0 or 1, corresponding to sampling for attributes, (2) the normal error function, which corresponds to sampling for density with material normally distributed about a point in a line, and (3) the one-term autoregressive function yr+1=by?+a?? In the case of the two-valued function the relative performance of systematic and random samples is shown to depend on the lengths of the intervals of the function relative to the sampling interval. If these are small all forms of sampling are about of equal accuracy, but if they are large, systematic sampling is on the average twice as accurate as random sampling with one point per block, which is again twice as accurate as random sampling with two points per block. Similar results hold for the autoregressive function when b-*■ 1. In the case of the normal function, numerical analysis shows that systematic sampling over the whole of the curve is remarkably accurate in determining the integral of the curve. Mathematical reasons why this should be so are put forward. The sampling of part of the curve by systematic sampling is also investigated, and is used to demonstrate the value of end-corrections. The effect on the sampling errors of departures of actual density distributions from the normal form due to random variations in the material are evaluated. Numerical analyses are made of five numerical sequences: (1) 288 altitudes at 0-1 mile intervals along a grid line of a 1 in. O.S. map, (2) yields of 96 rows of potatoes, (3) 192 daily maximum screen temperature readings, (4) 192 soil temperature readings (9 a.m.) at 4 in., (5) 192 similar readings at 12 in. These analyses confirm the findings of the theoretical part of the investigation, and show that for these types of material the gain in precision with systematic sampling over stratified random sampling of the same intensity with one point per block is of the same order as the gain in precision with stratified random sampling with one point per block over stratified random sampling of the same intensity with two points per block, though the former tends to be larger in material of the more continuous type. The actual average ratios of the variances for the five sequences range from 1.26 to 2.99 in the first case, and T31 to T90 in the second. The relation between the gain in precision and the gain in efficiency is evaluated. The latter is always smaller owing to decrease in accuracy per point for a given method of sampling with decrease in intensity. Consideration of the relation between sampling costs and the losses due to errors in the sampling results shows, however, that with a more precise method of sampling greater accuracy should be demanded in the results. The danger of using systematic sampling in material about which nothing is known, or on material which may be subject to periodicities, is stressed, as is the importance in large-scale sampling investigations of making a preliminary investigation before instituting systematic sampling and of arranging for adequate control of error in the form of error estimates, with supplementary observations if necessary, in systematic sampling or stratified random sampling with one point per block. Control of this type should of course also be employed in stratified random sampling with two or more points per block, but in this case no special provisions are necessary, since valid estimates of error are always available from the sampling results themselves.

Publisher

The Royal Society

Subject

General Engineering

Reference4 articles.

1. Aitken A. G. 1939 Statistical mathematics. Edinburgh: Oliver and Boyd.

2. Relative Accuracy of Systematic and Stratified Random Samples for a Certain Class of Populations

3. Kalamkar R. J. 1932

4. Quart. J;Kendall D. G.;Math.,1942

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3