SOCI208 Module 8 - Statistical Sampling

An important new distinction is that data sets can be Example: the 1998 General Social Survey is a sample of the adult population in the U.S.; but for purposes of sampling experiments the 1998 GSS data set can be treated as the population, from which one draws a random sample of size n = 100, say.  (So you might truthfully say that the population/sample distinction is socially constructed!!!)

1.  Populations

A population (or universe) is the total set of elements of interest for a given study.

1.  Finite Populations

Example:

2.  Infinite Populations

An infinite population usually has elements that consist of all the outcomes of a process if the process were to operate indefinitely under the same conditions.  The infinite population is represented by an RV with its associated probability distribution.
Example:

2.  Censuses and Samples

1.  Census

A census is a study of a finite population that includes every element of the population.
Example: A census is not possible with an infinite population.

2.  Sample

A sample is a part of the population selected so that inferences can be drawn from it about the population.
The process of designing and executing s study based on a sample is called a sample survey.
NOTE: "survey" does not necessarily imply the use of a questionnaire.

Example

3.  Reasons for Sampling

Reasons for using sampling rather than a census with a finite population are

4.  Sampling and Nonsampling Errors

5.  Probability and Judgement Samples

Example:

3.  Simple Random Sampling From a Finite Population

1.  Definition

A (simple) random sample from a finite population is a sample selected so that each possible sample combination of the specified size has equal probability of being chosen.

NOTE:

  • the number of different possible samples of n from a population of N elements is given by the formula
  • N!/(n!(N - n)!)  (it may be a BIG number!!!)
  • the definition of a simple random sample implies that each element of the population has an equal probability of being selected; but equal probability of selection of elements is not a sufficient condition for a simple random sample - Q - Why?  (See NWW p. 241 bottom.)
  • 2.  Selection of Simple Random Sample

    Selecting a simple random sample requires a frame. The general procedure for selecting a simple random sample is to select elements sequentially without replacement:
    1.  Sample Selection Using a Table of Random Numbers
    See the following exhibit (county lawyers)
    Exhibit: (NWW Table 9.1 p. 244) [m8001.gif]
    2.  Sample Selection Using Computer-Generated Numbers
    <show example of programs in SYSTAT and STATA>

    4.  Simple Random Sampling From an Infinite Population

    1.  Definition

    The n random variables X1, X2, ..., Xn generated by a process constitute a simple random sample from an infinite population if
    1. they are independent and
    2. they come from the same probability distribution (i.e., they are "identically distributed"); the common probability distribution for X1, X2, ..., Xn is the infinite population.

    2.  Diagnostic Procedures for Checking Randomness of Data

    See the following exhibits:
    Exhibit: (NWW Figure 9.1 p. 246) [m8002.gif]
    Exhibit: (NWW Figure 9.2 p. 247) [m8003.gif]
    Exhibit: (NWW Figure 9.3 p. 248) [m8004.gif]
    NOTE: in the social sciences an important example of a population treated as an infinite population is that of the error term in a statistical models, as in a regression model of Y as a function of variables Xk and a residual error term; diagnostic procedures for checking randomness may then be used with an estimate of the error term called the residual.

    5.  Sample Statistics and Population Parameters

    1.  Sample Statistics

    Sample statistics consist of summaary measures calculated from a sample of n observations X1, X2, ..., Xn from a population such as

    2.  Population Parameters

    Definitions of population parameters differ depending on whether the population is finite or infinite.
     
    Table 1.  Population Parameters for Finite and Infinite Populations
    Parameter Finite population
    with observations X1, X2, ..., XN
    Infinite population
    represented by RV X
    Population Mean m = (Si=1 to NXi)/N m = E{X}
    Population Variance s2 = (Si=1 to N(Xi - m)2)/N s2 = s2{X}
    Population Standard Deviation s = (s2)1/2 s = (s2)1/2

    NOTE: See also box in NWW p. 251 for demonstration that "the population mean m and variance s2 for a finite population correspond, respectively, to the expected value and variance of the RV associated with the equal-probability selection of one population element."
     

    3.  Definition of Statistical Inference

    Statistical inference is the use of probability theory to make inferences about population parameters using information obtained from a sample.



    Last modified 28 Sep 2002