Introduction to Significance
© Copyright 2004 Herbert J. Bernstein
When we perform an experiment, we often pose a hypothesis that says
that the mean of certain data values is different from the mean of
certain other data values or different from some given value (e.g.
0 or 1/2). How do we determine if we have collected enough data
to prove or disprove such an hypothesis?
Given random data, we are not likely to be able to prove anything.
A truly fair coin could happen to give us a run of a million
heads in a row. All we can do is to estimate probabilities
that the means are or are not different.
In order to do that we use the characteristics of two
important distirbutions: the normal distribution
(see
mathworld.wolfram.com/NormalDistribution.html) and the
Poisson distribution (see
mathworld.wolfram.com/PoissonDistribution.html).
- The Normal Distribution
- The bell-shaped curve
- If the total deviation from the mean is +/- one standard deviation,
then 68% of the cases will be included.
- If the total deviation from the mean is +/- two standard deviations,
then 95% of the cases will be included.
- If the total deviation from the mean is +/- three standard deviations,
then 99.7% of the cases will be included.
- If the total deviation from the mean is +/- four standard deviations,
then 99.99% of the cases will be included.
- If the total deviation from the mean is +/- six standard deviations,
then 99.9996% of the cases will be included.
- If two independently measured means are 2 standard deviations
apart then 1/6th of the cases from one distribution could overlap with the
one-sigma confidence interval of the other and vice versa. If we had used 6
standard deviations, then 1.3% of the cases from one distrbution could overlap
with the 3-signma confidence interval of the other and vice versa. At
12 standard deviations separation (6 sigma confidence intervals), less
1/500,000 cases overlap.
- The Poisson distribution
- A one-sided distribution
- What we get from the counting the random arival of events at some
rate λ
- The mean λ is estimated by the number of counts N
divided by the time interval over which they were collected.
- The standard deviation of N is estimated by the square
root of N
- We use these standard deviations as if they had come from the
normal distribution is estimating signficance
- Treatments
- Defines the terms of the hypothesis
- Selection of definitions of variables or transformations of the data
- Treatments chosen to decouple variables and identify dependent
and independent variabls
- Treatments chosen to allow well-understood distributions to
be used.
- The Null Hypotheses: The hypothesis that there is no signficant difference
- If we fail to reject the null hypothesis, we cannot conclude that
the means are the same, just that we need to collect more data to make
the decision.
- Type I error: rejecting the null hypothesis when it is true
(saying we have a difference when we don't).
- Type II error: accepting the null hypothesis when it is false
(saying we don't have a difference when we do).
- Tests of signficance (see
http://www.stat.yale.edu/Courses/1997-98/101/sigtest.htm