Also see our experimental design and statistical analysis programme.
Well designed and correctly analysed experiments can lead to a reduction in animal use whilst increasing the scientific validity of the results. The number of animals used should be the minimum number that is consistent with the aims of the experiment.
A well designed experiment should be:
When two or more treatment groups are compared, the animals in the groups should be in identical environments and be similar in every way apart from the applied treatments. Bias can be minimised by:
2. Adequately powered (i.e. use sufficient animals)
Powerful experiments are ones that have the maximum chance of detecting a true treatment effect. Power is achieved by:
Sample size should be determined using a formal method such as power analysis or using the resource equation method (see below). Although power is increased by increasing sample size, an unnecessarily large experiment will waste animals and scientific resources.
Variation is controlled by randomly allocating animals of similar genotypes, of a similar weight and age, which have had a similar environment throughout their lives. Variation due to circadian rhythms or fluctuations in the environment can often be reduced by appropriate experimental design, by using a randomised block or Latin square experiments.
Measurement error should be minimised by careful technique and good instrumentation, and blinding the researcher to treatment allocation.
Power analysis: A power analysis method for comparing two groups, for example, requires the following information:
The StatPages.org website offers online calculations of sample size combining the above factors.
The resource equation: E = N (number of animals per treatment x number of treatments) - T (number of treatments)
Where N = total number of subjects (e.g. individual animals or groups/cages of animals) and T = the number of treatment combinations, E (the sample size) should be approximately between 10 and 20.
For example, an experiment comparing four treatments, using six rats per treatment, would have N = 24 (6 x 4) and T = 4, therefore E = 24 4 = 20. This is within the acceptable range. However, there may be good reasons for going above this upper limit. If E is 30 or 40, the experiment may be too large and may waste resources.
This equation is most appropriate for small, non-routine and more complex animal experiments likely to be analysed using the analysis of variance statistical method (ANOVA).
3. Have a wide range of applicability
It is often useful to find out whether similar results are obtained in males and females, in different strains, or as a result of different diets or environments. Similarly, the response to a drug may depend on prior treatment, the effects of other drugs, or the route of administration. These effects can be studied efficiently using factorial experimental designs.
Factorial experimental designs: These can be used to investigate the effect of a drug on both males and females without doing two separate experiments or using twice as many animals. Put simply, in each of the two experimental groups half the subjects are male and half female. An adequately powered factorial experiment will show whether or not the two sexes will respond in the same way, which is not possible if the two sexes are used in different experiments.
4. Be simple and efficient
Experiments should not be so complicated that mistakes are made in their execution, or the statistical analysis becomes unduly complicated.
Small pilot studies should be used before starting a major experiment to ensure that the experiment is logistically efficient and to give some preliminary indication of likely results.
All experiments should be pre-planned, and should not be changed while they are in progress.
5. Indicate the range of certainty
Each experiment should be statistically analysed so that the results can be used in planning future experiments. An appropriate statistical analysis should indicate the range of uncertainty in the results, or the measure of variation, usually indicated by significance levels or confidence intervals.
These factors are discussed in more detail by Dell et al. (2002), Festing et al. (2002) and more briefly by Festing and Altman (2002) (see 'References' above).
|Return to top|