Short Notes on Interpreting Studies: Bias, Chance, Confounding and Causation


Bias can be design specific (i.e. inherent in the design utilised) or can occur in the production of research.

Design Specific Biases

For cohort studies: losses to follow up

For case-control studies: selection, recall

For cross-sectional studies: response rate

Types of Biases

Measurement bias (calibrate instrument, train people)

1. Exposure measurement error = difference between measured and the true exposure

2. Error of measure of exposure can be due to a faulty instrument, errors or omissions in the description of the use of the instrument, poor execution of protocol, limitation due to the individual, errors during data entry

3. Measurement error can be intra-subject, intra-observer, inter-observer, instrument or measurement


Respondent or information bias (non-response, desirability, recall)

With recall bias, there can be the danger that cases are highly motivated and may recall past exposure more completely than controls


Researcher bias

All researchers have hypotheses to prove and are non-blinded


Selection bias

Occurs when the subjects studied are not representative of the target population about which conclusions are to be drawn:


1. Simpson's Paradox (confounding):Baseline is affected by behaviours that are self-selecting - what people select themselves for. E.g. it may look like success rate for applying to university differ between men and women but on closer inspection, women may be applying for more difficult studies.

2. In RCTs, selection bias can be volunteer bias (people who select themselves for e.g. an exercise programme are likely to be at low risk) and selection bias in studies can be affected by refusals to take part

3.Incidence-Prevalence bias. E.g. are those on oral contraceptives more like to get cervical cancer or more likely to survive if they get it?

4. Berkson's Bias (admission rate bias). Case-control studies done in a hospital context can be misleading. Berkson's bias is logical but hard to demonstrate. Basically, if you take a funny sample, like people who have had post-mortems done on them, they are not representative. Cases are peculiar so it affects the results even if sampling is random. To deal with Berkson's Bias, return to the base population from which the cases emerged. In case-control, you need to think back to the original cohorts and select your controls from them - i.e. people at risk of something and some develop it/exposed to it and others haven't. To avoid this, THINK OF ALL CASE CONTROL STUDIES AS NESTED WITHIN A COHORT.

4. Other biases similar to Berkson's are referral bias or unmasking bias. This happens when something is done to a patient which unmasks a disease - e.g. there was an argument that oral contraceptives were linked to cancer of the uterus but women who took oral contraceptives had different bleeding patterns and so were more likely to be investigated.


Chance can be detected/measured by confidence intervals, p-values, hypothesis testing or generating, clear prior hypothesis or post hoc. The public health importance: (1) clinical importance (2) population attributable risk (3) scope for prevention/amelioration (cause and effect?) (4) public perceptions


Confounding is a distortion of an exposure-outcome association brought about by the association of another factor with both outcome and exposure. A confounder must not be an effect of the exposure. You can deal with confounding in (1) design and (2) analysis



Match, restrict, randomize or stratify


Stratify, standardise, multivariate analysis


1. To choose the comparison of subjects for a study so that they are made similar to the case or the exposure subjects in respect to specified confounding factors

2. Usually 1:1 matching and analysed together as a pair (e.g. 45 year old man matched with another 45 year old man)

3. Can improve efficiency

4. Can improve validity in cohort studies

5.But matched variables cannot be studied (except as effect modifiers) as risk ratio becomes 1

6. Matched controls may be harder (and more expensive) to find



1. Increases the efficiency of the study. (There is not inclusion of individuals in which the disease cannot occur or is unlikely to occur.)

2. Control for confounding using individual matching. In cohort studies, matching will not introduce complications. In case-control, special techniques that consider the data in matched form are required.

3. Improve comparability


When matching, it can be difficult to adjust for some factors (e.g. case and his twin or sibling - genes or socio-economic factors will be similar)



1. It is a fairly complicated technique

2. The factor for which matching has been performed cannot be used in the analysis to assess exposure and outcome (only use for known confounders)

3.Overmatching i.e. matching for factors for which there is no need to match (decrease efficiency) or matching for a factor that is causal pathway

4. Inefficient - finding an appropriate match may be difficult and if one member of the pair does not respond appropriately, the pair has to be excluded


We can eliminate those with the confounding variable. Restriction occurs for administrative reasons but it can leave too small groups for analysis.


Appropriate for assessing impact of interventions that are feasible and ethical but may be inappropriate for studying causes of diseases.


Analysis by group but each of the groups may be too small for useful analysis. To solve this problem, you can do one of the following:

1. Stratified and adjustment using the Mantel-Haenszel method that provides a weighted average that can be used in follow-up and case-control studies.

2. Direct standardisation

3. Indirect standardisation (standardised mortality ratios)


Advantages of stratification:

1. The end result is that you can get a stable estimate without heavy reliance in technology

2. The investigator can visualise the distribution of subjects by exposure, disese and potential confounder

3.The consumer of research can visualise distributions and can check calculations

4. Fewer assumptions are needed for a stratified analysis, reducing the possibility of a biased result


How stratification deals with confounding:

1.When assessing confounding, the factor is biasing only the overall association between exposure and outcome without any variation in the association between strata.

2. In effect measurement modification there is sufficient difference between strata so that we cannot assume homogeneity between strata (if there is an interaction, is there an alternative model that can explain the interaction. If not, how do we present the results?)

Effect modifier

Two factors interact to magnify or diminish overall effect on outcome


Bradford Hill's criteria (ACCESS PTB!)

Analogy (are plentiful)

Consistency (repeated demonstration in different populations)

Coherence (absence of conflict with other knowledge)

Experimental evidence


Strength (most powerful criteria for showing causality, does removal of factor decrease/prevent presumed outcome?)



Temporality (effect after exposure)

Biological gradient (dose effect)


1. Several causes affecting one disease may have a greater effect if they occur together than the simple sum of their individual effects.

2. Associations that are not causal can occur by chance and if selection or measurement procedures are biased.

3. Confounding variables may erroneously appear to link causes directly with disease. Common confounding variables are time, place and poverty.


Back to eReference