Secondary Statistical Analysis

 

What is secondary statistical analysis?

Secondary statistical analysis is the analysis of data that have been collected by others. It may be an analysis of official statistics or an analysis of data collected by other researchers (generally, general datasets like the General Household survey, social surveys or one off surveys). Secondary statistical analysis is commonly done on datasets produced by the Office for National Statistics, such as Labour Force Survey or Social Trends. You can also use European or International datasets such as the International Social Survey Programme (ISSP). Secondary statistical analysis is a quantitative methodology.

Secondary statistical analysis is so-called because it is done by researchers who will probably have not been involved in the collection of data and will be using the data for purposes other than what it was originally intended. It enables you to explore research questions without having to go through the process of collecting the data yourself

Advantages

  1. It is cost-effective. You may have to pay to access a general dataset like the General Household Survey (usually housed in universities) but it will be nothing compared to what it would have cost if you had to collect the data yourself!
  2. It saves time. Data collection is time intensive so this gives you the added bonus of going straight to the analysis.
  3. High quality data. Sampling has been rigorous. Organisations involved have well established methods of reducing likelihood of non responses and a lot of the time the datasets have been generated nationally by highly experienced researchers.
  4. Opportunity for longitudinal analysis.
  5. Good for subgroup analysis.
  6. Opportunity for cross-cultural analysis.
  7. More time for data analysis.
  8. Reanalysis may offer new interpretations.
  9. No ethics committees to be dealing with!

Disadvantages

  1. Lack of familiarity with data.
  2. Complexity of data. General datasets contain huge numbers (cases and variables) and this can be daunting to manage. It does take time to 'clean up' your dataset but starting your analysis - i.e. breaking it down into something more manageable.
  3. No control over data quality.
  4. Key variables that you want to use are absent. (You may have little control over what is there but you can transform some variables into something else!)

However, secondary statistical analysis presents few disadvantages. You do get to employ high-quality datasets that are based on large reasonable representative samples.


How do you access these datasets?

There is a data archive at the University of Essex, which holds the British Household Panel Survey, General Household Survey, Family Expenditure Survey, British Social Attitudes Survey, National Child Development Study, Expenditure and Food Survey, Labour Force Survey, National Food Survey, British Crime Survey and ONS Omnibus Survey. Various European centres hold datasets like LISER in Luxembourg (previously called CEPS/INSTEAD) which has the European Household Panel Survey amongst others. Libraries are also a good source and many universities have arranged access to national datasets.


Housekeeping Tips For Secondary Statistical Analysis

1. Always keep a copy of the original dataset you have acquired. Keep this separate from your work.

2. At the onset, make a copy of the original dataset and delete the variables you don't need. This makes your dataset far more manageable. For instance, if you're doing an analysis on 5 countries, delete the other countries. You'll have already decided your hypotheses, so it is easy to delete variables you don't need.

3. Note that if you're doing sub-group analysis using the delete option instead of filtering in SPSS, make sure you remove the delete option before exiting, otherwise you may lose all those cases permanently!

4. Run simple descriptives and frequencies to get a feel for your data before you start any inferential statistics.

5. When creating new variables, do fill in the variable labels probably including what type of variable it is (interval, ordinal or nominal). You'd be amazed the number of people who don't do this and end up producing rubbish in their output.


Conducting Secondary Statistical Analysis

To use this methodology, you need to be able to do the following:

  1. Understand the principles of quantatitive research
  2. Understand statistical thinking
  3. Have a good knowledge of statistical models
  4. Be able to implement standard statistical procedures(multivariate analysis as well as descriptive statistics) using a computer programme like STATA, R or SPSS.


Further Reading

Agresti, A. and Finlay, B. Statistical Methods for the Social Sciences. (Prentice-Hall, 2007).

Babbie, E. The Practice of Social Research (Wadsworth, 1995).

Babbie, E. Survey Research Methods (Wadsworth, 1990).

Black, T.R. Understanding Social Research. 2nd edition. (Sage publications, 2001).

De Vaus, D. Analyzing Social Science Data: 50 Key Problems in Data Analysis. (Sage, 2002)

De Vaus, D. A. Surveys in Social Research. 3rd Edition. (UCL Press, 1991).

De Vaus, D. Research Design in Social Research (Sage, 2001).

Neuman, W.L. Social Research Methods: Qualitative and Quantitative Approaches. (Allyn & Bacon, 1997).

Norusis, M. SPSS 17.0 Guide to Data Analysis. (Prentice Hall, 2009).

Pallant, J. SPSS Survival Manual: A Step to Step Guide to Data Analysis Using SPSS for Windows (Version 15). (Open University Press, 2007).

Rose, D. and Sullivan, O. Introducing Data Analysis for Social Scientists, 2nd ed. (Open University Press, 1993).

I've used these methods in previous papers such as 

Heffernan C. 2005.  Gender, Cohabitation and Martial Dissolution: Are changes in Irish family composition typical of European countries? , IRISS working paper

Heffernan C. 2004.  Who's at Risk? The Use of Secondary Data Analysis in the Assessment of 'High Risk' Behaviours for Sexually Transmitted Infections , Graduate Journal of Social Science.