Now, you have collected quantitative data on different variables (things you are measuring), you need to ask yourself: is there a link between these variables? Are they related? The link between variables is called ‘correlation’.
Definition
Correlation is a “statistical measure that expresses the extent to which two variables are linearly related (meaning they change together at a constant rate)” (JMP Statistical Discovery 2023). Note: Correlation does not show if one variable causes the other. |
---|
Let’s use an example to illustrate what we mean by correlation. Imagine, you have two sets of data. One set shows how much time you spend on your phone each day for a week. The other set shows how anxious you felt each day. You want to know if there is a link between the two. If people spend more time on their phones, does their anxiety increase or decrease?
To find out what the correlation is between variables, you carry out ‘correlation analysis’.
Definition
Correlation analysis is a “statistical method that is used to discover if there is a relationship between two variables/datasets, and how strong that relationship may be” (James 2021). Note: In this method, you use mathematical tools to investigate what patterns exist between your variables. |
---|
Correlation analysis allows you to establish what is called a ‘correlation coefficient’. This is a number between –1 and 1 and is most commonly referred to as ‘r’. The value of ‘r’ tells you the nature of the link between your variables. Specifically, ‘r’ gives us two things:
- the direction of the link
- the strength of the link
Let’s discuss each in more detail.
1. Direction of the Link
Direction shows how your variables are connected. If one variable increases, does the other increase or decrease? Or is there no effect? In other words, is the link between your variables positive (upwards), negative (downwards) or non-existent?
If the correlation coefficient (r) is close to 1, it is a positive correlation. This means when one variable goes up, the other also goes up.
- Example: If you spend more time on your phone, you might feel more anxious. If you spend less time on your phone, you might feel less anxious. This correlation can be visualised like this:
If the correlation coefficient (r) is close to -1, it is a negative correlation. This means when one variable goes up, the other goes down.
- Example: If you spend more time on your phone, you might feel less anxious. If you spend less time on your phone, you might feel more anxious. This correlation can be illustrated like this:
If the correlation coefficient (r) is 0, there is no clear connection between the two variables. They do not seem to change together. This means when one variable goes up, there is no effect on the other.
- Example: If you spend more time on your phone, it has no effect on your feelings of anxiety. This can be shown on a graph like this:
2. Strength of the link
Strength shows to what extent your variables are connected. If one variable increases, how responsive is the other variable to that increase? Another way to describe this is, “How good would a straight line be at describing your data?” (Benedict 2014).
The closer the correlation coefficient (r) is to -1 or 1, the stronger the connection between the two variables. If it is exactly -1 or 1, it is a strong, ‘perfect’ connection. This means “a change in one variable is accompanied by a perfectly consistent change in the other” (Frost 2023).
- Example: The correlation coefficient (r) between ‘hours spent on phone’ and ‘level of anxiety’ is 1. This means that as the amount of ‘hours spent on phone’ increases, ‘level of anxiety’ will always increase (at a constant rate!) for each extra ‘hour spent on phone’.
So, if ‘hours spent on phone’ increases from 1 to 2, the ‘level of anxiety’ could increase from 5 to 7. This would mean that if ‘hours spent on phone’ increased from 2 to 3, the ‘level of anxiety’ must increase from 7 to 9.
If the correlation coefficient (r) is closer to 0, like 0.3 or -0.3, it is a weak connection. A change in one variable means a less consistent change in the other.
- Example: The correlation coefficient (r) between ‘hours spent on phone’ and ‘level of anxiety’ is -0.3. This means that as the amount of ‘hours spent on the phone‘ increases, the ‘level of anxiety’ will usually (but not always!) decrease for each extra ‘hour spent on phone’.
So, if ‘hours spent on phone’ increases from 1 to 2, the ‘level of anxiety’ could decrease from 7 to 5. Then, if ‘hours spent on phone’ increased from 2 to 3, the ‘level of anxiety’ could then change in a few different ways, including: decreasing from 5 to 4, decreasing from 5 to 3, increasing from 5 to 6, or even staying the same at 5. However, the ‘level of anxiety’ is still most likely to decrease.
The above points are reflected in the scatter plots below (Robertson 2023). They show that the closer the points are to the straight line, the stronger the correlation.
Summary
- The purpose of correlation analysis is “to increase our understanding of how different variables are related and to identify patterns in those relationships” (Mcleod 2023).
- Through correlational analysis, we obtain a correlation coefficient ‘r’. This is a value that goes from –1 to 1. It tells us the direction and strength of the relationship between our two variables.
To learn more about correlation analysis, please see the resources below.
(Author: Julia Mathews)
What is it?
Videos:
Correlation – the basic idea explained by Benedict (2014)
This video is an introduction to correlation. It guides you on how to read scatter plots. Then, using these plots, it explains what the correlation coefficient is. It also talks about the limits of correlation analysis.
(Academic reference: Benedict. (2014, April 11). Correlation – the basic idea explained [Video]. YouTube https://youtu.be/qC9_mohleao)
Articles:
Interpretation of correlation in clinical research by Man Hung, Jerry Bounsanga, & Maren Wright Voss (2018)
This is an advanced article. It explains what correlation is, how to use it, and how to interpret it. It is helpful because it shows why knowing more about correlation can lead to better research results.
(Academic reference: Hung, M., Bounsanga, J., & Voss, M. W. (2017). Interpretation of correlations in clinical research. Postgraduate Medicine, 129(8), 902–906. https://doi.org/10.1080/00325481.2017.1383820)
Books:
Correlational research by Paul C. Price (2017)
In section 6.2 of this book, the authors explain what correlation research is. Why carry out correlation analysis rather than other types of analysis? What are correlation coefficients? What is the difference between correlating and causing? These questions are answered in depth. There are examples, figures and links to further resources.
(Academic reference: Price, P. C., Jhangiani, R. S., Chiang, A. I., Leighton, D. C., & Cutter, C. (2017). 6.2 Correlational research. Research methods in psychology (3^{rd} ed.). https://opentext.wsu.edu/carriecuttler/chapter/correlational-research/)
How is it done?
Videos:
Correlation coefficient by The Organic Chemistry Tutor (2020)
This video is a step-by-step guide. It shows you how to calculate the correlation coefficient. Using graphs, it gives a brief explanation of the types of correlation and then it demonstrates how to get the values you need to do the calculation. It concludes by showing you how to use Pearson’s correlation coefficient formula.
(Academic reference: The Organic Chemistry Tutor. (2020, June 25). Correlation coefficient [Video]. YouTube. https://youtu.be/11c9cs6WpJU)
Correlation coefficient | types, formulas & examples by Pritha Bhandari (2023)
This guide is useful when selecting which correlation coefficient you should use. The guide talks about this in the ‘Types of correlation coefficients’ section where it discusses the differences between coefficients. It shows different formulas and explains them. It also focuses on Pearson’s ‘r’ and Spearman’s ‘rho’. In other sections, the guide explains what a correlation coefficient is. It shows how to visualise correlation via scatter plots.
(Academic reference: Bhandari, P. (2023). Correlation coefficient | types, formulas and examples. Scribbr. https://www.scribbr.com/statistics/correlation-coefficient/)
How to calculate correlation coefficient in Excel (2 easy ways) by Sumit Bansal (n.d.)
This guide shows you two different ways to calculate the correlation coefficient in Excel. Firstly, it explains how to use Excel’s CORREL formula. Secondly, it gives a step-by-step guide on how to enable and use Excel’s Data Analysis Toolpak. There are 7 steps on how to enable it, and 9 steps on how to use it.
(Academic reference: Bansal, S. (n.d.). How to calculate correlation coefficient in Excel (2 easy ways). TrumpExcel. https://trumpexcel.com/correlation-coefficient-excel/)
Websites:
Correlation studies in psychology research by Kendra Cherry (2023)
This website is a useful guide to correlation research. It first explains the traits of a correlation study. Then, it talks about 3 different types of correlation research you can do. It gives pluses and minuses for each type. The site then concludes with potential pitfalls you might face and FAQs.
(Academic reference: Cherry, K. (2023). Correlation studies in psychology research. Verywell Mind. https://www.verywellmind.com/correlational-research-2795774)
Correlation calculator by Math is Fun (2018)
On this website, you can enter your data to find the correlation. It will also create a scatter plot with your data. To use the site, first click on the ‘Table’ button. You can then enter all your observations. For each observation, put your X value in the left column and Y value in the right column. You can see your data on a scatter plot by clicking on the ‘Graph’ button. The value of your correlation coefficient (r) shows up in the ‘Correlation’ box.
(Academic reference: Math is Fun. (2018). Confidence interval calculator. https://www.mathsisfun.com/data/correlation-calculator.html)
Method in action
Articles:
Meaning in life and hope as predictors of positive mental health: Do they explain residual variance not predicted by personality traits? by Peter Halama and Maria Dědová (2007)
In this article, the authors use correlation analysis on lots of different variables. They aim to find the link between people’s traits, sense of meaning in life, hope and good mental health. It is a useful read to see how correlation is used when there are lots of variables.
(Academic reference: Halama, P., & Dedova, M. (2007). Meaning in life and hope as predictors of positive mental health: Do they explain residual variance not predicted by personality traits? Studia Psychologica, 49(3), 191. https://www.researchgate.net/publication/279899436_Meaning_in_life_and_hope_as_predictors_of_positive_mental_health_Do_they_explain_residual_variance_not_predicted_by_personality_traits)
The correlation of social support with mental health: A meta-analysis by Tayebeh Fasihi Harandi, Maryam Mohammad Taghinasab, and Tayebeh Dehghan Nayeri (2017)
This article compares information from 64 correlation studies. They examine how social support is linked to mental health in Iran. It is a helpful read if you want to see how researchers use correlation in a meta-analysis.
(Academic reference: Harandi, T. F., Taghinasab, M. M., & Nayeri, T. D. (2017). The correlation of social support with mental health: A meta-analysis. Electronic Physician, 9(9), 5212–5222. https://doi.org/10.19082/5212)