Data, Maths & Descriptive Statistics

In this section we will be covering:

Understand and evaluate the different types of data including: raw data, quantitative and qualitative, primary and secondary
Understand how to round up/down significant figures
Understand how to calculate estimates of data
Understand how to calculate and convert percentages and fractions including: calculating the number of a percentage, calculating the percentage of a number, convert a percentage to a decimal and vice versa, convert a decimal to a fraction and vice versa
How to calculate and simplify a ratio
Understand how to analyse, calculate and evaluate the descriptive statistics including: mean, median and mode
Understand how to analyse, calculate and evaluate the measures of dispersion including: range and standard deviation

Raw data:

Data that psychologists have collected from and investigation, but has not been processed or analysed, so for example, number of yes responses from a question. In order to record this data, psychologists would put this into a data table

Checklist for a raw data table:

  • A title outlining what the table is about.
  • Rows and columns are clearly labelled.
  • Unit measurements such as percentages should be labelled in the heading, not put next to every score.

Quantitative and Qualitative Data

Tip for remembering them:

Quantitative data = numbers

Qualitative data = language

Primary and Secondary Data

When a researcher collects data either by witnessing an event or by carrying out an experiment or questionnaire, this is known as PRIMARY data. It can be quantitative or qualitative; the key to it being primary data is that it is collected first hand by the researcher.

By contrast, when data is collected second hand, which is through the analysis of pre-existing data, we call this secondary data.  When we use statistics or refer to existing research to develop our own theories, this is secondary data.

Tip for remembering them:

Primary = first (first hand) and Secondary = second (second hand).

Strengths of primary data:

  • Is gathered first hand, therefore there is more certainty on how valid it is, as the researcher themselves knows the strengths and weaknesses of their own research.
  • If collected objectively, with careful planning and sampling, controls in place and other features of methodology adhered to, then they’re likely to be scientifically gathered for the stated aim of the study. This means they are more credible.
  • New research and ideas can be discovered through primary data, as it may not have been explored before.

Weakness of primary data:

  • Expensive to obtain because each researcher or research team has to start from the beginning of a study and follow the whole study through, finding participants, organising materials and running the study.
  • Time consuming, due to the above.
  • Limited to the time, place and number of participants etc., whereas secondary data can come from different sources to give more range and detail.
  • Maybe biased due to the researcher wanting to find certain results.

Strengths of secondary data:

  • Doesn’t take long to collect as the research has already been carried out.
  • Can gather lots of data in a short space of time.
  • Can help to build an idea about what most research is presenting in certain areas

Weaknesses of secondary data:

  • You don’t always know where it has come from and how reliable it is
  • Might not be relevant to what you are researching. This can lead to spending lots of time trawling through journals and research papers.
  • Data can be over complicated and may be more difficult to understand. This is because the research has been written by someone else.
  • Sometimes the data can be out of date

Maths 

Rounding up/down significant figures

Rules:
• The first non-zero digit reading from left to right is the first significant figure.
• For numbers 5 and above we round up.
• For numbers 4 and below we round down.

Worked examples:

1 significant figure: 42,3249 = 400000 (rounded down)

1 significant figure 0.00379 = 0.004 (rounded up)

1st sig figure (i.e. 1st number after zeros)

2 significant figures 0.0040352 = 0.0040 (rounded down)

1st & 2nd sig figures (ignoring zeros)

  1. The world’s oldest living plant is the Tasmanian King’s Holly at 43,600 years old. 2 significant figures = 44,000
  2. 1,143,552 paper bags are used in the USA every hour. 3 significant figures = 1.14 million
  3. There are 635,013,559,599 possible hands in a game of bridge. 2 significant figures = 640 million

Make estimations from data collected

When making estimations, you may want to round figures to one digit (one significant figure). For example, with the sum 234 x 39.78 you might just want to know “very roughly” what sort of value you are expecting rather than knowing the precise answer. So we do an “order of magnitude” calculation which means rounding the numbers to 1 digit (1 significant figure), so we get: 200 x 40 = 8000.

Assessment activity – remember the rounding up and down rule

  1. 574 x 29 =
  2. 333 x 14 =
  3. 88 x 9 =

Calculating and converting percentages (%) and fractions

These will all be familiar to you from Maths GCSE, however, it was quite some time ago, so you may need a refresher. Remind your-self of these simply conversions and then attempt the questions.

Calculating the number of a percentage Calculating the percentage of a number
Find 32% of 50

Divide the percentage that you want to find, by 100 = 32/100

Then multiply by that value by the number wanted

32/100 = 0.32 x 50 = 16

Find the percentage of people who said ‘yes’ = 18 to wanting to switch from Apple to Android in a survey of 30 people.

In this question, you would need to divide 17 by the total and multiply by 100

18/30 = 0.6 x 100 = 60%

Converting decimals to percentagesConverting percentages to decimals
Multiply by 100

Add % sign

0.045 x 100 = 4.5%

Remove % sign

Divide by 100

75/100 = 0.75

Converting decimal to fractionConverting fraction to decimal
For 2 decimal places divide by 100

For 3 decimal places divide by 1000

0.75 = 2 decimal places and 0.125 has 3 decimal places

Find the highest number that can be equally divided in both numbers of the fraction

In this case 25 can go into both 75 and 100. You then work out how many time 25 fits into 75, and how many times it fits into 100

25 fits into 75 = 3 times and 100 = 4 time = 3/4

This is much easier! You just divide the top number by the bottom number

1/5 = 0.2

Ratios

A ratio is how much of one thing there is compared to another thing. For example 8:10 means a ratio of 8 to 10. Ratios can be simplified like fractions, so in this case both can by divided by 2 and is therefore simplified to 4:5

A table to show the number of participants who perceived an ambiguous image as a monkey or as a teapot from both conditions: image presented with animals and image presented with kitchen items.

 Perceived as a monkeyPerceived as a teapot
Presented with animals1510
Presented with kitchen items512
  1. Identify and simplify the ratio of the number of participants who perceived a monkey in the first condition and the number who perceived a monkey in the second condition. [2]
  2. Identify and simplify the ratio of the number of participants who perceived a teapot in the first condition and the number who perceived a teapot in the second condition. [2]

Answer – a) identified = 15:5 and simplified = 3:1 b) identified = 10:12 and  simplified = 5:6

Explaining the Answer

The question asks for two ratios, one for identifying and one for simplifying, one mark is achieved for each. In condition one 15 perceive the image as a monkey compared to 5 in condition two, therefore the ratio is identified as 15:5.

In order to simplify a ratio, you divide the numbers by the greatest common factor, this is the largest number that both numbers in the ratio can be divided by. In this case by 15 and 5 can be divided by 5. 15/5 = 3 and 5/5 = 1, therefore the ratio is 3:1.

The same principle applies to question b. The questions asks for the ratio of number who perceive a teapot in the first condition, which is 10 and the number who perceive a teapot in the set condition which is 12, therefore the answer is 10:12.

Simplifying 10:12 is again done by finding the highest common factor, which is 2. Therefore, you divide both numbers by 2. 10/2 =5 and 12/2 =6, so your answer is 5:6

Descriptive statistics

When analysing data, descriptive statistics are used to describe the basic features of the data, they provide a summary of the results and are the first step in any data analysis.

There are two types of descriptive statistics; measures of central tendency and measures of dispersion.

Measures of central tendency: Mean, Median & Mode

The MEAN is the average of the numbers. It is calculated by adding up all the scores and dividing by the total number of scores.

For example,

6 + 9 + 9 + 13 + 15 + 21 + 24 + 24 + 28 + 32 = 181

181/10 (as there are 10 scores) = 18.1

Strengths:

  • Most informative as it takes every score into account

Weaknesses:

  • Any data that is greatly larger or smaller in comparison with the other pieces of data can distort the mean
  • Sometimes the mean doesn’t make sense in terms of what the data is about e.g. the mean number of children in a family = 2.4

The MEDIAN is the middle number. It is calculated by finding the middle score after placing all the scores in numerical order.
If there is an odd number the median is the middle number.

For example,

4, 7, 8, 9, 14, 21, 28, 29, 34 = Median = 14

If there is an even number of results, the median is the mean of the two central numbers.

4, 7, 8, 9, 14, 21, 23, 28, 29, 34 = 14+21= 35/2 Median = 17.5

Strengths:

  • It is less effected by extreme scores

Weaknesses:

  • It is not suited to being used with small sets of data, especially if it contains widely varying scores e.g. 7, 8, 9, 102, 121 = 9, but a more accurate median would be 60!

The MODE is the value that appears most frequently in a set of data.
When there is more than one number that appears the most frequently, we call this bimodal.

For example,

6, 9, 9, 13, 15, 21, 24, 24, 28, 32 The mode is 9 and 24

Strengths:

  • Is not effected by extreme scores
  • Gives a good idea of how often something is occurring e.g. what mobile phone is selling the most

Weaknesses:

  • A set of data may not have a most frequent score

Measures of Dispersion

Measures of dispersion measure how spread out a set of data is and include the range, variance and standard deviation.

The RANGE is the difference between the lowest and highest values. It is calculated by subtracting the lowest score from the highest score in a data set.

For example:

3, 6, 8, 11, 14, 17, 18, 22, 23

23 is the highest score

3 is the lowest score

So the range is 20 (23-3)

The STANDARD DEVIATION tells us about the spread of scores around the mean. So a small variance would imply that the scores are all similar and close to the mean. A large variance would indicate that the scores are at a larger distance from the mean.

Example

If we calculated the mean weather temperature throughout the summer in the UK, the mean may be 15 degrees. If we then calculated the standard deviation as being small, this would show that the temperature remained very consistent throughout the period. If, however, the standard deviation was very large, this would tell us that the weather varied greatly from very cold to very hot on some days.

Activity

Table 1: The mean number of aggressive acts displayed by children in two different nursery’s and the standard deviations for nursery one and nursery two.

 Nursery oneNursery Two
Mean43
Standard Deviation1.60.21

What do the standard deviations tell us about the results?

The Standard Deviation is a measure of how spreads out numbers are. In this case, it indicates that there the results are more spread out in nursery one compared to nursery two. This implies that the some children in nursery one have displayed a high amount of aggression and some have shown low amounts of aggression. Where as in nursery two, because the SD is much lower, it implies that the children all had similar levels of aggression.

 

Strengths and weakness of measures of dispersion:

Range:

Strengths:

  • Easy to calculate
  • Takes into consideration extreme score

Weaknesses:

  • Only using two scores in the data set and ignoring the rest
  • The extreme scores could distort the range

Standard Deviation:

Strengths:

  • Is less effected by extreme scores
  • It uses the whole data set and gives a more accurate idea of how the data is distributed
  • It shows how much data is clustered around a mean value

Weaknesses:

  • Takes a long period of time to calculate
  • Can’t be used with categorical data
  • Assumes a normal distribution and there may not be.