Statistics For Data Science Course
3
338

Central Tendencies Mean, Median, and Mode

What is a Central Tendency?

Central tendency is a measurement of data in terms of a single value that represents the centre or location of the data distribution

Let’s take an example. We are given a dataset of Sex Ratio of all the states of the India. In this case, Central Tendency is a single
number which denotes the “Centre” of the data. We can say that the mean of the sex ratio of all states is 940. We can also say that the Median of sex ratio of all Indian states is 946.6. Here 940 or 946.6 is a single number which represents the Centre or Location of our dataset and is known as Central Tendency. While the central tendencies are good for describing the location of data, it fails to tell anything about the shape of data. That is, how disperse or how close the data points are in a distribution. Central Tendencies are
measured mostly using

  • Mean – Calculated Average
  • Median – middle value
  • Mode – most occurring value

Mean

A Mean is a sample mathematical average of data points.

In our previous example, we may ask the following question –

  • What is the average sex ratio of all the states of India?
We can take the average of the sex ratio of all the states. The following graph represents the sex ration of each state in brown point while the blue line represents the central tendency, which is mean in this case.

Mean_SexRatio

Median

Median is a value which divides the data-set into two equal halves. That is, it separates the lower half of the data and the upper half of the data. To calculate median, we sort the N data-points in increasing order and then calculate median as follows :-
  • (N+1/2)th term, if N is ODD
  • ((N/2 th) term + (N/2 +1)th term)/2
The following picture depicts example for each case.

Median_Calculation

Mean Vs Median

Mean doesn’t represent correct picture when the data has outliers, or the distribution is skewed.

For e.g., let’s pocket money of 5 students in thousand rupees are 5,7,8,12. Then the average pocket money is 8k. But when an outlier say a very rich student say Mark whose pocket money is 100,000k joins them, then average becomes almost 20,000k. This is gross misrepresentation as 5 out of 6 students has very little pocket money compared to Mark. But, Median pocket money in this case is 8k which makes sense.

The median gives a better idea of central tendency in case of skewed distribution. 

The following diagram summarizes Mean Vs Median with another example.


MeanVsMedian

Mode

The mode is the most frequent score in our data set. Let’s say our data-set has following data-points.

·        1,2,3,3,3,4,4,5,3,7,3

In this case 3 occurs the most number of times and hence 3 is the mode of our data-set. Mode is used when we want to know the most likely outcome in our data-set. Let’s say we are dealing with categorical data. For e.g. religion of people in India and we know that 80% people living in India are Hindus. Hence Hindu become the mode of our data.

The following video deals with measures of central tendency in detail.

Show Comments

No Responses Yet

Leave a Reply