Measures of central tendency

To represent a quick summary of data we use several mathematical methods. These methods usually try to determine where the center of the dataset lie.

There are three mathematical measures and they all have their own unique features:

  1. Mean

  2. Median

  3. Mode

Mean is simply the average of the whole dataset. It is very much affected by outliers.

Median is the middle point of the ordered data set. It is not affected by outliers.

A mode is an element with the most frequency.

A range is a difference between max and minimum values.

Image result for mean mode median

 

The measure of central tendency

Well if you are seeing a graph of this kind for the first time there is no need to get overwhelmed with the shape of the curve.

The graph represents the distribution of data where X-axis represents the values that occur in the dataset and the Y-axis represent the frequency of the occurrence of those values.

 

Skewness (aka the measure of symmetry)

So skewness is merely a measure of symmetry, so skewness will indicate if there is data more concentrated on one side than another. It also measures the amount of how much it is concentrated.

So if median < mean then it is right or positively skewed, so what it means the starting points have higher values than the ending values, so the outliers are on the right.

If median > mean then it is negatively skewed or left skewed, the endpoints have higher values than the starting points, so the outliers are on the left.

So the skew is where the tail is. (or where the outliers are)

When median=mean then it is zero skew.

Skewness

 

 

Here is the link to all Maths 101 posts

Maths 101: Part 1: Data Types and their visualization

Math 101: Part 2: Measures of central tendency and Skewness

Maths 101: Part 3: Random variables and Normal Distribution

Maths 101: Part 4: PDF, Central Limit Theorem and Chebyshev’s inequality


0 Comments

What do you think?