It can be difficult to see patterns and trends in a large dataset: mean, median, and mode are used to effectively summarize the data as concisely as possible.
Terminologies like “average”, “most common”, and “typical value” are all examples of how measures of center are expressed on day to day life.
Population: This is the data from ALL individuals. For example, data of ALL individuals who climbed Mt. Everest. Population data is fixed and complete.
Sample Data: This is data from SOME of the individuals. Data may vary from sample to sample. Sample data is variable and non-complete.
$$ \LARGE \mu = \frac{\sum_{i=0}^{n}x_i}{n} $$
Mean is the most representative value in our dataset. To calculate mean, we sum up all the values, and divide by the number of values.
<aside> 💡
Disadvantage of mean: It is sensitive to outliers (extreme high or extreme low values. Mean will always tend to shift towards outliers. Mean is ideal when data is not influenced by outliers.
</aside>
<aside> 💡
Mean is misleading when our data has outliers
</aside>

Skewed to right

Skewed to left

Symmetric Data
Median is the middle value in our dataset. Follow below steps to calculate the median: