Data Science ::: Python Measuring Central Tendency

Python - Measuring Central Tendency

Mathematically central tendency means measuring the center or distribution of location of values of a data set. It gives an idea of the average value of the data in the data set and also an indication of how widely the values are spread in the data set. That in turn helps in evaluating the chances of a new input fitting into the existing data set and hence probability of success.

There are three main measures of central tendency which can be calculated using the methods in pandas python library.

· Mean - It is the Average value of the data which is a division of sum of the values with the number of values.

· Median - It is the middle value in distribution when the values are arranged in ascending or descending order.

· Mode - It is the most commonly occurring value in a distribution.

Calculating Mean and Median

The pandas functions can be directly used to calculate these values.

import pandas as pd

#Create a Dictionary of series

d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',

   'Lee','Chanchal','Gasper','Naviya','Andres']),

   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),

   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}

#Create a DataFrame

df = pd.DataFrame(d)

print "Mean Values in the Distribution"

print df.mean()

print "*******************************"

print "Median Values in the Distribution"

print df.median()

Its output is as follows −

Mean Values in the Distribution

Age       31.833333

Rating     3.743333

dtype: float64

*******************************

Median Values in the Distribution

Age       29.50

Rating     3.79

dtype: float64

Calculating Mode

Mode may or may not be available in a distribution depending on whether the data is continous or whether there are values which has maximum frquency. We take a simple distribution below to find out the mode. Here we have a value which has maximum frequency in the distribution.

import pandas as pd

#Create a Dictionary of series

d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',

   'Lee','Chanchal','Gasper','Naviya','Andres']),

   'Age':pd.Series([25,26,25,23,30,25,23,34,40,30,25,46])}

#Create a DataFrame

df = pd.DataFrame(d)

print df.mode()

Its output is as follows −

     Age      Name

0   25.0    Andres

1    NaN  Chanchal

2    NaN    Gasper

3    NaN      Jack

4    NaN     James

5    NaN       Lee

6    NaN    Naviya

7    NaN     Ricky

8    NaN     Smith

9    NaN     Steve

10   NaN       Tom

11   NaN       Vin

24loader, Home of exclusive blog and update music and entertaiment

Header Ads

Data Science ::: Python Measuring Central Tendency

Calculating Mean and Median

Calculating Mode

No comments

Facebook

Videos

Popular

Recent

Comments

Tags

Related Post No.

ads

Popular Posts

Photography

Recent in Sports

Custom Widget

Related Post No.