# Mean, median, mode and other statistical functions

Many students have tried this problem and found it difficult. Please be patient and don't give up!!

This a problem set for you to work through [1]

This is a problem set. Some of these are easy, others are far more difficult. The purpose of these problems sets are to HELP YOU THINK THROUGH problems. The solution is at the bottom of this page, but please don't look at it until you have tried (and failed) at least three or four times.

## What is this problem set trying to do

You are going to use a number of built-in methods here. If you complete this problem set, you will have shown me you understand:

2. counting occurrences in a list click here for a review of counting

## The Problem

1. Mean For a data set, the terms arithmetic mean, mathematical expectation, and sometimes average are used synonymously to refer to a central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values.[2]
2. Mode The mode is the value that appears most often in a set of data. [3]
3. Median In statistics and probability theory, a median is the number separating the higher half of a data sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one (e.g., the median of {3, 3, 5, 9, 11} is 5). If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values[4]

## Some Code to Get You Started

```list=[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5]

def mean(list):
...
return mean

def mode(list):
frequency = {}
highest = max(list)
lowest = min(list)
# in this loop, we simply update our list named "frequency with the count of values.
# we use highest + 1 because the range function doesn't include the last value.
for i in range(lowest,highest+1):
...
return mode

def median(list):
new_list = sorted(list)
...
return median

# your program must return the correct answers for the questions below:

print("the mean of list is: " + str(mean(list)))
print("the median of list is: " + str(median(list)))
print("the mode of list is: " + str(mode(list)))
```

## Take This Further

1. plot (graphically - with ascii art) the range of numbers
2. calculate the standard deviation of a range of numbers

## A few different possible solutions

Click the expand link to see one possible solution, but NOT before you have tried and failed!

```list=[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5]

def mean(list):
return mean

def mode(list):
frequency = {}
highest = max(list)
lowest = min(list)
# in this loop, we simply update our dictionary named "frequency with the count of values.
for i in range(lowest,highest+1):
frequency.update({i:list.count(i)})
values = frequency.values()
keys = frequency.keys()
mode = keys[values.index(max(values))]
return mode

def median(list):
new_list = sorted(list)
if len(new_list) % 2 == 1:
median = new_list[len(list)/2]
return median

print("the mean of list is: " + str(mean(list)))
print("the median of list is: " + str(median(list)))
print("the mode of list is: " + str(mode(list)))
```

The example below includes standard deviation and a bar graph.

```import matplotlib.pyplot as plt

list =[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5,5]
numbers = [1,2,3,4]

def graph(graph_list):
count = [1]
sorted_list = sorted(graph_list)
highest = max(sorted_list)
lowest = min(sorted_list)
for i in range(lowest,highest+1):
count.append(i)
plt.hist(sorted_list, bins=count)
plt.ylabel('Occurences')
plt.xlabel('Number')
plt.show()

def mean(mean_list):
mean = sum(numbers)/len(numbers)
return mean

def mode(mode_list):
frequency = []
count = []
sorted_list = sorted(mode_list)
highest = max(sorted_list)
lowest = min(sorted_list)
for i in range(lowest,highest+1):
frequency.append(sorted_list.count(i))
count.append(i)
highest_frequency = max(frequency)
index_count = frequency.index(highest_frequency)
mode = count[index_count]
return mode

def median(median_list):
sorted_list = sorted(median_list)
if len(sorted_list)%2 ==1:
index = len(sorted_list)/2
median = sorted_list[index]
else:
index_1 = len(sorted_list)/2
index_2 = len(sorted_list)/2 - 1
median = []
median.append(sorted_list[index_1])
median.append(sorted_list[index_2])
median = mean(median)
print("The median is rounded up")
return median

def standev(stdev_list):
mean_list = []
mean_num = mean(stdev_list)
for i in range (0,len(stdev_list)):
mean_list.append(stdev_list[i]-mean_num)
mean_difference = mean(mean_list)
stdev = mean_difference**1/2
return stdev

# your program must return the correct answers for the questions below:

print("the mean of list is: " + str(mean(numbers)))
print("the median of list is: " + str(median(numbers)))
print("the mode of list is: " + str(mode(numbers)))
print("the standard deviation of list is: " + str(standev(numbers)))
graph(numbers)
```

The code below approaches mode differently than the two examples above (which use a dictionary).

```list=[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5]

def mean(list):
mean = sum(list)/len(list)
return mean

def median(list):
new_list = sorted(list)
median = new_list[len(new_list)/2]
return median

def mode(list):
current_top = 0
highest = max(list)
lowest = min(list)
for i in range(lowest,highest+1):
new_possible_top = list.count(i)
if new_possible_top > current_top:
current_top = new_possible_top
mode = i
return mode

# your program must return the correct answers for the questions below:
print("the mean of list is: " + str(mean(list)))
print("the median of list is: " + str(median(list)))
print("the mode of list is: " + str(mode(list)))
```