Mean, median, mode and other statistical functions
This is a problem set. Some of these are easy, others are far more difficult. The purpose of these problems sets are to HELP YOU THINK THROUGH problems. The solution is at the bottom of this page, but please don't look at it until you have tried (and failed) at least three or four times.
What is this problem set trying to do[edit]
You are going to use a number of built-in methods here. If you complete this problem set, you will have shown me you sort of understand:
- sorting lists click here to learn a bit about sorting
- counting occurrences in a list click here for a review of counting
- you will return to an old friend, modulo click here for a refresher
- max function click here to learn more about max
- min function click here to learn more about min
The Problem[edit]
Please program the following functions:
- Mean For a data set, the terms arithmetic mean, mathematical expectation, and sometimes average are used synonymously to refer to a central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values.[2]
- Mode The mode is the value that appears most often in a set of data. [3]
- Median In statistics and probability theory, a median is the number separating the higher half of a data sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one (e.g., the median of {3, 3, 5, 9, 11} is 5). If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values[4]
Some Code to Get You Started[edit]
list=[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5]
def mean(list):
answer = sum(list)
...
return mean
def mode(list):
frequency = {}
highest = max(list)
lowest = min(list)
# in this loop, we simply update our dictionary named "frequency with the count of values.
for i in range(lowest,highest+1):
frequency.update({i:list.count(i)})
...
return mode
def median(list):
new_list = sorted(list)
...
return median
print("the mean of list is: " + str(mean(list)))
print("the median of list is: " + str(median(list)))
print("the mode of list is: " + str(mode(list)))
Take This Further[edit]
- plot (graphically - with ascii art) the range of numbers
- calculate the standard deviation of a range of numbers
How you will be assessed[edit]
Every problem set is a formative assignment. Please click here to see how you will be graded
References[edit]
One Possible Solution[edit]
Click the expand link to see one possible solution, but NOT before you have tried and failed!
list=[2,3,3,2,3,2,3,9,7,3,4,8,1,2,8,7,6,5,8,9,1,2,3,2,1,4,3,2,1,4,5,4,1,6,9,6,1,4,2,3,5]
def mean(list):
answer = sum(list)
mean = answer / len(list)
return mean
def mode(list):
frequency = {}
highest = max(list)
lowest = min(list)
# in this loop, we simply update our dictionary named "frequency with the count of values.
for i in range(lowest,highest+1):
frequency.update({i:list.count(i)})
values = frequency.values()
keys = frequency.keys()
mode = keys[values.index(max(values))]
return mode
def median(list):
new_list = sorted(list)
if len(new_list) % 2 == 1:
median = new_list[len(list)/2]
return median
print("the mean of list is: " + str(mean(list)))
print("the median of list is: " + str(median(list)))
print("the mode of list is: " + str(mode(list)))