Things to do :
1. Collect fair and representative sample (using one of the three methods). In the two coursework’s so far has been data given.
2. Talk about your hypostasis.
3. Decide on groups an class intervals, you will need sample size and range.
4. Sort: by doing Tallying and construct a frequency table
5. Display data by doing: Piechart, bar chart or histogram, stem and leaf, pictogram, line graph and frequency polygon.
6. Calculate the mean and compare data.
7. Talk about the measure of spread and do a cumulative frequency curve and box plot. This will enable you to calculate IQR.
8. Go on to do the third measure of spread which is the standard deviation.
9. Draw a table with all your result and use common sense to discuss data.
10. Extend the task.
Statistics Coursework 1
The title of my investigation is 'The weight of a Sample of mice which are Randomly Selected '. As you can interpret from this title, I am going to investigate into the weight of a set of randomly selected male and female mice. I have chosen to investigate into this topic because I am required by this subject to produce a coursework assignment.
This leaves me with the predicament of trying to decide how heavy mice should be. Is there a certain weight difference between male and female mice or is my sample miss lead and then further testing may be required. Is there a reason why my sample shows differences and is this true of all field mice? In contrast, is there a certain weight expected at certain times of the year for these field mice? In short, my aim for this investigation is to find the average weight for male and female and so enable me to correctly decide the weight of my sample in the coursework. Then I will use probability to assess the chance of a male or female being a certain average weight.
To determine a population for my coursework, I am going to assume the whole field as my population and my sample of mice has been provided, 107 female observations and 94 male observations.
It is never the less a useful point of crossreference information from a table so I will show this in a table.
Table to show sample size and range
Mice

Observations of mice

Highest weight gms

Lowest
Weight gms

Range

Female

107

29

5

24

Male

94

29

9

20

Sorting data before I start to compare the information:
I am going to group my data because the sample size is too big to analysis in a long list. I know my classes will start at 5gms because that is the lightest weight and end at 29gms because that is the heaviest weight.
Grouping ofweight


5 to








 to 29






Weights gms in 5’s
W = weight (variable)

When drawing graphs it is better to show the classes continues so there are no gaps

Tally

5 to 9

5 W 10

Everything less than 10 so 9.999999

10 to 14

10 W 15

Everything less than 10 so 14.99999

15 t0 19

15 W 20

Everything less than 10 so 19.9999

20 to 24

20 W 25

Everything less than 10 so 24.9

25 to 29

25 W 30

Everything less than 10 so 29.999

Now I have decided on grouping my data, I will loose some accuracy because I will be using mid weight of each group to calculate mean and standard deviation. However, the value of sorting information into a table outweighs the small loss of accuracy.
Tally the data
Female (107 obsevation)

Tally

Frequency
Number of mice

Mid –weight
gms

W= weights


F =frequency(count)

w

5 W 10

iiii iiii iiii iiii

19

7.5

10 W 15

iiii iiii iiii iiii iiii iiii

29

12.5

15 W 20

Iiii ii

7

17.5

20 W 25

iiii iiii iiii iiii iiii

25

22.5

25 W 30

iiii iiii iiii iiii iiii iiii ii

27

27.5


Total observations

107


Male (94 obsevation)

Tally

Frequency
Number of mice

Mid –weight
gms

W= weights


F =frequency(count)

w

5 W 10

iiii iiii iiii iiii

1

7.5

10 W 15

iiii iiii iiii iiii iiii iiii

20

12.5

15 W 20

Iiii ii

45

17.5

20 W 25

iiii iiii iiii iiii iiii

21

22.5

25 W 30

iiii iiii iiii iiii iiii iiii ii

7

27.5


Total observation

94


Now I will use some display graphs to note any peaks or trends. There are many types of graphs one can use to display information.(Piechart, bar chart or histogram, stem and leaf, pictogram, line graph and frequency polygon). I have decided to use a frequency polygon because this will show two curves on one graph and pick out peeks and trends. I will also use a stem and leaf to compare any extremes differences and finally use a histogram for an accurate display.
Frequency Polygon ( show AV)
L
Female Frequency polygon
Male frequency polygon

f

45



x



r

40






e

35






q

30


x




u

25




x

x

e

20

x

x




n

15






c

10



x


x

y

5

x






0
5

10

15

20

25

30

weights

Stem and leaf
To get a visual idea of the spread of my data, I decided to represent it in a stem and leaf diagram:
Female Male
To display a histogram I will need to use a table to calculate frequency density (height of bar)
Female
Weight = W

Frequency F

Class interval CI

Workings F ÷ CI

Frequency density

5 W 10

19

5

19 ÷ 5 =

3.8

10 W 15

29

5

29 ÷ 5 =

5.8

15 W 20

7

5

7 ÷ 5 =

1.4

20 W 25

25

5

25 ÷ 5 =

5

25 W 30

27

5

27 ÷ 5 =

5.4

Total

107







Male
Weight = W

Frequency F

Class interval CI

Workings F ÷ CI

Frequency density

5 W 10

19

5

19 ÷ 5 =

3.8

10 W 15

29

5

29 ÷ 5 =

5.8

15 W 20

7

5

7 ÷ 5 =

1.4

20 W 25

25

5

25 ÷ 5 =

5

25 W 30

27

5

27 ÷ 5 =

5.4

Total

107







Histograms
Histograms are similar to bar charts apart from the consideration of areas. In a bar chart, all of the bars are the same width and the only thing that matters is the height of the bar. In a histogram, the area is the important thing.
When drawing a histogram, the yaxis is labelled 'relative frequency' or 'frequency density'. You must work out the relative frequency before you can draw a histogram. To do this, first you must choose a standard width of the groups.
Now to get a better view of my data, I will calculate the estimated mean. This is an estimate because I have tallied the data in groups. So I cannot add up all the weights and divide by items in my survey. I will use mid weight to enable me to calculate the mean.
To find total weight I will use a table:
Female

Mid –weight
gms

Frequency

Working

Total weight

Weight = W

w

F

Mid weight X F

gms

5 W 10

7.5

19

7.5 X 19 =

142.5

10 W 15

12.5

29

12.5 X 29 =

362.5

15 W 20

17.5

7

17.5 X 7 =

122.5

20 W 25

22.5

25

22.5 X 25 =

562.5

25 W 30

27.5

27

27.5 X 27 =

742.5


Total

107 mice

Total weight =

1932.5 gms


Mean weight Female = 1932.5 ÷ 107
= 18.06gms

male

Mid –weight
gms

Frequency

Working

Total weight

Weight = W

w

F

Mid weight X F

gms

5 W 10

7.5

1

7.5 X 1=

7.5

10 W 15

12.5

20

12.5 X 20 =

250.0

15 W 20

17.5

45

17.5 X 45 =

787.5

20 W 25

22.5

21

22.5 X 21 =

472.5

25 W 30

27.5

7

27.5 X 7 =

192.5


Total

94 mice

Total weight =

1710.0gms


Mean weight Female = 1710.0 ÷ 94
= 18.19gms







Mice

Mean

Range

Female

18.06

24

Male

18.19

20

As my table shows that the mean weight of both is very similar and the male are slightly heavier on average. Looking at the spread of the data we can see that the male weight has less variation and it is true that the males are heavier. The range is a very simple measure of spread as it only takes account of the highest and lowest value. An even better measure of spread is the I.Q.R ( inter quartile range). This measure looks at the top quarter and lower quarter part of the data.I will need to draw a cumulative frequency graph to obtain this.
I have now decided to construct a frequency table so that I can draw a cumulative frequency graph, which will enable me to draw a box and whisker plot, and therefore visually see any or all outliers in my data. Here follows my table
Female



Cumulative


Weight = W

Frequency

weight

Frequency



F

Less than 5

0


5 W 10

19

Less than 10

19


10 W 15

29

Less than 15

48


15 W 20

7

Less than 20

55


20 W 25

25

Less than 25

80


25 W 30

27

Less than 30

107



107 mice










Male



Cumulative


Weight = W

Frequency

weight

Frequency



F

Less than 5

0


5 W 10

1

Less than 10

1


10 W 15

20

Less than 15

21


15 W 20

45

Less than 20

66


20 W 25

21

Less than 25

87


25 W 30

7

Less than 30

94



94 mice










Cumulative graphs:
Mice

LQ

UQ

IQR

Female

11.5gms

25.5gms

14gms

Male

15gms

22gms

7gms





Again this has confirm it that the mean weight is more consistent than the female weight as there may be some expecting mothers, ( the lower the measure of spread the better).
An even better measure of spread is the standard deviation. The range only looks at the two extremes and the IQR only looks at the top quarter and lower quarter; whereas the Sd looks at ever item in the survey.
To calculate Sd:
 mean^{2}
Mice

Mean

mean^{2}

Female

18.06gms

326.1636

Male

18.19gms

330.8761

A table to calculate f X w^{2 } (where w is the variable mid weight)
Female
Mid –weight
gms

w^{2}

w^{2 } X f

w

Square mid weight


7.5

56.25

56.25 X 19 = 1068.75

12.5

156.25

156.25 X 29 = 4531.25

17.5

306.25

306.25 X 7 =2143.75

22.5

506.25

506.25 X 25 =12656.25

27.5

756.25

756.25 X 27 =20418.75

Total

Total

= 40818.75

Variance =  mean^{2}
107
Variance female = 40818.75  326.1636
107
Variance female = 381.48364  326.1636
Standard deviation =
s.d female = 7.4 gms
Male
Mid –weight
gms

w^{2}

w^{2 } X f

w

Square mid weight


7.5

56.25

56.25 X 1 = 56.25

12.5

156.25

156.25 X 20 = 3125.0

17.5

306.25

306.25 X 45 =13781.25

22.5

506.25

506.25 X 21 =10631.25

27.5

756.25

756.25 X 7 =5293.75

Total

Total

= 32887.5

Variance =  mean^{2}
94
Variance male = 32887.5  330.8761
94
Variance male = 349.86  326.16
Standard deviation =
s.d male = 4.86gms
LOOKING AT ALL THREE MEASURES OF SPREAD AND THE AVERAGE
Mice

Mean

Range

IQR

SD

Female

18.06gms

24gms

14gms

7.4 gms

Male

18.19gms

20gms

7gms

4.86gms

All three measures of spread show that the male weights are more consistent and therefore the average weight of 18.19gms is what one would expect for females, yet the female average weight could vary by at least 7.4gms depending if they were expecting and had young ones.
The quality of the investigation could be improved by using a more varied population, in an ideal world this would be an Internet site of every field mice. Further testing could be done with regards to the time of year we look at three different seasons. 