Sumit Chawan-137160323850Student id : 18200549Question :Suppose a hospital tested the age and body fat data for 18 randomly selected adults with the following results.Age 23 23 27 27 39 41 47 49 50%Fat 9.5 26.

5 7.8 17.8 31 4 25.9 27.4 31.2Age 52 54 54 56 57 58 58 60 61%Fat 34.

6 42.5 28.8 33.4 30.2 34.1 32.9 41.2 35.

7Q1) Calculate the mean, median and standard deviation of age and %fat. MeanAge=1Total Number of Records*sum of Ages Mean(Age)=23+23+27+27+39+41+47+49+50+52+54+54+56+57+58+58+60+6118 MeanAge =83618MeanAge =46.4444Mean%Fat=1Total Number of Records*sum of %Fat Mean(%Fat)=9.5+26.

5+7.8+17.8+31+4+25.9+27.

4+31.2+34.6+42.5+28.8+33.

4+30.2+34.1+32.9+41.

2+35.718 Mean%Fat =494.518Mean%Fat =27.

4722Median : Median is the middle value for the given list of data.Note : The Data must be ordered before making the calculations for median.Here the as the number of records are even , median can be calculated as the average of the 9th and 10th term of the data.MedianAge=9th Term+10th Term2=50+522=51Median%Fat=9th Term+10th Term2=30.2+312=30.

6Standard Deviation Standard Deviation defines how the measurements of the group spread about the mean value of the data set.Std(age) = ( ((23-46.4)2 + (23-46.4)2 + (27-46.

4)2 + (27-46.4)2 + (39-46.4)2 + (41-46.4)2 + (47-46.4)2 + (49-46.4)2 + (50-46.4)2+ (52-46.

4)2 + (54-46.4)2 + (54-46.4)2 + (56-46.4)2 + (57-46.4)2 + (58-46.4)2 + (58-46.

4)2 + (60-46.4)2 + (61-46.4)2 )/ 18)1/2 = 12.94 Standard Deviation of % Fat Std(%fat) ( ((9.5-27.47)2 + (26.

5-27.47)2 + (7.8-27.47)2 + (17.8-27.47)2 + (31.

4-27.47)2 + (25.9-27.47)2 + (27.4-27.47)2 + (27.

2-27.47)2 + (31.2-27.

47)2+ (34.6-27.47)2 + (42.5-27.47)2 + (27.

47-27.47)2 + (33.4-27.47)2 + (30.2-27.

47)2 + (34.1-27.47)2 + (39.9-27.47)2 + (41.

2-27.47)2 + (35.7-27.

47)2 )/ 18)1/2 = 10.63(Approx)Q2) Box PlotBox plot Summary for Age :Q1 = 36 ,Q3 = 57.25, Mean =46.44, Median = 51 (As Calculated in Q1),Min value = 23 ,Max value= 61.Box plot Summary for %Fat :Q1 = 23.875 ,Q3 = 34.

225, Mean =27.47, Median = 30.6 (As Calculated in Q1),Min value = 9.5 ,Max value= 42.

5.Q3)Scatter Plot based on Age and %Fat.Q-Q plot based on Age and %FatCode Snippet for Q-Q plot in R-ProgrammingQ-Q PlotQ4)Normalize the two variables based on Z-Score Normalization.Z-Score Normalization : Z-score Normalization converts all the values in dataset to a common scale with an average of Zero and standard deviation of 1.The Z- score is calculated as : z=x-??Where ?=Mean of x and ?=standard deviation.Example : For x = 23 , Mean of age = 46.

44 ,Std. of Age = 12.94Z=23-46.412.94= -1.80835Similarly we can calculate the Z-transform for all values for the two variables .This result is populated in table below.

Q5) Calculate the Co-relation co-efficient.The Pearson co-efficient measures the strength of the linear relationship between two variables. It is calculated as :r=xi-xmean*(yi-ymean)xi-xmean2*(yi-ymean)2Where r = co-relation co-efficientxmean= mean of x-variablesymean = mean of y-variablesxmean = 46.4 and ymean = 27.

47 (Calculated in Question 1) (xi-xmean)² * (yi-ymean)²= 6044262.007 Square Root of ((xi-xmean)² * (yi-ymean)² )= 2458.508086We also have the value of ximean * yimean from the table = 2376.724Hence ,r= 0.96Q6)Are “age” and “%Fat” positively or negatively co-related ?Ans:1) Yes the two variables “Age” and “%Fat” are positively co-related .2)The Pearson co-relation co-efficient of 0.96 indicates a strong relationship between the two variables .

i.e. As the age increases the %Fat will also increase and vise versa.References :1) http://onlinestatbook.com/2/describing_bivariate_data/pearson.html2) http://howto.commetrics.com/methodology/statistics/normalization/3) https://stat.ethz.ch/R-manual/R-devel/library/stats/html/qqnorm.html