Advertisement
elena1234

cut, population mean and sample size in Python

Jun 7th, 2022 (edited)
286
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 2.21 KB | None | 0 0
  1. Question 4
  2.  
  3. Partition the sample based on 10-year age bands, i.e. the resulting groups will consist of people with ages from 18-28, 29-38, etc. Question 4 Partition the sample based on 10-year age bands, i.e. the resulting groups will consist of people with ages from 18-28, 29-38, etc. Construct 95% confidence intervals for the difference between the mean BMI for females and for males within each age band.    
  4.  
  5. df = da.copy()
  6. df['agegrp'] = pd.cut(x=df.RIDAGEYR, bins=[18,28,38,48,58,68,78,88])
  7. df['gender'] = df.RIAGENDR.map({1: 1, 2: 2})
  8.  
  9. dk = df[['agegrp', 'gender', 'BMXBMI']].dropna()
  10. dk
  11.  
  12. condition = dk['gender'] == 1 # only for males
  13. dk_male = dk[condition]
  14. dk_male
  15.  
  16. dz_male = dk_male.groupby('agegrp').agg({'BMXBMI': [np.mean,np.std]})
  17. dz_male # means and std for males
  18.  
  19. dk_male.groupby('agegrp').size() # sample size n for each males groups
  20.  
  21. condition = dk['gender'] == 2 # only for females
  22. dk_female = dk[condition]
  23. dk_female
  24.  
  25. dz_female = dk_female.groupby('agegrp').agg({'BMXBMI': [np.mean,np.std]})
  26. dz_female # means and std for females
  27.  
  28. dk_female.groupby(['agegrp']).size() # sample size n for each females groups
  29.  
  30. #​ Function for means difference
  31.  
  32.   def find_limits_male_female (mean1, mean2, std1, std2, n1, n2):
  33.     diff = 1.984 * np.sqrt((std1*std1 / n1) + (std2 * std2 / n2))
  34.     low = (mean1 - mean2) - diff
  35.     upper = (mean1 - mean2) + diff
  36.     return (low, upper)
  37.  
  38. group1 = find_limits_male_female(27.058186,28.01943,6.679515,8.048854, 452,494)
  39. group1
  40.  
  41. group2 = find_limits_male_female(29.697180,29.943443,6.726690,7.959097,461,488)
  42. group2
  43.  
  44. group3 = find_limits_male_female(29.514646,31.003733,6.104950,8.044642,396,509)
  45. group3
  46.  
  47. group4 = find_limits_male_female(29.385132,30.787361,6.151534,7.647590,417,451)
  48. group4
  49.  
  50. group5 = find_limits_male_female(29.232462,31.054664,5.959024,7.779502,459,461)
  51. group5
  52.  
  53. group6 = find_limits_male_female(28.720270,30.537818,5.336652, 6.780588,296,275)
  54. group6
  55.  
  56. group7 = find_limits_male_female(27.464368,27.850000,4.695650, 5.483781,174,198)
  57. group7
  58.  
  59. CI for 18 to 28 (0.0, 1.8)
  60. CI for 28 to 38 (-0.7, 1.1)
  61. CI for 38 to 48 (0.6, 2.4)
  62. CI for 48 to 58 (0.4, 2.4)
  63. CI for 58 to 68 (0.9, 2.7)
  64. CI for 68 to 78 (0.8, 2.8)
  65. CI for 78 to 88 (-0.6, 1.4)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement