Getting Started with R on MIT’s 14.74x (Foundations of Development Policy)

I noticed that a major grievance of many students enrolled in MIT‘s latest edX course on development policy (Foundations of Development Policy: Advanced Development Economics) was that there wasn’t enough done to get them going with the R assignments. I have posted the R code for the homework (past the deadline, of course) of the first 2 weeks, so that others get a hang of the level of R that might be needed to solve these assignments in the following weeks. I’m willing to help out those needing help getting up to speed with R required for this course. For specific queries, leave your message in the comments section.

A great place to get spend time learning R before taking Foundations of Development Policy (14.74x) would be another edX course that’s been getting great reviews recently: Introduction to R Programming

R Code for Home Work (Week 1)

# set working directory to local directory where the data is kept
setwd("~/IGIDR/Development Economics - MIT/Homework Assignment 01")
# read the data
wb_dev_ind = read.csv("wb_dev_ind.csv")
# summarize data
summary(wb_dev_ind)
# Question 1
# What is the Mean of GDP per capita? What is the standard deviation of GDP per capita?
meanGDPperCapita = mean(wb_dev_ind$gdp_per_capita, na.rm = TRUE)
print(round(meanGDPperCapita))
sdGDPperCapita = sd(wb_dev_ind$gdp_per_capita, na.rm = TRUE)
print(round(sdGDPperCapita))
# Question 2
# What is the mean illiteracy rate across all countries? What is the standard deviation?
illiteracy_all = numeric(nrow(wb_dev_ind))
wb_dev_ind$illiteracy_all = illiteracy_all
wb_dev_ind$illiteracy_all = 100 - wb_dev_ind$literacy_all
meanIlliteracy = mean(wb_dev_ind$illiteracy_all, na.rm = TRUE)
print(round(meanIlliteracy))
sdIlliteracy = sd(wb_dev_ind$illiteracy_all, na.rm = TRUE)
print(round(sdIlliteracy))
# Question 3
# What is the mean infant mortality rate across all countries? What is the standard deviation?
meanInfantMortality = mean(wb_dev_ind$infant_mortality, na.rm = TRUE)
print(round(meanInfantMortality))
sdInfantMortality = sd(wb_dev_ind$infant_mortality, na.rm = TRUE)
print(round(sdInfantMortality))
# Question 4
# What is the mean male illiteracy rate? What is the mean female illiteracy rate?
illiteracy_male = numeric(nrow(wb_dev_ind))
wb_dev_ind$illiteracy_male = illiteracy_male
wb_dev_ind$illiteracy_male = 100 - wb_dev_ind$literacy_male
meanIlliteracyMale = mean(wb_dev_ind$illiteracy_male, na.rm = TRUE)
print(round(meanIlliteracyMale))
sdIlliteracyMale = sd(wb_dev_ind$illiteracy_male, na.rm = TRUE)
print(round(sdIlliteracyMale))
illiteracy_female = numeric(nrow(wb_dev_ind))
wb_dev_ind$illiteracy_female = illiteracy_female
wb_dev_ind$illiteracy_female = 100 - wb_dev_ind$literacy_female
meanIlliteracyFemale = mean(wb_dev_ind$illiteracy_female, na.rm = TRUE)
print(round(meanIlliteracyFemale))
sdIlliteracyFemale = sd(wb_dev_ind$illiteracy_female, na.rm = TRUE)
print(round(sdIlliteracyFemale))
# Question 5
# What are the mean, minimum, and maximum illiteracy rate among the 50 richest countries
richest50 = wb_dev_ind[order(wb_dev_ind$gdp_per_capita, decreasing = TRUE),][1:50,]
summary(richest50)
# Question 6
# What are the mean, minimum, and maximum illiteracy rate among the 50 poorest countries?
poorest50 = wb_dev_ind[order(wb_dev_ind$gdp_per_capita),][1:50,]
summary(poorest50)
# Question 7
# What are the mean, minimum, and maximum infant mortality rate among the 50 richest countries?
summary(richest50)
# Question 8
# What are the mean, minimum, and maximum infant mortality rate among the 50 poorest countries?
summary(poorest50)
# Question 9
# What is the median GDP per capita?
summary(wb_dev_ind)
# Question 10-12
# Regress the infant mortality rate on per capita GDP, and then answer questions 10-12
model1 = lm(infant_mortality ~ gdp_per_capita, data = wb_dev_ind)
summary(model1)
# Question 13
# Regress the illiteracy rate on GDP per capita. Is the coefficient on per capita GDP significantly different from zero at the 5% level?
model2 = lm(illiteracy_all ~ gdp_per_capita, data = wb_dev_ind)
summary(model2)
# Question 14
# Regress the infant mortality rate on the illiteracy rate. Graph a scatter plot of the data as well as the regression line.
model3 = lm(infant_mortality ~ illiteracy_all, data = wb_dev_ind)
summary(model3)
plot(wb_dev_ind$illiteracy_all, wb_dev_ind$infant_mortality)
abline(model3)
view raw HW01.R hosted with ❤ by GitHub

R Code for Home Work (Week 2)

# Set working directory to local directory where the data is kept
setwd("~/IGIDR/Development Economics - MIT/Homework Assignment 02")
# read data
migueldata = read.csv("ted_miguel_worms.csv", header = TRUE)
attach(migueldata)
# Question 6
# How many observations are there per pupil? (Enter a whole number of 0 or higher)?
length(migueldata$pupid)
length(unique(migueldata$pupid))
# Question 7
# What percentage of the pupils are boys? (Answers within 0.50 percentage points of the correct answer will be accepted. For instance, 67 would be accepted if the correct answer is 67.45%)
mean(sex, na.rm = TRUE)
# Question 8
# What percentage of pupils took the deworming pill in 1998? (Answers within 0.50 percentage points of the correct answer will be accepted. For instance, 67 would be accepted if the correct answer is 67.45%)
mean(pill98, na.rm = TRUE)
# Question 9
# Was the percentage of schools assigned to treatment in 1998 greater than or less than the percentage of pupils that actually took the deworming pill in 1998?
mean(treat_sch98, na.rm = TRUE)
mean(treat_sch98, na.rm = TRUE) > mean(pill98, na.rm = TRUE) # Ans = Greater Than
# Question 10
# Which of the following variables from the dataset are dummy variables? (Check all that apply.)
summary(migueldata)
# Question 11
# Using the data, find and enter the difference in outcomes (Y: school participation) between students who took the pill and students who did not in 1998. (Enter your answer as a difference in proportions. For instance, if the proportion in one group is 0.61 and the proportion in the other group is 0.54, enter 0.07. Answers within 0.05 of the correct answer will be accepted. For instance, 0.28 would be accepted if the correct answer is 0.33.)
took_pill_98 = mean(migueldata[migueldata$pill98 == 1,]$totpar98, na.rm = TRUE)
no_pill_98 = mean(migueldata[migueldata$pill98 == 0,]$totpar98, na.rm = TRUE)
diff = took_pill_98 - no_pill_98
diff
# Question 12
# Since schools were randomly assigned to the deworming treatment group, the estimate calculated in the previous answer is an unbiased estimate of taking the pill on school attendance.
# False
# Explanation
# The estimated impact of 13 percentage points calculated in the previous answer might not be a good estimate of the effect of taking the pill. Many students in the randomly assigned treatment schools did not actually take the pills, so those who took the pills would not have been randomly selected at all. For instance, kids who attend school more anyway might have been more likely to be there when the pills were handed out, meaning that omitted variables would be correlated with taking the pill and future school attendance. This would bias the estimate upward i.e. the 13 percentage point difference might overstate the impact of deworming on attendance.
# Question 13
# Using the data, find and enter the difference in outcomes (Y: school participation) between students in treatment schools and students not in treatment schools in 1998, regardless of whether or not they actually took the pill. (Enter your answer as a difference in proportions. For instance, if the proportion in one group is 0.61 and the proportion in the other group is 0.54, enter 0.07. Answers within 0.05 of the correct answer will be accepted. For instance, 0.28 would be accepted if the correct answer is 0.33.)
in_treatment_sch = mean(migueldata[migueldata$treat_sch98 == 1,]$totpar98, na.rm = TRUE)
non_treatment_sch = mean(migueldata[migueldata$treat_sch98 == 0,]$totpar98, na.rm = TRUE)
diff_treatment_sch = in_treatment_sch - non_treatment_sch
diff_treatment_sch
# Question 14
# Using the data, calculate the difference in the probability of taking the pill given that a student was in a treatment school and the probability of taking it if a student was not in a treatment school. (Enter your answer as a difference in proportions. For instance, if the proportion in one group is 0.61 and the proportion in the other group is 0.54, enter 0.07. Answers within 0.05 of the correct answer will be accepted. For instance, 0.28 would be accepted if the correct answer is 0.33.)
pr_pill_treatment_sch = mean(migueldata[migueldata$treat_sch98 == 1,]$pill98, na.rm = TRUE)
pr_pill_no_treatment_sch = mean(migueldata[migueldata$treat_sch98 == 0,]$pill98, na.rm = TRUE)
diff_pr_pill_treatment_sch = pr_pill_treatment_sch - pr_pill_no_treatment_sch
# Question 15
# Using the data, derive the Wald Estimator of taking the pill on school attendance. (Enter your answer as a difference in proportions. For instance, if the proportion in one group is 0.61 and the proportion in the other group is 0.54, enter 0.07. Answers within 0.05 of the correct answer will be accepted. For instance, 0.28 would be accepted if the correct answer is 0.33.)
waldRatio = diff_treatment_sch/diff_pr_pill_treatment_sch
waldRatio
view raw HW02.R hosted with ❤ by GitHub

I hope this helps!

5 thoughts on “Getting Started with R on MIT’s 14.74x (Foundations of Development Policy)

Leave a comment