Please answer all questions in R language in RStudio. The questions are as follows:
#======= Question 1 (1 Point) =======
Q1-1. Spread the data out to multiple columns, with the shop type (Starbucks vs. Dunkin Donuts) being the key column and the number of shops being the value column.
Q1-2. Delete observations that contain at least one missing value or invalid value (e.g., negative income).
# Q2-1. Create a scatter plot to examine the relationship between house prices and Starbucks.
# Q2-2. Repeat Q2-1 for Dunkin Donuts.
# Q3. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts.
# Consider both predictors at the same time.
# One might argue that neighborhoods where Starbucks are located are relatively rich.
# We want to examine if Starbucks still has a predictive power for house prices, even after controlling for household incomes and population.
# Q4-1. Create new variables by taking the logarithm of median_income and population.
# Q4-2. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts, as well as the logarithm of household incomes and population.
# Q4-3. Do you think considering median income and population improves the linear regression model?
#======= Question 5 (2 Point) =======
# The dynamics of house prices might vary across counties.
# Q5-1. Split (facet) the plot resulting from Question 2 by county.
# Q5-2. Add the county variable to the previous linear regression model (from Question 4).
#======= Question 6 (2 Point) =======
# Note that answers without any explanations will be given a penalty.
# There is no particular answer to this question. Any "reasonable" answers based on your analyses are acceptable.
# Q6-1. Do you think the coefficients of Starbucks and Dunkin Donuts remain significant after considering the county information?
# Write your opinion briefly by commenting (#).
# Q6-2. According to the linear regression model in Question 5, which county has the highest average house prices?
# Write your opinion briefly by commenting (#).
# Q6-3. Calculate average house prices by county. Which county has the highest average house prices?
# If the result seems different from the linear regression, how would you interpret it?