Question

Please answer all questions in R language in RStudio. The questions are as follows: #======= Question 1 (1 Point) ======= Q1-1. Spread the data out to multiple columns, with the shop type (Starbucks vs. Dunkin Donuts) being the key column and the number of shops being the value column. Q1-2. Delete observations that contain at least one missing value or invalid value (e.g., negative income). # Q2-1. Create a scatter plot to examine the relationship between house prices and Starbucks. # Q2-2. Repeat Q2-1 for Dunkin Donuts. # Q3. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts. # Consider both predictors at the same time. # One might argue that neighborhoods where Starbucks are located are relatively rich. # We want to examine if Starbucks still has a predictive power for house prices, even after controlling for household incomes and population. # Q4-1. Create new variables by taking the logarithm of median_income and population. # Q4-2. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts, as well as the logarithm of household incomes and population. # Q4-3. Do you think considering median income and population improves the linear regression model? #======= Question 5 (2 Point) ======= # The dynamics of house prices might vary across counties. # Q5-1. Split (facet) the plot resulting from Question 2 by county. # Q5-2. Add the county variable to the previous linear regression model (from Question 4). #======= Question 6 (2 Point) ======= # Note that answers without any explanations will be given a penalty. # There is no particular answer to this question. Any "reasonable" answers based on your analyses are acceptable. # Q6-1. Do you think the coefficients of Starbucks and Dunkin Donuts remain significant after considering the county information? # Write your opinion briefly by commenting (#). # Q6-2. According to the linear regression model in Question 5, which county has the highest average house prices? # Write your opinion briefly by commenting (#). # Q6-3. Calculate average house prices by county. Which county has the highest average house prices? # If the result seems different from the linear regression, how would you interpret it?

          Please answer all questions in R language in RStudio. The questions are as follows:

#======= Question 1 (1 Point) ======= 
Q1-1. Spread the data out to multiple columns, with the shop type (Starbucks vs. Dunkin Donuts) being the key column and the number of shops being the value column. 
Q1-2. Delete observations that contain at least one missing value or invalid value (e.g., negative income).

# Q2-1. Create a scatter plot to examine the relationship between house prices and Starbucks. 
# Q2-2. Repeat Q2-1 for Dunkin Donuts.

# Q3. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts. 
#     Consider both predictors at the same time.

# One might argue that neighborhoods where Starbucks are located are relatively rich.  
# We want to examine if Starbucks still has a predictive power for house prices, even after controlling for household incomes and population. 

# Q4-1. Create new variables by taking the logarithm of median_income and population. 
# Q4-2. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts, as well as the logarithm of household incomes and population. 
# Q4-3. Do you think considering median income and population improves the linear regression model?

#======= Question 5 (2 Point) ======= 
# The dynamics of house prices might vary across counties. 
# Q5-1. Split (facet) the plot resulting from Question 2 by county. 
# Q5-2. Add the county variable to the previous linear regression model (from Question 4).

#======= Question 6 (2 Point) ======= 
# Note that answers without any explanations will be given a penalty. 
# There is no particular answer to this question. Any "reasonable" answers based on your analyses are acceptable.   
# Q6-1. Do you think the coefficients of Starbucks and Dunkin Donuts remain significant after considering the county information?  
#       Write your opinion briefly by commenting (#).      
# Q6-2. According to the linear regression model in Question 5, which county has the highest average house prices? 
#       Write your opinion briefly by commenting (#).     
# Q6-3. Calculate average house prices by county. Which county has the highest average house prices? 
#       If the result seems different from the linear regression, how would you interpret it?
        
Show more…
please answer all questions in r language in rstudio questions are below thank you question 1 1 point q1 1 spread the data out to multiple columns with the shop type starbucks vs dunkin donu 86756

Added by Brian H.

Close

Computer Science and Information Technology
Computer Science and Information Technology
Trishna Knowledge Systems 2018 Edition
AceChat toggle button
Close icon
Ace pointing down

Please give Ace some feedback

Your feedback will help us improve your experience

Thumb up icon Thumb down icon
Thanks for your feedback!
Profile picture
Please answer all questions in R language in RStudio. The questions are as follows: #======= Question 1 (1 Point) ======= Q1-1. Spread the data out to multiple columns, with the shop type (Starbucks vs. Dunkin Donuts) being the key column and the number of shops being the value column. Q1-2. Delete observations that contain at least one missing value or invalid value (e.g., negative income). # Q2-1. Create a scatter plot to examine the relationship between house prices and Starbucks. # Q2-2. Repeat Q2-1 for Dunkin Donuts. # Q3. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts. # Consider both predictors at the same time. # One might argue that neighborhoods where Starbucks are located are relatively rich. # We want to examine if Starbucks still has a predictive power for house prices, even after controlling for household incomes and population. # Q4-1. Create new variables by taking the logarithm of median_income and population. # Q4-2. Build a linear regression model to predict house prices based on the number of Starbucks and Dunkin Donuts, as well as the logarithm of household incomes and population. # Q4-3. Do you think considering median income and population improves the linear regression model? #======= Question 5 (2 Point) ======= # The dynamics of house prices might vary across counties. # Q5-1. Split (facet) the plot resulting from Question 2 by county. # Q5-2. Add the county variable to the previous linear regression model (from Question 4). #======= Question 6 (2 Point) ======= # Note that answers without any explanations will be given a penalty. # There is no particular answer to this question. Any "reasonable" answers based on your analyses are acceptable. # Q6-1. Do you think the coefficients of Starbucks and Dunkin Donuts remain significant after considering the county information? # Write your opinion briefly by commenting (#). # Q6-2. According to the linear regression model in Question 5, which county has the highest average house prices? # Write your opinion briefly by commenting (#). # Q6-3. Calculate average house prices by county. Which county has the highest average house prices? # If the result seems different from the linear regression, how would you interpret it?
Close icon
Play audio
Feedback
Powered by NumerAI
Danielle Fairburn Kathleen Carty
David Collins verified

Sri K and 87 other subject AP CS educators are ready to help you.

Ask a new question

*

Labs

-

Want to see this concept in action?

NEW

Explore this concept interactively to see how it behaves as you change inputs.

View Labs

*

Key Concepts

-
Key Concept
Premium Feature
Explore the core concept behind this problem.
Play button
Key Concept
Premium Feature
Explore the core concept behind this problem.
Your browser does not support the video tag.

*

Recommended Videos

-
using-boston-dataset-for-this-questions-115-pts-build-a-logistic-regression-model-using-all-possible-predictors-in-order-to-predict-the-probability-of-a-car-having-a-house-having-a-higher-th-08581

Using the "Boston" dataset for these questions. 1. (15 pts) Build a logistic regression model using all possible predictors to predict the probability of a house having a higher-than-median value, i.e., medv01 = 1. Fit the model on the training data to predict medv01 in the test set. Show your results. What are the training and test errors? 2. (15 pts) Build a linear discriminant analysis (LDA) model using all possible predictors to predict the probability of a house having a higher-than-median value, i.e., medv01 = 1. Fit the model on the training data to predict medv01 in the test set. Show your results. What are the training and test errors? 3. (15 pts) Build a quadratic discriminant analysis (QDA) model using all possible predictors to predict the probability of a house having a higher-than-median value, i.e., medv01 = 1. Fit the model on the training data to predict medv01 in the test set. Show your results. What are the training and test errors? 4. (15 pts) Build a K-nearest neighbors (KNN) model using all possible predictors to predict the probability of a house having a higher-than-median value, i.e., medv01 = 1. Fit the model on the training data to predict medv01 in the test set for each K value, where K = 1, 3, 5, 7, .... 19. Use "set.seed(5)" in your KNN estimations. Show your results. Generate two plots in which the x-axis is the K value, and the y-axis is the training and test errors, respectively. Which K value should we choose? (Please remember to standardize predictors before running KNN.) 5. (5 pts) Fill in the table below using training and test error rates calculated from the previous parts. Among the four classification models, which one would you choose to predict Boston house values? Why?

Sri K.

answer-all-questions-34

Answer all questions

Supreeta N.

1-a-study-was-conducted-to-analyse-the-effect-of-the-age-of-a-particular-brand-of-car-on-their-selling-prices10-cars-aged-between-1-and-6-years-old-were-randomly-selected-from-the-previous-y-49044

Madhur L.


*

Recommended Textbooks

-
Computer Science and Information Technology

Computer Science and Information Technology

Trishna Knowledge Systems 2018 Edition
achievement 1,415 solutions
Introduction to Programming Using Python

Introduction to Programming Using Python

Y. Daniel Liang 1st Edition
achievement 1,714 solutions
Computer Science - An Overview

Computer Science - An Overview

Glenn Brookshear, Dennis Brylow 12th Edition
achievement 1,985 solutions

*

Transcript

-
00:01 In the solution of this question, the best way to interrupt the coefficient of renarigation model by using the first option which is if rm increases by one unit the natural log, natural log.
00:50 Of odds of the house house with the median value is greater than $30 ,000 will increase by 2 .354.
01:25 So this will be the correct solution and the explanation behind it is because for a higher higher rm one would expect to observe observe a higher median value so this is because more rooms would be imply more space simple as that more rooms that is more space hence costing is also more more costing...
Need help? Use Ace
Ace is your personal tutor. It breaks down any question with clear steps so you can learn.
Start Using Ace
Ace is your personal tutor for learning
Step-by-step explanations
Instant summaries
Summarize YouTube videos
Understand textbook images or PDFs
Study tools like quizzes and flashcards
Listen to your notes as a podcast
Continue solving this problem
Create a free account to:
  • View full step-by-step solution
  • Ask follow-up questions with Ace AI
  • Save progress and study later
Continue Free
Join the community

18,000,000+

Students on Numerade


Trusted by students at 8,000+ universities

Numerade

Get step-by-step video solution
from top educators

Continue with Clever
or



By creating an account, you agree to the Terms of Service and Privacy Policy
Already have an account? Log In

A free answer
just for you

Watch the video solution with this free unlock.

Numerade

Log in to watch this video
...and 100,000,000 more!


EMAIL

PASSWORD

OR
Continue with Clever