Question

The ultimate goal in any sport (besides having fun) is to win. One measure of how well a team does is the winning percentage. In baseball, a lot of effort goes into figuring out the variable that best predicts a team's winning percentage. The following data represent the winning percentages of teams in the National League along with potential explanatory variables based on the 2010 season. Team | Winning Percentage | Runs | Home Runs | Team Batting Average | On Base Percentage | Batting Average Against | Team ERA ---|---|---|---|---|---|---|--- Philadelphia | 0.599 | 772 | 166 | 0.260 | 0.332 | 0.254 | 3.67 Atlanta | 0.562 | 738 | 139 | 0.258 | 0.339 | 0.246 | 3.56 San Francisco | 0.568 | 697 | 162 | 0.257 | 0.321 | 0.236 | 3.36 Chicago Cubs | 0.463 | 685 | 149 | 0.257 | 0.320 | 0.255 | 4.18 Florida | 0.494 | 719 | 152 | 0.254 | 0.321 | 0.261 | 4.08 LA Dodgers | 0.494 | 667 | 120 | 0.252 | 0.322 | 0.244 | 4.01 Washington | 0.426 | 655 | 149 | 0.250 | 0.318 | 0.266 | 4.13 Arizona | 0.401 | 713 | 180 | 0.250 | 0.325 | 0.271 | 4.81 NY Mets | 0.488 | 656 | 128 | 0.249 | 0.314 | 0.260 | 3.70 Houston | 0.469 | 611 | 108 | 0.247 | 0.303 | 0.261 | 4.09 San Diego | 0.556 | 665 | 132 | 0.246 | 0.317 | 0.240 | 3.39 Pittsburgh | 0.352 | 587 | 126 | 0.242 | 0.304 | 0.282 | 5.00 Source: espn.com (a) Using the regression analysis tool built in Excel spreadsheet to estimate a regression equation that predicts winning percentage. Consider all available independent variables in the model. Show the summary output table. (2 marks) (b) Comment on the overall model performance. (4 marks) (c) At 10% level of significance, are there any independent variables that appear to be unnecessary? Justify your answer. (6 marks) (d) Based on your discussion in part (c), re-develop the regression model, and show the summary output table. (2 marks) (e) Compare the two regressions developed in part (a) and (d), decide which model has a better fit to the data? (3 marks) (f) Using the model developed in part (d), estimate the winning percentage for a team with Runs = 648 Home Runs = 123 Team Batting Average = 0.246 On Base Percentage = 0.325 Batting Average Against = 0.257 Team ERA = 3.56 (3 marks)

          The ultimate goal in any sport (besides having fun) is to win. One measure of how well a team does is the winning percentage. In baseball, a lot of effort goes into figuring out the variable that best predicts a team's winning percentage. The following data represent the winning percentages of teams in the National League along with potential explanatory variables based on the 2010 season.

Team | Winning Percentage | Runs | Home Runs | Team Batting Average | On Base Percentage | Batting Average Against | Team ERA
---|---|---|---|---|---|---|---
Philadelphia | 0.599 | 772 | 166 | 0.260 | 0.332 | 0.254 | 3.67
Atlanta | 0.562 | 738 | 139 | 0.258 | 0.339 | 0.246 | 3.56
San Francisco | 0.568 | 697 | 162 | 0.257 | 0.321 | 0.236 | 3.36
Chicago Cubs | 0.463 | 685 | 149 | 0.257 | 0.320 | 0.255 | 4.18
Florida | 0.494 | 719 | 152 | 0.254 | 0.321 | 0.261 | 4.08
LA Dodgers | 0.494 | 667 | 120 | 0.252 | 0.322 | 0.244 | 4.01
Washington | 0.426 | 655 | 149 | 0.250 | 0.318 | 0.266 | 4.13
Arizona | 0.401 | 713 | 180 | 0.250 | 0.325 | 0.271 | 4.81
NY Mets | 0.488 | 656 | 128 | 0.249 | 0.314 | 0.260 | 3.70
Houston | 0.469 | 611 | 108 | 0.247 | 0.303 | 0.261 | 4.09
San Diego | 0.556 | 665 | 132 | 0.246 | 0.317 | 0.240 | 3.39
Pittsburgh | 0.352 | 587 | 126 | 0.242 | 0.304 | 0.282 | 5.00

Source: espn.com

(a) Using the regression analysis tool built in Excel spreadsheet to estimate a regression equation that predicts winning percentage. Consider all available independent variables in the model. Show the summary output table. (2 marks)
(b) Comment on the overall model performance. (4 marks)
(c) At 10% level of significance, are there any independent variables that appear to be unnecessary? Justify your answer. (6 marks)
(d) Based on your discussion in part (c), re-develop the regression model, and show the summary output table. (2 marks)
(e) Compare the two regressions developed in part (a) and (d), decide which model has a better fit to the data? (3 marks)
(f) Using the model developed in part (d), estimate the winning percentage for a team with
Runs = 648 Home Runs = 123
Team Batting Average = 0.246 On Base Percentage = 0.325
Batting Average Against = 0.257 Team ERA = 3.56 (3 marks)

The ultimate goal in any sport (besides having fun) is to win. One measure of how well a team does is the winning percentage. In baseball, a lot of effort goes into figuring out the variable that best predicts a team's winning percentage. The following data represent the winning percentages of teams in the National League along with potential explanatory variables based on the 2010 season.

Team | Winning Percentage | Runs | Home Runs | Team Batting Average | On Base Percentage | Batting Average Against | Team ERA
—|—|—|—|—|—|—|—
Philadelphia | 0.599 | 772 | 166 | 0.260 | 0.332 | 0.254 | 3.67
Atlanta | 0.562 | 738 | 139 | 0.258 | 0.339 | 0.246 | 3.56
San Francisco | 0.568 | 697 | 162 | 0.257 | 0.321 | 0.236 | 3.36
Chicago Cubs | 0.463 | 685 | 149 | 0.257 | 0.320 | 0.255 | 4.18
Florida | 0.494 | 719 | 152 | 0.254 | 0.321 | 0.261 | 4.08
LA Dodgers | 0.494 | 667 | 120 | 0.252 | 0.322 | 0.244 | 4.01
Washington | 0.426 | 655 | 149 | 0.250 | 0.318 | 0.266 | 4.13
Arizona | 0.401 | 713 | 180 | 0.250 | 0.325 | 0.271 | 4.81
NY Mets | 0.488 | 656 | 128 | 0.249 | 0.314 | 0.260 | 3.70
Houston | 0.469 | 611 | 108 | 0.247 | 0.303 | 0.261 | 4.09
San Diego | 0.556 | 665 | 132 | 0.246 | 0.317 | 0.240 | 3.39
Pittsburgh | 0.352 | 587 | 126 | 0.242 | 0.304 | 0.282 | 5.00

Source: espn.com

(a) Using the regression analysis tool built in Excel spreadsheet to estimate a regression equation that predicts winning percentage. Consider all available independent variables in the model. Show the summary output table. (2 marks)
(b) Comment on the overall model performance. (4 marks)
(c) At 10% level of significance, are there any independent variables that appear to be unnecessary? Justify your answer. (6 marks)
(d) Based on your discussion in part (c), re-develop the regression model, and show the summary output table. (2 marks)
(e) Compare the two regressions developed in part (a) and (d), decide which model has a better fit to the data? (3 marks)
(f) Using the model developed in part (d), estimate the winning percentage for a team with
Runs = 648 Home Runs = 123
Team Batting Average = 0.246 On Base Percentage = 0.325
Batting Average Against = 0.257 Team ERA = 3.56 (3 marks)

Added by Lisa B.

Question

Please give Ace some feedback