(a) Sketch the tree corresponding to the CART partition given below. The word in each box indicates the class label for that region. Cat Dog Sheep X2 1 Cat 0 Rabbit 3 4 6 X1 (b) Create a diagram similar to that given in part (a) using the CART tree below. Indicate the class label in each region of the partitioned feature space. X2 < 1 X2 < 2 X1 < 1 Banana X1 < 2 X2 < 0 Orange Grapes Apple Apple Pear (c) The predictive performance of a single tree can be substantially improved by aggregating many decision trees. i. Briefly explain the random forest method for classification. ii. Should pruned or un-pruned trees be used in random forests? Explain. iii. Briefly explain the of out-of-bag error rate for random forests. (d) Briefly explain how K-fold cross-validation can be used to approximate a testing error rate. Describe one advantage of cross-validation over the validation set approach.
Added by Sharon L.
Close
Step 1
(a) The tree corresponding to the CART partition is: ``` βββββββ β Cat β βββββββ β βββββββ βSheepβ βββββββ β βββββββ β Cat β βββββββ β ββββββββ βRabbitβ Show moreβ¦
Show all steps
Your feedback will help us improve your experience
Adi S and 96 other Intro Stats / AP Statistics educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
3. This problem involves the OJ data set which is part of the ISLR package. a) Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations. b) Fit a tree to the training data, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics about the tree, and describe the results obtained. What is the training error rate? How many terminal nodes does the tree have? c) Type in the name of the tree object in order to get a detailed text output. Pick one of the terminal nodes, and interpret the information displayed. d) Create a plot of the tree, and interpret the results. e) Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the test error rate? f) Determine the optimal size of the tree. g) Produce a pruned tree corresponding to the optimal tree size. h) Compare the training error rates between the pruned and unpruned trees. Which is higher? i) Compare the test error rates between the pruned and unpruned trees. Which is higher?
C D.
Let us consider a data set containing 50 positive and 50 negative instances, where the attributes are purely random and contain no information about the class labels. Hence, the generalization error rate of any classification model learned over this data is expected to be 0.5. Let us consider a classifier that assigns the majority class label of training instances (ties resolved by using the positive label as the default class) to any test instance, irrespective of its attribute values. We can call this approach the majority inducer classifier. Determine the error rate of this classifier using the following methods. (a) Leave-one-out. (b) 2-fold stratified cross-validation, where the proportion of class labels at every fold is kept the same as that of the overall data. (c) From the results above, which method provides a more reliable evaluation of the classifier's generalization error rate?
Nick J.
1. Unsupervised data mining techniques specify a target variable. (T/F) 2. The output of a classification data task is continuous. (T/F) 3. In each iteration of k-fold cross-validation, one of the k subsets is used as the test set and the other k-1 subsets are put together to form a training set. (T/F) 4. Bagging iteratively combines multiple weak learners, usually weighted related to the weak learners' accuracy, to create one strong learner. (T/F) 5. A decision tree building process is greedy because at each step of the tree-building process, the best split is made at that step rather than looking ahead and picking a split that will lead to a better tree in some future step. (T/F) 6. Bonferroniβs Principle states that if you look for events of a given type, you can expect to find events to occur even if the data is completely random. (T/F) 7. R-squared is a measure of how much variability of the target variable can be explained by the predictor variables in a multiple linear regression model. (T/F) 8. Binary logistic regression can be used to predict the probability of a categorical dependent variable. (T/F)
Adi S.
Recommended Textbooks
Elementary Statistics a Step by Step Approach
The Practice of Statistics for AP
Introductory Statistics
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD