Problem 4: Rule-based methods
The goal is to build a rule-based classifier to predict the class label based on these attributes. Here's the dataset:
Age
Income
Education
Temperature
ID
(Continuous) (Ordinal)
(Nominal)
(Continuous)
Outcome
1
28
Medium
Bachelor's
75.5
Yes
2
35
High
Master's
68.2
No
3
40
Low
High School
80.0
Yes
4
25
Medium
PhD
72.8
No
5
45
High
Bachelor's
78.1
Yes
6
30
Medium
High School
69.5
No
7
38
Low
Master's
77.3
Yes
8
22
High
PhD
74.9
No
9
33
Medium
High School
70.4
No
10 42
Low
Bachelor's
76.7
Yes
1. Initial Rule Generation: Generate an initial rule that classifies instances based on a single attribute with
the highest information gain or Gini index.
2. Rule Expansion: Expand the rule from question 1 by incorporating another attribute to improve the
classification accuracy.
3. Handling Continuous Attributes: Propose a method to handle continuous attributes in rule-based
classification, especially for attributes like 'Age' and 'Temperature.'
4. Discrete Attribute Rule: Create a rule based on a discrete attribute with more than two categories, like
'Education.'
5. Rule Evaluation: Evaluate the rules generated so far on the training dataset. Identify any instances that
are misclassified.
6. Pruning Rules: Discuss the concept of rule pruning and why it might be necessary in rule-based methods.
Provide an example of when rule pruning could be beneficial.
7. Handling Missing Values: Discuss how rule-based methods handle instances with missing attribute
values and propose a strategy to address missing values in this dataset.