1. You are tasked to work on the following two problems:
(1) Based on a large dataset of consumer items, predict the number of items that will be sold in the next quarter of a year.
(2) Based on a large dataset of accounts, predict which account will be hacked or not hacked.
A. Use regression algorithm for both.
B. Use regression algorithm for problem (1) and classification algorithm for problem (2).
C. Use classification algorithm for both.
2. Which of the following use cases would you solve with an unsupervised ML algorithm? (Check ALL the correct ones)
A. You have a dataset of emails labeled as spam or ham. The algorithm should learn to filter out the spam.
B. You have a dataset of customers. The algorithm should discover market segments and assign customers to the segments.
C. You have a dataset of medical patients that have either diabetes or not. The algorithm should predict new patients as diabetic or not.
D. You have a dataset of text documents. The algorithm should group them into documents on the same topic (Science, Sports, etc.).
E. Use classification algorithm for problem (1) and regression algorithm for problem (2).
3. import numpy as np
a3 = np.array([ [[2,4,8], [9, 3, 5], [7, 6, 8]], [[8, 1,9], [6, 7, 7], [5, 9, 8]], [[5,0,1],[3, 0, 3], [2, 3, 8]]])
Which slicing syntax (in the form of a3[....]) will give you the following result:
array([[8, 5, 8], [9, 7, 8], [1, 3, 8]])
4. You want to plot a comparison of the spread of a quantitative variable Y based on the categories of a categorical variable X.
Which plot do you use?
A. Boxplot
B. Pie Chart
C. Scatterplot
D. Histogram