Upper management is looking to roll out a new product and wants to see if there are any patterns and insights that can be discovered from customer data. Your team has been tasked to discover these potential patterns and structures within this data.
Which type of machine learning approach would be most appropriate to pick for this problem?
Correct Answer:B
When the goal is to uncover hidden structures or groupings in unlabeled data, unsupervised learning - notably clustering algorithms - is the appropriate choice. CPMAI describes clustering as "an unsupervised process that partitions data into groups based on similarity" and calls for applying these methods to discover patterns in unlabeled datasets .
- [Machine Learning]
Major factors for the project you are currently working on are around the training time, cost, and complexity of training your models. Which algorithm is not the best choice given these constraints?
Correct Answer:B
Neural Networks—especially deep architectures—typically require extensive computational resources, longer training times, and higher infrastructure costs compared to simpler methods. In contrast, algorithms like Naive Bayes train very quickly on large datasets, and Gaussian Mixture Models or SVMs have more moderate training complexity and infrastructure demands. Therefore, given strict constraints on training time, cost, and complexity, Neural Networks are the least suitable choice.
- [Trustworthy AI]
You??re working on a project and are working with personally identifiable information (PII). What??s the best approach to take when it comes to collecting and using this data?
Correct Answer:D
Under CPMAI Phase III: Data Preparation, the Data Format task includes ??Data anonymization?? as a core activity to remove or mask PII when it is not required for modeling, thereby protecting privacy while retaining data utility.
Clean, well-labeled datasets used for machine learning are partitioned into three subsets: Training sets, Validation sets, and Test sets. As your team is doing this, what's the best way to split up this data?
Correct Answer:B
CPMAI's glossary defines data splitting as "dividing a data set into subsets (e.g., training, validation, test) for model development and evaluation," typically achieved via random subsampling to ensure each subset is representative of the underlying distribution and to prevent sampling bias.
- [CPMAI Methodology]
You are leading a project to develop a new predictive maintenance solution. Together with your project team you determine your data needs, see if you have access to the data, and then begin working on the project.
Which phase best describes the work you are performing?
Correct Answer:B
Phase II: Data Understanding is dedicated to identifying data requirements, collecting initial data, assessing data quality, and verifying that necessary datasets are accessible and fit for modeling. Determining what data you need and confirming access are the core activities of this phase .
- [AI Fundamentals]
Which of the following best describes the technical definition of Machine Learning?
Correct Answer:B
Tom Mitchell??s widely adopted formulation captures ML??s essence: improvement on task T, measured by P, through experience E. This aligns with CPMAI??s view that ML enables systems to learn from data and improve over time (??The ability of a machine to learn from data, improve with experience, and apply that learning to make predictions.??) .