Tab navigation
- Overview
- Objectives- selected tab,
- Test preparation
Business Understanding (5%)
- CRISP-DM process methodology
- Identifying business objectives
- Translating business objectives to data mining goals
Data Understanding (25%)
- Read data from various sources - Source nodes
- Use data visualization - Graph nodes
- Understand distributions and summary statistics
- Identify data quality issues
- Identify and understand outliers
- Identify anomalies - Anomaly node
- Understand relationships among variables
Data Preparation (35%)
- Combine datasets using the Merge and Append nodes
- Derive new fields - Fields Pallet nodes
- Aggregate and restructure datasets
- Use the Select node
- Sampling and balancing datasets
- Methods for reducing the dimensionality of the dataset
- Understand SQL pushback
- Understand use of data caching
- Methods for missing value replacement
Modeling (20%)
- Partition the dataset
- Understand which models to use for sets or binary outcomes
- Understand which models to use for numeric outcomes
- Understand model types and basic operations
- Combine models using the Ensemble node
- Auto modeling nodes
Evaluation of Results (10%)
- Use the Analysis node
- Produce and interpret Evaluation charts
- Interpret model results using data visualizations (charts) and classification tables
- Interpret Generated Model Nuggets
Deployment of Results (5%)
- Use the Export nodes
- Score new data using generated models
- Understand monitoring of deployed models
