Seven Steps To a ML Workflow
- Overview
- 7 Steps to a Machine Learning (ML) Workflow
ML workflows determine which phases are used during a ML project. An ML workflow describes the steps of a ML implementation.
It is not recommended to try to fit a model to a workflow that is too rigid. Instead, it’s better to develop flexible workflows. This allows you to start small and then upgrade to a "production-grade" solution (meaning you have the ability to handle heavy use in a commercial or industrial environment).
While ML workflows vary from project to project, these are ML stages. Following are 7 steps to a ML workflow.
- Step 1: Collect Data:
Gathering data begins with defining the problem. Understanding the problem is critical to determining needs and the best solution.
For example, ML projects that use real-time data require IoT systems that use various data sensors. The first data set can be collected from different sources such as databases, archives or sensors.
- Step 2: Prepare Data (or Data Preprocessing)
Data preprocessing is a data mining technique that involves transferring raw data into an understandable format.
This means cleaning and formatting the raw material. Raw data cannot be used to train ML models. Additionally, ML models can only handle numbers, so ordinal and categorical data must be converted into numerical features.
Real-world data is usually incomplete, inconsistent, and lacks certain behaviors or trends. Data preprocessing is the process of giving data some basic transformation so that a model can consume it.
- Step 3: Choose A ML Model
Considerations when selecting a model include performance (the quality of the model's results) and interpretability (the ease of interpreting the model's results).
Other considerations include dataset size (which affects how the data is processed and synthesized) and training time and cost (training the model).
- Step 4: Train The ML Model
There are 3 main steps to training a machine learning model:
- Start with existing data
- Analyze data to discover patterns
- Make predictions
- Step 5: Evaluate ML Models
There are three main ways to evaluate a model:
- Accuracy (percentage of test data predictions correct)
- Precision (predicting applicable cases falling into a specific category)
- Recall (predicting cases belonging to a category involves all exemplars that legitimately belong to that category)
- Step 6: Perform Hyperparameter Tuning
Hyperparameters define the model architecture, so the process of trying to find the ideal model architecture is called hyperparameter tuning.
- Step 7: Deploy ML Models for Predictions
A predictive model is a container for different versions of an ML model. To deploy a model, you first set up a model resource in AI Platform Prediction (which runs your model in the cloud).