When applying machine learning models, we’d usually do data pre-processing, feature engineering, feature extraction and, feature selection. After this, we’d select the best algorithm and tune our parameters in order to obtain the best results. AutoML is a series of concepts and techniques used to automate these processes.
AutoML is the process of automating the time consuming, iterative tasks of machine learning model development.
This is our first piece in a series about Automated machine learning if you are familiar with AutoML and just want to get your hands dirty and see how we are using it here at ALMETA, you can jump directly to our next piece on Google’s AutoML.
You are still here? alright 🙂 let us start.
What Can We Automate in Machine Learning?
1. Hyperparameter Optimization You can search for the best combination of hyperparameters with different kinds of search
2. Model Selection Run the same data through several algorithms whose hyperparameters are set by default, to determine which algorithm can learn best on your data.
3. Feature Selection Given a pre-determined domain of inputs, some tools can select the most relevant features from that domain.
This does not solve the larger problem of identifying the right features (out of all possible inputs in the world) and gathering them.
4. Transfer Learning and Pre-Trained Models is a technique where one uses pre-trained models to transfer what
How automated ML works?
1. Identify the ML problem to be solved: classification, forecasting, or regression.
2. Specify the source and format of the labeled training data: Numpy arrays or Pandas
3. Configure the compute target for model training.
4. Configure the automated machine learning parameters that determine how many iterations over different models, hyperparameter settings, advanced preprocessing/featurization, and what metrics to look at when determining the best model.
When to use automated ML?
- Implement machine learning solutions without extensive programming knowledge.
- Save time and resources.
- Leverage data science best practices.
- Provide agile problem-solving.
The Drawbacks of AutoML
- In the world of automated machine learning, we pretend that data exploration and domain knowledge don’t matter. We can only do that for a few limited use cases. So, automated machine learning has a narrow happy path; that is, it’s easy to step off the path and get into trouble.
- AutoML is a fairly new concept in the machine learning world. It is, therefore, important to exercise caution while applying some of the current AutoML solutions. This is because some of these technologies are still under development.
- Another major challenge is the time it takes to run the AutoML models. This will really depend on the computational power of the machine we’re running. As we shall see soon, some of the AutoML solutions run well on our local machines, but some require an accelerated solution.
The most well-known ones:
- H2O AutoML
- Sequential Model-based Algorithm Configuration (SMAC)
- RoBO – Robust Bayesian Optimization framework
In this post, we presented the AutoML concept, we discussed where, when, and how to use AutoML, the pros, and cons of AutoML, and finally provide you with a list of systems to try AutoML yourself.
So are you ready to get your hands dirty, in our next piece we discuss in depth one of these products namely Google’s AutoML and How this service can be used to empower Arabic NLP applications.
Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from Google Play or Apple’s App Store.