While explaining Optuna to a client in the context of hyperparameter tuning, and performing more research on the topic, I came across AutoGluon to perform "AutoML for images, text, and tabular data". After a quick scan of the documentation, I decided to give it a try and see how it performs on a simple project.
I always loved the Kaggle competition NYC Taxi Trip Duration, as the data has spatial, temporal, and other traditional attribute information and is a great dataset to test various models and feature engineering techniques.
I used a local Apache Spark instance (as it is my go-to ETL engine) to perform some feature engineering before letting AutoGluon do its magic. As a quick proof of concept, the results are quite impressive, and here are the steps to reproduce the project and visualize the result.
No comments:
Post a Comment