Cloud Managed Services & Favorite Machine Learning Tools
As someone who's passionate about data science, I've discovered that one of the best ways to reduce my team's dependence on others is to utilize cloud managed services for our machine learning infrastructure. In a previous post, I mentioned the significance of enhancing software engineering skills within the team. However, I believe that deploying clusters, monitoring virtual machines, or writing intricate parallelization logic shouldn't be a priority when we have the option to utilize managed services that can do all of this for us. It's important to always be on the lookout for ways to enhance our work and make it more efficient.
Personally, I find Google Cloud Platform (GCP) to be the best cloud provider for machine learning work. Although I have experience building solutions on various platforms such as AWS, Azure, and on-premise Hive/Spark, GCP remains my top choice. Below is a list of my favorite managed services within GCP. For those who find some of the concepts difficult to grasp, there are plenty of blog posts and documentation available for further exploration.
Vertex AI AutoML. This managed service specializes in scaling automatic model training, feature engineering, hyper parameter tuning, and neural architecture search. In my experience with various AutoML tools, such as H2O Driverless, Turi (prior to its acquisition by Apple), Azure AutoML, and Sage Maker, Vertex AI outperforms them in terms of ease of use, scalability, infrastructure reliability, accuracy, and model flexibility in terms of utilization.
Vertex AI pipelines. This managed service allows end-users to create complex orchestration pipelines to train and deploy new models on a schedule. The creation of pipeline components is achieved through simple Python functions, which can then be mixed, matched, and reused to generate complex execution graphs. The managed service handles the execution of these pipelines on managed compute infrastructure, where each component of any pipeline runs in its own separate Docker image instance.
Big Query. A managed data warehouse (SQL) that separates storage from compute (similar to Snowflake) to scale significantly. This service has in-built scalable ML models (such as k-means, matrix factorization, simple regression) and robust integration with Vertex AI and DataFlow.
DataFlow. A managed service for Apache Beam, which aids in defining complex data processing pipelines with simple Python code. The managed service then scales and parallelizes these pipelines, without end-users having to be concerned with the complexities of scaling and parallelizing their job, or the compute clusters behind their job.
TensorFlow Hub. This repository offers free trained models for the transformation of text or image data into powerful embeddings that can be used as features in individual models (transfer learning).
In conclusion, cloud managed services have revolutionized the way data scientists work with machine learning infrastructure. The use of managed services can simplify and accelerate the development of new Data Science products.