The Next Frontier for AutoML: Learning Custom Embeddings

Mar 01, 2023

As an AI enthusiast, I'm always excited to explore the latest developments in machine learning. Today, I want to share my thoughts on what I believe is the next frontier for AutoML - custom embeddings.

In a previous blog post, I talked about my preferred AutoML tool - vertex AI AutoML. While it's a great tool to build ML models, it won't let you learn embeddings from your data. I believe that the future of machine learning lies in the development of tools that allow users to learn custom embeddings from their own data. These embeddings can represent different business entities and have the potential to transform the way we solve business problems.

So, what exactly are embeddings? In a nutshell, they create a mathematical representation of different business entities such as users, products, and stores, allowing computers to understand information about these entities and how they are related to each other. This is incredibly useful for businesses, as embeddings can be used to build ML models with very sparse training data and also improve the speed of model inference in high-volume real-time systems with very strict latency requirements.

In my opinion, the Two Tower model is a viable solution for addressing this issue. It is a powerful neural network architecture that can create high-quality embeddings of various business entities. The model embeds pairs of diverse entities, such as users and products, which are each represented by structured and unstructured data, into the same embedding space. This allows the embeddings to encode how different entities relate to each other. However, there is currently no robust AutoML-managed tool on the market to train a Two Tower Model, making it difficult for businesses to adopt this technique without substantial expertise and manual effort.

Fortunately, the field of AutoML is constantly evolving, and promising developments like Graph Neural Networks (GNNs) are also emerging. Kumo, for example, is developing tools that model existing structured business data as graphs and create powerful ML models based on those graphs. Graph networks are ideal for learning potent embeddings for business entities in the data because graphs can easily encode complex relationships and dependencies between these entities. I am eager to see what the team comes up with in the form of new products.

In conclusion, the potential of learning custom embeddings using AutoML is vast and has the ability to transform the way businesses solve problems and innovate. Although there are currently no robust AutoML-managed tools available for creating custom embeddings, promising developments like the Two Tower model and Graph Neural Networks are emerging. As AutoML continues to evolve, we can expect to see new and more effective tools that will enable businesses to easily learn and utilize custom embeddings to their fullest potential.

Data Science is not Rocket Science

Discussion about this post