Scaling Deep Learning Models in Production Using Kubernetes

Scaling Deep Learning Models in Production Using Kubernetes
17m

While there are a lot of machine learning frameworks and libraries available, putting the models in production at large scale is still a challenge. Sahil would like to talk about how they took on the challenge of deploying deep learning models in production: how they chose their tools and developed their internal deep learning infrastructure using Kubernetes. He will cover how they do model training in Docker containers, distributed TensorFlow training in a cluster of containers, automated re-training of models and finally, the deployment of models to serve predictions. At the large scale which they operate on, nothing comes easy. He will also talk about how they optimize their model predictions infrastructure for latency or throughput depending on the use case.

Speaker

Sahil Dua

sahil dua

Git contributor. Tech Speaker. Currently@BookingCom. Open Source Contributor. http://github.com/sahildua2305