09-14, 12:00–12:30 (Europe/Amsterdam), Bar
Working on ML serving for couple of years we learned a lot. I would like to share a set of best practices / learnings with the community
At Adyen we deploy a lot of models for online inference in the payment flow. Working in the MLOps team to streamline this process, I learned a lot about best practices / things to consider before (after) putting a model online. These are small things but they do contribute to a production and reliable setup for online inference. Some examples:
- Adding meta data & creating a self contained archive
- Separating serving sources from training sources
- Choosing the requirements of model
- Adding an example input & output request
- Adding schemas for input and output
- Common issues when putting models online: memory leaks, concurrency
- Which server is best? Process based or thread based
- How different python versions affect inference (execution) time
No previous knowledge expected
Staff Engineer @ Adyen. I am passionate about high performance distributed systems. Recently I was working on scaling Adyen's Data & ML infrastructure.