PyData Amsterdam 2023

Extend your scikit-learn workflow with Hugging Face and skorch
09-14, 10:30–11:00 (Europe/Amsterdam), Bar

Discover how to bridge the gap between traditional machine learning and the rapidly evolving world of AI with skorch. This package integrates the Hugging Face ecosystem while adhering to the familiar scikit-learn API. We will explore fine-turing of pre-trained models, creating our own tokenizers, accelerating model training, and leveraging Large Language Models.


The machine learning world is evolving quickly, AI is talked about everywhere, with the Hugging Face ecosystem being in the midst of it. For traditional machine learning users, especially coming from scikit-learn, keeping up can be quite overwhelming. With the help of the skorch package, it is possible to marry the best of both worlds. It allows you to integrate with many of the Hugging Face features while conforming to the sklearn API.

In this talk, I'll give a brief introduction to skorch. Then we will learn how to use it to tap into the Hugging Face ecosystem, benefiting from: using pre-trained models and fine-tuning them, working with tokenizers as if they were sklearn transformers, accelerating model training, and even using Large Language Models as zero-shot classifiers. I'll discuss some benefits and drawbacks of this approach.

This talk should be of interest to you if you're coming from the scikit-learn world and are interested in the latest deep learning developments. Familiarity with scikit-learn and a little bit of PyTorch knowledge is recommended.


Prior Knowledge Expected

Previous knowledge expected

I worked as a Data Scientist and Head of Data Science for a couple of ears, now I'm Machine Learning Engineer at Hugging Face. I'm also a maintainer of the skorch package (https://github.com/skorch-dev/skorch).