PyData Amsterdam 2023

Transfer Learning in Boosting Models
09-16, 10:00–10:30 (Europe/Amsterdam), Foo (main)

Did you know that you could do transfer learning on boosted forests too? Even in current days, we face business cases where the modelling sample is very low. This brings an uncertainty to the modelling results and in some cases no ability to model at all. To counter it, we investigated the ability to use transfer learning approaches on boosting models. In this talk, we would like to show the methods used and results from a real case example applied to the credit risk domain.


Transfer learning (TL), a form of machine learning, involves leveraging knowledge acquired while addressing one task and applying it to a related task. While TL is mainly associated with deep learning tasks, it is also applicable to boosting algorithms which are commonly used in advanced credit risk modelling.

During the talk, we present a real use-case involving building a probability of default (PD) model for a customer segment with small data history within the bank. There can be several ways to benefit from data coming from other customer segments with already rich data available within the bank.

Simple approaches would be:
- Fit a model on only rich data & just apply to the limited data
- Fit a model on both data sets, but tune it on the limited data

More complex (TL) approaches:
- Fit a model on rich data with sample weights come from resemblance analysis to calculate similarity between these two data sources.
- Use refitting with the limited data on the model trained on rich data
- Start with an initial pre-trained model while modelling on the limited data

Join us for an engaging session where we will share the outcomes of our experiments and lessons learned, as we address these approaches that hold relevance beyond the presented use-case, offering practical applicability for similar scenarios in your own domain.


Prior Knowledge Expected

No previous knowledge expected

Busra is an experienced data scientist with passion for analytics at ING’s Risk & Pricing Advanced Analytics Team in Amsterdam. She has designed and developed end-to-end advanced analytics solutions to a business problem in different domains during the last 5 years at ING. Currently, she is working on real-time credit risk models by using ML. Busra has a background on optimisation and operational research from her B.Sc. study and she has M.Sc. degree on Data Science.