PyData Amsterdam 2023

Distillation Unleashed: Domain Knowledge Transfer with Compact Neural Networks
09-16, 10:00–10:30 (Europe/Amsterdam), Qux

This talk explores distillation learning, a powerful technique for compressing and transferring knowledge from larger neural networks to smaller, more efficient ones. It delves into its core components and various applications such as model compression and transfer learning. The speaker aims to simplify the topic for all audiences and provides implementation, demonstrating how to apply distillation learning in real scenarios. Attendees will gain insights into developing efficient neural networks by reviewing the various examples of the complex model. The material will be accessible online for convenient access and understanding.


As the field of artificial intelligence continues to advance, the demand for more efficient and compact neural network models has become increasingly vital. The ability to compress and transfer knowledge from larger, complex models to smaller, more efficient models has emerged as a powerful solution. In this talk, we aim to shed light on the significance of distillation learning and its applications across various domains.

In an era where data sizes and computational requirements are escalating, distillation learning provides a compelling solution to address the challenges posed by these factors. By utilizing a teacher-student framework, this approach facilitates the transfer of knowledge from a larger, well-performing teacher model to a smaller student model. The student model is trained to mimic the behaviour and output of the teacher model, thereby inheriting its expertise. This process enables the creation of compact models that are not only efficient in terms of memory and inference speed but also capable of performing tasks with comparable proficiency. Distillation learning represents a breakthrough in model compression and transfer learning, revolutionizing the field of artificial intelligence and novel machine learning utilising deep neural networks.

In this talk, we will provide a comprehensive overview of distillation learning, covering its core components. We will explore the definition and motivation behind, highlighting the role of the teacher model in guiding the student model and the objective of the student model to replicate the teacher model's output. Additionally, we will discuss the diverse applications, including model compression, transfer learning, ensemble learning, multi-task learning, and language models. We will also delve into different types of this learning approach, such as model distillation, knowledge distillation, multi-task distillation, and transfer distillation.

This talk facilitates knowledge exchange and inspires the development of efficient neural networks. The speaker simplifies the topic, making it accessible to all audiences. Simple practical implementation in TensorFlow will be demonstrated, showcasing how attendees can apply this technique in real scenarios. No expertise in complex models is required, and the material will be shared online for convenient access and comprehension.

Highlighted References:
- Mirzadeh, S., Farajtabar, M., Liang, D., & Ghasemzadeh, H. (2020). Improved knowledge distillation via teacher assistant: Bridging the gap between student and teacher. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Radosavovic, I., Kosaraju, R. P., Girshick, R., He, K., & Dollár, P. (2020). Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Malik, Shaiq Munir, et al. (2021) Teacher-Class Network: A Neural Network Compression Mechanism. In Proceedings of the British Machine Vision Conference.


Prior Knowledge Expected

Previous knowledge expected

Hadi is an R&D senior machine learning engineer at the Deltatre group, where he is an integral member of the innovation lab and a fellow at the Sport Experiences unit, based in Czechia and Italy. With a solid academic background, Hadi is a former lecturer at the Institute for Advanced Studies in Basic Sciences (IASBS) in Iran and as a researcher at the Institute of Formal and Applied Linguistics (ÚFAL) at Charles University in Prague. Throughout his career, he has actively participated in numerous industrial projects, collaborating closely with renowned experts in the fields of CV/NLP/HLT/CL/ML/DL. His research focuses on multimodal learning inspired by neural models that are both linguistically motivated and tailored to language and vision, visual reasoning and deep learning. His main research interests are Machine Learning, Deep Learning, Computer Vision, Multimodal Learning and Visual Reasoning while he is experienced in a wide variety of international projects on cutting-edge technologies.