PyData Amsterdam 2023

What the PDEP? An overview of some upcoming pandas changes
09-14, 10:30–11:00 (Europe/Amsterdam), Foo (main)

Last year, the pandas community adopted a new process for making significant changes to the library: the Pandas Enhancement Proposals, aka PDEPs. In the meantime, several of those proposals have been proposed and discussed, and some already accepted. This talk will give an overview of some of the behavioural changes you can expect as a pandas user.


Last year, the pandas community adopted a new process for making significant changes to the library: the Pandas Enhancement Proposals, aka PDEPs (similar to Python's PEPs and numpy's NEPs, ..). In the meantime, several of those proposals have been proposed and discussed, and some already accepted, shaping up the pandas roadmap (https://pandas.pydata.org/about/roadmap.html).

The goal of this talk is to introduce you to this new process, and give an overview of a few of the proposed PDEPs. This way, you will learn about some of the behavioural changes you can expect as a pandas user in the near future.

Over the many years of development, pandas has grown (or kept since the early days) quite some corner cases and inconsistencies. Some of the proposed PDEPs are an attempt to tackle those? For example, one accepted proposal is to ban any (up)casting in "setitem-like" operations, avoiding surprising data type changes. There is also a proposal to stop providing the inplace option for many methods, because even though the name might imply otherwise, those operations were not actually done in-place. Another major change that is under way is a change to the copy and view semantics of operations in pandas (related to the well-known (or hated) SettingWithCopyWarning). This is already available as an experimental opt-in to test and use the new behaviour, and will probably be a highlight of pandas 3.0.


Prior Knowledge Expected

No previous knowledge expected

I am a core contributor to Pandas and Apache Arrow, and maintainer of GeoPandas. I did a PhD at Ghent University and VITO in air quality research and worked at the Paris-Saclay Center for Data Science. Currently, I work at Voltron Data, contributing to Apache Arrow, and am a freelance teacher of python (pandas) at Ghent University.