Big Data 7 min read

Reflections on PyCon LT 2025 Data Day: Sessions on Static Code Analysis, Data Warehouses, Pipelines, and Data Science Tools

The author recounts attending PyCon LT 2025 Data Day, summarizing talks on building a simple static code analyzer with AST, challenges of data warehouses versus data lakes, cloud cost‑scraping pipelines, A/B testing libraries, privacy‑enhancing data processing, and tools like Panel and Dagster, while noting the inspiring presence of female speakers.

DevOps Engineer
DevOps Engineer
DevOps Engineer
Reflections on PyCon LT 2025 Data Day: Sessions on Static Code Analysis, Data Warehouses, Pipelines, and Data Science Tools

Today was the second day of my attendance at PyCon LT 2025, the Data Day focused on Dataframes, Databases, and Orchestration.

Although the topics are outside my usual expertise, I hoped to learn something valuable since the ticket included this day’s agenda.

The first talk, “Build Your Own (Simple) Static Code Analyzer,” was presented by a core developer of numpydoc and author of *Hands‑On Data Analysis with Pandas*. She explained how to use AST (abstract syntax trees) to create a custom static code analyzer, offering insight into how linters work.

She also shared her personal blog (https://stefaniemolin.com/), which I found worth following.

The next session, “Data Warehouses Meet Data Lakes,” delivered by an Italian speaker, discussed challenges faced by data warehouses and the technologies used to build data architectures, emphasizing the need for pipelines similar to DevOps for data collection and analysis.

The talk “Cutting the price of Scraping Cloud Costs” described techniques for building pipelines that calculate cloud pricing.

The final morning session, “cluster‑experiments: A Python library for end‑to‑end A/B testing workflows,” introduced a Python library for A/B testing, which felt somewhat promotional for the speaker’s open‑source project.

After the talks, I took a break for lunch and later returned for the afternoon sessions.

The first afternoon talk, “Accelerating privacy‑enhancing data processing,” covered challenges of processing cancer‑research data in the real world and showcased the speaker’s tech stack, even using LEGO bricks in the presentation.

The second talk, “Working for a Faster World: Accelerating Data Science with Less Resources,” highlighted a tool called Panel for data exploration and building web apps.

The next session, “Organize your data stack using Dagster,” introduced the open‑source data orchestration tool Dagster , which was well received and praised for its potential usefulness.

The final talk, “Top 5 Lessons from a Senior Data Scientist,” delivered by a freelance female data scientist, shared experience‑based advice rather than technical details, emphasizing lessons applicable to professionals.

Insights

I noticed that the female speakers tended to maintain more polished personal websites compared to their male counterparts.

All speakers were enthusiastic, thoughtful, and willing to share their knowledge, which is commendable.

The technically strong, confident, and sharing female presenters especially resonated with me and provided inspiration.

This concludes my “diary” of the second day at PyCon.

Looking forward to tomorrow’s AI and ML Day for more insights.

👉 PyCon LT 2025 · Day 1 Notes

References

[1] PyCon LT 2025: https://pretalx.com/pycon-lithuania-2025/schedule/

[2] numpydoc: https://github.com/numpy/numpydoc

[3] Panel: https://github.com/holoviz/panel

[4] Dagster: https://github.com/dagster-io/dagster

Data EngineeringStatic Analysisdata scienceDagsterPanelPyCon
DevOps Engineer
Written by

DevOps Engineer

DevOps engineer, Pythonista and FOSS contributor. Created cpp-linter, commit-check, etc.; contributed to PyPA.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.