🧠 Latest TIL: Never Call `spark.stop()` in Databricks Workflows Aug 27

Hi there 👋

Welcome to my blog! I’m Dan, Head of Data Science at The HEINEKEN Company. Recognized as a 30 Under 30 Future Data & Analytics Leader. Passionate about working in complex systems to solve real-world challenges.
🧠 TIL

Never Call `spark.stop()` in Databricks Workflows

Never call spark.stop() in Databricks workflows. Explicitly call sys.exit().

August 27, 2025 · 1 min · 82 words · Dan Wertheimer
🧠 TIL

Building Personal AI Copilots

How treating your LLM Tools like humans can help you build personal AI copilots.

August 4, 2025 · 5 min · 1020 words · Dan Wertheimer
🧠 TIL

Mypy Narrowing of Union[T, None]

Mypy only narrows Union[T, None] when it sees an explicit check for None or an assertion. This behaviour leads to type errors that may seem unintuitive to solve.

August 1, 2025 · 2 min · 270 words · Dan Wertheimer
✍️ Post

Thoughts: Domain knowledge isn't just the business domain.

The “domain knowledge” part of data science doesn’t necessarily need to be business domain knowledge. it could be the problem domain. What do I mean? When you’ve encountered a lot of problems, you start to see that at a fundamental level, these problems look like something you’ve seen before. for example: churn prediction and machine failure at a fundamental level are all a part of the same problem domain, “make a prediction before something bad happens”. ...

August 16, 2022 · 1 min · 114 words · Daniel Wertheimer
✍️ Post

AI Ethics: Addressing Racial Biases in AI and Machine Learning

As AI practitioners, we must identify and tackle problematic biases in our data and models to ensure equitable technology that makes lives better.

September 28, 2020 · 4 min · 741 words · Dan Wertheimer
✍️ Post

A paper, a post and a paragraph - 2nd August 2019

I had an idea. There’s so much to learn, to read, to do and not enough time. I’m having to become a lot better at filtering the content I consume and that why I’ve started this series of posts: A Paper, a Post and a Paragraph. The idea is that I aim to post about a paper I’ve read (Academic or White Paper), a post I liked and a paragraph related to some of my thoughts on current happenings in the topics I’m interested in. While my profession as a data scientist will guide a lot of the content, there may be a few odd new things here and there, like my recent interest in bread baking. ...

August 2, 2019 · 5 min · 886 words · Dan Wertheimer