notebooks are McDonalds of code

You can come to McDonalds and order a salad, but you won't. Same with notebooks, you can write NASA-production-grade software in a notebook, but most likely you won't. Notebooks make you lazy, and encourage bad practices.

common arguments

I first watched this talk by Joel Grus and laughed hard. I later saw the extensive use of notebooks everywhere, even in production, and I don't laugh anymore. I'm scared and sad. This section will have arguments mostly from that talk, but I might add a couple more here in the future.

state

State is like Jesus from the Big Lebowski, you don't fuck with Jesus. But state will for sure fuck you up. You execute a cell, you change the variable value, save the colab and forget to execute the cell. Twenty minutes later you find a bug, hello there!

versioning

We've seen it before. tfinal_final_final_of_final.xls is back. You want to play with a notebook, you copy it, change a couple of cells. Two weeks later, you have 25 different versions, which one do you need? Good luck with that!

bad habits

This is exactly the McDonalds methaphor. If people are given a chance to be lazy, they will be (I will be for sure). Quickly hacking some stuff without properly testing? Sure! Linting? Pfff, living on the edge, you'll throw the colab away in an hour anyways. Writing everything in one file? Of course! Scrolling is so fun!

notebooks slow you down

Here I try making a point why using notebooks is bad for you personally.

distractions

I don't have a diagnosed ADHD, but I have 90% of the symptoms from the NHS website. I'm very easy to distract. And when I use colabs, I'm just one tab away from everything else. Wikipedia? Sure, let's open five more tabs. Gmail? Let's check the inbox! Youtube music? Let's change the playlist. You got it, my tmux pane with neovim or any IDE of your choice is far less destructive.

execution environment

Notebooks are often used as a playground to easily have access to an accelerator. In this case, the execution environment is often set up differently and with different dependencies. In this case, you make your code work in a colab, smile widely and run your experiment after on a cluster or whatever. You start crying in twenty minutes after your code crashes due to lack of dependency or versioning or anything else similar.

efficiency

Notebooks keep the state, you have all the data on your fingertips, you are not encouraged to optimise your code. When you rerun scripts, you want them be damn fast, you think more about the efficiency of your code.

reading code

I find notebooks broken for moving around the codebase. In neovim (or Pycharm, or VSCode), you can easily go to the place where the function is defined and change it, it's just one hotkey away. You can easily look at all the places where the function is used. How do you do that in a notebook? Do you go to your IDE and search for it? What do you do if you change a function? Reload the whole thing? Autoreload can help, but now you have to remember to rerun the cells you need and potentially fuck up your state if you skip a cell.

notebooks slow your team down

Enough with personal reasons, let's think about the issues affecting the whole team.

breaking changes

You use some function in a colab. Another developer changes the function signature and their IDE changes all other calls of this function in the code, but not in the notebooks! If the notebook is not used often (e.g. for leaderboarding), you are in for a treat. Apart from the frustration, this is also bad from the context switching and credit assignment perspective. Who is to fix this? You, who uses the notebook? Developer who changed the function signature? Both of the cases suck.

awareness

You don't usually check in notebooks in your version control system (if you do, I'm sorry). They usually pile up either locally or on some cloud drive. In this case, people are unaware of what's going on. When you check in your data analysis scripts or any modules, people can glance over PRs and have an idea of what's going on. Notebooks are like dark matter of development (yes, I have almost zero knowledge of physics, and still think I can use this metaphor here).

fucking around -> production

Some people like notebooks as they allow them to easily check some ideas and move on. However, when their ideas work out, they are having hard time moving this code to modules. Let's think about what you need to do. First, you need to move the code to modules, sometimes it's not just a single file, it can be multiple files across the codebase. Now you need to test it somehow. Personally for me, after I've moved the code, I'm already exhausted and sometimes bored. I know my code runs in a colab, why do I need to test it again? Often I end up not unit testing my code for this reason. But even if you don't test your code, you have to make sure that it runs and produces similar results to what you've had in a notebook. This also takes time and energy.

sharing is caring

You have hunderds of notebooks with useful utils that are accessible only to you. If that was a library within your team codebase, everyone could use that and the whole team could avoid code duplication. But as discussed earlier, you are not encouraged to move this to a module because you are lazy.

FAQ

I'll try to answer some common answers I get when I tell people I do not use notebooks. If your question is not here, let's chat on twitter.

how do you do plotting?

I have small utils that unify how my plots look. In case I want incrementally play with the plot, I pickle the data for it and run the plotting script for each iteration.

how do you work with a remote machine?

sshfs works greatly for these purposes in case you need interactivity.

I am using code autoreload and write code in modules

Nice! I've done this for a while as well. This is a good use-case. However, this approach does not address some of the issues, e.g. data analysis scripts should be checked by another team member.

i considered you to be my friend, how could you do this to me?

Hi Lucas, I'm not judging you. We can still be friends. But we can be better friends if you stop using notebooks.

some of your points are valid, but why stop using notebooks completely?

I don't think I'm losing much. I'm also constantly exploring other options and having fun.