Common Computational Pitfalls and How to Avoid Them
Common Computational Pitfalls and How to Avoid Them
Doing science has never been easy. Thankfully it has gotten much better in the last hundred years. Think about Rutherford’s students who had to sit in some pitch-black cellar counting light flashes! We can just sit in our office (nowadays probably a home office) and run some simulations on a computer. While convenient, it does come with its own set of challenges. Let me share a couple of examples from my own life with you to illustrate some of these challenges.
One of the first times I realized, “using a computer as a primary tool for work might not be as great as it seems” was when I was working on my BSc. I was running simulations of pedestrians. Each simulation was performed for different parameters of agents, and the outputs were saved to a file. Another script was reading data from all the files and performing some analysis. I used my MacBook to run it all, and unfortunately, it broke down (I guess from overheating). Fortunately, the hard drive survived, but I needed to use some other machine for analyzing it. The only one I had was a RaspberryPi running Ubuntu. I copied my python scripts there, installed all the libraries, and performed the analysis. Imagine my surprise when I discovered that it looks like absolute garbage! Fast forward two days of debugging, and it turned out that a function for sorting filenames I was using worked differently on Ubuntu and OSX.
Lesson learned 1: The OS you’re using for running your code matters.
On another occasion, I was doing research for my MSc thesis. It was a similar thing – many simulations, many different combinations of parameters, a couple of hours for each. It was hard to keep track of all that, but obviously, I had a good system for doing it. I stored each result in a separate file. The name of the file indicated the set of parameters I used. There was no space for confusion!
Unfortunately, it turned out there was. I started a simulation, then discovered a bug, so I terminated it and fixed it. I started it again but quickly realized that I should introduce another parameter. I added it and, in the meantime, also fixed how the optimizer worked. After a week, I had no idea which run was done with what version of the code, and I had to repeat hundreds of hours of simulations.
Lesson learned 2: You have to be meticulous about keeping track of your experiments. (I also became acquainted with git, my new best friend for life, who now helps with this task.)
One or two years later, when I was working at Estimote, our team faced a problem that we were not able to solve alone. We needed to work together with another team to make use of their expertise. They had performed similar tests as we had before, so we wanted to see how our methods worked with their data and vice versa. Unfortunately, it turned out that it wasn’t easy. We used a different format of data, conventions, and tools to do similar things and it wasn’t immediately obvious how to translate between these two worlds. Of course, we finally succeeded, learned a lot in the process, and made our code and processes much more compatible, but it took us more time than we suspected in the beginning.
Lesson learned 3: It’s important to operate within a unified framework. It requires some overhead, but if your project is bigger than you or your team, it will pay off.
I could tell you more failure stories like this, and if you work in the computational sciences business, you likely have plenty of your own. Let’s take a moment to summarize some common challenges, which, by the way, are magnified when doing quantum computations:
Here are some good practices I’ve learned over the years to make my life easier – sometimes slightly, sometimes significantly. The list is long; these are just some examples.
It all started on a warm summer day in June 2019. I came to Cambridge for a job interview at Zapata. I had a whole day of conversations, learned a lot about what people do at Zapata, and about the product they were building – a computational platform that we now call Orquestra. At the end of the day, tired and jetlagged, I had a meeting with Peter Johnson, one of the founders, and we decided to take a stroll around Cambridge. He asked me what I thought of the platform and the idea of working to build it. I had fallen in love with the idea of the product, but the opportunity to build a tool I had always dreamt of was pure bliss!
Okay, so what is Orquestra and why am I so enthusiastic about it?
Orquestra is a workflow management system designed for performing computations utilizing quantum computers (though it works equally well with classical and classical-quantum computations).
At the top, there are workflows. A workflow is a representation of an algorithm – it defines what steps need to be performed, in what order, and the relationship between them. (Although a workflow is not limited to a single algorithm; it could also chain multiple algorithms together)
The next layer is tasks or specific steps in the workflow. For each task, we define the inputs, how we want to process them, and what the output should be.
I know it doesn’t sound particularly exciting, so I’ll show you where the magic (at least part of it) lies. Here is how we run a workflow for solving MaxCut with QAOA:
(It’s a small thing, but I just love to watch how it unfolds. )
Here is how we run a workflow for solving MaxCut with QAOA for three graphs, each using three different optimization algorithms.
A couple of cool facts about these examples:
I could go on, but you get the idea. We’ve been using prototypes of Orquestra internally for well over a year to do our research and customer work, and it definitely speeds up our work and saves us some major headaches.
Is Orquestra a silver bullet that will seamlessly solve all of your problems? Is it the right tool for any kind of project? Does Orquestra work flawlessly 100% of the time? Sure no. But we work hard – mostly day and sometimes night – to make the answers to these questions as close to “yes” as we can. So that one day, some scientists will look back at the software tools we’re using today and think about us with the same pity that we think today about Rutherford’s students.
Yours forever in computational struggles,
Michal