The Halting Problem Problem
![]() |
| Are the pod bay doors open or not, HAL? |
My experiment with Claude Code is going well. After about 20 hours' worth of "coding," I've reached this point (if you're reading this in the far future, there's no guarantee this link will still work, but it should be available for the foreseeable one). Depending on when you access it, you might see more functionality bolted on, but, for now, it's able to successfully pull items from my Square sandbox, display them on the requisite category pages, search for items, and add items to a shopping cart. I've made a few tweaks here and there and simplified the duplicated menu structure that led to one bug, but otherwise, I've just let the machine do its thing.
However, watching it fix an issue with the search results (on the desktop view, the results landed below and to the left of the site header, resulting in a very ugly experience), I was drawn to ruminating about one of the inviolate constraints of computer science - The Halting Problem.
The Halting Problem is one of those problems that I can understand anew after reviewing the proof for a solid 10 minutes, but can never keep the details straight (like the Monty Hall Problem - maybe all problems containing "Hal" are simply designed to be difficult). However, the concept is straightforward - the problem states that it is impossible to create a system that can always determine if a system (itself or another one it's inspecting) will reliably halt.
What does this mean? Essentially, computers are bad at debugging. And though LLMs are a special class of non-deterministic computer, they're still computers, which means that you can't always count on them to reliably write a feature or debug an issue.
While we can make the same claims about humans, humans have a built-in failsafe mechanism - impatience. If a website doesn't return results in a minute (often far less), we assume it's broken. If we're in charge of the website, we'll spend time troubleshooting the issue, and, somewhere north of 99.999% of the time, we'll find that there is indeed a problem, even if we don't know what it is.
A computer, with no other instructions, will simply sit there until an external factor makes it fail (a hardware failure, a reboot, the end of time...). And, this, in another form, was what Claude Code experienced while debugging my search results issue.
During its "thinking" phase, I could see a veritable word salad of it flitting back and forth between a couple of different strategies without any external stimuli to give it feedback. Humans could simply make one of the changes, confirm whether or not it worked, and move on. [NB: We are subject to analysis paralysis. One of my biggest pet peeves in software engineering is the amount of time spent planning for eventualities before experimenting. The beauty of software - with notable exceptions that deal with life or livelihood - is that the barrier to experimentation is low. It's often sufficient to write some code, test its functionality, and revert if it doesn't work. Yes, I'm simplifying the scope of software engineering - particularly that large, complex systems require more rigorous examination. But, for individual experiments, the stakes are often very low.]
I've read recently from a few different sources that our consciousness may derive from our combination of a brain and our presence in the physical world. A dystopian novel I read a couple of years ago - Fall; or, Dodge in Hell - made a similar argument. If you want to see a very compelling version of how our AI future can unfold, this is a great novel to read (and it came out 4 years before ChatGPT became a household name, so it wasn't piggybacking off the hype). It's not a happy outcome, but it does seem like a realistic possibility.
As I grow more comfortable with Claude Code, I am impressed with its code generation (and, in most cases, its debugging skills), even though it doesn't always look for the most efficient solution. But it still makes the same dumb mistakes that all LLMs do - it invents solutions, it only looks at the most obvious implementations, and it can force itself into a loop. It doesn't really have a presence of mind to determine if its solution is correct or if should stop, go for a walk, and let the problem lie fallow while its brain ruminates on when we will inevitably reach peak K-Pop.
It's still too early to make a more substantive claim, but I think Claude Code may actually increase my productivity 2 or 3 times vs. the 20% increase I claimed using AI for autocompletion. But, it still requires a lot of babysitting. It's still wrong enough that you can't just let it vibe code through anything non-trivial.
And, perhaps the greater danger, is that, even though I spend time reviewing the code it produces, I'm writing far fewer lines than before. Maybe this will lead to a lesser understanding of the system over time, and maybe it won't, but it's a real risk as we rely more on these systems to build the software our livelihoods and lives depend on without understanding what they're producing. You wouldn't trust a bridge just made of "stuff," would you?
Until next time, my human and robot friends.

Comments
Post a Comment