You shouldn’t care about my thoughts on AI

There is a joke somewhere about opinions and assholes. It seems to me that writing about AI and its impact on everything has been proving that to be true. Yet, here I am, adding more to the pile. In my defense, I am writing this mostly to settle my own thinking and put down my current opinions for reference so that when I am wrong, I have something to look back on later.

There have been rumblings about the power of LLMs and “AI” for quite a while but for me it became real after Github released Copilot in 2021. This is the point where as a software engineer, it had the opportunity to have a direct influence on my livelihood and lots of people started opining on how LLMs would change the world forever, be a flash in the pan, or maybe end the world.

As a geek, I was pretty curious. I made some early attempts at running some of the AI models locally on what at the time was a beefy GPU. I was not impressed. It was very difficult to get running, the results were mediocre at best, and using just the sound of the fans as an indicator it was hugely wasteful. It seemed like the biggest usecase waas making bad pictures with too many fingers. I dismissed LLMs and “AI” as a crutch that was decades if not longer away from being anything useful. I think this was partly my pride as staff engineer, partly my disdain for the search for silver bullets that would make everyone a successful software engineer, and my limited experience with LLMs at that point and their propensity for making shit up (aka hallucinating).

Then in 2023 I had an idea that started to change my mind. I still firmly believed that LLMs hallucinatory behavior was bad but there was a crossover to my role as a dungeon master for a few Dungeons and Dragons campaigns where hallucinating could be good thing. So I tried something

Write a damage liability waiver form for a dungeon where fantasy adventurers must fight undead creatures for practice. Include standard liability legalese. The name of the dungeon is The Dungeon of Darkness and the proprietor is Berala Evenfall.

Thus began my tentative use of LLMs. There is a lot of reasonable debate about using LLMs for generating creative content. I am uncomfortable that there are companies that have shamelessly used content without compensating the original authors to build billion dollar companies. But to expand the enjoyment of my players in a casual setting I found them to be very useful. From writing deeper backstories for minor NPCs to fleshing out my world building it helped me have more fun playing D&D and I think that was a good thing. I confined LLMs to making shit up in an arena where making shit up quickly was their strong point.

Tangent: I do ponder at times that there is a strange parallel between AI training and human learning. If I think of the human brain, what do we do? We are pattern matching engines. We take new situations and compare them to all of the experiences we have had in life to that point and try to figure out the most likely response to achieve the outcome we want. That doesn’t feel all that different to me from what is being done with LLMs. Even with human’s, there is a fine line between being “inspired” by the content I have freely been exposed to in life and plagiarizing it. Back to the original story.

Sometime later in 2023 I got access to use Github Copilot. And as much as I hated to say it at the time, I was grudgingly impressed. I did not see it as any kind of existential threat to human developers but it was definitely a useful tool. It often produced a good first draft of what I was working on; enough that I could more quickly fix what it had done than typed it from scratch by myself. Or to help me document what I had just written in a way that would be helpful to future developers by commenting things more completely. However, I found that it had an interesting bimodal distribution of results. Sometimes it would pre-generate a brilliant implementation of what I had barely started to do myself while at other times it would stubbornly produce absolute garbage over and over. It made me realize that my mental model for how these tools can fail is nothing like how I think of code written by other humans.

That realization is I think what concerns me most as LLM and AI tools advance. For the whole of human history, when interacting with people around us, we always have to make judgements on the veracity of what we are seeing, what they are doing, and if we should trust them. And this is even before we take into account potential malice or intentionally lying. The user experience of AI technology is to create tools that we interact with similarly to other humans. However, the failure modes of LLMs do not mirror that of people. For example, we expect a reasonable human to be able to say “I don’t know” if they are not confident in their answer (and we distruct people who lack that ability). AI tools? Not so much. See using glue to keep pizza cheese on. We have been conditioned (incorrectly I might add) by pre-LLM software to believe that computers do not make mistakes. This brings about a huge risk that our brains are just not set up to determine how much we can trust the information we get from them.

My current use of AI tools is sporadic. Copilot is part of my development workflows and sometimes it is useful and othertimes I will go days without ever using it. I have also started dabling with running some of the models locally again. The tooling has gotten far better (I use ollama and Enchanted) and being able to run the models on my Macbook Pro has significantly improved the experience. There is something magical about being able to do a query that is basically using the whole of the internet locally and offline from your laptop to find answers. It is a nice backup. And being able to have multiple models that I can run side by side to compare allows me to find the right tool for the job. Analagous to finding different human experts. I also feel that I am learning where these tools fall apart and where I need to be wary.

So where does that leave me in early 2025? My biggest error early on was forgetting how quickly technology can move. I feel like I have been watching the technology improve faster and faster right before my eyes. There was a quote I heard on the Stratechery podcast (not original I am sure) from Ben Thompson about progress that we can go “decades with years of progress and then years with decades of progress.” I feel like we are in the second situation. How long will it last? No idea. But I don’t think we are done seeing rapid growth.

May Contain Blueberries

the sometimes journal of Jeremy Beker

You shouldn’t care about my thoughts on AI