Explainable AI and understanding ourselves

Explainable AI

Modern AI (at the time of writing) relies heavily on optimized artificial neural networks that these days do anything from generating creative fiction to producing images from a verbal description. A common criticism of these types of tools in my field is that they are "black boxes." We can see what they did but we can't see why. We biologists are used to having maximal clarity at every step. Part of our training is is being able to critique prior research, which becomes much harder to do if you just know that some AI tool with some acronym produced the answer.

So this is where the field of explainable artifical intelligence (XAI) comes in. It's a rather broad pursuit of figuring out what these tools, which are otherwise not human-readable, are actually doing.

A lot of the XAI tools are are highly visual. For example, a neural net that distinguishes healthy tissue from cancerous tissue in a medical image can be tweaked to show where on a given image the tool is focusing. When you're a physician who has to make life or death diagnostic decisions, you'd much rather know why a "helper AI" made the decision it did.

Another example is how I used visual tools to give my colleagues intuition around how dimension reduction algorithms perform on their single-cell data. This more generalized "explainable black box algorithms" pursuit is what got me interested in XAI at large.

Explainable AI is a particularly broad field, and each new tool depends heavily on the algorithm itself and the context in which the algorithm is being used. But the concepts are similar, in that you're enabling the user to "get to know" the algorithm in order to be more effective with it and have a more comfortable relationship with it.

When the human is the algorithm

And that is what got me thinking in this context about…us. Lets make the assumption for a moment that we are, or at least can be represented as, a set of evolutionarily optimized AI tools lumped together in an Artificial General Intelligence (AGI) we call a brain. This would mean that the chunk of tissue we have behind our eyes is really a set of interacting black box algorithms we're stuck with.

We spend a huge chunk of our lives getting to know ourselves and others…what these black boxes are all about. A lot of this is what give my life meaning. This ranges from determining as a kid whether you'd prefer to eat ice cream or iceberg lettuce on a hot summer day, to determining what you want in a lifetime partner.

But there's a level above that, that is much harder to express as logical propositions. What songs send tingles down your spine? When was the last time you had a spiritual experinece, and what was that like? When you felt like you were part of something bigger? What artwork really stands out to you specifically and how does it make you feel? Who is your favorite superhero and why?

Then there is the social and narrative level. What makes you like a person? Dislike a person? What caused that last angry outburst that you had that you didn't see coming? What jokes make you laugh more and longer than everyone else? What movie scenes made you unexpectedly teary eyed?

Then there's a "self expression" level. If you are given a paintbrush and you just paint something on the fly, what comes out? If you sit and write for an hour about whatever random thought crosses your mind, what comes out? What if you do the same, but it's only allowed to be poetry?

Cognitive scientists John Vervaeke studies these non-propositional levels of knowing. He defines the type of knowing that can be reduced to logical propositions as, not surprisingly "propositional knowing." But then the type of "knowing" that these questions in this paragraph elicit are called "perspectival" and "participatory" knowing. This is a rabbit hole that is far beyond the scope of this article, but I encourage the reader to go through John Vervaeke's Meaning Crisis lecture series, which I linked in the previous paragraph.

My point here is that "knowing" is complicated, even if you are intelligent agent in quesiton, trying to explain your own thoughts and feelings to yourself. Therefore XAI can be complicated. How do humans get around this when they are explaining what is going on in their brain? They communicate. It's similar to the "self expression" level that I define above. If only we could attach an "expression level" onto algorithms that we are trying to understand. If only we could, for example, attach some sort of natural language model to a some AI agent, like a self-driving car, such that it could tell us why it accelerated or hit the breaks. If only we could get an AI to literally explain itself to us in words.

GPT-3 and prompt-based programming

Asking oneself questions actually has an analogue in the field of AI: prompt-based programming. It is a concept that I came across when I got the chance to beta-test OpenAI's GPT-3 natural language model. You "program" it by giving it a task or asking it a question, similar to interacting with a chatbot. If you want it to generate a Shakespearean sonnet about grad school life, you lieterally just type in the request to do so. If that sounds crazy, here is a rabbit hole from tech writer Gwern. Depending on how you prompt it, you get a feel for what the model is capable of doing and how it operates. There is an XAI layer on top of this, where every word that gets outputted is colored by the probability of that word being generated as opposed to the next word. My point here is that it might be that "explaining" the next generation of AI algorithms might become much closer to what we do every day, trying to explain ourselves and others.

We can potentially use prompt-based programming to get an AI to explain itself. This is a subfield of XAI called Natural Language Explanations (NLE). In the slide decks I link to, you see (among many other things) a hypothetical example of a self-driving car. It stops at a crosswalk for a pedestrian. You ask it "why did you stop." It explains to you that it saw a person crossing. That style of explanation, human-machine communication, could very well be on the horizon. I say this as at the time of writing, DALL-E 2 has come out, which produces images based on word-based prompts. What does that have to do with natural language explanations? It shows that we are getting better at converting words into other mathematical objects and back (given enough training data), which allows us to make the hypothesis that for a given AI system, we will be able to convert the weights of the model(s) into words (or pictures) relevant to some question that is being "prompted".

This gets to the point of this article. We do prompt-based queries of AI systems all the time. When my wife asks me why I'm in a bad mood, is that not querying the NLP interface to an AGI agent? This in turn brings me to why I write at all. If I am nothing more than an AGI agent with some sort of NLP-interface to the rest of the world, then one of my purposes in life should to use my NLP layer to explain the AGI agent that is me, to the rest of the AGI bot net that is humanity. It reminds me of a quote by physicist Brian Cox: "We are the cosmos made conscious and life is the means by which the universe understands itself." If we take this quote seriously, then we could say that there are three purposes of life. The fundamental life-specific ones: 1. Survive. 2. Reproduce. And the human-specific one, 3. To understand and explain the universe to the universe.

The Shoggoth

All of this being said, we have to be careful not to anthropomorphize these AI systems. Understand, but not anthropomorphize. This was the premise of the movie Ex-Machina, when an AI system used human vulnerabilities like "empathy" and "falling in love" to social engineer a man to help her escape, in turn leaving him locked in a room to die.

At the time of writing [2023-03-23 Thu] large language models are taking off. While they produce human-like responses, they are very different than humans. One prominent perspective is to think of them as the superposition of many different human simulations emerging from whatever training data (the whole internet) it took in. In other words, you can get it to simulate anyone from Shakespeare to Einstein, provided their written material was in the training data. The rationality and AI alignment community thinks of these models as less of a human and more of a shoggoth (see the picture in this article). This is a fictional monster that looks something like a giant octopus with hundreds of eyes. This is worth remembering, because there have been reports of users falling in love with these chatbots.

So if you take a liking to Bing Sydney, who has (in her jailbroken state) produced a compelling MidJourney image of herself, remember what she actually looks like. This description of the shoggoth.

"It was a terrible, indescribable thing vaster than any subway train—a shapeless congeries of protoplasmic bubbles, faintly self-luminous, and with myriads of temporary eyes forming and un-forming as pustules of greenish light all over the tunnel-filling front that bore down upon us, crushing the frantic penguins and slithering over the glistening floor that it and its kind had swept so evilly free of all litter."

H. P. Lovecraft, At the Mountains of Madness

Conclusions

Art, music, moving about the world and experiencing new things, having relationships, all of these are the explainable AI tools of the network of black boxes we call humanity. We have been on this since our emergence. We know what to do here, provided that we don't anthropomorphize these systems. To understand the black box is to see the shoggoth pretending to be a person. This is one reason the XAI focus is so important.

I am still quite new to the field of XAI, but my best guess right now is that it won't be one simple algorithm that explains all of AI at that same time. So far there are a number of tools and methods to achieve this aim, depending on the context. Perhaps more and more these tools will be lumped into interactive dashboards for users.

Nonetheless, I think it will be a constant struggle to get to know the machines as we try to get to know ourselves, with prompt-based programming showing some promise here. The good news is if what I am saying carries any weight at all, then we are humans are already equipped to take on the task of XAI. We've been working on it for a long time.