Fight complexity with complexity

The Sage sees without looking, finds without searching, and arrives without going anywhere.

Lao Tzu, Tao Te Ching

When I joined my thesis lab in 2012, we were working with a new kind of "big data" from a new kind of single-cell technology. The data were not human-readable. Think of an excel sheet of 50 columns by 100,000+ rows. As such, many methods were developed around sub-setting and visualizing the data, so humans could actually read this. One class of these methods is non-linear dimensionality reduction, like t-SNE and UMAP. These are black box algorithms. Their output can give us high-level intuition of what's there. But it's hard to interpret how accurate they are in relation to how accurate we think they are, given that they look very nice. The way I approached this problem was to use the algorithm itself to try to explain the algorithm's accuracy, by coloring and labeling these maps by their own performance. I was using a black box algorithm to explain a black box algorithm. I was using complexity (t-SNE and UMAP) to interpret complexity (t-SNE and UMAP).

Fast forward to 2023. ChatGPT has brought forth an interest in (and fear of) AI that we have never seen before. But there is something interesting happening here that is relevant to this article. These language models are black box algorithms. We have intuition from an input and output perspective of what's going on, but we can't explain what's going on at the neuron to neuron level. But OpenAI is now experimenting with using GPT models to explain the neurons of GPT models. Specifically, it's using GPT-4 to interpret GPT-2. While I can't evaluate the merit of this approach at this time, it points to the same theme: using complexity (GPT-4) to interpret complexity (GPT-2).

Now let's look at the human brain. When I was in undergrad learning basic neuroscience as part of my major, I learned about the grandmother neuron hypothesis. This was the idea that there would be a group of neurons that fired when I saw my grandma, and a smaller group of neurons that fired when she was in a rocking chair. And eventually a single neuron that would be associated with grandma in the rocking chair at a particular angle. This hypothesis turned out to be wrong. The cortex stores information in a decentralized manner. Ablating a piece of the cortex doesn't ablate a specific childhood memory, because that memory is stored all over. At this point, the undergrad version of me got the idea that the human brain was irreducibly complex. The idea that we could put on a helmet that would interpret our thoughts was science fiction and that's that.

Enter GPT based language models. Again, a series of studies I can't perfectly evaluate the merit of just yet, but we have a GPT-based language model converting fMRI data into interpretable words associated with what the subject was seeing. A similar study out of Cold Spring Harbor used an AI diffusion model (image generator) to reconstruct images that the subject was perceiving using fMRI data. Now, I've been doing biology long enough that I will approach this kind of work with a healthy level of skepticism for a few years, but we're dealing with a similar theme as before: using complexity (AI models) to interpret complexity (the human brain).

Now let's shift from computation, my second love, to wet-lab biology, my first love. At the start of graduate school, in the early 2010s, the trendy thing was to use systems biology to finally understand biology well enough that we can develop individualized cocktails of drugs for cancer patients depending on the exact nature of their tumor. I joined a systems immunology lab as a cancer biology PhD student, thinking that we'd be able to de-code biological networks both in normal and cancerous tissue enough to make this a reality. While it turns out that it's more complicated than that, what did happen was a revolution in cancer immunotherapy: CAR T cells.

The idea here is that the immune system leverages its own complexity to fight the complexity of the rapidly evolving world of pathogens in our environment. So why not train the immune system to recognize and fight cancer? It's not a new idea, but unlike AI, ideas in biology take a long time to develop. CAR T cell therapy is one great example of how you might get it done. You take the patient's T cells, add an antigen to them that targets the cancer, and reintroduce these engineered T cells back into the patient.

But notice the philosophical difference between this approach and the targeted therapy approach. Targeted therapy contained this embedded mental model that we would have a perfect molecular wiring diagram of a cancer patient on a computer, and we would be able to use all kinds of in-silico modeling tools to figure out what drugs were needed, and generate the cocktail of drugs, curing the cancer. CAR T cell therapy contains an embedded mental model that biology is complicated, cancer is complicated, and the complicated immune system handles the complicated pathogenic world pretty well, so let's fight a cancer's complexity with the immune system's complexity. In a way, cancer immunotherapy contains an embedded admission of ignorance, which in my opinion is a good thing. Because that is what science is.

I'll conclude by re-introducing a word into the biological lexicon that I hear largely from sustainability research: hyperobjects. A hyperobject is a high-dimensional shape and/or cloud and/or network and/or data structure that we cannot possibly perceive all at once. One prototypical example of this is the ocean. When I think about the ocean, I think about a big vast blue body of water, the setting sun, the sound of the waves, and light surf rock music (in case you wanted to know about my hobbies in grad school). But in reality, the ocean is a huge network of the chemical properties of the seawater, its interaction with the dynamics single-cell and multi-cell life, sunlight, gas exchange with the atmosphere, geothermal vents, coral reefs, ocean currents, and of course, all of this interacting with human pollutants.

How do you understand a hyperobject? You can't. But what you can do is slice it in as many ways as you can until you get some intuition around it, while acknoledging that your intution around it will be incomplete. Examples of this range from creating Hegelian dialectics between opposing views rather than taking a side, to studying something like the ocean from as many different angles as you can. To admit that something is a hyperobject is to admit one's ignorance. But it's not just that. The subtletly here is that the admission of a hyperobject is a call to action. It tells you that it's time to start peering into slices of that hyperobject so you can start to get actionable intution around it (eg. stop polluting the ocean).

But the advent of the "fight complexity with complexity" paradigm gives us a new approach. You get another hyperobject to grok the hyperobject in question, and then tell the human what it needs to know. When we have an AI model that can synthesize all possible ways a protein can fold with every study and dataset ever published about cell signaling, will we be able to re-visit the old dream of targeted therapy? Maybe. If not, then maybe we can get really good at tweaking the immune system to work in our favor. Either way, "fight complexity with complexity" is worth thinking about, because this wave isn't dying down any time soon.