AI in the Library of Babel

“Every block of stone has a statue inside it and it is the task of the sculptor to discover it.” - Michelangelo

Borges inhabits a genre you could describe as ironic realism. His stories are liberally sprinkled with a trope where he takes some abstraction of his choice and makes it parochially literal in a deliberately absurd way. In the spirit of making up unwieldy words, I’ll call it:

borgesification: the act of taking an abstract concept literally.

For instance, I could imagine him writing a story in which the clearly figurative Michelangelo quote above is interpreted literally and sculptors spend their careers searching for certain blocks of stone which they think contain the desired statues. Here, Borges would be borgesifying the idea of creativity as a search, on which more (not very much) later.

But instead of an imaginary short story (though I’m sure Borges would appreciate that concept), I’ll give an example of borgesification in one of his actual ones, The Library of Babel. Since meaning is the intersection of artificial intelligence and linguistics that I personally care about, I can’t help extrapolating some morals from it.

The Library of Babel

In Borges’ short story The Library of Babel, all books exist, present in an enormous library which constitutes the known world. Every possible combination of symbols (up to some length n) seems to be present in some book somewhere.

“Everything: the minutely detailed history of the future…the translation of every book in all languages…the treatise that Bede could have written (and did not) about the mythology of the Saxons…“.

Librarians search for truth by physically searching the library for the book that contains it. They spend their lives traveling through the library, occasionally stumbling on interpretable phrases.

This conceit is obviously pretty absurd, but what exactly is being borgesified? To answer that, a brief detour…

One of the really old ideas in AI is that creativity is just search in the appropriate search space.

This applies both to problems that have an obvious search-like element (e.g. planning a route from 2 points that your roomba should take) but also creative tasks like writing a novel: there’s a space of possible ways your novel could be, and all you have to do is to find the right one. Another example further afield, from Greek poet Seferis, which mirrors the Michelangelo quote above.

“all poems written or unwritten exist…The special ability of the poet is to see them.”.

Creativity as search goes hand in hand with the quest in AI (and philosophy, for that matter) to find appropriate representations.

For example, consider the task of image generation from sentences. Machine learning has yielded a modicum of success at this task: you provide a sentence describing a scene, like:

“A sheep by another sheep standing on the grass with sky above and a boat in the ocean by a tree behind the sheep”

and get a picture in return. Here’s a real example generated by a computer from that sentence. Blurry but pretty amazing when you think about what the computer had to understand to do this:

Thinking about this task as a search problem, there’s a space of possible images, and the task is finding the right one for the sentence.

But intuitively, it would seem crazy to consider every combination of however many pixels make up this image. Instead, we want to consider different scenes, which aren’t made of pixels, so much as objects and their relations.

In other words, it seems like we should be searching through scene space, not pixel space. (I haven’t discussed that algorithm that actually produced the above picture - I’ll save that for another time.)

Creativity as Search Borgesified

Returning now to the story of the library, we now have the tools to describe the borgesification in more detail.

Borges has taken the idea that creativity is a form of search and interpreted it in the most literal way possible, so that the librarians search not the space of possible meanings, but the space of possible strings.

Because the librarians believe that meaning inheres in the library’s books, they try to answer questions about the world by searching over the space of strings, rather than the space of ideas that those strings might, interpreted in some language, represent.

Moreover, they believe that the books mean things when interpreted in their language, even though there’s no reason to reject the null hypothesis that they’re just random.

The locus classicus of smart thinking about form, meaning and intelligence is Douglas Hofstadter’s Gödel, Escher, Bach which makes a point precisely on these lines:

“people often attribute meaning to words in themselves, without being in the slighest aware of the very complex “isomorphism” that imbues them with meanings. This is an easy enough error to make. It attributes all the meaning to the object (the word), rather than to the link between that object and the real world.“.

In a way, Douglas Hofstadter’s whole thesis in Gödel, Escher, Bach revolves around these level slips between meaning and form, construed both as a feature and a bug of cognition.

Here’s another example of a similar borgesification, outside of Borges’ writing: at one point in Lemony Snicket’s superb A Series of Unfortunate Events, Klaus has to open a door by entered a passcode. He’s told this code is the sentence describing the central theme of Anna Karenina and accordingly enters the following words

“a rural life of moral simplicity, despite its monotony, is the preferable personal narrative to a daring life of impulsive passion, which only leads to tragedy.”

And the door opens. This is absurd because you could never expect someone to enter that exact sequence given the prompt: there might be one moral of the book, but there are countless strings of words which represent that moral. In other words, there might be a unique answer in concept space, but not in string space.

Only in a world where symbols were somehow inseparable from their meanings could this security system be expected to work.

The superficial moral

The mistake made by Borges’ librarians is one often thought to be made for real in AI systems.

Critics of connectionism and now more modern statistical methods for translation, image recognition, dialogue generation, chess playing and so on level that these systems, which are learnt by fitting models to huge amounts of data, aren’t engaging with meaning. Or to put it another way, they don’t incorporate the right sort of representation. (Criticisms of this nature date back at least to the push back against behaviorism in both cognitive science and AI - see Marr’s levels, Chomsky’s performance-competence distinction).

This perceived intensional failing is usually translated into an extensional prediction: that AI needs something more to pass the Turing test.

For instance, this well-known article nicely illustrates a failure mode of statistical image recognition where a leopard skin couch is recognized as a leopard. The problem seems to be that the decision process for leopard-hood used by the statistical classifier has no abstract representation of 3D shape.

Another example is Winograd schemas, on which front I’ll defer to wikipedia, pointing out just that the pattern is the same: there’s a claimed extensional failing of systems that don’t “understand” language from a perceived intensional failing: lack of a semantics.

So you might be tempted to agree with the following version of Klaus from one of the library of Babel’s variations on A Series of Unfortunate Events:

Klaus: the moral of The Library of Babel is that an algorithm which doesn’t take into account the appropriate representations for the task in question is doomed to failure.

That said.

I think you’d be wrong to agree with this version of Klaus. Nor indeed do I think the above is a good moral as regards modern statistical AI.

I can’t really do justice to why I think that without a much longer digression, but the general gist of it is that representations and/or meanings in cognition aren’t at all what they seem. Form and meaning are weirdly inextricable.

This is a sentiment that you can definitely feel in Gödel, Escher, Bach (although Douglas Hofstadter definitely isn’t a proponent of modern machine translation) and in bits and pieces of philosophy from Hume to Quine.

If I had to improve on Klaus’ moral of the Library of Babel, he would utter something a bit more long winded:

meaning is never a property of objects independent of their contexts, no matter how strongly it might seem to be, and if we assume it is, we will end up trying to build impossible systems which attempt to explicitly represent concepts which in truth should be merely explanations of these systems.

Some caveats and endnotes:

  1. I’ve been using “meaning”, “intension”, “abstraction” and “representation” somewhat interchangeably. I’m of the mind that these are words approximating the same idea, just from different perspectives, but of course, in particular technical contexts that’s not always true.
  1. You might wonder, reading this slightly weird hagiography of Borges, whether I really believe his stories are about intensionality and meaning. Avoiding the obvious answer that The Library of Babel exists to problematize aboutness (i.e. intensionality, i.e. meaning), I think this is an emphatic yes. I also think they demonstrate how effective it is to do philosophy by parable, a moral which philosophers like Daniel Dennett have taken up with aplomb.