Psychedelics and Language Model Temperature

The effect of psychedelics is often described as an increase in fluidity – of being, in some non-specific sense, more free to wander out of usual habit patterns. This essay presents an analogy to a feature of LLMs, namely the temperature parameter, that allows a precise characterisation of this increased fluidity. It begins with a semi-technical description of how this parameter affects the operation of an LLM, and then draws a parallel to how psychedelics affect human experience at multiple levels from perception to thought patterns and life reflections, before concluding by placing this analogy inside a larger project of using the internal mechanism of AI to better understand our own minds.

The abundant content written about AI mostly falls into two categories, one about its effects on the world and the other about its internal mechanism. The former includes economists speculating about its market prospects, political writers arguing for more or less regulation, and social commentators decrying what it is doing to the world. The latter includes machine learning research and pedagogical material.

There is much that is valuable in this content, but taking only one side at a time misses the possibility of cross fertilisation between the two. Of course it adds difficulty to combine understanding both of AI and of another subject – it may seem more feasible to either practice AI and remain entirely within the discipline or stand entirely outside AI and critique it from the perspective of another discipline. However, when exploring in two directions simultaneously, one does not need to go to great depth to unearth something of interest. In this piece, I will discuss a simple technical concept from large language models called temperature, and how it relates to psychedelic experience.

Temperature in LLMs

When a language model chooses which word to use next, it invokes a property called temperature. As a user of the model, you can set the temperature to different values, and its effect is often phrased as a trade-off between predictability (low temperature) and creativity (higher temperature). What is literally happening is that, every time the model is about to generate a word, it predicts a suitability score for every word in its vocabulary and then, somewhat randomly, picks the word to use based on these scores. A higher score means it’s more likely to pick that word, but how much more likely? That’s the function of temperature. Temperature determines the relationship between the suitability score of a word, and the probability that word is picked. A high temperature means that even words with low scores still have some chance of being picked. If you make the temperature lower, it becomes more likely that the model will just pick one of the few very high-scoring words, and when the temperature is zero, it will always pick the single highest-score word in the vocabulary.

Mathematically, the probability it picks the i-th word is given by the following, where s_i is the score of the i-th word.

If you want to follow this calculation through, you will see that, when T is high, all the probabilities are close to each other, and when it’s low, almost all the probability is on a few dominant words. Alternatively, you can see the effect graphically. In the images below, we’re imagining the model has nine words it’s choosing between. The three panes show exactly the same scores for these nine words, but with increasing temperature from left to right. Notice that the order doesn’t change: under all temperatures, w3 is the one with the highest probability of being picked, followed by w5, w9, w1 etc. However, the choice becomes much more uncertain with rising temperature. Comparing T=0.1 and T=0.5, it has gone from 90% sure of picking w3 to only 25% sure. Other options, like w2 and w4, which were not even considered in T=0.1 have begun to emerge as possibilities when T=0.5.

If you were to increase the probability on a select few of the improbable options, that would also increase the overall uncertainty – but raising the temperature does so in a completely general way, meaning that the additional uncertainty is spread equally over all possibilities.

Temperature in Humans

Like language models predict the next word, our minds are constantly making inferences as to what is going on in the world, and so the concept of temperature can also describe some property of our cognition. Imagine you’re lying in bed and hear a sound at your window – it could be a bird, or a tree branch, or someone throwing stones, or something you misheard that’s actually not coming from the window. Of the many possible causes, there’s going to be one you consider the most likely, but how much weight do you still put on the other interpretations – the long, almost endless tail of possibilities? Or suppose you are deciding what career to pursue, do you have one favourite that is essentially the only one you’re considering, or are there multiple options that all have a reasonable chance you might pick them? You can probably think of people you know who tend towards having a single strong favourite, like the left bar chart, who are confident in what they believe and what they’re going to do, and don’t spend much time entertaining alternatives. Then there are other people who are more like the right bar chart: they’re open to lots of possibilities, and tend not to commit fixedly to any one.

This is not just a distinction between different people. The same person can have periods or contexts in which their temperature is higher or lower, and one particular thing that you can do to cause an increase in your temperature is taking psychedelics. This is not the only thing psychedelics do – for one thing, different substances have different effects so they can’t all be acting through an identical mechanism – but it can account for a surprisingly large portion of their effects.

Temperature in Perception

At a low level, psychedelics increase the temperature of perception. In ordinary consciousness, our visual field is mostly stable – we have a fixed idea of what is where, and where the boundaries between objects are. Occasional counterexamples are optical illusions, such as the famous Necker cube, which is drawn to be ambiguous between two interpretations – whether we’re looking from above or below.

As your eye switches between these two interpretations, it will seem like something has flipped or changed somehow in the image, even though you also can tell that none of the lines have moved. This is exactly the sort of movement that one sees in psychedelic visuals. Contrary to trite cinematic depictions, things do not look like they’re moving or changing shape in the same way as when they “really” move – one does not see a straight line as a curved line or a triangle as a square. Instead, there is a sense that things are shifting in lots of little ways, while at the same time they’re in exactly the same position as before.[1]

To take a real example, consider how you structure the pattern of lines you see on this palm. How many main lines do you see there being? Probably 2-4, but there are some at the bottom middle that maybe could be main lines as well, or they could be a secondary-main sort of line along with some of the 2-4 further up. And what about the colour? Do you see it as basically all the same colour with some minor fluctuations, or are some splotches sufficiently more red to be a fundamentally different colour from the rest? There could also be a darker section at the bottom which shifts to a different coloured section around the bottom of the thumb. If you look for long enough at this, or at your own hand, and let your vision roam through the different interpretations, it will start to seem less solid than it first appeared. You will have slightly turned up the temperature on the different ways you can see that palm, so instead of being locked into one way of seeing it, you’ll have lots of options with varying degrees of salience. As features move in and out of prominence, there will be a feeling of some subtle visual change. Imagine turning the temperature up much more again, and doing so across everything you see, and also the other sense doors such as sound and smell. That is what psychedelics do – that is part of the state of perceptual dissolution they bring about.

Temperature in Cognition

It is an iconic feature of psychedelics that they alter perception, but they also increase temperature in higher levels of mental activity. That is, they increase the uncertainty in the plans you make and directions your thoughts follow. Suppose it’s just after lunch time and you have nothing else planned before dinner. How do you decide what to do with your time? The set of all things you could conceivably do is nearly infinite, but even restricting to those that are not complete nonsense, the number of options is still large. If you’re in a moderately sized city, it will easily run to the dozens. Normally, by force of habit, you don’t consider any more than two or three, and this shields you from the full extent of how much freedom you have in that moment. But in a state of increased temperature, that filter starts to peel away, and the normally-ignored options begin to present themselves as worthy of consideration. You’re forced, then, to take on a greater degree of agency than you’re used to.

Raised temperature will produce the same burden of freedom for any decision. Should you change career, or relationship or place you live, or take up or drop a hobby? We all have, to some extent, a set of core beliefs about what kind of person we are and the life we’re leading – a sort of dictum or creed. Increasing your cognitive temperature can lead to you trying on different creeds, and cycling rapidly through many alternatives: “I’ll work harder in my career, and become a rich, successful person”, “My family (or friends) are the most important thing, I’ll start spending more time with them”, “I’ll make being physically healthy a central part of who I am”, “Life is an opportunity to try new things, I want to travel the world”, “I want to feel part of a community, I’m going to make an effort to get to know my neighbours”, “I’ll spend more time alone to learn a new language or craft”… You could choose any of these as the pillar, or a pillar, you build your life around, and, with increased temperature, you can see with vividness what that choice would be like. You can picture what it would be like to look back 5 years from now having really devoted yourself to it, and you see that there’s nothing stopping you doing so. Maybe this even bleeds back into the more immediate decision of what to do with your evening. You could call up a friend, or your family, you could go for a walk or to the gym, you could do some extra work, or start reading a new book or meditate, and that could be the first step in a new way to approach life. Even if you don’t normally notice them, these alternative paths are always available, neither inviting you nor resisting you, just passive and open as a way you could go – and as your temperature is turned up, more and more of them come into sight.

The mathematical formulation of temperature expresses that the uncertainty is turned up equally across all possibilities, and that is also the case with psychedelics. Psychedelics are often referred to as non-specific amplifiers. They do not make you think or feel particular alternatives to the normal, but they increase the availability of many diverse options – pleasant, unpleasant, neutral and almost any other adjective you can think of. This is why temperature is a fitting analogy, it is exactly the parameter that tunes uncertainty itself, rather than what things you think are more or less likely than each other.

Simulated Annealing: the cooling of our thoughts

Ordinary consciousness does not have the high temperature of psychedelic consciousness, and that’s a good thing – it would be paralysing and exhausting to always process many options for day-to-day decisions.  Low temperature provides confidence and groundedness. On the other hand, the benefit of high temperature is that it helps you resist becoming close-minded and stuck in your ways. It can reveal alternative ways of thinking and being that you might never encounter if you cleaved always to the default.

There’s an idea from machine learning that tries to combine the best of both high and low temperatures, called simulated annealing. ‘Annealing’ just means ‘cooling down’, so this idea is to begin with a high temperature, and then reduce it over time. Early on, a high temperature allows you to be adaptable and more able to back out of any mistakes. As time goes on, there’s less benefit of further exploration because you’ve explored a lot of the options already, so you can afford to turn down the temperature. At the end, the hope is that you have a configuration that’s likely to be a decent one, because of the early exploration, but also stable, because the final temperature is low.[2]

This is an effective technical trick, and it seems nature has led us to something similar. Say, when you first move to a new area, what you do each day might keep changing as you try various routes into work, places to shop or go for walks etc. Then you begin to establish a routine and your activity becomes more consistent and predictable. That doesn’t mean you do exactly the same thing every day or week, but that most of them are variations of the same few patterns.

Throughout life, the annealing process plays out on a slower scale too. At least from adolescence to old age, there’s a consistent trend of lowering temperature. In teens and twenties, you’re still figuring out how you see yourself and your core beliefs. Uncertainty and changes in such fundamental components of who you are produces an especially high temperature state because many other parts of life depend on them – it’s hard to make a long-term decision like what career to pursue if you haven’t yet decided what values are important to you.

By around 30 these can still change a bit but are normally more settled and your uncertainty is around external questions like where to live and work or whether/who to marry – less fundamental than building your identity and core values, but still important as they influence many subsequent smaller decisions. As you get older again, these too gradually solidify, until most of the remaining uncertainty is vicarious via the younger people in your life who are still going through the earlier periods of change, e.g. what career or relationship will your children or grandchildren pursue.

Of course, some people are exceptions, and the timeline and milestones can differ by culture, but the general trend appears to be one of cooling. It’s hardly controversial to say that young people tend to be more open to new things (high temperature), and old people more stable and set in their ways (low temperature).

Psychedelics to Complement Annealing

However, there is an important difference between simulated annealing in machine learning and annealing over our lifetimes. An ML problem is fully defined ab initio. There is a list of options of varying degrees of good or bad, and you search through some to find the best one you can – that’s essentially what machine learning is. But when a person tries to find the best way to set up their life, they’re searching for options in a world that is itself changing. Simulated annealing works because you do a lot of exploration with your high temperature at the start, and then by the end, you have a good sense of the landscape so you can afford to lower your temperature

………………………

[1] At higher intensity, maybe you start to lose touch with the “they’re in the same position as before”, but at that point, the whole concept of object and position is beginning to unravel anyway.

[2] Many machine learning systems use temperature, not just LLMs. The latter is of course widely familiar today, so a suitable reference for the general concept, but the former is a more fitting analogy for simulated annealing in particular.

………………………………………………..

Get the Full Experience
Read the rest of this article, and view all articles in full from just £10 for 3 months.

Subscribe Today

, ,

No comments yet.

You must be a subscriber and logged in to leave a comment. Users of a Site License are unable to comment.

Log in Now | Subscribe Today