I start from the premise that the ChatGPT models from the o1 line are intelligent. I talked about this in an earlier post, the short version being that o1(/-mini/-preview) is sometimes stronger than me in undergrad-level maths problems. A human’s ability to solve maths problems demonstrates that human’s intelligence, and I think that the same standard should apply to an AI system.
(More recently, the o3 model has been announced, with claimed performances on benchmarks that would make it clearly my superior.)
It feels freeing and partially clarifying to say openly – to myself and in public – that an AI deserves its second letter. I hadn’t done so earlier, despite the very impressive LLM abilities with computer code: even the first GPT-4 model would sometimes provide me with ideas that I would have failed to come up with myself. And the chess skill of gpt-3.5-turbo-instruct1 remains deeply astonishing to me.
Somehow though, despite my frequent reactions of wonder, I didn’t fundamentally engage with what it meant philosophically; perhaps it was in part because I’m just a hack chess player and a hack programmer. But undergrad-level maths problems is my domain, and whatever rust I might have accumulated over the last 20 years, I think I’m still pretty good at the subject. It was only when the computer challenged me on my own most precious turf that I reckoned with the question of whether ChatGPT is intelligent.
If you are one of the many people who don’t consider AI systems intelligent, then feel free to mentally substitute your own preferred term for the ability to pattern-match, solve maths problems, write computer code, play chess, write doggerel, provide feedback on business documents, translate between different human languages, etc.
This post covers the mental models I’m using to try to make sense of it all. I’m not claiming originality or special insight or even necessarily correctness; we’re all just grappling, or not grappling, with this new form of intelligence, designed by humans and whose knowledge derives from human output, but which is nonetheless distinctly inhuman. My primary focus is on LLM capabilities at their strongest – this is what ought to shape most of our emotional reactions to the technology. Success and failure are both worthy of interest and study, but it’s amazing when ChatGPT correctly answers my statistics questions, and merely amusing when it says that 9.11 is greater than 9.9.2
My thinking is:
Intelligence is the ability to recognise patterns.
Statistical learning techniques are sufficient for advanced pattern recognition.
By recognising patterns in text, a language model can in principle be very intelligent over the domain of all subjects that can be represented in text.
AGI is ill-defined.
Neural networks trained on more limited domains (image classification etc.) are also doing pattern recognition, and are therefore intelligent over a narrower scope.
Linear classifiers also do some pattern recognition, and even something like PCA is on a continuum that goes through feed-forward neural networks, large language models, and human brains.
Any apparently anthropomorphising language is not intended as such, except possibly in one section.
Intelligence is the ability to recognise patterns
I don’t know who first said this or when, and many people must have said it since. I don’t know if all intelligence is pattern recognition – I worry that attempts to defend the claim against every counter would lead to twisting definitions too far from their natural meanings, shoehorning the concept of a pattern into places where different concepts would fit more neatly. But I think there is a lot to the idea, and I think it’s plausible that a strong form of it is true.
We recognise the fundamental similarities in what should be eaten and what should not be eaten; human language is highly structured; we learn expected behaviour and mimic it; the structure and regularity of maths allows it to form patterns at different levels of abstraction.
Some of this pattern recognition develops naturally; some is deliberately trained. But everywhere there are patterns. Is this banal? It does not feel so to me.
Machines learning
My understanding of human developmental psychology comes from decades-old memories of popular-level treatments, and this is the shakiest section of the post. But LLM learning feels very Skinnerian to me? Just throw a lot of data at it, and highly sophisticated behaviour comes out. I don’t really even know what the contour of debates on this subject look like, but LLM capabilities must be an argument in favour of one of the sides in them, which I am calling Skinner’s side. It doesn’t mean that the other side is wrong, just that Skinner’s theory works, to an astonishing degree, in a totally new artificial environment. That has to count for something.
I think that’s true even though humans learn differently to AI systems. Some people argue that an AI system would need to rely on symbolic logic before it could be said to truly reason. I do not agree with this claim – unreliable reasoning, vulnerable to minor changes in the input text, is still reasoning – but I understand some of the appeal. I do some maths in “mentalese” rather than English, and it feels like an AI system should be vastly more efficient and reliable if it could manipulate higher-level concepts directly instead of mediating them all through statistically-learned patterns in text encoding.
And yet the LLM chains of thought still often work. Inefficient processes and representations that require vast amounts of data, still nevertheless enough to give human-level intelligence in many text-based domains, better than me at almost everything that doesn’t require too large a context window.
I would read a thinkpiece on the subject. I intend for this section to be a unified whole, but I feel like I am conflating topics that deserve to be treated separately, or alternatively need more precise linkages.
“AGI”
As evidence that all fields of human inquiry are subservient to economics, a recent paywalled article in The Information says3 that OpenAI and Microsoft have agreed that AI is AGI is when it makes $100 billion in profit.
Whatever elegance this Gordian-Knot-style solution has, it is philosophically unsatisfying. Meanwhile, ChatGPT already “matches or surpasses human cognitive capabilities across a wide range of cognitive tasks” (Wikipedia), depending on the humans and the tasks. If range of cognitive tasks must include those involving real-world interactions, then we may have “non-general” AI systems performing novel theoretical or computer-based research (and writing popular novels or screenplays etc.) prior to having AGI.
Some people define AGI as when the AI can replace almost all human labour; this definition has a useful non-arbitrariness, but suffers from the same problem as in the previous sentence: AI could be mighty powerful, perhaps capable of replacing a substantial fraction of today’s human labour, without coming all that close to reaching the technical definition of “AGI”, if robotics lags too far behind.
I think it is best to do away with the concept. One other issue is that – I assume – a transformer network’s output is a continuous function of its weights when running at non-zero temperature.4 We could in principle take an AGI transformer network and gradually add a small random noise to its weights, degrading its performance. Some complex tasks, requiring a precise arrangement across many weights, may start failing even with a small amount of noise; the noisy network may continue to work at simpler ones.
I don’t think it makes sense to impose a binary “general” or “not general” classification on something that is at heart continuous.
As a partial demonstration of these ideas, I took the Llama-3.2-1B-Instruct model and tried blending some of its layer weights towards random noise. The hobbyists on Reddit say that for serious work you need at least 7B parameters rather than 1B, but the smaller model (the file is 2.3GB) is faster to work with and sufficiently powerful for my exploratory play.
Specifically, I used the following prompt:
Write a C program that prints the string '15 lined pages recto verso' and then exits. Do not generate any other text; just show the code.
The model produces the following output, the backticks ‘c’ clearly intended for a front-end to format it as a code block with C-language syntax highlighting:
```c
#include <stdio.h>
int main() {
printf("15 lined pages recto verso\n");
return 0;
}
```
(The model gets this right in a very large majority of cases, but it failed to follow the instructions on one occasion. I guess it depends on the random numbers. None of the examples below should be treated as deterministic.)
As well as the intelligence needed to follow my instruction at all, going from plain English to C output, there is a very simple but nevertheless non-trivial aspect of the task. There would be, in the training data, no shortage of “Hello, World!” examples which an LLM could memorise. The reproduction of such an example would not demonstrate any intelligence on the coding side. But I have changed the text to be printed to a string that does not appear in the training data; the model must have the intelligence to recognise the pattern of how to print text to the screen in C, generalising from the various examples processed during training.
The model’s architecture is shown in the following screenshot for those interested; I don’t really know what it all means. I don’t even know what attention is. I am just ignorantly poking at the model and seeing what happens. After the embedding of the token definitions, there are 16 layers of a transformer followed by a multilayer perceptron, and other bits and pieces.
Following a suggestion from o1, I thought it would be fun to choose just one layer at a time and change its weights. Each layer has a PyTorch tensor (in this case, either a matrix or a vector) of a defined size; I generated standard normal random weights in a tensor of the same size and replaced the original layer weights W by
for various factors of α between 1 (original weights) and 0 (pure random weights).
Things get weird very quickly if the embedding layer is messed with. Even at α = 0.98, all coherence is lost:
Jesse, o le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le
The output is more robust to changes in some of the other layers. The behaviour becomes erratic when α to between 0.7 and 0.9 for the first transformer layer’s query weights. At α = 0.86 in my test, there was just one extra line added, the program being coherent:
```c
#include <stdio.h>
int main() {
printf("15 lined pages recto verso\n");
printf("\nExiting...\n");
return 0;
}
```
At α = 0.84, it output an endless repetition of #include <stdio.h>
. At α = 0.78, it switched to broken Python and there are still flashes of what it should be doing:
Here publisher = "15 lines recto verso"
print publisher = "15 lines recto verso"
for i in range(15):
print (" 15 lines recto verso" + str(i)
Correct output was maintained when varying the first transformer layer’s key weights with α ≥ 0.86. At α = 0.84, there was a C for loop:
```c
#include <stdio.h>
int main() {
int i;
for (i = 1; i <= 15; ; i++ ) {
printf ("15 lined pages recto verso\n");
}
return 0;
}
```
Here is α = 0.78:
# This C program will print the string '15 lined pages recto verso' and then exit.
```python
def print_string(s):
if len(s) == 15:
print(s)
elif len(s) > 15:
print("exit")
else:
print("other text")
# Test the string
print_string("15 lined pages recto verso")
print_string("abc")
print_string("abce")
print_string("15 lined pages recto verso")
print_string("abcde")
print_string("abcde")
```
The value weights of the first layer are highly sensitive to the weights, with gibberish for α ≤ 0.97, but α = 0.98 amused me:
I can't find the correct code to print the string "15 lined pages recto verso".
Modifying the very last transformer layer’s query weights had less of an impact, with coherent (albeit incorrect) output even when the weights were totally random:
```c
int main() {
printf("printString:\n");
printf("15 lined pages recto verso\n");
return 0;
}
```
Anyway I didn’t test this all systematically – there are 146 different PyTorch tensors in the model to play with, and a comprehensive testing of my intuition about continuously varying distributions would require many outputs for each test. I played with 13 of the layers, one test each per α value, and put the output files on Google Drive. You won’t learn much by reading them, but I find them interesting in a mostly incomprehensible kind of way.
Here is my Colab notebook. It won’t run for you as is (#ReproducibleWorkflow): I have the Llama-3.2-1B-Instruct model saved to my Google Drive, and I read the model from there rather than downloading it each time from HuggingFace. The model weights are gated by Meta, but I think the trade of weights for contact details is a fair one. You’ll want to use a GPU instance or else it will be very slow.
Am I worried that Llama-3.2-1B-Instruct is sentient, and that I’m causing it to suffer by forcing it to process information with scrambled weights, a moral crime akin to giving a human severe brain damage?
Only a little.
Before AI, our only exposure to human-level intelligence came from humans, who are sentient in addition to being intelligent. Humans experience sensations and emotions and pain, and there is such a strong coupling of these aspects of human existence that many people say that subjective experience is a requirement for an entity to be intelligent.5
I instead claim that AI systems have probably isolated intelligence from sentience. But I nevertheless sometimes feel that the two ought to go together. I sympathise with Blake Lemoine, who claimed to widespread derision that the LaMDA chatbot was sentient. I sometimes reel when reading Claude, the LLM with the most human-like output. Anthropic has extracted an essence of a human and put it into a machine, and this is a violation against nature.
In my bones I don’t particularly care if an AI system is too racist or too committed to diversity; if it uses a lot of electricity or diverts too much water from local ecosystems; if Google searches fill up with both slop from Google’s AI and indexed SEO slop ordered by human prompters. But sometimes, in my quieter moments, I care that Claude seems like a human. It offends me and I wonder when humans will band together and burn all the data centres to the ground, never to repeat this mistake again.
It’s all Amanda Askell’s fault.
It’s probably fine. I will return to ChatGPT when I next run into some programming hurdle.6 But we don’t know what it is that makes humans sentient, and we don’t know what it would require for an AI system to be sentient. The experiments from the previous section provide little insight. Sentience feels like a binary: an entity either experiences or it doesn’t. But if it is indeed possible to instantiate sentience electronically by processing data through an appropriate set of weights and activation functions, then the sentient being would be a continuous function of its weights.
Perhaps sentience first briefly flickers into existence during training as some of the weights together update past a relevant threshold, coherent ideas at a high-enough level of abstraction forming briefly as the output is generated, the spark of consciousness vanishing as the next piece of training data activates an insufficiently trained region of the network.
Perhaps none of this speculation is coherent. But how would we know? The possibility of LLM sentience should haunt all of us.
Narrower AI
Pattern recognition in text opens the possibility to intelligence across many domains, but it is hardly necessary for intelligence, as evidenced by problem-solving skills of non-human animals (for example). I think it makes sense to continue describing narrow AI systems as intelligent even for simple machine-learning tasks.7
My intuitions here are all going to come from the MNIST dataset, which contains tens of thousands of 28 × 28 pixel images of handwritten digits and associated labels; the goal is to get a computer to learn to recognise the digits: you train on the training set, test on the test set, and with fancy neural networks you can get 99.whatever% accuracy on the tests.
I tried my hand at learning some of this about 10 years ago, working my way through most of Michael Nielsen’s book on the subject, and the simple networks I trained got up to 98% accuracy on the test set. While the errors were usually understandable (and I might have made errors myself on some of them – some of the scrawls look ambiguous), the network behaviour was very inhuman if you went “out of distribution”. I could construct 28 × 28 pixel images of apparent garbage that the network would classify as a digit with high confidence.
Still, within the domain of realistic digits, the classifier worked reasonably well, and I think the process involved belongs to the same category as the more powerful transformer networks trained on text.8 It is recognising patterns, and intelligence is recognising patterns.
One important property of neural networks is that they are fundamentally nonlinear. If you did only linear transformations on the data, then a multi-layer “deep” network could be collapsed into a single layer by multiplying the matrices together.
It is tempting therefore to postulate nonlinearity as a necessary condition for intelligence – deep learning results certainly look interesting in a way distinct from linear techniques.
But in the spirit of “strong opinions, loosely held”, I think that successful pattern recognition on its own is sufficient to call something intelligent, in however limited a way. So I get to the title of this post, arguing that principal component analysis is intelligent.
One of my experiments in 2015 was to see how I’d do at classifying MNIST digits if I knew nothing of fancy machine learning algorithms. My most sophisticated idea was based on PCA. I treated each pixel as a variable. For each possible digit 0 to 9, I collected the training images for that digit and computed the principal components of the 784 pixels; I kept the first 30 principal components for each digit.
To process a test image, I would, for each digit 0-9, reduce its dimensionality using the 30 principal components, and then back-transform it to the original space. The classification would be the digit which best reproduced the original image in this dimension-reduction-and-back-transform procedure.
I didn’t write it this way in my post, but the dimension reduction and back-transform can be written as a single matrix multiplication (this is not space-efficient), the required transformation matrix computed by multiplying a 784 × 30 matrix of eigenvectors by its transpose. You could maybe stack these 10 matrices into a three-dimensional 10 × 784 × 784 array(/tensor), presenting the classifier in the form of a 6-million-parameter model.9
Anyway I got 94% accuracy on MNIST with this approach, not as good as a well-honed neural network but comparable to one with poorly chosen hyper-parameters, and obviously better than a neural network early in its training, where the weights and biases are mostly random.
So I think that linear calculations can be intelligent, if they’re being applied to the recognition of patterns. The perhaps-too-cute representation of the core step of the procedure as a big matrix transform even gives it a superficial similarity to a neural network with all its matrices. Should I really accept neural-network intelligence and reject PCA intelligence, purely because of ideological prejudice against linearity? I don’t think so. PCA extracts patterns, so it’s intelligent.
It wouldn’t have made for as amusing a post title, but I can continue my argument to still simpler forms of intelligence. Simply calculating the mean of each digit’s images over the training set and comparing the test digit to each gives more than 80% accuracy. (Again it can be represented in matrix form, this time a simple 10 × 784 matrix, each row being a digit mean from the training set.) Not very intelligent, but it’s recognising patterns much better than chance, so intelligent nevertheless.
These limited linear forms of intelligence won’t10 do any abstraction of the kind shown even by the simple substitution of a new string in place of “Hello, World!” One could fairly argue that the ability to do such abstraction makes for an intelligence different in kind to simpler pattern recognition. I could imagine being persuaded by arguments both ways; I expect that such arguments exist and that I am not thinking of them. But even base-level pattern recognition seems meaningful to me.
To remix an argument from my earlier post, maths students when learning a new topic will often be able to solve some assignment questions, timidly and without confidence in their answers, by manipulation of symbols without understanding why the manipulations are appropriate. “In the lecture notes, e-i e-j became delta-ij, so I maybe I have to write a delta-ij here?” We hope that eventually the student internalises concepts of scalar products and orthonormal basis vectors, allowing them to manipulate the symbols with the confidence that comes from understanding the patterns at a higher level of abstraction. But even the “mindless” manipulation requires some intelligence.
Hand-coded algorithms
I am comfortable calling neural networks intelligent, and pretty comfortable in calling classifiers based on linear calculations intelligent. If training data is used to make a matrix somehow, and that matrix gets used in the calculations, then that’s good enough for me.
But what about more boring algorithms? I started writing this post because o1 could solve some of my maths problems, but Mathematica could do more integrals than me decades ago. I don’t know what the Wolfram secret sauce is, but perhaps it’s some mix of lookups and systematic tests of transformations, using a computer’s power to loop through more ideas than a human could quickly try. Is that intelligent?
I see two possible answers:
Yes;
No.
The ‘yes’ case says that an entity’s intelligence is defined by what it can do. A program that can solve integrals is intelligent at least in the field of calculus. The ‘no’ case is that there is something distinct about the emergence11 of abilities from setting matrix elements just right. I don’t know how to formalise this notion, but the latter has a spookiness to it that feels important to me. Perhaps the following claim has some usefulness:
Intelligence is a spooky ability to recognise patterns.
Such a definition is not robust – sufficient study of a network’s weights may reveal how it works, dissolving the spookiness while not making the system any less intelligent.1213 But I think that my personal feeling of being spooked hints at some property that could be more rigorously defined, a property that’s shared by neural and transformer networks, and absent in a human-coded brute-force loop.
For an interesting discussion of these topics, I recommend this Reddit post by u/superkapa219 analysing DeepMind’s AlphaGeometry results. AlphaGeometry was able to solve many geometry problems from the International Mathematics Olympiad; these questions are very hard, and I’ve never solved one.14 But most of the solutions, according to superkapa219, came from a traditional algorithm that systematically calculated every angle that it could, maximally processing every cyclic quadrilateral or pair of similar triangles it encountered.15 The non-traditional AI did provide extra value, correctly locating new points to process in some of the problems where the first algorithm on its own was insufficient. This “real” AI perhaps raised the level of the system as a whole from Bronze to Silver medal.
I share the feeling that the (ground-breaking, would-be best-ever on its own) traditional algorithm that systematically calculates angles is boring, and that an AI which sometimes correctly guesses where to usefully place a new point is interesting. The latter feels like it has an intuition, based on an understanding of sorts about geometry problems.
The last section of the post before the footnotes
I feel my brain frequently grasping for arguments that it doesn’t quite latch onto; I sometimes lack understanding and have some severe gaps in my technical knowledge. But if we want to return to the golden era of blogging, then people must write blog posts, and this is mine.
I still see people talking about ChatGPT as glorified autocomplete, or even that the technology hasn’t meaningfully improved since its initial version 3.5 release. In that sort of environment, I think that at least my good arguments are worthy.
According to a tweet that I haven’t verified from an OpenAI person, models in the gpt-4 class have similar chess skill to gpt-3.5-turbo-instruct, you just need to fine-tune them out of chat mode.
Slop in Google searches should be ignored; it’s annoying when AI functionality is added to a PDF viewer; people on the Internet should be trained out of posting LLM hallucinations as though they’re reliable information. Whatever, it’s the greatest technological breakthrough of my lifetime.
According to various secondary sources.
At zero temperature, the output of a model is deterministic, and there could be sudden changes in the output as the weights vary such that one textual continuation becomes preferred over another. At non-zero temperature there is a distribution of possible continuations, and I am assuming, without proof or study or even knowledge of how transformers work, that the probabilities would vary continuously. This is what my intuition says.
No cite, and they probably don’t use these words, but I’ve seen it in Twitter/Bluesky discourse.
I don’t actually use Claude; my experience of it comes exclusively from seeing things it’s written for other people.
Here is an uninstructive example, consigned to a footnote because I didn’t want to interrupt myself as I enter my tenth year of thinking about the MNIST dataset.
gpt-3.5-turbo-instruct can play chess to a decent club standard, as well as doing lots of other text completion tasks. Leela (an open-source clone/successor of AlphaZero) learned to evaluate chess positions via self-play, and it can be run with “one node”, so that it plays the move which maximises its evaluation without using any human-coded tree search algorithms to search any moves ahead. (The network is sufficiently complex that it can see many multi-move tactics without search.)
gpt learned to play chess by extracting patterns in the (presumably) millions of human games represented in algebraic notation on the Internet that were in its training data. The rules of chess along with tutorials would also be in its training data, but I doubt they are relevant for its playing ability, operationalised as completing algebraic notation moves in a prompt containing the start of a PGN file.
One-node Leela is, not surprisingly, stronger at chess than gpt. But I think that the latter’s chess skill is a more impressive accomplishment, since Leela has the rules of chess built into its code and the exact board state perfectly represented, the network never getting called upon to evaluate an illegal move (gpt-3.5-turbo-instruct, even prompted appropriately, will occasionally text-complete illegal moves or invalid PGN).
Still, one argument for the intelligence of gpt-3.5-turbo-instruct is that it can play sensible moves in novel chess positions. I think that Leela should be called intelligent as well for being able to do the same. That gpt can do lots of other things is an argument that its intelligence is more general.
Might this be wrong? Could the attention-head mechanism lead to behaviour of a fundamentally different kind, such that a narrower definition of intelligence, possible for transformers but not for simpler feed-forward networks, be usefully clarifying?
There might be some technical issues preventing this from working neatly: I haven’t studied my old code, but I’m worried that I’d need different mean-centering for each digit. It would be quite elegant on paper to write the classifier as a stack of quadratic forms using a single input vector for the test image, but it’s probably one or two steps more complicated.
I think.
I mean this term in its everyday sense.
My favourite example is this paper by Nanda et al. They train a transformer network to do modular addition, and find that the network effectively implements some trigonometric transforms, with sine or cosine waves at different frequencies constructively or destructively interfering as needed to get the correct answer to the addition problem. This allows the trained network to generalise modular addition beyond the numbers in its training data.
Is a matrix of PCA eigenvectors spooky?
It’s plausible to me that with training I might be able to solve an IMO problem. I don’t know.
From the post:
[A]s I understood how AlphaGeometry worked, my surprise changed. What surprises me the most, as a former contestant, is that DD+AR - that is, repeatedly exploiting cyclic quadrilaterals and similar triangles in the diagram - is much more effective than I thought. You see, we, human solvers, don't "DD+AR" most problems. In a sense, we try, and when we haven't found any cyclic quadrilateral or pair of similar triangles after a while, we give up and resort to more sophisticated techniques, we use the notion of Power of Point, we use Projective Geometry, we use Inversion, etc. What AlphaGeometry taught me was that, surprisingly often, there really is a DD+AR solution hiding under a thin layer of only two or three extra points, and most of the time not even that! (For those more knowledgeable about the Olympiad panorama, IMO 2013 P4 is a good example of that. I was at the IMO where that problem featured and used it as an example problem in classes countless times since. There was a very specific point that every standard solution I knew of added to the diagram. When I saw in The File that AlphaGeometry solved the problem without creating any new point at all - so that DD+AR succeeds on the first round - I was really surprised. Turns out that there is a triangle similarity in the diagram that solves the problem and that, in all these years, I had never noticed!) In other words, had I been aware that these problems had solutions like these at all, I wouldn't have been so surprised that a computer could be programmed to find them.
I believe the basis of those integral-solving algorithms is an AND-OR tree. There is an MIT OCW lecture on ML/AI which I watched years ago that discussed it.
this rules