What's up with 'emergence' and LLMs?

When humans give you lemons, you have lemons

Aug 11, 2024

Hi all,

I don’t know if you’ve noticed, there’s been some talk on how some abilities emerge out of LLMs like GPT-4 and ChatGPT, despite not being explicitly trained for them.

First: a primer. Emergence is an interesting observed phenomenon in complex systems (like economies, natural systems, etc.) where the collective behavior of the system's components leads to new and unexpected properties that cannot be predicted by examining the individual parts alone. In other words, the whole becomes greater than the sum of its parts.

Extension to the Game of Life as illustrated by Caleb Gucciardi on his medium post..

An observable example is the flocking behavior of birds, where simple rules followed by individual birds lead to coordinated patterns.

Another example is the emergence of consciousness in the human brain, which arises from the complex interactions of billions of neurons (as far as we know). It’s a fundamental concept used to explain a lot of properties of complex systems.

So back to emergent properties in LLMs. Throughout the year, I noticed more and more echoing observations of emergence seen from interactions with the models.

In more than one post, OpenAI’s CTO refers to effects from scaling law (i.e. making the system bigger) as emergent properties

These properties span some reasoning abilities (like solving a math problem that was unseen) to generating near accurate physical simulation in some videos.

The video generated by the Sora (OpenAI’s text-video model) shows really real water dynamics.

… and many more examples.

I was skeptical to call abilities of LLMs as “emergent” at first, and still remain so. Because all those LLMs are trained to mimic the data that we feed them, maximizing for attention between patterns. So when you give a model enough data of one pattern, it will spit out that same pattern. This includes mimicking complex physics in generated videos, but only superficially!

Well, it seems that it’s actually a debate amongst AI researchers. So I researched and stumbled upon a great read by Ann Rogers. Here’s her blog post summarizing her skeptical position: A Sanity Check on ‘Emergent Properties’ in Large Language Models.

Good night,

Najla

Growing Meta

Discussion about this post