Snapshots and Streams
Closing Chapter 2 — the architecture named
The synthesis that closes Chapter 2. Seven posts approached the same territory from seven different directions. This one names it — and the name will carry through the rest of the series.
The previous section ended with a promise: that the pieces built across this chapter — the lived body, the developing schema, the autopoietic cell, the sensorimotor loop, the shared word — would be collected into a single architectural picture. This is that post.
Let me start where the chapter started. With a bracket.
The bracket we opened
Chapter 2 opened with behaviorism — and with a decision. The behaviorist drew a bracket around the agent. Everything observable stayed outside the bracket, available for rigorous study: the stimulus, the response, the lawful relationship between environmental contingencies and action patterns. Everything else — what the agent was doing on the inside, how the world showed up for it, what it was like to be that agent — went inside the bracket and was set aside. Not denied. Bracketed. Declared, for the sake of a clean science, temporarily off-limits.
The six posts that followed were a sustained effort to open that bracket — and to open it without throwing away the rigor that closing it had bought. Each one approached the bracketed interior from a different discipline, with a different vocabulary, asking a different question. And if you have been reading carefully, you may have felt something building across them — a sense that each post was circling the same territory from a new side.

This post names that territory. And in naming it, I want to show you that what looked like six or seven separate topics was in fact one architecture — described, approached, and inhabited from seven directions.
The architecture has a name. You already know its components, because we have been planting them all chapter. It is the snapshot/stream architecture. It is the deepest framework this series has for saying what a biological agent is and what it does.
One architecture, seven angles
Let me walk back through the chapter — not to recap, but to relocate. To show where each tradition sits inside the architecture we are about to name.
Behaviorism described the observable surface — the outputs, the regularities, the lawful couplings between environment and action pattern. What we have spent the chapter doing is opening the bracket it drew. Everything that follows is an account of what behaviorism agreed, methodologically, not to look at.

Merleau-Ponty opened the bracket from the phenomenological side. The lived body, the body schema, the Vorhabe — these are descriptions of the snapshot from the inside. The Vorhabe is the holistic, simultaneously present structure of meaning the agent holds before any utterance begins — before the stream starts. Merleau-Ponty was mapping what it is like to be a snapshot processor.
Piaget described the trajectory — how the snapshot processor builds itself, over years, through cycles of assimilation and accommodation. Through representational redescription, the agent constructs the capacity for symbol-handling out of bodily, sensorimotor ground. The snapshot processor generates its own stream interface, from within, through development.
Maturana and Varela went deeper still. Autopoiesis grounds the snapshot in biology. The living system is a self-producing, self-maintaining network of processes — organizationally closed, continuously revising itself through structural coupling with its environment. On this account the snapshot is not a static picture. It is the current state of a living system: continuously produced, continuously revised, inseparable from the action through which it is expressed and transformed. The biological snapshot is not stored. It lives.
The reflective reconstruction — Descartes’ divide — told us why this architecture had to be rediscovered rather than assumed. The Cartesian separation of mind from body created a framework in which the snapshot processor was either denied or disembodied. The embodied traditions are the long, empirically grounded attempt to dissolve that framework from within.
Noë and O’Regan gave us the perceptual mechanism. Sensorimotor contingencies are the lawful, learnable dependencies between an agent’s actions and the changes in stimulation those actions produce. The world functions as its own external memory. The snapshot is not a stored picture of the world — it is the agent’s current capacity for active access to it. And Gibson’s affordances gave us the ecological face of the same point: the world as structured by action possibilities relative to what the agent can do.
And Tomasello gave us the social extension. Joint attention, shared intentionality, the ball game, the ratchet. The capacity to share a snapshot with another agent — to converge attention on the same object inside a shared normative frame — is what makes the stream possible at civilizational scale. The memetat is a shared stream: the sequential accumulation of cultural advances that each generation receives, holds through serial memory, extends, and transmits.
Seven angles. Not seven topics. One architecture, approached from seven directions.
Naming the architecture
Here is the architecture, stated plainly.
The snapshot is the holistic, relational, simultaneously present structure of meaning that constitutes the agent’s current engagement with its world. It is not a picture stored in a register. It is a living state — the current organizational configuration of a self-maintaining biological system, continuously revised through sensorimotor engagement with a world that answers back to action in lawful ways. It is holistic, analog, and always already in the process of becoming a different snapshot — because the living agent never stops acting, and every action revises the context in which the next event will be met.
The stream is the sequential, one-thing-at-a-time transmission of information through a one-dimensional channel. Speech is the paradigm case. The stream is the social and evolutionary adaptation that makes meaning transmissible across the gap between agents who do not share a perceptual field. To send meaning across that gap, the agent has to serialize — to convert the holistic, all-at-once snapshot into a sequence of discrete units that can travel through a channel and be reconstructed at the far end.

That is the whole asymmetry in one line: the snapshot is everything-at-once; the stream is one-thing-at-a-time. Meaning is constituted holistically and transmitted serially. Everything else in the architecture follows from the tension between those two facts.
One asymmetry, three levels
The snapshot/stream asymmetry is not a single distinction. It operates at three levels at once, and keeping them apart prevents most of the confusions that follow.
At the ontological level: habitat versus memetat — the world of the snapshot versus the world of the stream. The habitat is the world afforded by the body’s coupling with physical regularities. The memetat is the world of words, inscriptions, instruments, and conventions that agents co-construct and transmit.
At the epistemic level: stochastic versus schematic — the natively probabilistic, noise-tolerant snapshot processor versus the natively discrete, rule-governed stream interface. The body deals in gradients and likelihoods; the symbol deals in tokens and rules.
At the semiotic level: semantics versus syntax — the recovered picture versus the protocol of transmission. The snapshot is what is meant; the stream is the orderly sequence of marks that lets a receiver reconstruct it.
Three levels, one architecture. When an argument about cognition starts to feel slippery, it is often because two people are describing different levels of this same asymmetry and assuming they disagree.
The asymmetry that matters most
Let me name one feature of this architecture to carry into Chapter 3 — because it turns out to be the hinge on which the comparison between natural and artificial intelligence swings.

In the natural agent — in you — the snapshot and the stream are continuous aspects of the same living process. The snapshot is the context within which an incoming event acquires meaning. The stream — your ongoing sensorimotor activity — continuously revises the snapshot. The snapshot guides the stream; the stream revises the snapshot. The two are inseparable, because the agent never stops acting, and acting is always already sensing and doing at the same time.
This is the deepest difference between the living agent and the artificial intelligence system. A large language model has a snapshot of a kind — a high-dimensional relational map, built from an enormous archive of the memetat. But that snapshot is static between training runs. It is not revised through living. The artificial agent’s snapshot is stored. The natural agent’s snapshot lives.
We will develop this in detail in Chapter 3, where we take up the inversion thesis: natural intelligence builds from habitat toward memetat — from the body’s coupling with the world toward symbol and convention — while the language model runs the same architecture in reverse, from memetat toward habitat, beginning with the archive of symbols and only gesturing back toward the world. For now, hold it as an orientation. The architecture this chapter has built is the architecture of the natural, living, embodied agent. Artificial intelligence starts at the other end of it — and its powers and its limits both follow from where it starts.
Let me be careful about what this comparison claims — and what it does not. It is not a verdict that an artificial system can never be dynamic, never be grounded, never have a snapshot that lives. Nothing in the architecture forbids that. What the snapshot/stream distinction offers is not a ceiling on what AI can become but an analytic lens — a way of characterizing precisely where today’s systems stand, and how to interpret what they have actually achieved. A language model has done something genuinely remarkable with the stream alone. The lens lets us say exactly what that is — and exactly what is still missing.
And it tells us what closing the gap would take. To build a grounded artificial agent — one whose snapshot lives rather than sits stored — we would have to supply the elements the architecture marks as absent. It would have to be multimodal and cross-modal, binding many channels of sensing into a single coupled state rather than running one stream of tokens. It would have to be embodied and enactive, acting in a world that answers back and revising itself through that loop. And — not least — it would have to be economical: subject to the energetic and metabolic constraints that make a living system care about its own persistence in the first place. (Autopoiesis, recall, grounds the snapshot in exactly this kind of self-maintaining, energy-bound organization — a living cell runs its cognition on a metabolic budget that no current model has any analogue of.) The point of naming the architecture is not to settle, once and for all, whether a machine can think. It is to learn to negotiate the space the architecture opens — to see clearly which elements are present, which are absent, and what each addition would cost and buy. The tension between snapshot and stream is not a problem to be resolved. It is the space within which both natural and artificial cognition are built.
The triadic lens, now with flesh
Chapter 1 introduced a triadic lens — three contexts of agent activity: Agent–World, Agent–Agent, Agent–Institution. Chapter 2 has given that lens biological flesh.

The Agent–World vertex is now grounded — in autopoiesis, in sensorimotor contingencies, in affordances, in the body schema. The agent does not meet a neutral physical world. It meets a world already structured by its own organizational imperatives, its own sensorimotor competences, its own landscape of affordances. The snapshot is the agent’s current state of coupling with that world.
The Agent–Agent vertex is now grounded — in joint attention, shared intentionality, and the ball game. The capacity to share a snapshot with another agent is biologically prepared, developmentally elaborated, and extended by exaptation into the full range of linguistic and cultural practices that make science possible.
The Agent–Institution vertex — how reliable practice becomes portable and repeatable — is what the stream makes possible. The memetat is the accumulated, stabilized, institutionally maintained stream of cultural advances: the ratchet implemented at civilizational scale. That vertex is where Chapter 4 will go.
What was inside the bracket
We began Chapter 2 by opening a bracket. We end it by naming what was inside.
The bracket contained a living agent — embodied, self-maintaining, biologically normed, perceptually skilled, socially extended. An agent whose every action is at the same time a form of sensing. Whose snapshot is continuously revised through living. Whose capacity for sharing snapshots with other agents is the biological ground of everything the memetat has since become.
That agent is the subject of this series. And now — with the biological ground established, the perceptual mechanism specified, the social extension grounded — we are ready to ask, with real precision: what does this agent do when it does science?
Chapter 3 begins to answer.
Take-home point. The seven traditions of Chapter 2 are not rivals for the same ground — they are one architecture seen from seven directions. The snapshot is the holistic, simultaneously present, continuously revised living state of an agent’s engagement with its world. The stream is the one-thing-at-a-time serialization that makes meaning transmissible across the gap between agents. Their asymmetry runs at three levels at once — habitat/memetat (ontological), stochastic/schematic (epistemic), semantics/syntax (semiotic). In the living agent the snapshot and the stream are inseparable aspects of one ongoing process: the snapshot guides the stream, the stream revises the snapshot. The natural agent’s snapshot lives; the artificial agent’s snapshot, as today’s systems are built, is stored. That difference is not a permanent verdict but an analytic lens — it tells us what a grounded AI would have to add (multimodal, cross-modal, embodied, enactive, economical), and it is the hinge of Chapter 3.
Next: Chapter 3 opens — “Placing the Schools.” With the architecture named, we return to the schools of cognitive science — behaviorism, constructivism, classical cognitivism, the 4E approaches — and locate each one precisely within it. Not as rival theories of the same thing, but as partial maps of different regions of a single landscape.
Image prompts used for this post. Try them on your own AI model and compare what it produces with our figures.
1. The bracket, opened
Output format: PNG. Landscape, 16cm × 9cm. A single schematic diagram centered on a large pair of square brackets [ ] drawn in heavy ink. LEFT STATE (faded, in the background, cool gray): the brackets are CLOSED tightly around a small black box labeled "the agent (do not look inside)", with only arrows entering and leaving — an arrow labeled "stimulus" going in and an arrow labeled "response" coming out; a small tag reads "behaviorism: study the surface". RIGHT STATE (foreground, warm amber, the main image): the same brackets are now pried OPEN, swung apart like doors, and what spills out is not a mechanism but a small living figure — an embodied human form mid-stride, one hand reaching, eyes engaged, surrounded by faint icons of the things found inside: a body-outline (body schema), a spiral (development), a small cell (autopoiesis), a looping action–perception arrow (sensorimotor), and two figures sharing a gaze over a ball (social). Above the image, large caption: "We opened the bracket." Below, smaller caption: "Behaviorism agreed not to look inside. Chapter 2 looked — and found a living agent." Soft warm tones for the opened state, cool gray for the closed state; sketched, schematic line-art; NOT photographic; no brain icon anywhere.2. Seven angles, one architecture
Output format: PNG. Landscape, 18cm × 11cm. A single large room (or rotunda) drawn in light line-art, seen from slightly above, with SEVEN doors/windows set around its curved wall, each opening inward toward the same central space. At the center of the room floats the thing they all open onto: a small two-part emblem — a holistic web-like cloud (labeled "snapshot") joined by a horizontal arrow to a row of discrete beads on a line (labeled "stream"). Each of the seven doors is labeled with a tradition and a one-word angle: (1) "Behaviorism — the surface", (2) "Merleau-Ponty — the lived body", (3) "Piaget — development", (4) "Autopoiesis — biology", (5) "Descartes — the divide rediscovered", (6) "Sensorimotor & affordances — perception", (7) "Tomasello — the social". Faint sightlines run from each doorway inward to the central emblem, showing that all seven look upon the same thing from different positions. Above the room, large caption: "Seven angles, one architecture." Below, smaller caption: "Not seven topics — one architecture, described from seven directions." Warm, neutral palette; clean schematic line-art; not photographic; no brain icon.3. Snapshot and stream — the architecture named
Output format: PNG. Landscape, 18cm × 9cm. A single diagram split into two linked halves by a horizontal serialization arrow. LEFT HALF — labeled "SNAPSHOT (everything at once)": a richly interconnected web/cloud, many nodes joined by many lines all present simultaneously, drawn in warm tones, with a small embodied figure nested inside it coupled to a fragment of world by looping action–perception arrows; small annotation: "holistic · relational · continuously revised · living". RIGHT HALF — labeled "STREAM (one thing at a time)": a single straight one-dimensional channel along which discrete tokens travel left to right in sequence, like beads threaded one after another, drawn in cooler tones; small annotation: "sequential · discrete · transmissible". BETWEEN them, a bold arrow pointing right labeled "serialize", and a fainter return arrow pointing left labeled "reconstruct". BELOW the whole diagram, a slim three-row table aligning the asymmetry at its three levels: row 1 "Ontological — Habitat | Memetat"; row 2 "Epistemic — Stochastic | Schematic"; row 3 "Semiotic — Semantics | Syntax" (the snapshot term on the left column under the snapshot, the stream term on the right column under the stream). Above the diagram, large caption: "Snapshots and streams." Below, smaller caption: "Meaning is constituted all-at-once and transmitted one-at-a-time. The whole architecture lives in that tension." Soft warm tones on the snapshot side, cooler tones on the stream side; clean schematic line-art; not photographic; no brain icon.4. The snapshot that lives vs. the snapshot that is stored
Output format: PNG. Landscape, 18cm × 9cm. Two side-by-side panels making visible the central difference the post sets up for Chapter 3. PANEL 1 — "The natural agent: the snapshot LIVES": a small embodied human figure in the world, drawn inside a continuously looping cycle — arrows running action → world responds → sensory change → revised snapshot → action, forming an unbroken loop that is clearly in motion (motion-lines, a sense of ongoing flow); the snapshot at the center of the loop is shown as a living, slightly shifting web that is being rewritten by the loop; small caption: "continuously revised through living; snapshot and stream inseparable". PANEL 2 — "The language model: the snapshot is STORED": a large static lattice or frozen high-dimensional grid, drawn in cool tones, labeled "relational map built from the archive of the memetat"; the lattice is sealed inside a frame marked "fixed between training runs", with the action–world loop notably ABSENT or shown as a broken, dashed, disconnected arc that does not close; small caption: "static; not revised through living". Above both panels, large caption: "Two snapshots — one lives, one is stored." Below both panels, smaller caption: "The same architecture, entered from opposite ends. This is the hinge of Chapter 3." Warm tones for the living agent, cool tones for the stored map; sketched schematic line-art; not photographic; NO brain icon — depict the natural agent as a whole embodied figure in an action loop, not a head.5. The triadic lens, now with flesh
Output format: PNG. Landscape, 16cm × 11cm. A triangle with three labeled vertices, echoing the Chapter 1 triadic lens but now populated with what Chapter 2 contributed. TOP vertex "Agent–World" — clustered around it as small tags: "autopoiesis", "sensorimotor contingencies", "affordances", "body schema"; a tiny icon of an embodied figure coupled to a fragment of terrain. BOTTOM-LEFT vertex "Agent–Agent" — tags: "joint attention", "shared intentionality", "the ball game", "exaptation → language"; a tiny icon of two figures sharing a gaze over a single ball. BOTTOM-RIGHT vertex "Agent–Institution" — tags: "the memetat", "the stream", "the ratchet"; a tiny icon of stacked inscriptions/books with an upward ratchet arrow; this vertex carries a faint forward-pointing label "→ Chapter 4". In the center of the triangle, a small annotation: "Chapter 1 drew the lens. Chapter 2 gave it biological flesh." Clean schematic line-art, neutral palette with one warm accent for the central annotation; not photographic; no brain icon.The same stream (prompts) activates different snapshots (models) in different receivers (agents). Try the prompts above on your own AI model and compare what it produces with our figures.
This is “The Roots of STEM,” a series exploring the cognitive bases of science, technology, engineering, and mathematics. Subscribe to follow the arc from the body to the laboratory.


👏🏼👏🏼👏🏼👏🏼