53 Years in the Making
A companion to Agents Are Agents (Part 11 in the series).
The AI agent discourse in 2026 treats “agent” as if it were coined last year. It wasn’t. The concept has a fifty-year research lineage in computer science, with formal models, programming languages, production systems, and hard-won lessons about what works and what doesn’t in distributed autonomous systems.
This reading list traces that lineage chronologically, from the foundational theoretical work in the 1970s through the systems that put agents into production, to the convergence with large language models. Each entry includes why it matters for understanding where we are now.
Read Wooldridge & Jennings (1995) first. Everything else follows from it.
I. Foundations (1973–1995)
The theoretical and architectural groundwork. These papers defined what an agent is, how agents reason, and how they coordinate.
1. Hewitt, C., Bishop, P., Steiger, R. “A Universal Modular ACTOR Formalism for Artificial Intelligence.” (1973)
IJCAI 1973. (ACM Digital Library)
The Actor model. Everything that follows — Erlang processes, agent systems, message-passing concurrency — traces back to this paper. An actor receives a message, does computation, sends messages, creates new actors. No shared state. The model that refuses to die because it’s correct.
Read for: The realization that message passing and isolated state are not implementation choices — they’re the only way concurrent systems can work reliably.
2. Bratman, M. “Intention, Plans, and Practical Reason.” (1987)
Harvard University Press / CSLI Publications.
The philosophical foundation for the BDI (Belief-Desire-Intention) model. Bratman wasn’t a computer scientist — he was a philosopher reasoning about how humans make and execute plans. His insight: intentions are not just strong desires. They’re commitments that constrain future deliberation. An agent that forms an intention doesn’t re-deliberate every step — it follows through unless something changes.
Read for: Understanding why the BDI model works for AI agents. An LLM forming a plan and executing tool calls is doing exactly what Bratman described: committing to an intention and following through, with reconsideration only when beliefs change.
3. Brooks, R. “Elephants Don’t Play Chess.” (1990)
Robotics and Autonomous Systems 6(1–2), 3–15.
The counter-argument to BDI and symbolic AI. Brooks argued that intelligent behavior emerges from simple reactive rules interacting with a complex environment — no internal world model, no explicit reasoning, no beliefs. His Subsumption Architecture layered simple behaviors that collectively produced complex action. He was wrong about representation (LLMs proved that), but right about something deeper: agents don’t need to understand the world completely to act effectively in it.
Read for: The intellectual honesty to include the strongest objection to the BDI thesis. And the recognition that the LLM agent debate — “does it really reason or is it just reactive?” — is a replay of Brooks vs. Bratman from 1990.
4. Rao, A.S., Georgeff, M.P. “Modeling Rational Agents within a BDI-Architecture.” (1991)
Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR’91).
The formalization of BDI as a computational architecture. Rao and Georgeff took Bratman’s philosophy and turned it into a model you could implement: agents maintain beliefs about the world, desires about what they want, and intentions about what they’re committed to doing. The deliberation cycle — perceive, update beliefs, generate options, filter to intentions, execute — became the template for every agent system since.
Read for: The architecture diagram. Then look at Claude Code’s main loop. Same structure.
5. Shoham, Y. “Agent-Oriented Programming.” (1993)
Artificial Intelligence 60(1), 51–92.
The paper that coined “agent-oriented programming.” Shoham proposed AGENT-0, the first agent-oriented programming language, where agents are programmed in terms of mental categories: beliefs, capabilities, commitments, and commitment rules. Garcia-Montoro et al. classify this as the origin of “commitment agents.”
Read for: The foundational argument that agents should be programmed in terms of their mental state, not their control flow. This is exactly what happened with LLMs — we program them through natural language descriptions of goals and context, not through procedural code.
6. Finin, T. et al. “KQML as an Agent Communication Language.” (1994)
Proceedings of the Third International Conference on Information and Knowledge Management (CIKM’94).
The first formal standard for how agents talk to each other. KQML (Knowledge Query and Manipulation Language) defined performatives — structured message types for requesting, telling, asking, and subscribing. FIPA ACL followed in 1997 with refined semantics. These are the direct historical ancestors of tool-calling protocols like MCP: standardized, typed message formats for agent-to-agent and agent-to-tool communication.
Read for: The realization that standardized agent communication isn’t new. MCP’s structured tool calls are the latest iteration of a problem that was formally addressed thirty years ago.
7. Russell, S., Norvig, P. “Artificial Intelligence: A Modern Approach.” (1995)
Prentice Hall. (4th edition, 2020)
The textbook that unified AI around the concept of “rational agents.” Russell and Norvig organized the entire field — search, logic, planning, learning, perception — as different capabilities of agents operating in environments. This is the book that made “agent” the default frame for AI, not just a subfield. Every CS student who took an AI course in the last thirty years learned this definition.
Read for: The canonical definition. When the AI agent discourse in 2026 uses “agent” loosely, this is the rigor they’re missing.
8. Wooldridge, M., Jennings, N.R. “Intelligent Agents: Theory and Practice.” (1995)
Knowledge Engineering Review 10(2), 115–152.
The anchor paper. The most cited survey in the agent literature. Wooldridge and Jennings defined the properties that make an agent intelligent: autonomy, social ability, reactivity, and proactivity. They surveyed the theoretical foundations (BDI, reactive architectures, hybrid architectures) and the practical implementations available at the time.
Thirty years later, every property they identified maps directly to what we now call AI agents: autonomy (the model decides what to do), social ability (MCP tool calls, multi-model communication), reactivity (responding to user prompts and tool outputs), proactivity (forming and executing plans).
Read for: The definition. When someone in 2026 says “agent” without knowing what they mean, this paper is what they should have read first.
II. Languages and Architectures (1995–2004)
The period when agent theory became agent engineering. Programming languages, platforms, and standards for building agent systems.
9. Nwana, H.S. “Software Agents: An Overview.” (1996)
Knowledge Engineering Review 11(2), 205–244.
A practical taxonomy of agent types: collaborative, interface, mobile, information, reactive, hybrid. Nwana’s classification was based on what agents actually do in practice, not just their theoretical properties.
Read for: The breadth of what “agent” already meant in 1996. The current AI agent landscape — coding agents, research agents, tool-using agents — is a subset of what this paper surveys.
10. Sutton, R., Barto, A. “Reinforcement Learning: An Introduction.” (1998)
MIT Press. (2nd edition, 2018)
The foundational text for reinforcement learning. Sutton and Barto formalized the agent-environment interface: an agent observes state, takes actions, receives rewards, and updates its policy. The Markov Decision Process (MDP) framework they established is the mathematical substrate beneath every RL agent that followed, including DQN, AlphaGo, and RLHF for language models. If the reading list includes learned agents, this is where they start.
Read for: The mathematical formalization of what it means to be an agent that learns. Every LLM fine-tuned with RLHF is using the framework this book established.
11. Jennings, N.R., Sycara, K., Wooldridge, M. “A Roadmap of Agent Research and Development.” (1998)
Journal of Autonomous Agents and Multi-Agent Systems 1(1), 7–38.
Where the field was heading in 1998. Multi-agent coordination, negotiation protocols, trust, organizational structures. The roadmap predicted that agent technology would become pervasive in distributed systems — it just took longer than expected and arrived through a different door.
Read for: The prediction accuracy. Most of what they described as future work is now being done by LLM-based agents, often without awareness that the problems were already mapped.
12. Armstrong, J. “Making Reliable Distributed Systems in the Presence of Software Errors.” (2003)
PhD Thesis, Royal Institute of Technology, Stockholm.
Joe Armstrong’s doctoral thesis. The theoretical and practical foundation of Erlang/OTP. Armstrong’s core argument: you can’t prevent errors, so you must design for error recovery. Supervision trees, process isolation, hot code reloading, “let it crash” — all formalized here.
Read for: The architecture of fault-tolerant agent systems. Every lesson in this thesis applies directly to AI agent systems. The current frameworks that don’t isolate agents and don’t handle failure structurally are repeating mistakes Armstrong documented twenty years ago.
13. Varela, C., Abalde, C., Castro, L., Gulías, J. “On Modelling Agent Systems with Erlang.” (2004)
Erlang’04: Proceedings of the 2004 ACM SIGPLAN Workshop on Erlang.
The paper that mapped BDI agent architecture directly onto Erlang’s process model. Each agent component — event receiver, state server, executor, main process — became an Erlang process. Cooperation between agents became message passing. Fault tolerance came from supervision trees. The implementation demonstrated that the theoretical agent architecture and the practical distributed systems architecture were the same architecture.
Read for: The convergence. This is the paper where agent theory meets systems engineering. The architecture diagram on page 66 is Claude Code’s architecture, arrived at independently.
III. Production Systems (2004–2013)
Agents move from research to production. The lessons learned when agent architectures meet real-world reliability requirements.
14. Armstrong, J. “Programming Erlang: Software for a Concurrent World.” (2007)
Pragmatic Bookshelf.
The practitioner’s guide to building agent-based systems in Erlang. Where the thesis was theoretical, this book is operational. Process spawning, message passing, supervision trees, distributed Erlang, OTP patterns — all explained through working code.
Read for: The patterns. gen_server, gen_fsm, supervisor — these are the patterns that AI agent frameworks should be implementing and mostly aren’t.
15. Cantrill, B., Shapiro, M., Leventhal, A. “Dynamic Instrumentation of Production Systems.” (2004)
USENIX Annual Technical Conference 2004.
DTrace gave operating systems real-time observability into everything — kernel, userspace, hardware. For agent-based systems, observability is not optional. You cannot build reliable agent systems if you can’t observe what agents are actually doing. DTrace demonstrated that instrumentation can be zero-cost when not active and comprehensive when needed.
Read for: The observability principle. AI agent systems in 2026 have almost no observability. You can’t inspect what an LLM agent is “thinking,” but you can instrument every tool call, every message, every state change. The systems community solved this decades ago.
16. Joyent. “Manta: Object Storage with Integrated Compute.” (2013)
Architecture documentation and API reference. (Job Patterns)
The distributed Unix pipeline built on agent architecture. Manta ran Unix tools inside isolated compute zones (SmartOS zones + ZFS + DTrace) against a distributed object store. Every job pattern — word count, ETL, video transcode — was a composition of isolated agents communicating through text.
Read for: The proof that Unix philosophy scales to distributed agent systems. cat | grep | sort works across a cluster when the agents are properly isolated and communicate through text. This is the architecture that MCP + Claude Code recapitulates.
IV. The Neural Turn (2014–2022)
Machine learning creates a new kind of agent — one that reasons probabilistically rather than following programmed rules. The architecture problems remain the same.
17. Mnih, V. et al. “Playing Atari with Deep Reinforcement Learning.” (2013)
arXiv:1312.5602.
DeepMind’s DQN paper. An agent that learns to act in an environment from raw pixel input. DQN isn’t BDI in any formal sense — it learns a Q-function, not symbolic beliefs or intentions. But it’s an agent in the fullest sense: an autonomous process that perceives an environment, maintains internal state (learned representations), and takes actions to achieve objectives. The substrate shifted from logic programming to neural networks. The agent abstraction held.
Read for: The moment when “agent” in AI stopped meaning “rule-based system” and started meaning “learned behavior.” The formal architecture changed. The core abstraction — perceive, reason, act — didn’t.
18. Vaswani, A. et al. “Attention Is All You Need.” (2017)
NeurIPS 2017.
The Transformer architecture. Not an agent paper, but the foundation for every LLM-based agent that follows. Attention mechanisms gave neural networks the ability to maintain and query beliefs across long contexts — the prerequisite for LLMs acting as agents.
Read for: Understanding why LLMs can act as agents at all. Attention gives models the ability to selectively retrieve and weight information across long contexts — a functional prerequisite for any agent that needs to maintain and update its understanding of a situation.
19. Silver, D. et al. “Reward Is Enough.” (2021)
Artificial Intelligence 299, 103535.
DeepMind’s argument that intelligence — including perception, language, social intelligence — can emerge from agents maximizing reward in complex environments. Controversial, but relevant: it frames the entire history of agent research as converging toward a single principle.
Read for: The theoretical claim that agent-based architectures are sufficient for general intelligence. Whether or not you agree, it’s the intellectual backdrop for why everyone is building agents now.
V. LLM Agents (2022–2025)
Large language models become agents. The old problems return.
20. Yao, S. et al. “ReAct: Synergizing Reasoning and Acting in Language Models.” (2022)
arXiv:2210.03629.
The paper that formalized the pattern of LLMs alternating between reasoning (thinking about what to do) and acting (calling tools or taking actions). ReAct is the BDI deliberation cycle implemented in natural language: the model maintains beliefs (context), generates desires (sub-goals), forms intentions (action plans), and executes.
Read for: The direct mapping between the 1991 BDI cycle and modern LLM agent behavior. Rao & Georgeff would recognize this immediately.
21. Schick, T. et al. “Toolformer: Language Models Can Teach Themselves to Use Tools.” (2023)
arXiv:2302.04761.
LLMs learning to call external tools — APIs, calculators, search engines. This is the moment when LLM agents gained the ability to communicate with external processes through structured interfaces.
Read for: The transition from LLMs as text generators to LLMs as tool-using agents. MCP is the standardized version of what this paper demonstrated.
22. Park, J.S. et al. “Generative Agents: Interactive Simulacra of Human Behavior.” (2023)
arXiv:2304.03442.
Stanford’s generative agents paper. 25 agents with BDI-like architectures — memory (beliefs), plans (intentions), reflection (belief revision) — living in a simulated town. The agents maintained long-term memory, formed relationships, and coordinated activities. Explicitly BDI, implemented on LLMs.
Read for: The full circle. Generative agents are BDI agents where the deliberation cycle is implemented by an LLM instead of a logic program. Thirty years of agent research, same architecture, different substrate.
23. Anthropic. “Model Context Protocol (MCP).” (2024)
Open specification. (GitHub · Announcement)
The standardization of tool calling for LLM agents. MCP defines how models communicate with external tools through structured messages — request and response, with typed parameters and return values. It is, architecturally, a message-passing protocol for agent-to-tool communication.
Read for: The mechanics differ — MCP is synchronous JSON-RPC, Erlang is asynchronous peer-to-peer. But the philosophy is the same: no shared state, the tool doesn’t know about the model’s internal state, the model doesn’t know about the tool’s implementation. Text in, text out. The boundary between processes is explicit and typed.
24. García-Montoro, C., Vivancos, E., García-Fornes, A., Botti, V.J. “A Software Architecture-Based Taxonomy of Agent-Oriented Programming Languages.” (2008)
Technical University of Valencia.
A taxonomy of agent programming languages classified not by philosophical properties but by software architecture: commitment agents, event-driven agents, goal-directed agents, software-integration agents, hierarchical agents, task-and-communication agents. Six architectures, each with different trade-offs.
Read for: The architectural diversity that existed before LLMs collapsed everything into “prompt → tool call → response.” The taxonomy reveals design choices that the current LLM agent frameworks don’t even know they’re making.
VI. The Convergence (2025)
25. Anthropic. “Claude Code.” (2025)
CLI tool and documentation.
A frontier LLM operating as an agent in a terminal. Receives messages (prompts), maintains beliefs (conversation context, file reads), forms intentions (plans), executes actions (tool calls), and handles failures (adjusts approach). Written in Bun (JavaScriptCore → Safari → Apple), descended from Node.js (V8 → Chrome → Google).
The communication protocol is message passing. The interface is text. The composition model is Unix pipes.
Read for: The end of the reading list is the beginning of practice. Everything above converges here — not by design, but by the gravity of correct architectural decisions reasserting themselves across decades.
How to Read This List
If you read three papers: Wooldridge & Jennings (1995), Armstrong’s thesis (2003), and the ReAct paper (2022). That gives you the definition, the systems engineering, and the modern implementation.
If you read five: Add Rao & Georgeff (1991) for the formal BDI model and Varela et al. (2004) for the Erlang mapping.
If you’re building agent systems: Read Armstrong’s thesis and his Erlang book. Then read everything your framework doesn’t implement and ask why.
The through-line: An agent is an autonomous process that maintains beliefs, forms intentions, and communicates through messages. This was true in 1973 (Hewitt), formalized in 1991 (Rao & Georgeff), defined in 1995 (Wooldridge & Jennings), implemented in production in 1986 (Erlang) and 2011 (SmartOS), and is true now with LLMs in terminals. The substrate changed. The architecture didn’t.
53 years from Hewitt’s Actor model to Claude Code in a terminal. Nothing is new. The patterns are proven. The lineage is real.
Leave a Reply