About AI Proactivity
All people belong to a certain part of the spectrum. On one side there are those who simply do exactly what they're told. And on the other, those who on top of this show some amount of initiative — both within specific tasks and within processes and the project as a whole. The first group is the majority, they're reliable and predictable. The second — not the most predictable, but they're the engine of progress, they stir things up. Everyone else starts feeding off this energy, and together we all move in some right (or not so right) direction. A team always needs both. Even if both live within one person, since everyone must balance between these polar states.
We all already understand perfectly well that neural networks are great and can solve more or less any task with the proper level of decomposition. But right now they have a fundamental problem — they're on that part of the spectrum that won't lift a finger (or whatever they have) unless you push them. Agents are amazing as executors, but they do absolutely nothing proactively.
I think this is some next stage that will flip the game again, even more than before. And already at the most mainstream level, movement in this direction is visible. ChatGPT Pulse, for example, which by analyzing your chat list proactively sends you messages like "can't sleep?" and "you were interested in the Roman Empire, here's an interesting fact".
But as I've said, I don't really believe yet that chat interfaces are capable of accumulating enough useful stuff in their memory, so agents are everything. Our next step is to make agents proactive. Let's dream and think.
We seem to need to answer four questions:
- When to launch the agent?
- What should it do?
- How should it do it?
- What to do with the result of its work?
For example:
- When a merge request is created in GitLab, do a code review as described in the documentation and post the result back.
- Or once a day on a cron job, analyze the internet for interesting posts about Android and bring you a summary as a post in a Telegram bot/channel.
- Or every two hours launch an agent, what to do it determines itself based on a pre-defined technical backlog of the project, analyzes it, proposes solutions as a markdown spec/plan (see sdd).
The last option I ran locally on my computer using launchd + backlog.md + sh script to launch claude with the appropriate prompt. It works quite well, but it's just somehow very clunky to use.
Really missing some nice interface-constructor for such a system so far. Essentially you just need to learn to spawn agents on certain triggers, and everything else we already have. And so it asks questions before diving into a topic that's completely unnecessary. Maybe I should vibe-code it. 🤔