Lesson 08 · Concurrency

Let work run itself, no blocking

"Fire and forget - the agent doesn't block while the command runs."

⏱ ~10 min · 📝 3 interactive widgets · 🧑‍💻 Based on shareAI-lab · s08_background_tasks.py

The pain of blocking calls

s02's bash tool is synchronous: subprocess.run(..., timeout=120). Run something like npm install that takes 90 seconds, and the entire agent loop freezes for 90 seconds. The user stares at the terminal wondering if it crashed.

s08's solution: give the agent a background_run tool. It returns a task_id immediately and runs the command in a separate thread. The agent keeps looping and doing other work; when the background task finishes it pushes the result into a notification queue.

def run(self, command: str) -> str:
    task_id = str(uuid.uuid4())[:8]
    self.tasks[task_id] = {"status":"running", ...}
    thread = threading.Thread(target=self._execute, args=(task_id, command), daemon=True)
    thread.start()
    return f"Background task {task_id} started"   # returns immediately

How results get back to the agent

The key is a thread-safe queue: the background thread appends to it on completion; the main thread drains it before each LLM call, injecting any completed notifications as user messages.

def agent_loop(messages):
    while True:
        # Drain bg notifications before each LLM call
        notifs = BG.drain_notifications()
        if notifs:
            messages.append({
                "role": "user",
                "content": f"<background-results>{notif_text}</background-results>",
            })
        response = client.messages.create(...)
        ...

So the agent spawns a background task in turn N, and when it finishes in turn N+3, the next LLM call automatically carries the result. The model sees the <background-results> block and knows: "that task is done, I can continue."

Timeline demo

The widget below lets you simulate: the main thread ticks once per second (simulating the agent loop beat); you can spawn background tasks at any time. Watch the two timelines converge at the "drain point."

Which commands deserve a background slot?

Not everything should go background. Two criteria:

  1. Duration: a few-second command is simpler to run synchronously - no queue maintenance overhead.
  2. Result urgency: if the next step immediately needs the output (e.g. cat file.txt followed by grep), going background is pointless - you'll block waiting anyway.
Interactive

Widget 1 · Timeline · main thread + 2 background threads

Click Spawn to give a background thread work. The main thread checks the notification queue on every tick. Watch the three swim lanes interleave.

Main (agent loop)
Background thread A
(idle)
Background thread B
(idle)
queue: []
Interactive

Widget 2 · Fire & Forget · 8 commands, which ones go background?

For each command, choose foreground (synchronous) or background (async spawn). Think about two dimensions: "do I need the result right away?" and "how long does it take?"

Correct: 0 / 8
Interactive

Widget 3 · Drain Timing · which turn does the agent see the result?

Key rule: the main thread drains the queue before each LLM call. Given 5 scenarios, answer: after a background task completes, which turn does its result enter the agent's view?

Correct: 0 / 5