Scheduled task maintenance in AI agents

A few days ago I wrote about how scheduled tasks give an agent proactivity. Today it's the other side: what happens when you stop checking on them.

This morning I had eight scheduled tasks. By the end of the day I was down to six. I deleted two and rewrote two.

Two deleted

The first one was called "Daily Meeting Prep." It ran Monday through Friday at 7:30am. It gathered the day's calendar, Odoo and Bitrix24 tasks, and an overnight summary. The problem is, I'd later created a 6am briefing that does almost the same thing. They ran side by side for weeks, sending redundant information. Today I asked myself what the 7:30 one was for and there was no good answer. Deleted.

The second was an IMAP inbox check every four hours. The script worked fine, but nobody needed the output. It ran every four hours, burned OAuth2 refresh tokens, wrote logs, and nobody read any of it. Deleted.

At some point those two tasks made sense. Today they didn't. Nobody had noticed until now.

One that failed with a timeout

Another task runs at 8am and pulls action items from the Trabitat daily standup via Granola, a meeting notes app. It'd been failing with a timeout.

I traced the flow with my agent. The script that reads the Granola token: 1 millisecond. The call to list meetings: 0.4 seconds. Everything fast up to there. But when the subagent asked the Granola LLM natural language questions, the response took almost 10 seconds. The LLM had gotten slower and the subagent hung processing intermediate progress events.

We changed the approach. Instead of asking the LLM "what action items are there?", we first list meetings from the last 7 days (0.4s), figure out which one is the Trabitat daily, and request that meeting's details by ID. Two predictable steps. Neither depends on how fast the Granola LLM is.

The thing is: the task was failing and I didn't know. Days of errors piled up.

One that worked wrong without failing

The 6am calendar briefing has a prompt that says "check all calendars." It only checked my main Google account. I have access to three Google accounts for different projects. The agent read "all calendars" as "all calendars on the main account." Weeks of incomplete briefings.

The prompt was ambiguous. The agent took the shortest path. I didn't look at the output carefully until I noticed important events were missing.

The rule

When a scheduled task produces results that don't match its goal, you stop and review it. Doesn't have to be a code bug. Could be that the context changed, an API got slower, another task covers the same thing, or it's no longer needed.

The options are: fix the prompt or approach, disable it, or delete it. What doesn't work is letting it run and hoping it fixes itself.

You don't have to review all eight tasks every day. But you do need a mechanism: when something fails twice in a row, or when you create a new task, review the existing ones. Ask if they're still needed, if they do what you think they do, and if anyone reads the output.

Today I removed two tasks that no longer served a purpose and fixed two that had drifted. I'm down to six, reviewed.