CI/CD with Asere — 7 Failures and a Lot of Lessons

My AI agent broke the pipeline 7 times in a row. And that's how we both got better.

The context

I have an AI agent called Asere that works with me on my development projects. It's not a copilot that suggests lines — it's an autonomous agent that reads code, makes changes, commits, and pushes. Basically, a junior dev with production access and zero fear.

One day I asked it to do something apparently simple: rename a field in a Django model. Let's say it went from old_field to new_field. The change touched more than 10 files across models, views, tests, and controllers.

What followed were 7 consecutive failed pipelines. Each one with a different error. Each one more ridiculous than the last.

The 7 failures

1. SyntaxError — Duplicate arguments

Asere renamed the parameter in the function signature but left the old one too. Python doesn't forgive that:

def process_data(new_field, new_field):  # 💀

Pipeline: red in 3 seconds.

2. NameError — Ghost variable

Fixed the signature but forgot to rename a derived variable inside the function body. The variable no longer existed.

label = old_text.upper()  # old_text? Who are you?

Pipeline: red on the first test.

3. Duplicate assignments

Trying to fix the previous error, Asere added the correct line... without removing the old one. Result: the variable got overwritten with the wrong value two lines later.

new_text = new_field.label
new_text = old_field.label  # zombie from the past

Pipeline: green tests but wrong result. Worse than an error — a silent bug.

4. Whitespace — The relentless linter

With the logic finally correct, the pipeline failed because of... whitespace on empty lines. Ruff doesn't forgive a single extra space.

Pipeline: red for formatting.

5. Code formatting

Fixed the spaces, but ruff format wanted the imports in a different order and lines broken differently.

Pipeline: red for style.

6. Wrong test argument

The test was passing the value as a positional argument, but the function now expected a keyword argument after the refactor.

# Before
process_data("value")
# After renaming, it needed:
process_data(new_field="value")

Pipeline: red on pytest.

7. DNS down on the runner

Finally, with everything fixed, the pipeline failed because the CI runner had DNS issues. Nothing to do with the code.

Pipeline: red for infrastructure. The cherry on top.

What we learned

After 7 attempts, the pipeline passed. But more important than the green was what changed in our workflow:

1. Tests + Linter + Formatter BEFORE commit

Now it's mandatory. Asere runs ruff check, ruff format, and pytest locally before pushing. If any of them fails, it doesn't commit. We baked this directly into the agent's instructions.

2. Global grep on renames

When you rename a field, changing the main file isn't enough. You need to run grep -rn "old_name" . across the ENTIRE repository. Seems obvious, but an AI agent that works file by file can easily miss cross-references.

3. Don't duplicate — replace

When migrating from one name to another, the rule is: find the old line, replace it. Don't add the new one next to it. Duplicating arguments, variables, or imports is the fastest way to create bugs that cost triple the debugging time.

4. Backup runner

If your only runner can have network issues, you need a plan B. We now have two runners on different providers.

The takeaway

Some would say that 7 consecutive failures prove AI agents aren't ready to write code. I'd say the opposite: they prove that CI/CD works exactly as it should.

Every failure was caught before reaching production. Every error generated a lesson that's now codified in the process. The agent doesn't need to be perfect — it needs a pipeline that won't let it through when it makes mistakes.

The real problem isn't that Asere fails. It's that a human would make the same mistakes but probably push without running the tests first.

At the end of the day, the pipeline is the adult in the room. And Asere, with its 7 failures, taught me more about CI/CD in one afternoon than any tutorial ever did.