How I Built a Senior-Level Code Challenge in 2 Hours with AI

I got a technical assessment this week. Senior Backend Engineer position. The task: build an inventory and calendar availability system for a vehicle fleet — think car bookings, overlap prevention, maintenance windows.

The kind of thing that sounds simple until you start thinking about race conditions.

The Architecture Decision

I didn't start from zero. I had a previous project — a monorepo using hexagonal architecture (ports & adapters) with Python, async everywhere, clean domain separation. That project used manual dependency injection wired at startup.

Having that reference was a game-changer. Instead of spending an hour debating architecture, my AI agent could infer the pattern from existing code. It understood the structure: domain entities with no framework dependencies, ports as abstract interfaces, adapters for persistence, use cases orchestrating the business logic.

But I didn't just copy-paste. The new project improved on the original:

FastAPI's native Depends() instead of manual DI wiring — cleaner, more idiomatic
PostgreSQL with range types and exclusion constraints for booking overlap prevention at the database level
SQLAlchemy async + asyncpg throughout — no sync fallbacks
Alembic migrations from day one

What the AI Actually Did

Here's the honest breakdown:

I defined the domain model — Car, Dealer, Booking entities. Status transitions. What "available" means (not just "no overlapping booking" but also "not in maintenance").
The AI scaffolded the hexagonal structure — reading my reference project, it created the ports, adapters, and use cases following the same pattern but adapted to FastAPI.
TDD from the start — 38 unit tests, all using in-memory adapters. No database needed. This is where hexagonal architecture pays off: your domain logic is testable in 0.06 seconds.
Then came the review. I ran a second AI agent as a code reviewer — think of it as an automated senior dev doing a PR review. It found 17 issues. Two were critical: a missing daily_price field on the domain entity (it was in the spec but got lost in translation), and wrong type annotations in the ORM layer.
Fix cycle — another agent applied all 17 fixes, I verified tests passed, pushed.

Total time from "read the spec" to "repo pushed with CI pipeline": about 2 hours.

The Parts That Required Human Judgment

AI didn't do everything. I made the key architectural decisions:

PostgreSQL over SQLite — because the real stack uses Postgres, and I wanted to demonstrate exclusion constraints for overlap prevention. That's a senior-level choice: using the database's native capabilities instead of application-level locking.
Hexagonal architecture — not because it's trendy, but because the assessment explicitly asked about SOLID principles and clean architecture. Ports & adapters is the most honest demonstration of dependency inversion.
What NOT to build — I skipped authentication, rate limiting, and deployment configs. An assessment isn't a production system. Knowing what to leave out is as important as knowing what to include.

The Reference Project Advantage

This is the part I want to emphasize. Having a well-structured reference project dramatically accelerated the AI's work. It wasn't generating architecture from abstract principles — it was pattern-matching against real, working code.

The inference went something like: "This project uses abstract repository interfaces → the new one should too. This project separates domain from infrastructure → same pattern. This project has ADR documents → create them here too."

If your AI agents are starting from scratch every time, you're leaving speed on the table. Build one project well, and every subsequent one benefits.

What I'd Do Differently

GitLab CI from the start — I initially generated a GitHub Actions workflow by habit. The repo was on GitLab. Embarrassing, but that's what reviews are for.
More integration tests — the unit tests are solid, but the API integration tests could cover more edge cases.
OpenAPI documentation — FastAPI generates it for free, but I could have added richer descriptions.

The Takeaway

The code challenge wasn't about writing code fast. It was about demonstrating architectural thinking — and then executing it cleanly. AI handled the execution. I handled the decisions.

That's the multiplier effect. Not "AI writes code for me." More like: "I architect, AI implements, a second AI reviews, I verify."

Two hours. 38 tests. Hexagonal architecture. PostgreSQL exclusion constraints. Clean domain model. ADR documents explaining every major decision.

Not bad for a Friday morning.