Rescuing Odoo performance: how we found and fixed a silent performance killer

From 5-second page loads to sub-second responses — a real-world debugging story.

The problem

An Odoo instance running on Docker Swarm was painfully slow. Page loads took 5+ seconds. The login page took almost 25 seconds on a cold start. Something was clearly wrong, but the server had 29 GB of RAM, a modern CPU, and only a handful of users. It should have been flying.

We diagnosed and fixed it with no rearchitecting, no code changes — just proper infrastructure tuning.

Step 1: identify the bottleneck

First, we checked the basics:

# Load averages
cat /proc/loadavg
# 3.13  2.78  2.62  ← High for a lightly-used server

# Memory
free -h
# 6.3 GB used / 29 GB total — Where's the RAM going?

# Top CPU consumers
ps aux --sort=-%cpu | head -15

The output showed 23 Playwright/Chromium processes consuming 57.7% CPU and over 4 GB of RAM. These were zombie renderer tabs from an OpenClaw instance that had been running since February — nearly 3 months of accumulated browser processes that were never cleaned up.

One single process (PID 2509720, running since March 1st) had accumulated 63,183 minutes of CPU time. That's 43 days of CPU consumed by a single zombie tab.

Step 2: kill the zombies

pkill -9 -f 'ms-playwright/chromium'

Instantly freed 4 GB of RAM and the load dropped from 3.13 to under 1.0. But Odoo was still slow — login took 5 seconds. The zombies were part of the problem, but not the root cause.

Step 3: PostgreSQL — the real culprit

With Chrome gone, we looked at the database:

-- 47 connections, 41 of them idle
SELECT count(*) as total FROM pg_stat_activity;
-- total: 47

-- Connections idle for over an hour
SELECT pid, now() - query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - query_start) > interval '5 minutes'
  AND state = 'idle';
-- 41 rows returned

Then we checked the PostgreSQL configuration:

SHOW shared_buffers;      -- 128MB  ← On a 29GB server!
SHOW work_mem;            -- 4MB    ← Way too low for Odoo
SHOW effective_cache_size;-- 4GB    ← Could be higher
SHOW random_page_cost;    -- 4.0    ← Default for mechanical disks

shared_buffers = 128MB was the smoking gun. PostgreSQL was caching almost nothing in memory, hitting the mechanical disk for every query. On a server with 29 GB of RAM, this is like riding a bicycle on a highway.

We also found dead tuples accumulating because autovacuum wasn't aggressive enough:

-- Tables with high dead tuple percentages
SELECT relname, n_live_tup, n_dead_tup,
  CASE WHEN n_live_tup > 0
    THEN round(100.0 * n_dead_tup / n_live_tup, 1)
  END as dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000;

-- stock_move_line: 17.1% dead tuples
-- stock_move:       16.3% dead tuples
-- account_move_line: 16.1% dead tuples

Step 4: the fix

We applied the changes via ALTER SYSTEM (no config file editing needed):

-- Memory: use the RAM you have
ALTER SYSTEM SET shared_buffers = '2GB';
ALTER SYSTEM SET effective_cache_size = '6GB';
ALTER SYSTEM SET work_mem = '64MB';
ALTER SYSTEM SET maintenance_work_mem = '512MB';
ALTER SYSTEM SET wal_buffers = '16MB';

-- WAL: reduce checkpoint pressure
ALTER SYSTEM SET max_wal_size = '2GB';
ALTER SYSTEM SET min_wal_size = '512MB';
ALTER SYSTEM SET checkpoint_completion_target = '0.9';

-- Planner: this server has a mechanical disk but lots of RAM
ALTER SYSTEM SET random_page_cost = '1.5';
ALTER SYSTEM SET effective_io_concurrency = '2';

-- Autovacuum: be aggressive for Odoo's write-heavy tables
ALTER SYSTEM SET autovacuum_max_workers = '3';
ALTER SYSTEM SET autovacuum_naptime = '1min';
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = '0.05';
ALTER SYSTEM SET autovacuum_analyze_scale_factor = '0.02';
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = '2000';

-- Logging: help future debugging
ALTER SYSTEM SET log_min_duration_statement = '2000';
ALTER SYSTEM SET log_checkpoints = 'on';

-- Kill idle connections after 5 minutes
ALTER SYSTEM SET idle_in_transaction_session_timeout = '300000';

Then we cleaned up the 29 idle connections and restarted PostgreSQL:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
  AND now() - query_start > interval '5 minutes'
  AND pid <> pg_backend_pid();
-- 29 connections terminated

docker service update --force odoo_db
docker service update --force odoo_kw-odoo

The results

Metric	Before	After
Login page TTFB	5.0s	0.5s
Home page TTFB	0.8s	0.5s
Server load	3.13	0.78
RAM used	6.3 GB	2.7 GB
DB connections	47 (41 idle)	29
Chrome processes	23	0

10x improvement on page load time with zero code changes.

Key takeaways

Default PostgreSQL config is for tiny servers. If you're running Odoo on a VPS with more than 4 GB of RAM, you almost certainly need to tune shared_buffers and work_mem. The defaults are designed to run on anything — they're not designed to run well.
Kill zombie processes. Browser automation tools (Playwright, Selenium) leak processes if not properly managed. Set up a cron job or watchdog to clean them up regularly.
Autovacuum needs tuning for Odoo. Odoo's write-heavy tables (stock_move, account_move_line, mail_message) generate dead tuples fast. The default autovacuum_vacuum_scale_factor = 0.2 means vacuuming only triggers after 20% of rows are dead. For a production Odoo, 5% is much better.
idle_in_transaction_session_timeout is essential. Odoo doesn't always clean up its connections. Without a timeout, idle connections accumulate and waste resources.
Diagnose before you scale. This server had plenty of resources — they were just misconfigured. Throwing more hardware at the problem would have been expensive and ineffective.

Appendix: the configuration parameters explained

Parameter	Default	Our Value	Why
`shared_buffers`	128MB	2GB	PostgreSQL's main cache. Should be 25% of RAM on dedicated DB servers
`work_mem`	4MB	64MB	Memory per sort/hash operation. Higher = fewer disk spills
`effective_cache_size`	4GB	6GB	Planner hint for OS cache. Higher = smarter query plans
`maintenance_work_mem`	64MB	512MB	For VACUUM, CREATE INDEX. Faster maintenance
`random_page_cost`	4.0	1.5	Cost of random disk access. Lower = prefers index scans
`wal_buffers`	4MB	16MB	Write-ahead log buffer. Reduces disk writes
`max_wal_size`	1GB	2GB	More WAL space = fewer forced checkpoints
`idle_in_transaction_session_timeout`	0 (disabled)	5min	Kills stuck connections automatically

Diagnosed and fixed on a Contabo VPS running Odoo 17 on Docker Swarm with PostgreSQL 13. Total time from diagnosis to fix: ~30 minutes.