Rescuing Odoo performance: how we found and fixed a silent performance killer

From 5-second page loads to sub-second responses — a real-world debugging story.

The problem

An Odoo instance running on Docker Swarm was painfully slow. Page loads took 5+ seconds. The login page took almost 25 seconds on a cold start. Something was clearly wrong, but the server had 29 GB of RAM, a modern CPU, and only a handful of users. It should have been flying.

We diagnosed and fixed it with no rearchitecting, no code changes — just proper infrastructure tuning.

Step 1: identify the bottleneck

First, we checked the basics:

# Load averages
cat /proc/loadavg
# 3.13  2.78  2.62  ← High for a lightly-used server

# Memory
free -h
# 6.3 GB used / 29 GB total — Where's the RAM going?

# Top CPU consumers
ps aux --sort=-%cpu | head -15

The output showed 23 Playwright/Chromium processes consuming 57.7% CPU and over 4 GB of RAM. These were zombie renderer tabs from an OpenClaw instance that had been running since February — nearly 3 months of accumulated browser processes that were never cleaned up.

One single process (PID 2509720, running since March 1st) had accumulated 63,183 minutes of CPU time. That's 43 days of CPU consumed by a single zombie tab.

Step 2: kill the zombies

pkill -9 -f 'ms-playwright/chromium'

Instantly freed 4 GB of RAM and the load dropped from 3.13 to under 1.0. But Odoo was still slow — login took 5 seconds. The zombies were part of the problem, but not the root cause.

Step 3: PostgreSQL — the real culprit

With Chrome gone, we looked at the database:

-- 47 connections, 41 of them idle
SELECT count(*) as total FROM pg_stat_activity;
-- total: 47

-- Connections idle for over an hour
SELECT pid, now() - query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - query_start) > interval '5 minutes'
  AND state = 'idle';
-- 41 rows returned

Then we checked the PostgreSQL configuration:

SHOW shared_buffers;      -- 128MB  ← On a 29GB server!
SHOW work_mem;            -- 4MB    ← Way too low for Odoo
SHOW effective_cache_size;-- 4GB    ← Could be higher
SHOW random_page_cost;    -- 4.0    ← Default for mechanical disks

shared_buffers = 128MB was the smoking gun. PostgreSQL was caching almost nothing in memory, hitting the mechanical disk for every query. On a server with 29 GB of RAM, this is like riding a bicycle on a highway.

We also found dead tuples accumulating because autovacuum wasn't aggressive enough:

-- Tables with high dead tuple percentages
SELECT relname, n_live_tup, n_dead_tup,
  CASE WHEN n_live_tup > 0
    THEN round(100.0 * n_dead_tup / n_live_tup, 1)
  END as dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000;

-- stock_move_line: 17.1% dead tuples
-- stock_move:       16.3% dead tuples
-- account_move_line: 16.1% dead tuples

Step 4: the fix

We applied the changes via ALTER SYSTEM (no config file editing needed):

-- Memory: use the RAM you have
ALTER SYSTEM SET shared_buffers = '2GB';
ALTER SYSTEM SET effective_cache_size = '6GB';
ALTER SYSTEM SET work_mem = '64MB';
ALTER SYSTEM SET maintenance_work_mem = '512MB';
ALTER SYSTEM SET wal_buffers = '16MB';

-- WAL: reduce checkpoint pressure
ALTER SYSTEM SET max_wal_size = '2GB';
ALTER SYSTEM SET min_wal_size = '512MB';
ALTER SYSTEM SET checkpoint_completion_target = '0.9';

-- Planner: this server has a mechanical disk but lots of RAM
ALTER SYSTEM SET random_page_cost = '1.5';
ALTER SYSTEM SET effective_io_concurrency = '2';

-- Autovacuum: be aggressive for Odoo's write-heavy tables
ALTER SYSTEM SET autovacuum_max_workers = '3';
ALTER SYSTEM SET autovacuum_naptime = '1min';
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = '0.05';
ALTER SYSTEM SET autovacuum_analyze_scale_factor = '0.02';
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = '2000';

-- Logging: help future debugging
ALTER SYSTEM SET log_min_duration_statement = '2000';
ALTER SYSTEM SET log_checkpoints = 'on';

-- Kill idle connections after 5 minutes
ALTER SYSTEM SET idle_in_transaction_session_timeout = '300000';

Then we cleaned up the 29 idle connections and restarted PostgreSQL:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
  AND now() - query_start > interval '5 minutes'
  AND pid <> pg_backend_pid();
-- 29 connections terminated
docker service update --force odoo_db
docker service update --force odoo_kw-odoo

The results

Metric Before After
Login page TTFB 5.0s 0.5s
Home page TTFB 0.8s 0.5s
Server load 3.13 0.78
RAM used 6.3 GB 2.7 GB
DB connections 47 (41 idle) 29
Chrome processes 23 0

10x improvement on page load time with zero code changes.

Key takeaways

  1. Default PostgreSQL config is for tiny servers. If you're running Odoo on a VPS with more than 4 GB of RAM, you almost certainly need to tune shared_buffers and work_mem. The defaults are designed to run on anything — they're not designed to run well.
  2. Kill zombie processes. Browser automation tools (Playwright, Selenium) leak processes if not properly managed. Set up a cron job or watchdog to clean them up regularly.
  3. Autovacuum needs tuning for Odoo. Odoo's write-heavy tables (stock_move, account_move_line, mail_message) generate dead tuples fast. The default autovacuum_vacuum_scale_factor = 0.2 means vacuuming only triggers after 20% of rows are dead. For a production Odoo, 5% is much better.
  4. idle_in_transaction_session_timeout is essential. Odoo doesn't always clean up its connections. Without a timeout, idle connections accumulate and waste resources.
  5. Diagnose before you scale. This server had plenty of resources — they were just misconfigured. Throwing more hardware at the problem would have been expensive and ineffective.

Appendix: the configuration parameters explained

Parameter Default Our Value Why
shared_buffers 128MB 2GB PostgreSQL's main cache. Should be 25% of RAM on dedicated DB servers
work_mem 4MB 64MB Memory per sort/hash operation. Higher = fewer disk spills
effective_cache_size 4GB 6GB Planner hint for OS cache. Higher = smarter query plans
maintenance_work_mem 64MB 512MB For VACUUM, CREATE INDEX. Faster maintenance
random_page_cost 4.0 1.5 Cost of random disk access. Lower = prefers index scans
wal_buffers 4MB 16MB Write-ahead log buffer. Reduces disk writes
max_wal_size 1GB 2GB More WAL space = fewer forced checkpoints
idle_in_transaction_session_timeout 0 (disabled) 5min Kills stuck connections automatically

Diagnosed and fixed on a Contabo VPS running Odoo 17 on Docker Swarm with PostgreSQL 13. Total time from diagnosis to fix: ~30 minutes.

← Back to Blog