DB maintenance commands in PostgreSQL: Analyze, Reindex, Vacuum

Andrei Gridnev
3 min readApr 20, 2016

--

Actively updated database tables in PostgreSQL sometimes require maintenance because of table and index bloat, obsolete and suboptimal query plans and sometimes corrupted indexes. Bloated tables and indexes occupy too much of disk space and memory, suboptimal queries and corrupted indexes slow down database queries. As a result, database performance may significantly degrade. PostgreSQL has a few commands to help developers solve those problems: ANALYZE, VACUUM, REINDEX.

ANALYZE

ANALYZE collects statistics about the contents of all or several columns in a table or entire database. These statistics are then used by the query planner to produce efficient execution plans for queries.

Examples:

  • ANALYZE users; collects statistics for users table.
  • ANALYZE VERBOSE users; does exactly the same plus prints progress messages.
  • ANALYZE users (id, display_name); collects statistics for id and display_name columns of users table.
  • ANALYZE; collects statistics for all the database tables.

Please note, that ANALYZE doesn't read or update indexes. It deals only with table/column contents. ANALYZE. Other queries may read from the table while ANALYZE is running.

Use ANALYZE when the contents of a table has changed significantly: a few percents of records in a table have been added, updated or deleted. It won't hurt to run ANALYZE just before or right after adding an index to a table. That sometimes helps query planner to properly use the new index.

VACUUM

VACUUM reclaims storage occupied by dead records. Dead records in a table are the deleted or the previous versions of updated records. Such records are only marked as unavailable but still occupy disk space, making table scans slower. VACUUM also has an option that tells it to ANALYZE the table.

There are two very different VACUUM versions. Just VACUUM command reorganises live records in a table and makes the allocated storage space available for new records. However, this storage space doesn't return to the operating system. This command can be run in parallel with other read and write operations on the table. VACUUM FULL exclusively locks the table, creates a new table file, copies only live records to this file, then deletes the old table file. The reclaimed storage space is returned to the system in this case. This command doesn't allow any other read and write operations on the vacuumed table.

Examples:

  • VACUUM users; vacuums users table.
  • VACUUM VERBOSE users; vacuums users table and prints progress messages.
  • VACUUM FULL users; fully vacuums users table. Other queries cannot access users table while vacuuming is running.
  • VACUUM ANALYZE users; vacuums and analyses users table.
  • VACUUM FULL VERBOSE ANALYZE users; fully vacuums users table and displays progress messages.
  • VACUUM; vacuums all the tables in the database the current user has access to.

Even though PostgreSQL can autovacuum tables itself after a certain number of rows gets marked as deleted, some developers prefer to run VACUUM ANALYZE on tables with a lot of read/write operations on a custom schedule. For example, at the least busy time of the day.

Use VACUUM when the contents of a table has changed significantly: a few percents of records in a table have been added, updated or deleted.

REINDEX

REINDEX rebuilds an existing index or all indexes for a table or entire database. This command is helpful when an index becomes corrupted and is not used anymore, or when an index gets bloated after significant changes in the table contents.

Examples:

  • REINDEX INDEX idx_users_on_display_name; reindexes index idx_users_on_display_name.
  • REINDEX TABLE users; reindexes users table.
  • REINDEX VERBOSE TABLE users; reindexes users table and prints progress messages.
  • REINDEX DATABASE app_production; reindexes all tables in app_production database.

In my opinion, running VACUUM ANALYZE and then REINDEX commands on the biggest tables in the application database really helps to improve its performance.

If you found this post useful, give some claps below or a share on social media. Thanks for reading!

Originally published at my blog at andreigridnev.com on April 20, 2016.

--

--

Andrei Gridnev
Andrei Gridnev

Written by Andrei Gridnev

Software engineer, manager since 2002. Engineering management, leadership, software architecture, high-performing teams, professional growth.

Responses (1)