WAL Internals in PostgreSQL

Share via:

WAL Internals in PostgreSQL

Write-Ahead Log (WAL) is a very important term in transaction processing. In PostgreSQL, it is also known as a transaction log. A log is a record of all the events or changes and WAL data is just a description of changes made to the actual data. So, it is ‘data about data or metadata’.

The WAL mechanism was first implemented in version 7.1 to mitigate the impacts of server crashes. It also made possible the implementation of the Point-in-Time Recovery (PITR) and Streaming Replication (SR). The term ‘Write-Ahead Log’ implies that any change that you make to the database must first be appended to the log file, and then the log file should be flushed to the disk.

Architecture:

A WAL segment is a 16 MB file, by default, and it is internally divided into pages of 8192 bytes (8 KB).

The first page has a header-data defined by the structure XLogLongPageHeaderData, while the headings of all other pages have the page information defined by the structure XLogPageHeaderData. Following the page header, WAL records are written on each page from the beginning in descending order. See Fig above.

The first WAL segment file is 000000010000000000000001. If the first one is filled up with the writing of WAL records, the second one 000000010000000000000002 would be provided. Files of the successor are used in ascending order in succession, after 0000000100000000000000FF has been filled up, the next one 000000010000000100000000 will be provided. In this way, whenever the last 2-digit carries over, the middle 8-digit number increases one. Also, remember every WAL record written also consists of its own header.

WAL files are placed under the data directory of PG. In the recent versions, pg_wal is the folder under which you can find all the WAL files located ($PGData/pg_wal).

WAL Administration in Postgres

Let us take some simple examples to understand how to deal with WAL Logs.

Ex1: If using your DB, you want to find the current WAL which is in use, you can run the following command. Pg_current_wal_lsn will list the current file which is being used.

Ex2: In case, you would like to do a manual switch of the WAL file, you will have to run the pg_switch_wal function. Look at the fig below, I have switched the log file and it has now moved to 02 as my current WAL file.

Switch it again and see the result.

PostgreSQL switches to a new WAL segment file under the following conditions:

  • The WAL segment has been filled
  • The function pg_switch_wal has been
  • archive_mode is enabled and the time set to archive_timeout has been

After they are switched out, WAL files can either be removed or recycled—i.e., renamed and reused for the future. The number of WAL files that the server would retain at any point in time depends on server configuration and server activity.

Ex3: You can see these files created physically under the pg_wal directory. As stated earlier, the pg_wal directory is created under the data directory which resides under your PG home.

Here

  • PG home is /dev/PG10
  • $PGData is /dev/PG10/data

You can see 3 WAL segment file 16MB of size has been created.

You can also check the count of WAL segment files with the below command directly from the database prompt.

Ex4: Contents of the WAL files can be read by using the pg_waldump utility. You need to run the utility along with the WAL segment file name to check what exactly is stored in the WAL files.

See the fig below, I have tried to filter out only INSERT commands from the WAL file. This utility is available from 9.3+ version onwards.

Ex5: The number of  WAL segment files in the pg_wal directory depends on min_wal_size, max_wal_size, and the amount of WAL generated in previous checkpoint cycles. When old log segment files are no longer needed, they are removed or recycled (that is, renamed to become future segments in the numbered sequence).

You can check if the archiving of the WAL is enabled or not using pg_settings.

You can run the following commands to change the archive mode. You would need to set the archive_command parameter carefully. 

These parameters would need the Postgres server bounce to take effect. Alternatively, you can set these parameters directly in the PostgreSQL.conf file.

 

Author    : Ankit Goyal
LinkedIn : https://www.linkedin.com/in/ankit-goyal-0a72999a

Thank you for giving your valuable time to read the above information. Please click here to subscribe for further updates

KTEXPERTS is always active on below social media platforms.

Facebook  : https://www.facebook.com/ktexperts/
LinkedIn    : https://www.linkedin.com/company/ktexperts/
Twitter       : https://twitter.com/ktexpertsadmin
YouTube   :  https://www.youtube.com/c/ktexperts
Instagram  : https://www.instagram.com/knowledgesharingplatform

Share via:
Note: Please test scripts in Non Prod before trying in Production.
1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4.75 out of 5)
Loading...

Add Comment