Hello hackers,
Currently, database recovery from archive is performed sequentially,
by reading archived WAL files and applying their records to the database.
Overall archive file processing is done one by one, and this might
create a performance bottleneck if archived WAL files are delivered slowly,
because the database server has to wait for arrival of the next
WAL segment before applying its records.
To address this issue it is proposed to receive archived WAL files in parallel
so that when the next WAL segment file is required for processing of redo log
records it would be already available.
Implementation of this approach assumes running several background processes (bgworkers)
each of which runs a shell command specified by the parameter restore_command
to deliver an archived WAL file. Number of running parallel processes is limited
by the new parameter max_restore_command_workers. If this parameter has value 0
then WAL files delivery is performed using the original algorithm, that is in
one-by-one manner. If this parameter has value greater than 0 then the database
server starts several bgworker processes up to the limit specified by
the parameter max_restore_command_workers and passes to every process
WAL file name to deliver. Active processes start prefetching of specified
WAL files and store received files in the directory pg_wal/pgsql_tmp. After
bgworker process finishes receiving a file it marks itself as a free process
and waits for a new request to receive a next WAL file. The main process
performing database recovery still handles WAL files in one-by-one manner,
but instead of waiting for a next required WAL file's availability it checks for
that file in the prefetched directory. If a new file is present there,
the main process starts its processing.
The patch implemeting the described approach is attached to this email.
The patch contains a test in the file src/test/recovery/t/021_xlogrestore.pl
Although the test result depends on real execution time and hardly could be
approved for including to the repository it was added in order to show
a positive effect from applying the new algorithm. In my environment restoring
from archive with parallel prefetching is twice as faster than in original
mode.
Regards,
Dmitry.