Home > mailing lists

synchronize_seqscans' description is a bit misleading - Mailing list pgsql-hackers

From	Gurjeet Singh
Subject	synchronize_seqscans' description is a bit misleading
Date	April 11, 2013 04:57:33
Msg-id	CABwTF4VwxS+jjT2RZSzHny5LArW+jFjFn5uiGH8cTRCXETGNag@mail.gmail.com Whole thread Raw
Responses	Re: [DOCS] synchronize_seqscans' description is a bit misleading
List	pgsql-hackers

Tree view

If I'm reading the code right [1], this GUC does not actually *synchronize* the scans, but instead just makes sure that a new scan starts from a block that was reported by some other backend performing a scan on the same relation.

Since the backends scanning the relation may be processing the relation at different speeds, even though each one took the hint when starting the scan, they may end up being out of sync with each other. Even in a single query, there may be different scan nodes scanning different parts of the same relation, and even they don't synchronize with each other (and for good reason).

Imagining that all scans on a table are always synchronized, may make some wrongly believe that adding more backends scanning the same table will not incur any extra I/O; that is, only one stream of blocks will be read from disk no matter how many backends you add to the mix. I noticed this when I was creating partition tables, and each of those was a CREATE TABLE AS SELECT FROM original_table (to avoid WAL generation), and running more than 3 such transactions caused the disk read throughput to behave unpredictably, sometimes even dipping below 1 MB/s for a few seconds at a stretch.

Please note that I am not complaining about the implementation, which I think is the best we can do without making backends wait for each other. It's just that the documentation [2] implies that the scans are synchronized through the entire run, which is clearly not the case. So I'd like the docs to be improved to reflect that.

How about something like:

<doc>
synchronize_seqscans (boolean)
This allows sequential scans of large tables to start from a point in the table that is already being read by another backend. This increases the probability that concurrent scans read the same block at about the same time and hence share the I/O workload. Note that, due to the difference in speeds of processing the table, the backends may eventually get out of sync, and hence stop sharing the I/O workload.

When this is enabled, ... The default is on.
</doc>

Best regards,

[1] src/backend/access/heap/heapam.c
[2] http://www.postgresql.org/docs/9.2/static/runtime-config-compatible.html#GUC-SYNCHRONIZE-SEQSCANS

--

Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

pgsql-hackers by date:

From: Michael Paquier
Date: 11 April 2013, 03:56:02
Subject: Re: SIGHUP not received by custom bgworkers if postmaster is notified

From: Tom Lane
Date: 11 April 2013, 06:10:13
Subject: Re: [DOCS] synchronize_seqscans' description is a bit misleading

synchronize_seqscans' description is a bit misleading - Mailing list pgsql-hackers

Previous

Next