Re: postgres_fdw: using TABLESAMPLE to collect remote sample - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: postgres_fdw: using TABLESAMPLE to collect remote sample
Date
Msg-id 84afe85f-2aa0-5aef-fa4a-59759afc03fb@enterprisedb.com
Whole thread Raw
In response to Re: postgres_fdw: using TABLESAMPLE to collect remote sample  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: postgres_fdw: using TABLESAMPLE to collect remote sample  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Hi,

here's a slightly updated version of the patch series. The 0001 part
adds tracking of server_version_num, so that it's possible to enable
other features depending on it. In this case it's used to decide whether
TABLESAMPLE is supported.

The 0002 part modifies the sampling. I realized we can do something
similar even on pre-9.5 releases, by running "WHERE random() < $1". Not
perfect, because it still has to read the whole table, but still better
than also sending it over the network.

There's a "sample" option for foreign server/table, which can be used to
disable the sampling if needed.

A simple measurement on a table with 10M rows, on localhost.

  old:        6600ms
  random:      450ms
  tablesample:  40ms (system)
  tablesample: 200ms (bernoulli)

Local analyze takes ~190ms, so that's quite close.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: adding 'zstd' as a compression algorithm
Next
From: Robert Haas
Date:
Subject: Re: adding 'zstd' as a compression algorithm