Thread: Instead of using the bloom index, a parallel sequencial scan is used with this example
Instead of using the bloom index, a parallel sequencial scan is used with this example
From
PG Doc comments form
Date:
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/12/bloom.html Description: postgres=# CREATE TABLE tbloom AS postgres-# SELECT postgres-# (random() * 1000000)::int as i1, postgres-# (random() * 1000000)::int as i2, postgres-# (random() * 1000000)::int as i3, postgres-# (random() * 1000000)::int as i4, postgres-# (random() * 1000000)::int as i5, postgres-# (random() * 1000000)::int as i6 postgres-# FROM postgres-# generate_series(1,10000000); SELECT 10000000 postgres=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6); ERROR: access method "bloom" does not exist postgres=# create extension bloom; CREATE EXTENSION postgres=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6); CREATE INDEX postgres=# CREATE index btreeidx ON tbloom (i1, i2, i3, i4, i5, i6); CREATE INDEX postgres=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Gather (cost=1000.00..127220.00 rows=250 width=24) (actual time=974.467..974.513 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbloom (cost=0.00..126195.00 rows=104 width=24) (actual time=895.448..895.448 rows=0 loops=3) Filter: ((i2 = 898732) AND (i5 = 123451)) Rows Removed by Filter: 3333333 Planning Time: 16.006 ms Execution Time: 974.635 ms (8 rows) postgres=# analyze tbloom; ANALYZE postgres=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Gather (cost=1000.00..127195.82 rows=1 width=24) (actual time=803.314..803.436 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbloom (cost=0.00..126195.72 rows=1 width=24) (actual time=775.911..775.911 rows=0 loops=3) Filter: ((i2 = 898732) AND (i5 = 123451)) Rows Removed by Filter: 3333333 Planning Time: 0.416 ms Execution Time: 803.471 ms (8 rows)
Re: Instead of using the bloom index, a parallel sequencial scan isused with this example
From
Bruce Momjian
Date:
On Thu, Oct 24, 2019 at 03:17:06PM +0000, PG Doc comments form wrote: > The following documentation comment has been logged on the website: > > Page: https://www.postgresql.org/docs/12/bloom.html > Description: I cleaned up your script and created an SQL file that can be piped into psql, attached. I see the bloomidx index being used without and with the ANALYZE, output attached. I tested this on git master, and back through PG 10. Would you please run these queries and post the output: SELECT version(); SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override'); My guess is that you have some non-default setting that is causing bloomidx not to be used. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
Re: Instead of using the bloom index, a parallel sequencial scan isused with this example
From
"Daniel Westermann (DWE)"
Date:
>>On Thu, Oct 24, 2019 at 03:17:06PM +0000, PG Doc comments form wrote:
>> The following documentation comment has been logged on the website:
>>
>>Page: https://www.postgresql.org/docs/12/bloom.html
>> Description:
>I cleaned up your script and created an SQL file that can be piped into
>psql, attached. I see the bloomidx index being used without and with
>the ANALYZE, output attached. I tested this on git master, and back
>through PG 10. Would you please run these queries and post the output:
>>
>>Page: https://www.postgresql.org/docs/12/bloom.html
>> Description:
>I cleaned up your script and created an SQL file that can be piped into
>psql, attached. I see the bloomidx index being used without and with
>the ANALYZE, output attached. I tested this on git master, and back
>through PG 10. Would you please run these queries and post the output:
Thanks for having a look, Bruce. You test case is not exactly the same as in the documentation. For you "temporary table" test case I indeed see the bloom index getting used. Doing the same with a normal table results in a parallel seq scan.
postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------------------------
PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)
postgres=# SELECT name, current_setting(name), source
postgres-# FROM pg_settings
postgres-# WHERE source NOT IN ('default', 'override');
name | current_setting |
----------------------------+--------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
dynamic_shared_memory_type | posix | configuration file
lc_messages | en_US.utf8 | configuration file
lc_monetary | de_CH.UTF-8 | configuration file
lc_numeric | de_CH.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
listen_addresses | * | configuration file
log_timezone | Europe/Zurich | configuration file
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
max_wal_size | 1GB | configuration file
min_wal_size | 80MB | configuration file
port | 5432 | environment variable
shared_buffers | 128MB | configuration file
TimeZone | Europe/Zurich | configuration file
Regards
Daniel
Attachment
Re: Instead of using the bloom index, a parallel sequencial scan isused with this example
From
Bruce Momjian
Date:
On Tue, Nov 5, 2019 at 08:28:57PM +0000, Daniel Westermann (DWE) wrote: > > >>On Thu, Oct 24, 2019 at 03:17:06PM +0000, PG Doc comments form wrote: > >> The following documentation comment has been logged on the website: > >> > >>Page: https://www.postgresql.org/docs/12/bloom.html > >> Description: > > >I cleaned up your script and created an SQL file that can be piped into > >psql, attached. I see the bloomidx index being used without and with > >the ANALYZE, output attached. I tested this on git master, and back > >through PG 10. Would you please run these queries and post the output: > > Thanks for having a look, Bruce. You test case is not exactly the same as in > the documentation. For you "temporary table" test case I indeed see the bloom > index getting used. Doing the same with a normal table results in a parallel > seq scan. Ah, I see your point about the temporary tables. I have retested with permanent tables, and see the same output you do. By reducing the generate_series by 100x, I am able to get the desired EXPLAIN plans by turning on/off various optimizer settings. I am attaching my test script and its output. I suggestion is what I reduce generate_series by 100x and update the EXPLAIN plans in the docs. The docs are unclear on how these different plans are generated, though I am not excited about adding the exact optimizer settings to generate each plan. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +