Thread: Combination of partial and full indexes
Hi.
I had a fight with a query planner because it doesn’t listen.
There are two indexes:
- with expression in descending order:
"offers_offer_next_update_idx" btree (offer_next_update(update_ts, update_freq) DESC) WHERE o_archived = false
- unique with two columns:
"offers_source_id_o_key_idx" UNIQUE, btree (source_id, o_key)
Here's the query with filter for offers.source_id columns which
is pretty slow because "offers_source_id_o_key_idx" is not used:
EXPLAIN ANALYZE
SELECT offers.o_url AS offers_o_url
FROM offers
WHERE offers.source_id = 1 AND offers.o_archived = false AND now() > offer_next_update(offers.update_ts, offers.update_freq)
ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC
LIMIT 1000;
Limit (cost=0.68..23403.77 rows=1000 width=116) (actual time=143.544..147.870 rows=1000 loops=1)
-> Index Scan using offers_offer_next_update_idx on offers (cost=0.68..1017824.69 rows=43491 width=116) (actual time=143.542..147.615 rows=1000 loops=1)
Index Cond: (now() > offer_next_update(update_ts, update_freq))
Filter: (source_id = 1)
Rows Removed by Filter: 121376
Total runtime: 148.023 ms
When I remove filter on offers.source_id, query plan looks like this:
EXPLAIN ANALYZE
SELECT offers.o_url AS offers_o_url
FROM offers
WHERE offers.o_archived = false AND now() > offer_next_update(offers.update_ts, offers.update_freq)
ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC
LIMIT 1000;
Limit (cost=0.68..4238.27 rows=1000 width=116) (actual time=0.060..3.877 rows=1000 loops=1)
-> Index Scan using offers_offer_next_update_idx on offers (cost=0.68..1069411.78 rows=252363 width=116) (actual time=0.058..3.577 rows=1000 loops=1)
Index Cond: (now() > offer_next_update(update_ts, update_freq))
Total runtime: 4.031 ms
I even tried to change orders of conditions in second query but it doesn't seem
to make a difference for a planner.
Shouldn't query planner use offers_source_id_o_key_idx to speed up query above?
PostgreSQL version: PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4, 64-bit
Configuration:
name | current_setting | source
------------------------------+----------------------------------------+----------------------
application_name | psql | client
checkpoint_completion_target | 0.9 | configuration file
checkpoint_segments | 3 | configuration file
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 128MB | configuration file
external_pid_file | /var/run/postgresql/9.3-main.pid | configuration file
lc_messages | en_US.UTF-8 | configuration file
lc_monetary | en_US.UTF-8 | configuration file
lc_numeric | en_US.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
max_connections | 100 | configuration file
max_locks_per_transaction | 168 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 4GB | configuration file
temp_buffers | 12MB | configuration file
unix_socket_directories | /var/run/postgresql | configuration file
work_mem | 16MB | configuration file
Definitions:
CREATE OR REPLACE FUNCTION public.offer_next_update(last timestamp without time zone, minutes smallint)
RETURNS timestamp without time zone
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
RETURN last + (minutes || ' min')::interval;
END
$function$
I dont think offers_source_id_o_key_idx will be used at all. It is a UNIQUE index on (source_id, o_key), but your query doesnot filter for any "o_key", so reading that index does not provide the pointers needed to fetch the actual data in thetable. I will try an index on source_id, offer_next_update(offers.update_ts, offers.update_freq) and see what happens HTH Gerardo ----- Mensaje original ----- > De: "Rafał Gutkowski" <goodkowski@gmail.com> > Para: pgsql-performance@postgresql.org > Enviados: Martes, 7 de Junio 2016 10:39:14 > Asunto: [PERFORM] Combination of partial and full indexes > > > Hi. > > > I had a fight with a query planner because it doesn’t listen. > > > There are two indexes: > > > - with expression in descending order: > "offers_offer_next_update_idx" btree (offer_next_update(update_ts, > update_freq) DESC) WHERE o_archived = false > - unique with two columns: > "offers_source_id_o_key_idx" UNIQUE, btree (source_id, o_key) > > > Here's the query with filter for offers.source_id columns which > is pretty slow because "offers_source_id_o_key_idx" is not used: > > > EXPLAIN ANALYZE > SELECT offers.o_url AS offers_o_url > FROM offers > WHERE offers.source_id = 1 AND offers.o_archived = false AND now() > > offer_next_update(offers.update_ts, offers.update_freq) > ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC > LIMIT 1000; > > > Limit (cost=0.68..23403.77 rows=1000 width=116) (actual > time=143.544..147.870 rows=1000 loops=1) > -> Index Scan using offers_offer_next_update_idx on offers > (cost=0.68..1017824.69 rows=43491 width=116) (actual > time=143.542..147.615 rows=1000 loops=1) > Index Cond: (now() > offer_next_update(update_ts, update_freq)) > Filter: (source_id = 1) > Rows Removed by Filter: 121376 > Total runtime: 148.023 ms > > > > > When I remove filter on offers.source_id, query plan looks like this: > > > EXPLAIN ANALYZE > SELECT offers.o_url AS offers_o_url > FROM offers > WHERE offers.o_archived = false AND now() > > offer_next_update(offers.update_ts, offers.update_freq) > ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC > LIMIT 1000; > > > Limit (cost=0.68..4238.27 rows=1000 width=116) (actual > time=0.060..3.877 rows=1000 loops=1) > -> Index Scan using offers_offer_next_update_idx on offers > (cost=0.68..1069411.78 rows=252363 width=116) (actual > time=0.058..3.577 rows=1000 loops=1) > Index Cond: (now() > offer_next_update(update_ts, update_freq)) > Total runtime: 4.031 ms > > > > > I even tried to change orders of conditions in second query but it > doesn't seem > to make a difference for a planner. > > > Shouldn't query planner use offers_source_id_o_key_idx to speed up > query above? > > > > > PostgreSQL version: PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, > compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4, 64-bit > > > Configuration: > name | current_setting | source > ------------------------------+----------------------------------------+---------------------- > application_name | psql | client > checkpoint_completion_target | 0.9 | configuration file > checkpoint_segments | 3 | configuration file > client_encoding | UTF8 | client > DateStyle | ISO, MDY | configuration file > default_text_search_config | pg_catalog.english | configuration file > effective_cache_size | 128MB | configuration file > external_pid_file | /var/run/postgresql/9.3-main.pid | configuration > file > lc_messages | en_US.UTF-8 | configuration file > lc_monetary | en_US.UTF-8 | configuration file > lc_numeric | en_US.UTF-8 | configuration file > lc_time | en_US.UTF-8 | configuration file > max_connections | 100 | configuration file > max_locks_per_transaction | 168 | configuration file > max_stack_depth | 2MB | environment variable > port | 5432 | configuration file > shared_buffers | 4GB | configuration file > temp_buffers | 12MB | configuration file > unix_socket_directories | /var/run/postgresql | configuration file > work_mem | 16MB | configuration file > > > > > Definitions: > > > > CREATE OR REPLACE FUNCTION public.offer_next_update(last timestamp > without time zone, minutes smallint) > RETURNS timestamp without time zone > LANGUAGE plpgsql > IMMUTABLE > AS $function$ > BEGIN > RETURN last + (minutes || ' min')::interval; > END > $function$ > > > >
I thought that first column from left in multi-column index can and will be used just as it would be a single column index. It doesn’t seem to work with unqiue indexes, which ultimetly makes sense. Thank you Gerardo. > On 07 Jun 2016, at 19:36, Gerardo Herzig <gherzig@fmed.uba.ar> wrote: > > I dont think offers_source_id_o_key_idx will be used at all. It is a UNIQUE index on (source_id, o_key), but your querydoes not filter for any "o_key", so reading that index does not provide the pointers needed to fetch the actual datain the table. > > I will try an index on source_id, offer_next_update(offers.update_ts, offers.update_freq) and see what happens > > HTH > Gerardo > > ----- Mensaje original ----- >> De: "Rafał Gutkowski" <goodkowski@gmail.com> >> Para: pgsql-performance@postgresql.org >> Enviados: Martes, 7 de Junio 2016 10:39:14 >> Asunto: [PERFORM] Combination of partial and full indexes >> >> >> Hi. >> >> >> I had a fight with a query planner because it doesn’t listen. >> >> >> There are two indexes: >> >> >> - with expression in descending order: >> "offers_offer_next_update_idx" btree (offer_next_update(update_ts, >> update_freq) DESC) WHERE o_archived = false >> - unique with two columns: >> "offers_source_id_o_key_idx" UNIQUE, btree (source_id, o_key) >> >> >> Here's the query with filter for offers.source_id columns which >> is pretty slow because "offers_source_id_o_key_idx" is not used: >> >> >> EXPLAIN ANALYZE >> SELECT offers.o_url AS offers_o_url >> FROM offers >> WHERE offers.source_id = 1 AND offers.o_archived = false AND now() > >> offer_next_update(offers.update_ts, offers.update_freq) >> ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC >> LIMIT 1000; >> >> >> Limit (cost=0.68..23403.77 rows=1000 width=116) (actual >> time=143.544..147.870 rows=1000 loops=1) >> -> Index Scan using offers_offer_next_update_idx on offers >> (cost=0.68..1017824.69 rows=43491 width=116) (actual >> time=143.542..147.615 rows=1000 loops=1) >> Index Cond: (now() > offer_next_update(update_ts, update_freq)) >> Filter: (source_id = 1) >> Rows Removed by Filter: 121376 >> Total runtime: 148.023 ms >> >> >> >> >> When I remove filter on offers.source_id, query plan looks like this: >> >> >> EXPLAIN ANALYZE >> SELECT offers.o_url AS offers_o_url >> FROM offers >> WHERE offers.o_archived = false AND now() > >> offer_next_update(offers.update_ts, offers.update_freq) >> ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC >> LIMIT 1000; >> >> >> Limit (cost=0.68..4238.27 rows=1000 width=116) (actual >> time=0.060..3.877 rows=1000 loops=1) >> -> Index Scan using offers_offer_next_update_idx on offers >> (cost=0.68..1069411.78 rows=252363 width=116) (actual >> time=0.058..3.577 rows=1000 loops=1) >> Index Cond: (now() > offer_next_update(update_ts, update_freq)) >> Total runtime: 4.031 ms >> >> >> >> >> I even tried to change orders of conditions in second query but it >> doesn't seem >> to make a difference for a planner. >> >> >> Shouldn't query planner use offers_source_id_o_key_idx to speed up >> query above? >> >> >> >> >> PostgreSQL version: PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, >> compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4, 64-bit >> >> >> Configuration: >> name | current_setting | source >> ------------------------------+----------------------------------------+---------------------- >> application_name | psql | client >> checkpoint_completion_target | 0.9 | configuration file >> checkpoint_segments | 3 | configuration file >> client_encoding | UTF8 | client >> DateStyle | ISO, MDY | configuration file >> default_text_search_config | pg_catalog.english | configuration file >> effective_cache_size | 128MB | configuration file >> external_pid_file | /var/run/postgresql/9.3-main.pid | configuration >> file >> lc_messages | en_US.UTF-8 | configuration file >> lc_monetary | en_US.UTF-8 | configuration file >> lc_numeric | en_US.UTF-8 | configuration file >> lc_time | en_US.UTF-8 | configuration file >> max_connections | 100 | configuration file >> max_locks_per_transaction | 168 | configuration file >> max_stack_depth | 2MB | environment variable >> port | 5432 | configuration file >> shared_buffers | 4GB | configuration file >> temp_buffers | 12MB | configuration file >> unix_socket_directories | /var/run/postgresql | configuration file >> work_mem | 16MB | configuration file >> >> >> >> >> Definitions: >> >> >> >> CREATE OR REPLACE FUNCTION public.offer_next_update(last timestamp >> without time zone, minutes smallint) >> RETURNS timestamp without time zone >> LANGUAGE plpgsql >> IMMUTABLE >> AS $function$ >> BEGIN >> RETURN last + (minutes || ' min')::interval; >> END >> $function$ >> >> >> >>
Altough, creating index `btree (source_id)` still changes nothing. So is `btree (source_id) WHERE o_archived = false`. It looks like partial indexes and full indexes cannot mix togheter even if when they have same condition. > On 08 Jun 2016, at 10:52, Rafał Gutkowski <goodkowski@gmail.com> wrote: > > I thought that first column from left in multi-column index can and will be used just as it would be a single column index. > > It doesn’t seem to work with unqiue indexes, which ultimetly makes sense. > > Thank you Gerardo. > >> On 07 Jun 2016, at 19:36, Gerardo Herzig <gherzig@fmed.uba.ar> wrote: >> >> I dont think offers_source_id_o_key_idx will be used at all. It is a UNIQUE index on (source_id, o_key), but your querydoes not filter for any "o_key", so reading that index does not provide the pointers needed to fetch the actual datain the table. >> >> I will try an index on source_id, offer_next_update(offers.update_ts, offers.update_freq) and see what happens >> >> HTH >> Gerardo >> >> ----- Mensaje original ----- >>> De: "Rafał Gutkowski" <goodkowski@gmail.com> >>> Para: pgsql-performance@postgresql.org >>> Enviados: Martes, 7 de Junio 2016 10:39:14 >>> Asunto: [PERFORM] Combination of partial and full indexes >>> >>> >>> Hi. >>> >>> >>> I had a fight with a query planner because it doesn’t listen. >>> >>> >>> There are two indexes: >>> >>> >>> - with expression in descending order: >>> "offers_offer_next_update_idx" btree (offer_next_update(update_ts, >>> update_freq) DESC) WHERE o_archived = false >>> - unique with two columns: >>> "offers_source_id_o_key_idx" UNIQUE, btree (source_id, o_key) >>> >>> >>> Here's the query with filter for offers.source_id columns which >>> is pretty slow because "offers_source_id_o_key_idx" is not used: >>> >>> >>> EXPLAIN ANALYZE >>> SELECT offers.o_url AS offers_o_url >>> FROM offers >>> WHERE offers.source_id = 1 AND offers.o_archived = false AND now() > >>> offer_next_update(offers.update_ts, offers.update_freq) >>> ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC >>> LIMIT 1000; >>> >>> >>> Limit (cost=0.68..23403.77 rows=1000 width=116) (actual >>> time=143.544..147.870 rows=1000 loops=1) >>> -> Index Scan using offers_offer_next_update_idx on offers >>> (cost=0.68..1017824.69 rows=43491 width=116) (actual >>> time=143.542..147.615 rows=1000 loops=1) >>> Index Cond: (now() > offer_next_update(update_ts, update_freq)) >>> Filter: (source_id = 1) >>> Rows Removed by Filter: 121376 >>> Total runtime: 148.023 ms >>> >>> >>> >>> >>> When I remove filter on offers.source_id, query plan looks like this: >>> >>> >>> EXPLAIN ANALYZE >>> SELECT offers.o_url AS offers_o_url >>> FROM offers >>> WHERE offers.o_archived = false AND now() > >>> offer_next_update(offers.update_ts, offers.update_freq) >>> ORDER BY offer_next_update(offers.update_ts, offers.update_freq) DESC >>> LIMIT 1000; >>> >>> >>> Limit (cost=0.68..4238.27 rows=1000 width=116) (actual >>> time=0.060..3.877 rows=1000 loops=1) >>> -> Index Scan using offers_offer_next_update_idx on offers >>> (cost=0.68..1069411.78 rows=252363 width=116) (actual >>> time=0.058..3.577 rows=1000 loops=1) >>> Index Cond: (now() > offer_next_update(update_ts, update_freq)) >>> Total runtime: 4.031 ms >>> >>> >>> >>> >>> I even tried to change orders of conditions in second query but it >>> doesn't seem >>> to make a difference for a planner. >>> >>> >>> Shouldn't query planner use offers_source_id_o_key_idx to speed up >>> query above? >>> >>> >>> >>> >>> PostgreSQL version: PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, >>> compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4, 64-bit >>> >>> >>> Configuration: >>> name | current_setting | source >>> ------------------------------+----------------------------------------+---------------------- >>> application_name | psql | client >>> checkpoint_completion_target | 0.9 | configuration file >>> checkpoint_segments | 3 | configuration file >>> client_encoding | UTF8 | client >>> DateStyle | ISO, MDY | configuration file >>> default_text_search_config | pg_catalog.english | configuration file >>> effective_cache_size | 128MB | configuration file >>> external_pid_file | /var/run/postgresql/9.3-main.pid | configuration >>> file >>> lc_messages | en_US.UTF-8 | configuration file >>> lc_monetary | en_US.UTF-8 | configuration file >>> lc_numeric | en_US.UTF-8 | configuration file >>> lc_time | en_US.UTF-8 | configuration file >>> max_connections | 100 | configuration file >>> max_locks_per_transaction | 168 | configuration file >>> max_stack_depth | 2MB | environment variable >>> port | 5432 | configuration file >>> shared_buffers | 4GB | configuration file >>> temp_buffers | 12MB | configuration file >>> unix_socket_directories | /var/run/postgresql | configuration file >>> work_mem | 16MB | configuration file >>> >>> >>> >>> >>> Definitions: >>> >>> >>> >>> CREATE OR REPLACE FUNCTION public.offer_next_update(last timestamp >>> without time zone, minutes smallint) >>> RETURNS timestamp without time zone >>> LANGUAGE plpgsql >>> IMMUTABLE >>> AS $function$ >>> BEGIN >>> RETURN last + (minutes || ' min')::interval; >>> END >>> $function$ >>> >>> >>> >>> >