Home > mailing lists

Re: How to speed up product code and subcode match - Mailing list pgsql-general

From	Andrus
Subject	Re: How to speed up product code and subcode match
Date	May 23, 2023 13:38:30
Msg-id	1e26fc09-46e4-6cdc-cec1-98bb5a975294@hot.ee Whole thread Raw
In response to	How to speed up product code and subcode match (Andrus <kobruleht2@hot.ee>)
List	pgsql-general

Tree view

Hi!

I ran

analyze toode;
create index vordlusajuhinnak_toode_pattern_idx on vordlusajuhinnak(toode bpchar_pattern_ops);

create index vordlusajuhinnak_toode_idx on vordlusajuhinnak(toode);
analyze vordlusajuhinnak;

Select runs now more than one hour. Output from explain

explain create table peatoode as
select toode.toode , n2, n3, n4
from toode, vordlusajuhinnak
WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%'

"Gather (cost=1000.55..443361894.55 rows=143828910 width=78)"
" Workers Planned: 2"
" -> Nested Loop (cost=0.55..428978003.55 rows=59928712 width=78)"
"        Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar) OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))"
"        -> Parallel Index Only Scan using toode_pkey on toode (cost=0.55..95017.93 rows=303869 width=60)"
"        -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)"

with

Set enable_nestloop to off;

explain output is:

"Gather (cost=10000001000.55..10443361906.55 rows=143828910 width=78)"
" Workers Planned: 2"
" -> Nested Loop (cost=10000000000.55..10428978015.55 rows=59928712 width=78)"
"        Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar) OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))"
"        -> Parallel Index Only Scan using toode_pkey on toode (cost=0.55..95029.93 rows=303869 width=60)"
"        -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)"

How to speed it up?

Andrus.

23.05.2023 14:32 Bzm@g kirjutas:

Great,

However I think it is still way to slow.

Next step is to run analyze also for the other table vordlusajuhinnak.

And make sure you have an index on vordlusajuhinnak.toode similar to the index on toode.toode

Boris

Am 23.05.2023 um 12:56 schrieb Andrus <kobruleht2@hot.ee>:

Hi!

I ran analyze firma2.toode and changed where clause to use like:

create table peatoode as
select toode.toode , n2, n3, n4
from toode, vordlusajuhinnak
WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%'

In this case it took 37 minutes, returned 277966 rows.

Thank you for help.

Andrus.

23.05.2023 11:24 Bzm@g kirjutas:

Also your row count is way off I guess. Did you ever run analyze bigtable?

--
Boris

Am 23.05.2023 um 10:22 schrieb bzm@2bz.de:

Hi there,

I guess the main problem is the nested loop. 

As a quick recheck what happened if you run your query Without nested loop?

This is not a solution but a quickt test

In a Session 

Set enable_nestedloop = off;
Explain Select your query ;
--
Boris

Am 23.05.2023 um 08:53 schrieb Andrus <kobruleht2@hot.ee>:

Hi!

Price list of main products vordlusajuhinnak contains 3 prices for product (column toode) and has 39433 products:
   create table vordlusajuhinnak( toode varchar(60), n2 numeric(8,2), n3 numeric(8,2), n4 numeric(8,2) );

toode column in unique, may be primary key in table and contains upper case letters, digits and - characters.

product table (toode) contains 733021 products:
   CREATE TABLE toode (       grupp character(1),       toode character(60) primary key,       ... lot of other columns     );

Both tables have pattern indexes to speed up queries:
   CREATE INDEX toode_toode_pattern_idx       ON toode (toode bpchar_pattern_ops ASC NULLS LAST);   -- This index is probably not used, should removed:   CREATE INDEX vordlusajuhinnak_toode_pattern_idx ON vordlusajuhinnak(toode bpchar_pattern_ops);

Product table as both main products and subproducts with sizes. Size is separated by main product code always by / character:

   SHOE1-BLACK   SHOE1-BLACK/38   SHOE1-BLACK/41   SHOE1-BLACK/42   SHOE1-BLACK/43   SHOE2/XXL   SHOE2/L

Product codes contain upper characers only in this table also.

Trying to get prices for all products using
   create table peatoode as   select toode.toode , n2, n3, n4   from toode, vordlusajuhinnak   where  toode.toode between vordlusajuhinnak.toode and vordlusajuhinnak.toode||'/z'

Takes 4.65 hours. How to speed this query up?
Output from explain:

"Nested Loop  (cost=0.55..272273178.12 rows=3202240012 width=78)" "
->  Seq Scan on vordlusajuhinnak  (cost=0.00..721.33 rows=39433 width=32)" "  ->  Index Only Scan using toode_pkey on toode
(cost=0.55..6092.62 rows=81207 width=60)" "        Index Cond: (toode

= (vordlusajuhinnak.toode)::bpchar)" "        Filter: ((toode)::text <= ((vordlusajuhinnak.toode)::text || '/z'::text))"

Using  PostgreSQL 13.2, compiled by Visual C++ build 1900, 64-bit in Windows server and psqlODBC driver.
Upgrading Postgres is possible, if this helps.

Tried also using like:
   WHERE toode.toode=vordlusajuhinnak.toode OR     toode.toode LIKE vordlusajuhinnak.toode||'/%'

Posted also in

https://stackoverflow.com/questions/76311957/how-to-match-character-columns-with-separated-subcolumns

Andrus.

pgsql-general by date:

From: Stephen Frost
Date: 23 May 2023, 12:46:57
Subject: Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

From: Tom Lane
Date: 23 May 2023, 14:00:25
Subject: Re: How to speed up product code and subcode match

Re: How to speed up product code and subcode match - Mailing list pgsql-general

Previous

Next