Re: ogr2ogr slow sql when checking system tables for column info and so on. - Mailing list pgsql-performance
From | Julien Rouhaud |
---|---|
Subject | Re: ogr2ogr slow sql when checking system tables for column info and so on. |
Date | |
Msg-id | 20221021094848.5gbp42wttkjzn3le@jrouhaud Whole thread Raw |
In response to | ogr2ogr slow sql when checking system tables for column info and so on. (Lars Aksel Opsahl <Lars.Opsahl@nibio.no>) |
Responses |
Re: ogr2ogr slow sql when checking system tables for column info and so on.
|
List | pgsql-performance |
Hi, On Fri, Oct 21, 2022 at 09:19:58AM +0000, Lars Aksel Opsahl wrote: > > The main problem is that for instance ogr2ogr is using more time to get system info about tables than doing the actualjob. > > The time pick up postgresql meta info takes between 30 and 60 seconds and sometimes hours if we have not done vacuum analyzerecenlty. > Then actual spatial jobs takes less than 10 seconds. > > Before I run ogr2ogr I do vacuum analyze > > schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum > ------------+----------------+------------+------------+------------------------------- > pg_catalog | pg_class | 215296 | 4365 | 2022-10-18 10:24:05.745915+02 > pg_catalog | pg_attribute | 1479648 | 18864 | 2022-10-18 12:36:52.820133+02 > pg_catalog | pg_type | 200777 | 2318 | 2022-10-18 06:33:58.598257+02 > pg_catalog | pg_constraint | 10199 | 104 | 2022-10-20 15:10:57.894674+02 > pg_catalog | pg_namespace | 860 | 1 | [NULL] > pg_catalog | pg_description | 9119 | 0 | 2022-05-06 01:59:58.664618+02 > (6 rows) > > VACUUM ANALYZE pg_catalog.pg_class; > VACUUM ANALYZE pg_catalog.pg_attribute; > VACUUM ANALYZE pg_catalog.pg_namespace; > VACUUM ANALYZE pg_catalog.pg_type; > VACUUM ANALYZE pg_catalog.pg_constraint; > VACUUM ANALYZE pg_catalog.pg_description; > > After running, we have this values > > schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum > ------------+----------------+------------+------------+------------------------------- > pg_catalog | pg_class | 221739 | 464 | 2022-10-18 10:24:05.745915+02 > pg_catalog | pg_namespace | 860 | 2 | > pg_catalog | pg_attribute | 1464900 | 1672 | 2022-10-18 12:36:52.820133+02 > pg_catalog | pg_constraint | 10200 | 8 | 2022-10-20 15:10:57.894674+02 > pg_catalog | pg_type | 204936 | 93 | 2022-10-18 06:33:58.598257+02 > pg_catalog | pg_description | 9119 | 0 | 2022-05-06 01:59:58.664618+02 > (6 rows) > > Here https://explain.depesz.com/s/oU19#stats the sql generated by ogr2ogr that takes 33 seconds in this sample > [...] > -> Seq Scan on pg_attribute a (rows=1464751) (actual time=0.028..17740.663 > [...] > Then we take copy of the pg_catalog tables involved. > > Here https://explain.depesz.com/s/NEwB#source is the trace when using the same sql as from ogr2ogr but using the tablesin test_pg_metadata and then it runs in 5 seconds. > [...] > -> Seq Scan on pg_attribute a (rows=1452385) (actual time=0.006..156.392 > > I do not understand way it's so much slower to use the tables in pg_catalog than in test_pg_metadata tables because theyhave the same content. In both case you have a sequential scan over the pg_attribute table, but for pg_catalog it takes 17 seconds to retrieve the 1.4M rows, and in the new table it takes 156 ms. It looks like you catalog is heavily bloated, which is the cause of the slowdown. You could do a VACUUM FULL of the tables in pg_catalog but it would only be a short term fix as it's likely that your catalog will get bloated again. Do you rely a lot on temporary tables? If yes it can easily lead to this kind of side effect, and you should modify you code to perform manual vacuum of catalogs tables very often, or add a dedicated high frequency task for running similar vacuum and keep the bloat under control.
pgsql-performance by date: