Thread: Explain is slow with tables having many columns
Hello, I have found that explain on tables with many (hundreds) columns are slow compare to nominal executions. This can break application performances when using auto_explain or pg_store_plans. Here is my test case (with 500 columns, can be pushed to 1000 or 1600) create table a(); DECLARE i int; BEGIN for i in 1..500 loop execute 'alter table a add column a'||i::text||' int'; end loop; END $$; #\timing #select a500 from a; a500 ------ (0 rows) Time: 0,319 ms #explain analyze select a500 from a; QUERY PLAN -------------------------------------------------------------------------------------------- Seq Scan on a (cost=0.00..10.40 rows=40 width=4) (actual time=0.010..0.010 rows=0 loops=1) Planning time: 0.347 ms Execution time: 0.047 ms (3 rows) Time: 4,290 ms Here is a loop to try to understand where this comes from DO $$ DECLARE i int; j int; BEGIN for j in 1..100 loop for i in 1..500 loop execute 'explain select a'||i::text||' from a'; end loop; end loop; END $$; Using perf top, most of the cpu time seems to come from relutils.c colname_is_unique: 59,54% libc-2.26.so [.] __GI___strcmp_ssse3 26,11% postgres [.] colname_is_unique.isra.2 1,46% postgres [.] AllocSetAlloc 1,43% postgres [.] SearchCatCache3 0,70% postgres [.] set_relation_column_names 0,56% libc-2.26.so [.] __strlen_avx2 select version(); PostgreSQL 11devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0, 64-bit Could this be improved ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote: > Hello, > I have found that explain on tables with many (hundreds) columns > are slow compare to nominal executions. See also this thread from last month: https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com Justin
Justin Pryzby wrote > On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote: >> Hello, >> I have found that explain on tables with many (hundreds) columns >> are slow compare to nominal executions. > > See also this thread from last month: > > https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com > > Justin maybe, I will check that patch ... I thought it would also have been related to https://www.postgresql.org/message-id/CAMkU%3D1xPqHP%3D7YPeChq6n1v_qd4WGf%2BZvtnR-b%2BgyzFqtJqMMQ%40mail.gmail.com Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
Hi, (CCing -hackers) On 2018-09-24 12:22:28 -0700, legrand legrand wrote: > I have found that explain on tables with many (hundreds) columns > are slow compare to nominal executions. Yea, colname_is_unique() (called via make_colname_unique()) is essentially O(#total_columns) and rougly called once for each column in a select list (or using or ...). IIRC we've hit this once when I was at citus, too. We really should be usign a more appropriate datastructure here - very likely a hashtable. Unfortunately such a change would likely be a bit too much to backpatch... Greetings, Andres Freund
Hi, (CCing -hackers) On 2018-09-24 12:22:28 -0700, legrand legrand wrote: > I have found that explain on tables with many (hundreds) columns > are slow compare to nominal executions. Yea, colname_is_unique() (called via make_colname_unique()) is essentially O(#total_columns) and rougly called once for each column in a select list (or using or ...). IIRC we've hit this once when I was at citus, too. We really should be usign a more appropriate datastructure here - very likely a hashtable. Unfortunately such a change would likely be a bit too much to backpatch... Greetings, Andres Freund
Hi, On 2018-09-24 12:43:44 -0700, legrand legrand wrote: > Justin Pryzby wrote > > On Mon, Sep 24, 2018 at 12:22:28PM -0700, legrand legrand wrote: > >> Hello, > >> I have found that explain on tables with many (hundreds) columns > >> are slow compare to nominal executions. > > > > See also this thread from last month: > > > > https://www.postgresql.org/message-id/flat/CAEe%3DmRnNNL3RDKJDmY%3D_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw%40mail.gmail.com > > > > Justin > > maybe, I will check that patch ... > > I thought it would also have been related to > https://www.postgresql.org/message-id/CAMkU%3D1xPqHP%3D7YPeChq6n1v_qd4WGf%2BZvtnR-b%2BgyzFqtJqMMQ%40mail.gmail.com Neither of these are related to the problem. Greetings, Andres Freund