Re: strange slow query - lost lot of time somewhere - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: strange slow query - lost lot of time somewhere |
Date | |
Msg-id | CAFj8pRAmwD50zZH2CH=P2Os-gHx+1vk2hrROS8N=cziAnnwM5Q@mail.gmail.com Whole thread Raw |
In response to | Re: strange slow query - lost lot of time somewhere (David Rowley <dgrowleyml@gmail.com>) |
List | pgsql-hackers |
st 4. 5. 2022 v 2:15 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:
On Tue, 3 May 2022 at 17:02, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> út 3. 5. 2022 v 6:57 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>> You sure there's not something taking an exclusive lock on one of these
>> tables every so often?
>
> I am almost sure, I can see this issue only every time when I set a higher work mem. I don't see this issue in other cases.
hmm, I don't think the query being blocked on a table lock would cause
this behaviour. As far as I know, all table locks should be obtained
before the timer starts for the "Execution Time" timer in EXPLAIN
ANALYZE. However, locks are obtained on indexes at executor startup,
so if there was some delay in obtaining a lock on the index it would
show up this way. I just don't know of anything that obtains a
conflicting lock on an index without the same conflicting lock on the
table that the index belongs to.
I do agree that the perf report does indicate that the extra time is
taken due to some large amount of memory being allocated. I just can't
quite see how that would happen in Memoize given that
estimate_num_groups() clamps the distinct estimate as the number of
input rows, which is 91 in both cases in your problem query.
Are you able to run the Memoize query in psql with \watch 0.1 for a
few seconds while you do:
perf record --call-graph dwarf --pid <pid> sleep 2
then send along the perf report.
I locally hacked build_hash_table() in nodeMemoize.c to make the
hashtable 100 million elements and I see my perf report for a trivial
Memoize query come up as:
+ 99.98% 0.00% postgres postgres [.] _start
+ 99.98% 0.00% postgres libc.so.6 [.]
__libc_start_main_alias_2 (inlined)
+ 99.98% 0.00% postgres libc.so.6 [.] __libc_start_call_main
+ 99.98% 0.00% postgres postgres [.] main
+ 99.98% 0.00% postgres postgres [.] PostmasterMain
+ 99.98% 0.00% postgres postgres [.] ServerLoop
+ 99.98% 0.00% postgres postgres [.] BackendStartup (inlined)
+ 99.98% 0.00% postgres postgres [.] BackendRun (inlined)
+ 99.98% 0.00% postgres postgres [.] PostgresMain
+ 99.98% 0.00% postgres postgres [.] exec_simple_query
+ 99.98% 0.00% postgres postgres [.] PortalRun
+ 99.98% 0.00% postgres postgres [.] FillPortalStore
+ 99.98% 0.00% postgres postgres [.] PortalRunUtility
+ 99.98% 0.00% postgres postgres [.] standard_ProcessUtility
+ 99.98% 0.00% postgres postgres [.] ExplainQuery
+ 99.98% 0.00% postgres postgres [.] ExplainOneQuery
+ 99.95% 0.00% postgres postgres [.] ExplainOnePlan
+ 87.87% 0.00% postgres postgres [.] standard_ExecutorStart
+ 87.87% 0.00% postgres postgres [.] InitPlan (inlined)
+ 87.87% 0.00% postgres postgres [.] ExecInitNode
+ 87.87% 0.00% postgres postgres [.] ExecInitNestLoop
+ 87.87% 0.00% postgres postgres [.] ExecInitMemoize
+ 87.87% 0.00% postgres postgres [.]
build_hash_table (inlined) <----
+ 87.87% 0.00% postgres postgres [.] memoize_create (inlined)
+ 87.87% 0.00% postgres postgres [.]
memoize_allocate (inlined)
+ 87.87% 0.00% postgres postgres [.]
MemoryContextAllocExtended
+ 87.87% 0.00% postgres postgres [.] memset (inlined)
Failing that, are you able to pg_dump these tables and load them into
a PostgreSQL instance that you can play around with and patch?
Provided you can actually recreate the problem on that instance.
+ 71,98% 14,36% postmaster [kernel.kallsyms] [k] page_fault ▒
+ 70,19% 6,59% postmaster libc-2.28.so [.] __memset_avx2_erms ▒
+ 68,20% 0,00% postmaster postgres [.] ExecInitNode ▒
+ 68,20% 0,00% postmaster postgres [.] ExecInitNestLoop ▒
+ 68,13% 0,00% postmaster postgres [.] ExecInitMemoize ▒
+ 68,13% 0,00% postmaster postgres [.] MemoryContextAllocExtended ▒
+ 63,20% 0,00% postmaster postgres [.] 0x0000000000776b89 ▒
+ 63,20% 0,00% postmaster postgres [.] PostgresMain ◆
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f48ca ▒
+ 63,03% 0,00% postmaster postgres [.] PortalRun ▒
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f83ae ▒
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f7fee ▒
+ 63,03% 0,00% postmaster pg_stat_statements.so [.] 0x00007f5579b599c6 ▒
+ 63,03% 0,00% postmaster postgres [.] standard_ProcessUtility ▒
+ 63,03% 0,00% postmaster postgres [.] ExplainQuery ▒
+ 62,83% 0,00% postmaster postgres [.] 0x000000000062e83c ▒
+ 62,83% 0,00% postmaster postgres [.] ExplainOnePlan ▒
+ 57,47% 0,14% postmaster [kernel.kallsyms] [k] do_page_fault ▒
+ 57,23% 0,51% postmaster [kernel.kallsyms] [k] __do_page_fault ▒
+ 55,61% 0,71% postmaster [kernel.kallsyms] [k] handle_mm_fault ▒
+ 55,19% 0,00% postmaster pg_stat_statements.so [.] 0x00007f5579b5ad2c ▒
+ 55,19% 0,00% postmaster postgres [.] standard_ExecutorStart ▒
+ 54,78% 0,87% postmaster [kernel.kallsyms] [k] __handle_mm_fault ▒
+ 53,54% 0,37% postmaster [kernel.kallsyms] [k] do_anonymous_page ▒
+ 36,36% 0,21% postmaster [kernel.kallsyms] [k] alloc_pages_vma ▒
+ 35,99% 0,31% postmaster [kernel.kallsyms] [k] __alloc_pages_nodemask ▒
+ 35,40% 1,06% postmaster [kernel.kallsyms] [k] get_page_from_freelist ▒
+ 27,71% 0,62% postmaster [kernel.kallsyms] [k] prep_new_page ▒
+ 27,09% 26,99% postmaster [kernel.kallsyms] [k] clear_page_erms ▒
+ 11,24% 2,29% postmaster [kernel.kallsyms] [k] swapgs_restore_regs_and_return_to_usermode ▒
+ 8,95% 6,87% postmaster [kernel.kallsyms] [k] prepare_exit_to_usermode ▒
+ 7,83% 1,01% postmaster [kernel.kallsyms] [k] mem_cgroup_charge
+ 70,19% 6,59% postmaster libc-2.28.so [.] __memset_avx2_erms ▒
+ 68,20% 0,00% postmaster postgres [.] ExecInitNode ▒
+ 68,20% 0,00% postmaster postgres [.] ExecInitNestLoop ▒
+ 68,13% 0,00% postmaster postgres [.] ExecInitMemoize ▒
+ 68,13% 0,00% postmaster postgres [.] MemoryContextAllocExtended ▒
+ 63,20% 0,00% postmaster postgres [.] 0x0000000000776b89 ▒
+ 63,20% 0,00% postmaster postgres [.] PostgresMain ◆
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f48ca ▒
+ 63,03% 0,00% postmaster postgres [.] PortalRun ▒
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f83ae ▒
+ 63,03% 0,00% postmaster postgres [.] 0x00000000007f7fee ▒
+ 63,03% 0,00% postmaster pg_stat_statements.so [.] 0x00007f5579b599c6 ▒
+ 63,03% 0,00% postmaster postgres [.] standard_ProcessUtility ▒
+ 63,03% 0,00% postmaster postgres [.] ExplainQuery ▒
+ 62,83% 0,00% postmaster postgres [.] 0x000000000062e83c ▒
+ 62,83% 0,00% postmaster postgres [.] ExplainOnePlan ▒
+ 57,47% 0,14% postmaster [kernel.kallsyms] [k] do_page_fault ▒
+ 57,23% 0,51% postmaster [kernel.kallsyms] [k] __do_page_fault ▒
+ 55,61% 0,71% postmaster [kernel.kallsyms] [k] handle_mm_fault ▒
+ 55,19% 0,00% postmaster pg_stat_statements.so [.] 0x00007f5579b5ad2c ▒
+ 55,19% 0,00% postmaster postgres [.] standard_ExecutorStart ▒
+ 54,78% 0,87% postmaster [kernel.kallsyms] [k] __handle_mm_fault ▒
+ 53,54% 0,37% postmaster [kernel.kallsyms] [k] do_anonymous_page ▒
+ 36,36% 0,21% postmaster [kernel.kallsyms] [k] alloc_pages_vma ▒
+ 35,99% 0,31% postmaster [kernel.kallsyms] [k] __alloc_pages_nodemask ▒
+ 35,40% 1,06% postmaster [kernel.kallsyms] [k] get_page_from_freelist ▒
+ 27,71% 0,62% postmaster [kernel.kallsyms] [k] prep_new_page ▒
+ 27,09% 26,99% postmaster [kernel.kallsyms] [k] clear_page_erms ▒
+ 11,24% 2,29% postmaster [kernel.kallsyms] [k] swapgs_restore_regs_and_return_to_usermode ▒
+ 8,95% 6,87% postmaster [kernel.kallsyms] [k] prepare_exit_to_usermode ▒
+ 7,83% 1,01% postmaster [kernel.kallsyms] [k] mem_cgroup_charge
David
pgsql-hackers by date: