Thread: JSON constructors and window functions
I got a crash running the below query on the regression database: """ select pg_catalog.json_object_agg_unique(10, cast(ref_0.level2_no as int4)) over (partition by ref_0.parent_no order by ref_0.level2_no) from public.transition_table_level2 as ref_0; """ Attached the backtrace. PS: I'm cc'ing Andrew and Nikita because my feeling is that this is f4fb45d15c59d7add2e1b81a9d477d0119a9691a responsability. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Attachment
On 4/2/22 01:25, Jaime Casanova wrote: > I got a crash running the below query on the regression database: > > """ > select pg_catalog.json_object_agg_unique(10, > cast(ref_0.level2_no as int4)) > over (partition by ref_0.parent_no > order by ref_0.level2_no) > from public.transition_table_level2 as ref_0; > """ > > Attached the backtrace. > > PS: I'm cc'ing Andrew and Nikita because my feeling is that this is > f4fb45d15c59d7add2e1b81a9d477d0119a9691a responsability. Hmm. Thanks for the report. The code in json_unique_check_key() looks sane enough., so the issue is probably elsewhere. I'll keep digging. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 4/2/22 15:40, Andrew Dunstan wrote: > On 4/2/22 01:25, Jaime Casanova wrote: >> I got a crash running the below query on the regression database: >> >> """ >> select pg_catalog.json_object_agg_unique(10, >> cast(ref_0.level2_no as int4)) >> over (partition by ref_0.parent_no >> order by ref_0.level2_no) >> from public.transition_table_level2 as ref_0; >> """ >> >> Attached the backtrace. >> >> PS: I'm cc'ing Andrew and Nikita because my feeling is that this is >> f4fb45d15c59d7add2e1b81a9d477d0119a9691a responsability. > > > Hmm. Thanks for the report. The code in json_unique_check_key() looks > sane enough., so the issue is probably elsewhere. I'll keep digging. Haven't found the issue yet :-( It happens on the second call for the partition to json_check_unique_key(). Here's a more idiomatic and self-contained query that triggers the problem. select json_objectagg('10' : ref_0.level2 with unique keys) over (partition by ref_0.parent_no order by ref_0.level2) from (values (1::int,1::int),(1,2),(2,1),(2,2)) as ref_0(parent_no,level2); cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2022-04-03 18:56:39 -0400, Andrew Dunstan wrote: > Haven't found the issue yet :-( It happens on the second call for the > partition to json_check_unique_key(). > > Here's a more idiomatic and self-contained query that triggers the problem. > > > select json_objectagg('10' : ref_0.level2 with unique keys) > over (partition by ref_0.parent_no order by ref_0.level2) > from (values (1::int,1::int),(1,2),(2,1),(2,2)) as ref_0(parent_no,level2); The hash was created in a context that's already freed. #1 0x00007febbcc556f9 in wipe_mem (ptr=0x7febbf084f88, size=6392) at /home/andres/src/postgresql/src/include/utils/memdebug.h:42 #2 0x00007febbcc5603e in AllocSetReset (context=0x7febbf084e80) at /home/andres/src/postgresql/src/backend/utils/mmgr/aset.c:591 #3 0x00007febbcc61ed6 in MemoryContextResetOnly (context=0x7febbf084e80) at /home/andres/src/postgresql/src/backend/utils/mmgr/mcxt.c:181 #4 0x00007febbcc561cf in AllocSetDelete (context=0x7febbf084e80) at /home/andres/src/postgresql/src/backend/utils/mmgr/aset.c:654 #5 0x00007febbcc62155 in MemoryContextDelete (context=0x7febbf084e80) at /home/andres/src/postgresql/src/backend/utils/mmgr/mcxt.c:252 #6 0x00007febbcc31ee2 in hash_destroy (hashp=0x7febbf084fa0) at /home/andres/src/postgresql/src/backend/utils/hash/dynahash.c:876 #7 0x00007febbcb01ac5 in json_unique_check_free (cxt=0x7febbf03f548) at /home/andres/src/postgresql/src/backend/utils/adt/json.c:985 #8 0x00007febbcb01b7c in json_unique_builder_free (cxt=0x7febbf03f548) at /home/andres/src/postgresql/src/backend/utils/adt/json.c:1014 #9 0x00007febbcb0218f in json_object_agg_finalfn (fcinfo=0x7ffeab802e20) at /home/andres/src/postgresql/src/backend/utils/adt/json.c:1227 #10 0x00007febbc84e110 in finalize_windowaggregate (winstate=0x7febbf037730, perfuncstate=0x7febbf057560, peraggstate=0x7febbf0552f8,result=0x7febbf057520, isnull=0x7febbf057540) at /home/andres/src/postgresql/src/backend/executor/nodeWindowAgg.c:626 #11 0x00007febbc84ea9b in eval_windowaggregates (winstate=0x7febbf037730) at /home/andres/src/postgresql/src/backend/executor/nodeWindowAgg.c:993 #12 0x00007febbc8514a7 in ExecWindowAgg (pstate=0x7febbf037730) at /home/andres/src/postgresql/src/backend/executor/nodeWindowAgg.c:2207 #13 0x00007febbc7fda4d in ExecProcNodeFirst (node=0x7febbf037730) at /home/andres/src/postgresql/src/backend/executor/execProcnode.c:463 #14 0x00007febbc7f12fb in ExecProcNode (node=0x7febbf037730) at /home/andres/src/postgresql/src/include/executor/executor.h:259 #15 0x00007febbc7f41b7 in ExecutePlan (estate=0x7febbf0374f0, planstate=0x7febbf037730, use_parallel_mode=false, operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x7febbf030098, execute_once=true) at /home/andres/src/postgresql/src/backend/executor/execMain.c:1636 #16 0x00007febbc7f19ff in standard_ExecutorRun (queryDesc=0x7febbef79030, direction=ForwardScanDirection, count=0, execute_once=true) at /home/andres/src/postgresql/src/backend/executor/execMain.c:363 #17 0x00007febbc7f17ee in ExecutorRun (queryDesc=0x7febbef79030, direction=ForwardScanDirection, count=0, execute_once=true) at /home/andres/src/postgresql/src/backend/executor/execMain.c:307 #18 0x00007febbca6d2cc in PortalRunSelect (portal=0x7febbefcbc10, forward=true, count=0, dest=0x7febbf030098) at /home/andres/src/postgresql/src/backend/tcop/pquery.c:924 #19 0x00007febbca6cf5c in PortalRun (portal=0x7febbefcbc10, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x7febbf030098, I don't think you're allowed to free stuff in a finalfunc - we might reuse the transition state for further calls to the aggregate. Greetings, Andres Freund
On 4/3/22 20:11, Andres Freund wrote: > Hi, > > On 2022-04-03 18:56:39 -0400, Andrew Dunstan wrote: >> Haven't found the issue yet :-( It happens on the second call for the >> partition to json_check_unique_key(). >> >> Here's a more idiomatic and self-contained query that triggers the problem. >> >> >> select json_objectagg('10' : ref_0.level2 with unique keys) >> over (partition by ref_0.parent_no order by ref_0.level2) >> from (values (1::int,1::int),(1,2),(2,1),(2,2)) as ref_0(parent_no,level2); > The hash was created in a context that's already freed. > [...] > > > I don't think you're allowed to free stuff in a finalfunc - we might reuse the > transition state for further calls to the aggregate. > Doh! Of course! I'll fix it in the morning. Thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 4/3/22 22:46, Andrew Dunstan wrote: > On 4/3/22 20:11, Andres Freund wrote: >> Hi, >> >> On 2022-04-03 18:56:39 -0400, Andrew Dunstan wrote: >>> Haven't found the issue yet :-( It happens on the second call for the >>> partition to json_check_unique_key(). >>> >>> Here's a more idiomatic and self-contained query that triggers the problem. >>> >>> >>> select json_objectagg('10' : ref_0.level2 with unique keys) >>> over (partition by ref_0.parent_no order by ref_0.level2) >>> from (values (1::int,1::int),(1,2),(2,1),(2,2)) as ref_0(parent_no,level2); >> The hash was created in a context that's already freed. >> > [...] >> >> I don't think you're allowed to free stuff in a finalfunc - we might reuse the >> transition state for further calls to the aggregate. >> > > Doh! Of course! I'll fix it in the morning. Thanks. > > I've committed a fix for this. I didn't find something to clean out the hash table, so I just removed the 'hash_destroy' and left it at that. All the test I did came back with expected results. Maybe a hash_reset() is something worth having assuming it's possible? I note that simplehash has a reset function. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 4/3/22 22:46, Andrew Dunstan wrote: >> On 4/3/22 20:11, Andres Freund wrote: >>> I don't think you're allowed to free stuff in a finalfunc - we might reuse the >>> transition state for further calls to the aggregate. >> Doh! Of course! I'll fix it in the morning. Thanks. > I've committed a fix for this. I didn't find something to clean out the > hash table, so I just removed the 'hash_destroy' and left it at that. > All the test I did came back with expected results. > Maybe a hash_reset() is something worth having assuming it's possible? I > note that simplehash has a reset function. But removing the hash entries would be just as much of a problem wouldn't it? regards, tom lane
Are we missing regression tests using these functions as window functions? Hm. I suppose it's possible to write a general purpose regression test that loops over all aggregate functions and runs them as window functions and aggregates over the same data sets and compares results. At least for the case of aggregate functions with a single parameter belonging to a chosen set of data types.
Hi, On 2022-04-04 11:43:31 -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > On 4/3/22 22:46, Andrew Dunstan wrote: > >> On 4/3/22 20:11, Andres Freund wrote: > >>> I don't think you're allowed to free stuff in a finalfunc - we might reuse the > >>> transition state for further calls to the aggregate. > > >> Doh! Of course! I'll fix it in the morning. Thanks. > > > I've committed a fix for this. I didn't find something to clean out the > > hash table, so I just removed the 'hash_destroy' and left it at that. > > All the test I did came back with expected results. > > Maybe a hash_reset() is something worth having assuming it's possible? I > > note that simplehash has a reset function. > > But removing the hash entries would be just as much of a problem > wouldn't it? I think so. I guess we could mark it as FINALFUNC_MODIFY = READ_WRITE. But I don't see a reason why it'd be needed here. Is it a problem that skipped_keys is reset in the finalfunc? I don't know how these functions work. So far I don't understand why JsonUniqueBuilderState->skipped_keys is long lived... Greetings, Andres Freund
Hi, On 2022-04-04 11:54:23 -0400, Greg Stark wrote: > Are we missing regression tests using these functions as window functions? So far, yes. ISTM that 4eb97988796 should have at least included the crashing statement as a test... The statement can be simpler too: SELECT json_objectagg(k : v with unique keys) OVER (ORDER BY k) FROM (VALUES (1,1), (2,2)) a(k,v); is sufficient to trigger the crash for me, without even using asan (after reverting the bugfix, of course). > Hm. I suppose it's possible to write a general purpose regression test > that loops over all aggregate functions and runs them as window > functions and aggregates over the same data sets and compares results. > At least for the case of aggregate functions with a single parameter > belonging to a chosen set of data types. I was wondering about that too. Hardest part would be to come up with values to pass to the aggregates. I don't think it'd help in this case though, since it depends on special case grammar stuff to even be reached. json_objectagg(k : v with unique keys). "Normal" use of aggregates can't even reach the problematic path afaics. Greetings, Andres Freund
On 4/4/22 12:33, Andres Freund wrote: > Hi, > > On 2022-04-04 11:54:23 -0400, Greg Stark wrote: >> Are we missing regression tests using these functions as window functions? > So far, yes. > > ISTM that 4eb97988796 should have at least included the crashing statement as > a test... The statement can be simpler too: > > SELECT json_objectagg(k : v with unique keys) OVER (ORDER BY k) FROM (VALUES (1,1), (2,2)) a(k,v); > > is sufficient to trigger the crash for me, without even using asan (after > reverting the bugfix, of course). > I will add some regression tests. >> Hm. I suppose it's possible to write a general purpose regression test >> that loops over all aggregate functions and runs them as window >> functions and aggregates over the same data sets and compares results. >> At least for the case of aggregate functions with a single parameter >> belonging to a chosen set of data types. > I was wondering about that too. Hardest part would be to come up with values > to pass to the aggregates. > > I don't think it'd help in this case though, since it depends on special case > grammar stuff to even be reached. json_objectagg(k : v with unique > keys). "Normal" use of aggregates can't even reach the problematic path > afaics. > It can, as Jaime's original post showed. But on further consideration I'm thinking this area needs some rework. ISTM that it might be a whole lot simpler and comprehensible to generate the json first without bothering about null values or duplicate keys and then in the finalizer check for null values to be skipped and duplicate keys. That way we would need to keep far less state for the aggregation functions, although it might be marginally less efficient. Thoughts? cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 4/4/22 11:43, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 4/3/22 22:46, Andrew Dunstan wrote: >>> On 4/3/22 20:11, Andres Freund wrote: >>>> I don't think you're allowed to free stuff in a finalfunc - we might reuse the >>>> transition state for further calls to the aggregate. >>> Doh! Of course! I'll fix it in the morning. Thanks. >> I've committed a fix for this. I didn't find something to clean out the >> hash table, so I just removed the 'hash_destroy' and left it at that. >> All the test I did came back with expected results. >> Maybe a hash_reset() is something worth having assuming it's possible? I >> note that simplehash has a reset function. > But removing the hash entries would be just as much of a problem > wouldn't it? > > Yes, quite possibly. It looks from some experimentation as though, unlike my naive preconception, it doesn't process each frame again from the beginning, so losing the hash entries could indeed be an issue here. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 4/4/22 2:19 PM, Andrew Dunstan wrote: > > On 4/4/22 12:33, Andres Freund wrote: > It can, as Jaime's original post showed. > > But on further consideration I'm thinking this area needs some rework. > ISTM that it might be a whole lot simpler and comprehensible to generate > the json first without bothering about null values or duplicate keys and > then in the finalizer check for null values to be skipped and duplicate > keys. That way we would need to keep far less state for the aggregation > functions, although it might be marginally less efficient. Thoughts? This is still on the open items list[1]. Given this is a user-triggerable crash and we are approaching PG15 Beta 1, I wanted to check in and see if there was any additional work required to eliminate the crash, or if the work at this point is just optimization. If the latter, I'd suggest we open up a new open item for it. Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
Attachment
On 2022-05-10 Tu 09:51, Jonathan S. Katz wrote: > Hi, > > On 4/4/22 2:19 PM, Andrew Dunstan wrote: >> >> On 4/4/22 12:33, Andres Freund wrote: > >> It can, as Jaime's original post showed. >> >> But on further consideration I'm thinking this area needs some rework. >> ISTM that it might be a whole lot simpler and comprehensible to generate >> the json first without bothering about null values or duplicate keys and >> then in the finalizer check for null values to be skipped and duplicate >> keys. That way we would need to keep far less state for the aggregation >> functions, although it might be marginally less efficient. Thoughts? > > This is still on the open items list[1]. Given this is a > user-triggerable crash and we are approaching PG15 Beta 1, I wanted to > check in and see if there was any additional work required to > eliminate the crash, or if the work at this point is just optimization. > > If the latter, I'd suggest we open up a new open item for it. > > Thanks, > > Jonathan > > [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items I believe all the issues here have been fixed. See commit 112fdb3528 cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 5/10/22 10:25 AM, Andrew Dunstan wrote: > I believe all the issues here have been fixed. See commit 112fdb3528 Thanks! I have updated Open Items. Jonathan