Re: posgres 12 bug (partitioned table) - Mailing list pgsql-bugs
From | Amit Langote |
---|---|
Subject | Re: posgres 12 bug (partitioned table) |
Date | |
Msg-id | CA+HiwqHU3rRrqEA_+yCzpqb_+ssFj7=df7DthpcmoUmjwDXLWg@mail.gmail.com Whole thread Raw |
In response to | Re: posgres 12 bug (partitioned table) (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>) |
Responses |
Re: posgres 12 bug (partitioned table)
Re: posgres 12 bug (partitioned table) |
List | pgsql-bugs |
Hi Soumyadeep, On Fri, Jul 10, 2020 at 2:56 AM Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote: > > Hey Amit, > > On Thu, Jul 9, 2020 at 12:16 AM Amit Langote <amitlangote09@gmail.com> wrote: > > > By the way, what happens today if you do INSERT INTO a_zedstore_table > > ... RETURNING xmin? Do you get an error "xmin is unrecognized" or > > some such in slot_getsysattr() when trying to project the RETURNING > > list? > > > We get garbage values for xmin and cmin. If we request cmax/xmax, we get > an ERROR from slot_getsystattr()->tts_zedstore_getsysattr(): > "zedstore tuple table slot does not have system attributes (except xmin > and cmin)" > > A ZedstoreTupleTableSlot only stores xmin and xmax. Also, > zedstoream_insert(), which is the tuple_insert() implementation, does > not supply the xmin/cmin, thus making those values garbage. > > For context, Zedstore has its own UNDO log implementation to act as > storage for transaction information. (which is intended to be replaced > with the upstream UNDO log in the future). > > The above behavior is not just restricted to INSERT..RETURNING, right > now. If we do a select <tx_column> from foo in Zedstore, the behavior is > the same. The transaction information is never returned from Zedstore > in tableam calls that don't demand transactional information be > used/returned. If you ask it to do a tuple_satisfies_snapshot(), OTOH, > it will use the transactional information correctly. It will also > populate TM_FailureData, which contains xmax and cmax, in the APIs where > it is demanded. > > I really wonder what other AMs are doing about these issues. > > I think we should either: > > 1. Demand transactional information off of AMs for all APIs that involve > a projection of transactional information. > > 2. Have some other component of Postgres supply the transactional > information. This is what I think the upstream UNDO log can probably > provide. So even if an AM's table_tuple_insert() itself doesn't populate the transaction info into the slot handed to it, maybe as an optimization, it does not sound entirely unreasonable to expect that the AM's slot_getsysattr() callback returns it correctly when projecting a target list containing system columns. We shouldn't really need any new core code to get the transaction-related system columns while there exists a perfectly reasonable channel for it to arrive through -- TupleTableSlots. I suppose there's a reason why we allow AMs to provide their own slot callbacks. Whether an AM uses UNDO log or something else to manage the transaction info is up to the AM, so I don't see why the AMs themselves shouldn't be in charge of returning that info, because only they know where it is. > 3. (Least elegant) Transform tuple table slots into heap tuple table > slots (since it is the only kind of tuple storage that can supply > transactional info) and explicitly fill in the transactional values > depending on the context, whenever transactional information is > projected. > > For this bug report, I am not sure what is right. Perhaps, to stop the > bleeding temporarily, we could use the pi_PartitionTupleSlot and assume > that the AM needs to provide the transactional info in the respective > insert AM API calls, As long as the AM's slot_getsysattr() callback returns the correct value, this works. > as well as demand a heap slot for partition roots > and interior nodes. It would be a compromise on the core's part to use "heap" slots for partitioned tables, because they don't have a valid table AM. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
pgsql-bugs by date: