Thread: [HACKERS] MERGE SQL Statement for PG11
I'm working on re-submitting MERGE for PG11 Earlier thoughts on how this could/could not be done were sometimes imprecise or inaccurate, so I have gone through the command per SQL:2011 spec and produced a definitive spec in the form of an SGML ref page. This is what I intend to deliver for PG11. MERGE will use the same mechanisms as INSERT ON CONFLICT, so concurrent behavior does not require further infrastructure changes, just detailed work on the statement itself. I'm building up the code from scratch based upon the spec, rather than trying to thwack earlier attempts into shape. This looks more likely to yield a commitable patch. Major spanners or objections, please throw them in now cos I don't see any. Questions? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hey,
It looks quite nice. Personally I'd like to also have the returning statement, and have the number of deleted and inserted rows as separate numbers in the output message.
regards
Szymon Lipiński
pt., 27.10.2017, 10:56 użytkownik Simon Riggs <simon@2ndquadrant.com> napisał:
I'm working on re-submitting MERGE for PG11
Earlier thoughts on how this could/could not be done were sometimes
imprecise or inaccurate, so I have gone through the command per
SQL:2011 spec and produced a definitive spec in the form of an SGML
ref page. This is what I intend to deliver for PG11.
MERGE will use the same mechanisms as INSERT ON CONFLICT, so
concurrent behavior does not require further infrastructure changes,
just detailed work on the statement itself.
I'm building up the code from scratch based upon the spec, rather than
trying to thwack earlier attempts into shape. This looks more likely
to yield a commitable patch.
Major spanners or objections, please throw them in now cos I don't see any.
Questions?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs wrote: > Earlier thoughts on how this could/could not be done were sometimes > imprecise or inaccurate, so I have gone through the command per > SQL:2011 spec and produced a definitive spec in the form of an SGML > ref page. This is what I intend to deliver for PG11. Nice work. I didn't verify the SQL spec, just read your HTML page; some very minor comments based on that: * use "and" not "where" as initial words in "when_clause" and "merge_update" clause definitions * missing word here: "the DELETE privilege on the if you specify" * I think the word "match." is leftover from some editing in the phrase " that specifies which rows in the data_source matchrows in the target_table_name. match." In the same paragraph, it is not clear whether all columns must be matched orit can be a partial match. * In the when_clause note, it is not clear whether you can have multiple WHEN MATCHED and WHEN NOT MATCHED clauses. Obviouslyyou can have one of each, but I think your doc says it is possible to have more than one of each, with differentconditions (WHEN MATCHED AND foo THEN bar WHEN MATCHED AND baz THEN qux). No example shows more than one. On the same point: Is there short-circuiting of such conditions, i.e. will the execution will stop looking for further WHENmatches if some rule matches, or will it rather check all rules and raise an error if more than one WHEN rules matcheach given row? * Your last example uses ELSE but that appears nowhere in the synopsys. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 10:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Questions? I think one of the reasons why Peter Geoghegan decided to pursue INSERT .. ON CONFLICT UPDATE was that, because it is non-standard SQL syntax, he felt free to mandate a non-standard SQL requirement, namely the presence of a unique index on the arbiter columns. If MERGE's join clause happens to involve equality conditions on precisely the same set of columns as some unique index on the target table, then I think you can reuse the INSERT .. ON CONFLICT UPDATE infrastructure and I suspect there won't be too many problems. However, if it doesn't, then what? You could decree that such cases will fail, but that might not meet your use case for developing the feature. Or you could try to soldier on without the INSERT .. ON CONFLICT UPDATE machinery, but that means, I think, that sometimes you will get serialization anomalies - at least, I think, you will sometimes obtain results that couldn't have been obtained under any serial order of execution, and maybe it would end up being possible to fail with serialization errors or unique index violations. In the past, there have been objections to implementations of MERGE which would give rise to such serialization anomalies, but I'm not sure we should feel bound by those discussions. One thing that's different is that the common and actually-useful case can now be made to work in a fairly satisfying way using INSERT .. ON CONFLICT UPDATE; if less useful cases are vulnerable to some weirdness, maybe it's OK to just document the problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 27 October 2017 at 15:24, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Oct 27, 2017 at 10:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Questions? > > I think one of the reasons why Peter Geoghegan decided to pursue > INSERT .. ON CONFLICT UPDATE was that, because it is non-standard SQL > syntax, he felt free to mandate a non-standard SQL requirement, namely > the presence of a unique index on the arbiter columns. If MERGE's > join clause happens to involve equality conditions on precisely the > same set of columns as some unique index on the target table, then I > think you can reuse the INSERT .. ON CONFLICT UPDATE infrastructure > and I suspect there won't be too many problems. Agreed > However, if it > doesn't, then what? You could decree that such cases will fail, but > that might not meet your use case for developing the feature. Or you > could try to soldier on without the INSERT .. ON CONFLICT UPDATE > machinery, but that means, I think, that sometimes you will get > serialization anomalies - at least, I think, you will sometimes obtain > results that couldn't have been obtained under any serial order of > execution, and maybe it would end up being possible to fail with > serialization errors or unique index violations. > > In the past, there have been objections to implementations of MERGE > which would give rise to such serialization anomalies, but I'm not > sure we should feel bound by those discussions. One thing that's > different is that the common and actually-useful case can now be made > to work in a fairly satisfying way using INSERT .. ON CONFLICT UPDATE; > if less useful cases are vulnerable to some weirdness, maybe it's OK > to just document the problems. Good points. I didn't say it but my intention was to just throw an ERROR if no single unique index can be identified. It could be possible to still run MERGE in that situaton but we would need to take a full table lock at ShareRowExclusive. It's quite likely that such statements would throw duplicate update errors, so I wouldn't be aiming to do anything with that for PG11. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 7:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Good points. > > I didn't say it but my intention was to just throw an ERROR if no > single unique index can be identified. You'd also throw an error when there was no "upsert compatible" join quals, I take it? I don't see the point in that. That's just mapping one syntax on to another. > It could be possible to still run MERGE in that situaton but we would > need to take a full table lock at ShareRowExclusive. It's quite likely > that such statements would throw duplicate update errors, so I > wouldn't be aiming to do anything with that for PG11. I would avoid mixing up ON CONFLICT DO UPDATE and MERGE. The "anomalies" you describe in MERGE are not really anomalies IMV. They're simply how the feature is supposed to operate, and how it's possible to make MERGE use alternative join algorithms based only on the underlying cost. You might use a merge join for a bulk load use-case, for example. I think an SQL MERGE feature would be compelling, but I don't think that it should take much from ON CONFLICT. As I've said many times [1], they're really two different features (Teradata had features similar to each, for example). I suggest that you come up with something that has the semantics that the standard requires, and therefore makes none of the ON CONFLICT guarantees about an outcome under concurrency (INSERT or UPDATE). Those guarantees are basically incompatible with how MERGE needs to work. In case it matters, I think that the idea of varying relation heavyweight lock strength based on subtle semantics within a DML statement is a bad one. Frankly, I think that that's going to be a nonstarter. [1] https://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 6:24 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I think one of the reasons why Peter Geoghegan decided to pursue > INSERT .. ON CONFLICT UPDATE was that, because it is non-standard SQL > syntax, he felt free to mandate a non-standard SQL requirement, namely > the presence of a unique index on the arbiter columns. That's true, but what I was really insistent on, more than anything else, was that the user would get a practical guarantee about an insert-or-update outcome under concurrency. There could be no "unprincipled deadlocks", nor could there be spurious unique violations. This is the kind of thing that the SQL standard doesn't really concern itself with, and yet it's of significant practical importance to users. Both Oracle and SQL Server allow these things that I specifically set out to avoid. I think that that's mostly a good thing, though; they do a really bad job of explaining what's what, and don't provide for a very real need ("upsert") in some other way, but their MERGE semantics do make sense to me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > I didn't say it but my intention was to just throw an ERROR if no > single unique index can be identified. > > It could be possible to still run MERGE in that situaton but we would > need to take a full table lock at ShareRowExclusive. It's quite likely > that such statements would throw duplicate update errors, so I > wouldn't be aiming to do anything with that for PG11. Like Peter, I think taking such a strong lock for a DML statement doesn't sound like a very desirable way forward. It means, for example, that you can only have one MERGE in progress on a table at the same time, which is quite limiting. It could easily be the case that you have multiple MERGE statements running at once but they touch disjoint groups of rows and therefore everything works. I think the code should be able to cope with concurrent changes, if nothing else by throwing an ERROR, and then if the user wants to ensure that doesn't happen by taking ShareRowExclusiveLock they can do that via an explicit LOCK TABLE statement -- or else they can prevent concurrency by any other means they see fit. Other problems with taking ShareRowExclusiveLock include (1) probable lock upgrade hazards and (2) do you really want MERGE to kick autovacuum off of your giant table? Probably not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon, Nice writeup. While the standard may not require a unique index for the ON clause I have never seen a MERGE statement that did not have this property. So IMHO this is a reasonable restrictions. In fact I have only ever seen two flavors of usage: * Single row source (most often simply a VALUES clause) in OLTP In that case there was lots of concurrency * Massive source which affects a significant portion of the target table in DW.In this case there were no concurrent MERGEs I believe support for returning rows at a later stage would prove to be very powerful, especially in combination with chaining MERGE statements in CTEs. To do that would require language extensions to pass the coloring of the source row through, especially for rows that fell into "do nothing". -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 2:13 PM, srielau <serge@rielau.com> wrote: > While the standard may not require a unique index for the ON clause I have > never seen a MERGE statement that did not have this property. So IMHO this > is a reasonable restrictions. The Oracle docs on MERGE say nothing about unique indexes or constraints. They don't even mention them in passing. They do say "This statement is a convenient way to combine multiple operations. It lets you avoid multiple INSERT, UPDATE, and DELETE DML statements." SQL Server's MERGE docs do mention unique indexes, but only in passing, saying something about unique violations, and that unique violations *cannot* be suppressed in MERGE, even though that's possible with other DML statements (with something called IGNORE_DUP_KEY). What other systems *do* have this restriction? I've never seen one that did. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
via Newton Mail
On Fri, Oct 27, 2017 at 2:42 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Oct 27, 2017 at 2:13 PM, srielau <serge@rielau.com> wrote:
> While the standard may not require a unique index for the ON clause I have
> never seen a MERGE statement that did not have this property. So IMHO this
> is a reasonable restrictions.
The Oracle docs on MERGE say nothing about unique indexes or
constraints. They don't even mention them in passing. They do say
"This statement is a convenient way to combine multiple operations. It
lets you avoid multiple INSERT, UPDATE, and DELETE DML statements."
SQL Server's MERGE docs do mention unique indexes, but only in
passing, saying something about unique violations, and that unique
violations *cannot* be suppressed in MERGE, even though that's
possible with other DML statements (with something called
IGNORE_DUP_KEY).
What other systems *do* have this restriction? I've never seen one that did.
Not clear what you are leading up to here.
When I did MERGE in DB2 there was also no limitation:
"Each row in the target can only be operated on once. A row in the target can only be identified as MATCHED with one row in the result table of the table-reference”
What there was however was a significant amount of code I had to write and test to enforce the above second sentence. IIRC it involved, in the absence of a proof that the join could not expand, adding a row_number() over() AS rn over the target leg of the join and then a row_number() over(partition by rn) > 1 THEN RAISE_ERROR() to catch violators.
Maybe in PG there is a trivial way to detect an expanding join and block it at runtime.
So the whole point I’m trying to make is that I haven’t seen the need for the extra work I had to do once the feature appeared in the wild.
Cheers
Serge
On Fri, Oct 27, 2017 at 9:00 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I didn't say it but my intention was to just throw an ERROR if no >> single unique index can be identified. >> >> It could be possible to still run MERGE in that situaton but we would >> need to take a full table lock at ShareRowExclusive. It's quite likely >> that such statements would throw duplicate update errors, so I >> wouldn't be aiming to do anything with that for PG11. > > Like Peter, I think taking such a strong lock for a DML statement > doesn't sound like a very desirable way forward. It means, for > example, that you can only have one MERGE in progress on a table at > the same time, which is quite limiting. It could easily be the case > that you have multiple MERGE statements running at once but they touch > disjoint groups of rows and therefore everything works. I think the > code should be able to cope with concurrent changes, if nothing else > by throwing an ERROR, and then if the user wants to ensure that > doesn't happen by taking ShareRowExclusiveLock they can do that via an > explicit LOCK TABLE statement -- or else they can prevent concurrency > by any other means they see fit. +1, I would suspect users to run this query in parallel of the same table for multiple data sets. Peter has taken some time to explain me a bit his arguments today, and I agree that it does not sound much appealing to have constraint limitations for MERGE. Particularly using the existing ON CONFLICT structure gets the feeling of having twice a grammar for what's basically the same feature, with pretty much the same restrictions. By the way, this page sums up nicely the situation about many implementations of UPSERT taken for all systems: https://en.wikipedia.org/wiki/Merge_(SQL) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 02:13:27PM -0700, srielau wrote: > While the standard may not require a unique index for the ON clause I have > never seen a MERGE statement that did not have this property. So IMHO this > is a reasonable restrictions. I don't understand how one could have a conflict upon which to turn INSERT into UPDATE without having a UNIQUE constraint violated... The only question is whether one should have control over -or have to specify- which constraint violations lead to UPDATE vs. which ones lead to failure vs. which ones lead to doing nothing. The row to update is the one that the to-be-inserted row conflicted with -- there can only have been one if the constraint violated was a PRIMARY KEY constraint, or if there is a PRIMARY KEY at all, but if there's no PRIMARY KEY, then there can have been more conflicting rows because of NULL columns in the to-be-inserted row. If the to-be-inserted row conflicts with multiple rows, then just fail, or don't allow MERGE on tables that have no PK (as you know, many think it makes no sense to not have a PK on a table in SQL). In the common case one does not care about which UNIQUE constraint is violated because there's only one that could have been violated, or because if the UPDATE should itself cause some other UNIQUE constraint to be violated, then the whole statement should fail. PG's UPSERT is fantastic -- it allows very fine-grained control, but it isn't as pithy as it could be when the author doesn't care to specify all that detail. Also, something like SQLite3's INSERT OR REPLACE is very convenient: pithy, INSERT syntax, upsert-like semantics[*]. I'd like to have this in PG: INSERT INTO .. ON CONFLICT DO UPDATE; -- I.e., update all columns of the existing -- row to matchthe ones from the row that -- would have been inserted had there not been -- a conflict. -- -- If an INSERTed row conflicts and then the -- UPDATE it devolves to also conflicts, then -- fail. and INSERT INTO .. ON CONFLICT DO UPDATE -- I.e., update all columns of the existing -- row to matchthe ones from the row that -- would have been inserted had there not been -- a conflict. -- ON CONFLICT DO NOTHING; -- If an INSERTed row conflicts and then the -- UPDATE it devolves to also conflicts, then -- DO NOTHING. [*] SQLite3's INSERT OR REPLACE is NOT an insert-or-update, but an insert-or-delete-and-insert, and any deletions thatoccur in the process do fire triggers. INSERT OR UPDATE would be much more useful. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 28 October 2017 at 00:31, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Oct 27, 2017 at 9:00 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> I didn't say it but my intention was to just throw an ERROR if no >>> single unique index can be identified. >>> >>> It could be possible to still run MERGE in that situaton but we would >>> need to take a full table lock at ShareRowExclusive. It's quite likely >>> that such statements would throw duplicate update errors, so I >>> wouldn't be aiming to do anything with that for PG11. >> >> Like Peter, I think taking such a strong lock for a DML statement >> doesn't sound like a very desirable way forward. It means, for >> example, that you can only have one MERGE in progress on a table at >> the same time, which is quite limiting. It could easily be the case >> that you have multiple MERGE statements running at once but they touch >> disjoint groups of rows and therefore everything works. I think the >> code should be able to cope with concurrent changes, if nothing else >> by throwing an ERROR, and then if the user wants to ensure that >> doesn't happen by taking ShareRowExclusiveLock they can do that via an >> explicit LOCK TABLE statement -- or else they can prevent concurrency >> by any other means they see fit. > > +1, I would suspect users to run this query in parallel of the same > table for multiple data sets. > > Peter has taken some time to explain me a bit his arguments today, and > I agree that it does not sound much appealing to have constraint > limitations for MERGE. Particularly using the existing ON CONFLICT > structure gets the feeling of having twice a grammar for what's > basically the same feature, with pretty much the same restrictions. > > By the way, this page sums up nicely the situation about many > implementations of UPSERT taken for all systems: > https://en.wikipedia.org/wiki/Merge_(SQL) That Wikipedia article is badly out of date and regrettably does NOT sum up the current situation nicely any more since MERGE has changed in definition in SQL:2011 since its introduction in SQL:2003. I'm proposing a MERGE statement for PG11 that i) takes a RowExclusiveLock on rows, so can be run concurrently ii) uses the ON CONFLICT infrastructure to do that and so requires a unique constraint. The above is useful behaviour that will be of great benefit to PostgreSQL users. There are no anomalies remaining. SQL:2011 specifically states "The extent to which an SQL-implementation may disallow independent changes that are not significant is implementation-defined”, so in my reading the above behaviour would make us fully spec compliant. Thank you to Peter for providing the infrastructure on which this is now possible for PG11. Serge puts this very nicely by identifying two different use cases for MERGE. Now, I accept that you might also want a MERGE statement that continues to work even if there is no unique constraint, but it would need to have different properties to the above. I do not in any way argue against adding that. I also agree that adding RETURNING at a later stage would be fine as well. I am proposing that those and any other additional properties people come up with can be added in later releases once we have the main functionality in core in PG11. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 28, 2017 at 3:10 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > SQL:2011 specifically states "The extent to which an > SQL-implementation may disallow independent changes that are not > significant is implementation-defined”, so in my reading the above > behaviour would make us fully spec compliant. Thank you to Peter for > providing the infrastructure on which this is now possible for PG11. > > Serge puts this very nicely by identifying two different use cases for MERGE. MERGE benefits from having a join that is more or less implemented in the same way as any other join. It can be a merge join, hash join, or nestloop join. ON CONFLICT doesn't work using a join. Should I to take it that you won't be supporting any of these alternative join algorithms? If not, then you'll have something that really isn't comparable to MERGE as implemented in Oracle, SQL Server, or DB2. They *all* do this. Would the user be able to omit WHEN NOT MATCHED/INSERT, as is the case with every existing MERGE implementation? If so, what actually happens under the hood when WHEN NOT MATCHED is omitted? For example, would you actually use a regular "UPDATE FROM" style join, as opposed to the ON CONFLICT infrastructure? And, if that is so, can you justify the semantic difference for rows that are updated in each scenario (omitted vs. not omitted) in READ COMMITTED mode? Note that this could be the difference between updating a row when *no* version is visible to our MVCC snapshot, as opposed to doing the EPQ stuff and updating the latest row version if possible. That's a huge, surprising difference. On top of all this, you risk live-lock if INSERT isn't a possible outcome (this is also why ON CONFLICT can never accept a predicate on its INSERT portion -- again, quite unlike MERGE). Why not just follow what other systems do? It's actually easier to go that way, and you get a better outcome. ON CONFLICT involves what you could call a sleight of hand, and I fear that you don't appreciate just how specialized the internal infrastructure is. > Now, I accept that you might also want a MERGE statement that > continues to work even if there is no unique constraint, but it would > need to have different properties to the above. I do not in any way > argue against adding that. Maybe you *should* be arguing against it, though, and arguing against ever supporting anything but equijoins, because these things will *become* impossible if you go down that road. By starting with the ON CONFLICT infrastructure, while framing no-unique-index-support as work for some unspecified future release, you're leaving it up to someone else to resolve the problems. Someone else must square the circle of mixing ON CONFLICT semantics with fully generalized MERGE semantics. But who? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 28 October 2017 at 20:39, Peter Geoghegan <pg@bowt.ie> wrote: > On Sat, Oct 28, 2017 at 3:10 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> SQL:2011 specifically states "The extent to which an >> SQL-implementation may disallow independent changes that are not >> significant is implementation-defined”, so in my reading the above >> behaviour would make us fully spec compliant. Thank you to Peter for >> providing the infrastructure on which this is now possible for PG11. >> >> Serge puts this very nicely by identifying two different use cases for MERGE. > > MERGE benefits from having a join that is more or less implemented in > the same way as any other join. It can be a merge join, hash join, or > nestloop join. ON CONFLICT doesn't work using a join. > > Should I to take it that you won't be supporting any of these > alternative join algorithms? If not, then you'll have something that > really isn't comparable to MERGE as implemented in Oracle, SQL Server, > or DB2. They *all* do this. > > Would the user be able to omit WHEN NOT MATCHED/INSERT, as is the case > with every existing MERGE implementation? If so, what actually happens > under the hood when WHEN NOT MATCHED is omitted? For example, would > you actually use a regular "UPDATE FROM" style join, as opposed to the > ON CONFLICT infrastructure? And, if that is so, can you justify the > semantic difference for rows that are updated in each scenario > (omitted vs. not omitted) in READ COMMITTED mode? Note that this could > be the difference between updating a row when *no* version is visible > to our MVCC snapshot, as opposed to doing the EPQ stuff and updating > the latest row version if possible. That's a huge, surprising > difference. On top of all this, you risk live-lock if INSERT isn't a > possible outcome (this is also why ON CONFLICT can never accept a > predicate on its INSERT portion -- again, quite unlike MERGE). > > Why not just follow what other systems do? It's actually easier to go > that way, and you get a better outcome. ON CONFLICT involves what you > could call a sleight of hand, and I fear that you don't appreciate > just how specialized the internal infrastructure is. > >> Now, I accept that you might also want a MERGE statement that >> continues to work even if there is no unique constraint, but it would >> need to have different properties to the above. I do not in any way >> argue against adding that. > > Maybe you *should* be arguing against it, though, and arguing against > ever supporting anything but equijoins, because these things will > *become* impossible if you go down that road. By starting with the ON > CONFLICT infrastructure, while framing no-unique-index-support as work > for some unspecified future release, you're leaving it up to someone > else to resolve the problems. Someone else must square the circle of > mixing ON CONFLICT semantics with fully generalized MERGE semantics. > But who? Nothing I am proposing blocks later work. Everything you say makes it clear that a fully generalized solution is going to be many years in the making, assuming we agree. "The extent to which an SQL-implementation may disallow independent changes that are not significant is implementation-defined”. So we get to choose. I recommend that we choose something practical. We're approaching the 10 year anniversary of my first serious attempt to do MERGE. I say that its time to move forwards with useful solutions, rather than wait another 10 years for the perfect one, even assuming it exists. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 27, 2017 at 3:00 PM, Serge Rielau <serge@rielau.com> wrote: >> What other systems *do* have this restriction? I've never seen one that did. > > Not clear what you are leading up to here. > When I did MERGE in DB2 there was also no limitation: > "Each row in the target can only be operated on once. A row in the target can only be identified as MATCHED with one rowin the result table of the table-reference” > What there was however was a significant amount of code I had to write and test to enforce the above second sentence. Then it seems that we were talking about two different things all along. > Maybe in PG there is a trivial way to detect an expanding join and block it at runtime. There is for ON CONFLICT. See the cardinality violation logic within ExecOnConflictUpdate(). (There are esoteric cases where this error can be raised due to a wCTE that does an insert "from afar", which is theoretically undesirable but not actually a problem.) The MERGE implementation that I have in mind would probably do almost the same thing, and make the "HeapTupleSelfUpdated" case within ExecUpdate() raise an error when the caller happened to be a MERGE, rather than following the historic UPDATE behavior. (The behavior is to silently suppress a second or subsequent UPDATE attempt from the same command, a behavior that Simon's mock MERGE documentation references.) > So the whole point I’m trying to make is that I haven’t seen the need for the extra work I had to do once the feature appearedin the wild. That seems pretty reasonable to me. My whole point is that I think it's a mistake to do things like lock rows ahead of evaluating any UPDATE predicate, in the style of ON CONFLICT, in order to replicate the ON CONFLICT guarantees [1]. I'm arguing for implementation simplicity, too. Trying to implement MERGE in a way that extends ON CONFLICT seems like a big mistake to me, because ON CONFLICT updates rows on the basis of a would-be duplicate violation, along with all the baggage that that carries. This is actually enormously different to an equi-join that is fed by a scan using an MVCC snapshot. The main difference is that there actually is no MVCC snapshot in play in most cases [2]. If *no* row with the PK value of 5 is visible to our MVCC snapshot, but an xact committed having inserted such a row, that still counts as a CONFLICT with READ COMMITTED. [1] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_implementation [2] https://www.postgresql.org/docs/devel/static/transaction-iso.html#xact-read-committed -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 28, 2017 at 12:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Nothing I am proposing blocks later work. Actually, many things will block future work if you go down that road. You didn't respond to the specific points I raised, but that doesn't mean that they're not real. > Everything you say makes it clear that a fully generalized solution is > going to be many years in the making, assuming we agree. I think that it's formally impossible as long as you preserve the ON CONFLICT guarantees, unless you somehow define the problems out of existence. Those are guarantees which no other MERGE implementation has ever made, and which the SQL standard says nothing about. And for good reasons. > "The extent to which an SQL-implementation may disallow independent > changes that are not significant is implementation-defined”. > > So we get to choose. I recommend that we choose something practical. > We're approaching the 10 year anniversary of my first serious attempt > to do MERGE. I say that its time to move forwards with useful > solutions, rather than wait another 10 years for the perfect one, even > assuming it exists. As far as I'm concerned, you're the one arguing for an unobtainable solution over a good one, not me. I *don't* think you should solve the problems that I raise -- you should instead implement MERGE without any of the ON CONFLICT guarantees, just like everyone else has. Building MERGE on top of the ON CONFLICT guarantees, and ultimately arriving at something that is comparable to other implementations over many releases might be okay if anyone had the slightest idea of what that would look like. You haven't even _described the semantics_, which you could do by addressing the specific points that I raised. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 28 October 2017 at 22:04, Peter Geoghegan <pg@bowt.ie> wrote: > On Sat, Oct 28, 2017 at 12:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Nothing I am proposing blocks later work. > > Actually, many things will block future work if you go down that road. > You didn't respond to the specific points I raised, but that doesn't > mean that they're not real. > >> Everything you say makes it clear that a fully generalized solution is >> going to be many years in the making, assuming we agree. > > I think that it's formally impossible as long as you preserve the ON > CONFLICT guarantees, unless you somehow define the problems out of > existence. Those are guarantees which no other MERGE implementation > has ever made, and which the SQL standard says nothing about. And for > good reasons. > >> "The extent to which an SQL-implementation may disallow independent >> changes that are not significant is implementation-defined”. >> >> So we get to choose. I recommend that we choose something practical. >> We're approaching the 10 year anniversary of my first serious attempt >> to do MERGE. I say that its time to move forwards with useful >> solutions, rather than wait another 10 years for the perfect one, even >> assuming it exists. > > As far as I'm concerned, you're the one arguing for an unobtainable > solution over a good one, not me. I *don't* think you should solve the > problems that I raise -- you should instead implement MERGE without > any of the ON CONFLICT guarantees, just like everyone else has. > Building MERGE on top of the ON CONFLICT guarantees, and ultimately > arriving at something that is comparable to other implementations over > many releases might be okay if anyone had the slightest idea of what > that would look like. You haven't even _described the semantics_, > which you could do by addressing the specific points that I raised. I have no objection to you writing a better version than me and if my work inspires you to complete that in a reasonable timescale then we all win. I'm also very happy to work together on writing the version described - you have already done much work in this area. I don't see any major problems in points you've raised so far, though obviously there is much detail. All of this needs to be written and then committed, so I'll get on and write my proposal. We can then see whether that is an 80% solution or something less. There are more obvious barriers to completion at this point, like time and getting on with it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Oct 29, 2017 at 4:48 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I have no objection to you writing a better version than me and if my > work inspires you to complete that in a reasonable timescale then we > all win. My whole point is that the way that you seem determined to go on this is a dead end. I don't think that *anyone* can go improve on what you come up with if that's based heavily on ON CONFLICT, for the simple reason that the final user visible design is totally unclear. There is an easy way to make me shut up - come up with a design for MERGE that more or less builds on how UPDATE FROM works, rather than building MERGE on ON CONFLICT. (You might base things like RLS handling on ON CONFLICT, but in the main MERGE should be like an UPDATE FROM with an outer join, that can do INSERTs and DELETEs, too.) The original effort to add MERGE didn't do anything upsert-like, which Heikki (the GSOC mentor of the project) was perfectly comfortable with. I'm too lazy to go search the archives right now, but it's there. Heikki cites the SQL standard. This is what MERGE *actually is*, which you can clearly see from the Oracle/SQL Server/DB2 docs. It says this in the first paragraph of their MERGE documentation. It's crystal clear from their docs -- "This statement is a convenient way to combine multiple operations. It lets you avoid multiple INSERT, UPDATE, and DELETE DML statements." > I'm also very happy to work together on writing the version > described - you have already done much work in this area. You seem to want to preserve the ON CONFLICT guarantees at great cost. But you haven't even defended that based on a high level goal, or a use case, or something that makes sense to users (describing how it is possible is another matter). You haven't even tried to convince me. > I don't see any major problems in points you've raised so far, though > obviously there is much detail. Did you even read them? They are not mere details. They're fundamental to the semantics of the feature (if you base it on ON CONFLICT). It's not actually important that you understand them all; the important message is that generalizing ON CONFLICT has all kinds of terrible problems. > All of this needs to be written and then committed, so I'll get on and > write my proposal. We can then see whether that is an 80% solution or > something less. There are more obvious barriers to completion at this > point, like time and getting on with it. Getting on with *what*, exactly? In general, I have nothing against an 80% solution, or even a 50% solution, provided there is a plausible path to a 100% solution. I don't think that you have such a path, but only because you're tacitly inserting requirements that no other MERGE implementation has to live with, that I doubt any implementation *could* live with. Again, I'm not the one making this complicated, or adding requirements that will be difficult for you to get in to your v1 -- you're the one doing that. The semantics that I suggest (the SQL standard's semantics) will require less code, and will be far simpler. Right now, I simply don't understand why you're insisting on using ON CONFLICT without even saying why. I can only surmise that you think that doing so will simplify the implementation, but I can guarantee you that it won't. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Oct 29, 2017 at 1:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Nothing I am proposing blocks later work. That's not really true. Nobody's going to be happy if MERGE has one behavior in one set of cases and an astonishingly different behavior in another set of cases. If you adopt a behavior for certain cases that can't be extended to other cases, then you're blocking a general-purpose MERGE. And, indeed, it seems that you're proposing an implementation that adds no new functionality, just syntax compatibility. Do we really want or need two syntaxes for the same thing in core? I kinda think Peter might have the right idea here. Under his proposal, we'd be getting something that is, in a way, new. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 29 October 2017 at 21:25, Peter Geoghegan <pg@bowt.ie> wrote: > The semantics that I suggest (the SQL standard's semantics) will > require less code, and will be far simpler. Right now, I simply don't > understand why you're insisting on using ON CONFLICT without even > saying why. I can only surmise that you think that doing so will > simplify the implementation, but I can guarantee you that it won't. If you see problems in my proposal, please show the specific MERGE SQL statements that you think will give problems and explain how and what the failures will be. We can then use those test cases to drive developments. If we end up with code for multiple approaches we will be able to evaluate the differences between proposals using the test cases produced. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 30 October 2017 at 09:44, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Oct 29, 2017 at 1:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Nothing I am proposing blocks later work. > > That's not really true. Nobody's going to be happy if MERGE has one > behavior in one set of cases and an astonishingly different behavior > in another set of cases. If you adopt a behavior for certain cases > that can't be extended to other cases, then you're blocking a > general-purpose MERGE. If a general purpose solution exists, please explain what it is. The problem is that nobody has ever done so, so its not like we are discussing the difference between bad solution X and good solution Y, we are comparing reasonable solution X with non-existent solution Y. > And, indeed, it seems that you're proposing an implementation that > adds no new functionality, just syntax compatibility. Do we really > want or need two syntaxes for the same thing in core? I kinda think > Peter might have the right idea here. Under his proposal, we'd be > getting something that is, in a way, new. Partitioning looked like "just new syntax", but it has been a useful new feature. MERGE provides new capabilities that we do not have and is much more powerful than INSERT/UPDATE, in a syntax that follow what other databases use today. Just like partitioning. Will what I propose do everything in the first release? No, just like partitioning. If other developers are able to do things in phases, then I claim that right also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 30, 2017 at 6:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> That's not really true. Nobody's going to be happy if MERGE has one >> behavior in one set of cases and an astonishingly different behavior >> in another set of cases. If you adopt a behavior for certain cases >> that can't be extended to other cases, then you're blocking a >> general-purpose MERGE. > > If a general purpose solution exists, please explain what it is. For the umpteenth time, a general purpose solution is one that more or less works like an UPDATE FROM, with an outer join, whose ModifyTable node is capable of insert, update, or delete (and accepts quals for MATCHED and NOT matched cases, etc). You could still get duplicate violations due to concurrent activity in READ COMMITTED mode, but not at higher isolation levels thanks to Thomas Munro's work there. In this world, ON CONFLICT and MERGE are fairly distinct things. What's wrong with that? You haven't actually told us why you don't like that. > The problem is that nobody has ever done so, so its not like we are > discussing the difference between bad solution X and good solution Y, > we are comparing reasonable solution X with non-existent solution Y. Nobody knows what your proposal would be like when time came to remove the restrictions that you suggest could be removed later. You're the one with the information here. We need to know what those semantics would be up-front, since you're effectively committing us down that path. You keep making vague appeals to pragmatism, but, in all sincerity, I don't understand where you're coming from at all. I strongly believe that generalizing from ON CONFLICT doesn't even make the implementation easier in the short term. ISTM that you're making this difficult for yourself for reasons that are known only to you. >> And, indeed, it seems that you're proposing an implementation that >> adds no new functionality, just syntax compatibility. Do we really >> want or need two syntaxes for the same thing in core? I kinda think >> Peter might have the right idea here. Under his proposal, we'd be >> getting something that is, in a way, new. > > Partitioning looked like "just new syntax", but it has been a useful > new feature. False equivalency. Nobody, including you, ever argued that that work risked painting us into a corner. (IIRC you said something like the progress was too small to justify putting into a single release.) > MERGE provides new capabilities that we do not have and is much more > powerful than INSERT/UPDATE, in a syntax that follow what other > databases use today. Just like partitioning. But you haven't told us *how* it is more powerful. Again, the semantics of a MERGE that is a generalization of ON CONFLICT are not at all obvious, and seem like they might be very surprising and risky. There is no question that it's your job to (at a minimum) define those semantics ahead of time, since you're going to commit us to them in the long term if you continue down this path. It is most emphatically *not* just a "small matter of programming". -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 30 October 2017 at 18:59, Peter Geoghegan <pg@bowt.ie> wrote: > It is most emphatically *not* just a "small matter of programming". Please explain in detail the MERGE SQL statements that you think will be problematic and why. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 30, 2017 at 11:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Please explain in detail the MERGE SQL statements that you think will > be problematic and why. Your proposal is totally incomplete, so I can only surmise its behavior in certain cases, to make a guess at what the problems might be (see my remarks on EPQ, live lock, etc). This makes it impossible to do what you ask right now. Besides, you haven't answered the question from my last e-mail ("What's wrong with that [set of MERGE semantics]?"), so why should I go to further trouble? You're just not constructively engaging with me at this point. We're going around in circles. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 30, 2017 at 10:59:43AM -0700, Peter Geoghegan wrote: > On Mon, Oct 30, 2017 at 6:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > If a general purpose solution exists, please explain what it is. > > For the umpteenth time, a general purpose solution is one that more or > less works like an UPDATE FROM, with an outer join, whose ModifyTable > node is capable of insert, update, or delete (and accepts quals for > MATCHED and NOT matched cases, etc). You could still get duplicate > violations due to concurrent activity in READ COMMITTED mode, but not > at higher isolation levels thanks to Thomas Munro's work there. In > this world, ON CONFLICT and MERGE are fairly distinct things. FWIW, and as an outsider, having looked at MERGE docs from other RDBMSes, I have to agree that the PG UPSERT (ON CONFLICT .. DO) and MERGE are rather different beasts. In particular, I suspect all UPSERT statements can be mapped onto equivalent MERGE statements, but not all MERGE statements can be mapped onto UPSERTs. The reason is that UPSERT depends on UNIQUE constraints, whereas MERGE uses a generic join condition that need not even refer to any INDEXes, let alone UNIQUE ones. Perhaps PG's UPSERT can be generalized to create a temporary UNIQUE constraint on the specified in the conflict_target portion of the statement, increasing the number of MERGE statements that could be mapped onto UPSERT. But even then, that would still be a UNIQUE constraint, whereas MERGE does not even imply such a thing. Now, a subset of MERGE (those using equijoins in the ON condition) can be mapped onto UPSERT provided either a suitable UNIQUE index exists (or that PG notionally creates a temporary UNIQUE constraint for the purpose of evaluating the UPSERT). This approach would NOT preclude a more complete subsequent implementation of MERGE. But I wonder if that's worthwhile given that a proper and complete implementation of MERGE is probably very desirable. On a tangentially related note, I've long wanted to have an RDBMS- independent SQL parser for the purpose of implementing external query- rewriting (and external optimizers), syntax highlighting, and so on. Having an external / plug-in method for rewriting unsupported SQL as a way of bridging functionality gaps (like lack of MERGE support) would be very nice. PG does have a way to expose its AST... It might be a good idea to start by implementing unsupported SQL features in such a way that they parse and can produce an AST along with a syntax/unsupported error -- then one might rewrite the parsed AST, generate appropriate SQL, and execute that. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote: > On Sun, Oct 29, 2017 at 1:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > Nothing I am proposing blocks later work. > > That's not really true. Nobody's going to be happy if MERGE has one > behavior in one set of cases and an astonishingly different behavior > in another set of cases. If you adopt a behavior for certain cases > that can't be extended to other cases, then you're blocking a > general-purpose MERGE. > > And, indeed, it seems that you're proposing an implementation that > adds no new functionality, just syntax compatibility. Do we really > want or need two syntaxes for the same thing in core? I kinda think > Peter might have the right idea here. Under his proposal, we'd be > getting something that is, in a way, new. +1. I don't think MERGE should be radically different from other database systems and just syntax sugar over a capability we have. The downthread comparison to partitioning isn't accurate either. There's a reason that we have INSERT .. ON CONFLICT and not MERGE and it's because they aren't the same thing, as Peter's already explained, both now and when he and I had exactly this same discussion years ago when he was working on implementing INSERT .. ON CONFLICT. Time changes many things, but I don't think anything's changed in this from the prior discussions about it. Thanks! Stephen
On 30 October 2017 at 19:17, Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Oct 30, 2017 at 11:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Please explain in detail the MERGE SQL statements that you think will >> be problematic and why. > > Your proposal is totally incomplete, so I can only surmise its > behavior in certain cases, to make a guess at what the problems might > be (see my remarks on EPQ, live lock, etc). This makes it impossible > to do what you ask right now. Impossible, huh. Henry Ford was right. If there are challenges ahead, its reasonable to ask for test cases for that now especially if you think you know what they already are. Imagine we go forwards 2 months - if you dislike my patch when it exists you will submit a test case showing the fault. Why not save us all the trouble and describe that now? Test Driven Development. > Besides, you haven't answered the question from my last e-mail > ("What's wrong with that [set of MERGE semantics]?"), so why should I > go to further trouble? You're just not constructively engaging with me > at this point. It's difficult to discuss anything with someone that refuses to believe that there are acceptable ways around things. I believe there are. If you can calm down the rhetoric we can work together, but if you continue to grandstand it makes it more difficult. > We're going around in circles. Not really. You've said some things and I'm waiting for further details about the problems you've raised. You've said its possible another way. Show that assertion is actually true. We're all listening, me especially, for the technical details. I've got a fair amount of work to do to get the rest of the patch in shape, so take your time and make it a complete explanation. My only goal is the MERGE feature in PG11. For me this is a collaborative engineering challenge not a debate and definitely not an argument. If you describe your plan of how to do this, I may be able to follow it and include that design. If you don't, then it will be difficult for me to include your thoughts. If you or others wish to write something as well, I have already said that is OK too. If anybody's goal is to block development or to wait for perfection to exist at some unstated time in the future, than I disagree with those thoughts. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Can I add my 2c worth, as someone without a horse in the race, as it were, in the hope that telling me how I've got this wrong might clarify the argument a bit (or at least you can all start shouting at me rather than each other :) ) The point of merge is to allow you to choose to either INSERT or UPDATE (or indeed DELETE) records based on existing state, yes? That state is whatever the state of the system at the start of the transaction? If I understand correctly, the only time when this would be problematic is if you try to insert a record into a table which would not allow that INSERT because another transaction has performed an INSERT by the time the COMMIT happens, and where that new record would have changed the state of the MERGE clause, yes? Isn't the only reason this would fail if there is a unique constraint on that table? Yes, you could end up INSERTing values from the merge when another transaction has INSERTed another, but (again, unless I've misunderstood) there's nothing in the spec that says that shouldn't happen; meanwhile for those tables that do require unique values you can use the UPSERT mechanism, no? Geoff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 30 October 2017 at 19:55, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Sun, Oct 29, 2017 at 1:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> > Nothing I am proposing blocks later work. >> >> That's not really true. Nobody's going to be happy if MERGE has one >> behavior in one set of cases and an astonishingly different behavior >> in another set of cases. If you adopt a behavior for certain cases >> that can't be extended to other cases, then you're blocking a >> general-purpose MERGE. >> >> And, indeed, it seems that you're proposing an implementation that >> adds no new functionality, just syntax compatibility. Do we really >> want or need two syntaxes for the same thing in core? I kinda think >> Peter might have the right idea here. Under his proposal, we'd be >> getting something that is, in a way, new. > > +1. > > I don't think MERGE should be radically different from other database > systems and just syntax sugar over a capability we have. I've proposed a SQL Standard compliant implementation that would do much more than be new syntax over what we already have. So these two claims aren't accurate: "radical difference" and "syntax sugar over a capability we have". > Time changes > many things, but I don't think anything's changed in this from the prior > discussions about it. My proposal is new, that is what has changed. At this stage, general opinions can be misleading. Hi ho, hi ho. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon, * Simon Riggs (simon@2ndquadrant.com) wrote: > On 30 October 2017 at 19:55, Stephen Frost <sfrost@snowman.net> wrote: > > I don't think MERGE should be radically different from other database > > systems and just syntax sugar over a capability we have. > > I've proposed a SQL Standard compliant implementation that would do > much more than be new syntax over what we already have. > > So these two claims aren't accurate: "radical difference" and "syntax > sugar over a capability we have". Based on the discussion so far, those are the conclusions I've come to. Saying they're isn't accurate without providing anything further isn't likely to be helpful. > > Time changes > > many things, but I don't think anything's changed in this from the prior > > discussions about it. > > My proposal is new, that is what has changed. I'm happy to admit that I've missed something in the discussion, but from what I read the difference appeared to be primairly that you're proposing it, and doing so a couple years later. > At this stage, general opinions can be misleading. I'd certainly love to see a MERGE capability that meets the standard and works in much the same way from a user's perspective as the other RDBMS' which already implement it. From prior discussions with Peter on exactly that subject, I'm also convinced that having that would be a largely independent piece of work from the INSERT .. ON CONFLICT capability that we have (just as MERGE is distinct from similar UPSERT capabilities in other RDBMSs). The goal here is really just to avoid time wasted developing MERGE based on top of the INSERT .. ON CONFLICT system only to have it be rejected later because multiple other committers and major contributors have said that they don't agree with that approach. If, given all of this discussion, you don't feel that's a high risk with your approach then by all means continue on with what you're thinking and we can review the patch once it's posted. Thanks! Stephen
On 31 October 2017 at 12:56, Stephen Frost <sfrost@snowman.net> wrote: > Simon, > > * Simon Riggs (simon@2ndquadrant.com) wrote: >> On 30 October 2017 at 19:55, Stephen Frost <sfrost@snowman.net> wrote: >> > I don't think MERGE should be radically different from other database >> > systems and just syntax sugar over a capability we have. >> >> I've proposed a SQL Standard compliant implementation that would do >> much more than be new syntax over what we already have. >> >> So these two claims aren't accurate: "radical difference" and "syntax >> sugar over a capability we have". > > Based on the discussion so far, those are the conclusions I've come to. > Saying they're isn't accurate without providing anything further isn't > likely to be helpful. I'm trying to be clear, accurate and non-confrontational. I appreciate those are your conclusions, which is why I need to be clear that the proposal has been misunderstood on those stated points. What else would you like me to add to be helpful? I will try... I've spent weeks looking at other implementations and combing the SQLStandard with a fine tooth comb to see what is possible and what is not. My proposal shows a new way forwards and yet follows the standard in horrible detail. It has taken me nearly 2 months to write the doc page submitted here. It is not simply new syntax stretched over existing capability. The full syntax of MERGE is quite powerful and will take some months to make work, even with my proposal. I'm not sure what else to say, but I am happy to answer specific technical questions that have full context to allow a rational discussion. >> > Time changes >> > many things, but I don't think anything's changed in this from the prior >> > discussions about it. >> >> My proposal is new, that is what has changed. > > I'm happy to admit that I've missed something in the discussion, but > from what I read the difference appeared to be primairly that you're > proposing it, and doing so a couple years later. > >> At this stage, general opinions can be misleading. > > I'd certainly love to see a MERGE capability that meets the standard and > works in much the same way from a user's perspective as the other RDBMS' > which already implement it. The standard explains how it should work, with the proviso that the standard allows us to define concurrent behavior. Serge has explained that he sees that my proposal covers the main use cases. I don't yet see how to cover all, but I'm looking to do the most important use cases first and other things later. > From prior discussions with Peter on > exactly that subject, I'm also convinced that having that would be a > largely independent piece of work from the INSERT .. ON CONFLICT > capability that we have (just as MERGE is distinct from similar UPSERT > capabilities in other RDBMSs). > > The goal here is really just to avoid time wasted developing MERGE based > on top of the INSERT .. ON CONFLICT system only to have it be rejected > later because multiple other committers and major contributors have said > that they don't agree with that approach. If, given all of this > discussion, you don't feel that's a high risk with your approach then by > all means continue on with what you're thinking and we can review the > patch once it's posted. It is certainly a risk and I don't want to waste time either. I am happy to consider any complete proposal for how to proceed, including details and/or code, but I will be prioritising things to ensure that we have a candidate feature for PG11 from at least one author, rather than a research project for PG12+. The key point here is concurrency. I have said that there is an intermediate step that is useful and achievable, but does not cover every case. Serge has also stated this - who it should be noted is the only person in this discussion that has already written the MERGE statement in SQL, for DB2, so I give more weight to his technical opinion than I do to others, at this time. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 31, 2017 at 2:25 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > If there are challenges ahead, its reasonable to ask for test cases > for that now especially if you think you know what they already are. > Imagine we go forwards 2 months - if you dislike my patch when it > exists you will submit a test case showing the fault. Why not save us > all the trouble and describe that now? Test Driven Development. I already have, on several occasions now. But if you're absolutely insistent on my constructing the test case in terms of a real SQL statement, then that's what I'll do. Consider this MERGE statement, from your mock documentation: MERGE INTO wines w USING wine_stock_changes s ON s.winename = w.winename WHEN NOT MATCHED AND s.stock_delta > 0 THEN INSERT VALUES(s.winename, s.stock_delta) WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN UPDATE SET stock = w.stock + s.stock_delta ELSE DELETE; Suppose we remove the WHEN NOT MATCHED case, leaving us with: MERGE INTO wines w USING wine_stock_changes s ON s.winename = w.winename WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN UPDATE SET stock = w.stock + s.stock_delta ELSE DELETE; We now have a MERGE that will not INSERT, but will continue to UPDATE and DELETE. (It's implied that your syntax cannot do this at all, because you propose use the ON CONFLICT infrastructure, but I think we have to imagine a world in which that restriction either never existed or has subsequently been lifted.) The problem here is: Iff the first statement uses ON CONFLICT infrastructure, doesn't the absence of WHEN NOT MATCHED imply different semantics for the remaining updates and deletes in the second version of the query? You've removed what seems like a neat adjunct to the MERGE, but it actually changes everything else too when using READ COMMITTED. Isn't that pretty surprising? If you're not clear on what I mean, see my previous remarks on EPQ, live lock, and what a CONFLICT could be in READ COMMITTED mode. Concurrent activity at READ COMMITTED mode can be expected to significantly alter the outcome here. Why not just always use the behavior that the second query requires, which is very much like an UPDATE FROM with an outer join that can sometimes do deletes (and inserts)? We should always use an MVCC snapshot, and never play ON CONFLICT style games with visibility/dirty snapshots. > It's difficult to discuss anything with someone that refuses to > believe that there are acceptable ways around things. I believe there > are. Isn't that blind faith? Again, it seems like you haven't really tried to convince me. > If you can calm down the rhetoric we can work together, but if you > continue to grandstand it makes it more difficult. I'm trying to break the deadlock, by getting you to actually consider what I'm saying. I don't enjoy confrontation. Currently, it seems like you're just ignoring my objections, which you actually could fairly easily work through. That is rather frustrating. > You've said its possible another way. Show that assertion is actually > true. We're all listening, me especially, for the technical details. My proposal, if you want to call it that, has the merit of actually being how MERGE works in every other system. Both Robert and Stephen seem to be almost sold on what I suggest, so I think that I've probably already explained my position quite well. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Is it possible to map MERGE onto a query with CTEs that does the the various DMLs, with all but the last RETURNING? Here's a sketch: WITH matched_rows AS ( SELECT FROM <target> t WHERE <condition> ), updated_rows AS ( UPDATE <target> t SET ... WHERE ... AND t in (SELECT j FROM matched_rows j) RETURNING t ), inserted_rows AS ( INSERT INTO <target> t SELECT ... WHERE ... AND t NOT IN (SELECT j FROM matched_rows j) RETURNING t ), DELETE FROM <target> t WHERE ...; Now, one issue is that in PG CTEs are basically like temp tables, and also like optimizer barriers, so this construction is not online, and if matched_rows is very large, that would be a problem. As an aside, I'd like to be able to control which CTEs are view-like and which are table-like. In SQLite3, for example, they are all view-like, and the optimizer will act accordingly, whereas in PG they are all table-like, and thus optimizer barriers. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Nico Williams wrote: > As an aside, I'd like to be able to control which CTEs are view-like and > which are table-like. In SQLite3, for example, they are all view-like, > and the optimizer will act accordingly, whereas in PG they are all > table-like, and thus optimizer barriers. There was a short and easy to grasp (OK, maybe not) discussion on the topic of CTEs acting differently. I think the consensus is that for CTEs that are read-only and do not use functions that aren't immutable, they may be considered for inlining. https://www.postgresql.org/message-id/5351711493487900@web53g.yandex.ru -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 31 October 2017 at 18:55, Peter Geoghegan <pg@bowt.ie> wrote: > On Tue, Oct 31, 2017 at 2:25 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> If there are challenges ahead, its reasonable to ask for test cases >> for that now especially if you think you know what they already are. >> Imagine we go forwards 2 months - if you dislike my patch when it >> exists you will submit a test case showing the fault. Why not save us >> all the trouble and describe that now? Test Driven Development. > > I already have, on several occasions now. But if you're absolutely > insistent on my constructing the test case in terms of a real SQL > statement, then that's what I'll do. > > Consider this MERGE statement, from your mock documentation: > > MERGE INTO wines w > USING wine_stock_changes s > ON s.winename = w.winename > WHEN NOT MATCHED AND s.stock_delta > 0 THEN > INSERT VALUES(s.winename, s.stock_delta) > WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN > UPDATE SET stock = w.stock + s.stock_delta > ELSE > DELETE; > > Suppose we remove the WHEN NOT MATCHED case, leaving us with: > > MERGE INTO wines w > USING wine_stock_changes s > ON s.winename = w.winename > WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN > UPDATE SET stock = w.stock + s.stock_delta > ELSE > DELETE; > > We now have a MERGE that will not INSERT, but will continue to UPDATE > and DELETE. Agreed > (It's implied that your syntax cannot do this at all, > because you propose use the ON CONFLICT infrastructure, but I think we > have to imagine a world in which that restriction either never existed > or has subsequently been lifted.) > > The problem here is: Iff the first statement uses ON CONFLICT > infrastructure, doesn't the absence of WHEN NOT MATCHED imply > different semantics for the remaining updates and deletes in the > second version of the query? Not according to the SQL Standard, no. I have no plans for such differences to exist. Spec says: If we hit HeapTupleSelfUpdated then we throw an ERROR. > You've removed what seems like a neat > adjunct to the MERGE, but it actually changes everything else too when > using READ COMMITTED. Isn't that pretty surprising? I think you're presuming things I haven't said and don't mean, so we're both surprised. > If you're not > clear on what I mean, see my previous remarks on EPQ, live lock, and > what a CONFLICT could be in READ COMMITTED mode. Concurrent activity > at READ COMMITTED mode can be expected to significantly alter the > outcome here. And I still have questions about what exactly you mean, but at least this post is going in the right direction and I'm encouraged. Thank you, I think we need some way of expressing the problems clearly. > That is rather frustrating. Guess so. >> You've said its possible another way. Show that assertion is actually >> true. We're all listening, me especially, for the technical details. > > My proposal, if you want to call it that, has the merit of actually > being how MERGE works in every other system. Both Robert and Stephen > seem to be almost sold on what I suggest, so I think that I've > probably already explained my position quite well. The only info I have is "a general purpose solution is one that more or less works like an UPDATE FROM, with an outer join, whose ModifyTable node is capable of insert, update, or delete (and accepts quals for MATCHED and NOT matched cases, etc). You could still get duplicate violations due to concurrent activity in READ COMMITTED mode". Surely the whole point of this is to avoid duplicate violations due to concurrent activity? I'm not seeing how either design sketch rigorously avoids live locks, but those are fairly unlikely and easy to detect and abort. Thank you for a constructive email, we are on the way to somewhere good. I have more to add, but wanted to get back to you soonish. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 1, 2017 at 10:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The problem here is: Iff the first statement uses ON CONFLICT >> infrastructure, doesn't the absence of WHEN NOT MATCHED imply >> different semantics for the remaining updates and deletes in the >> second version of the query? > > Not according to the SQL Standard, no. I have no plans for such > differences to exist. > > Spec says: If we hit HeapTupleSelfUpdated then we throw an ERROR. Your documentation said that the MERGE was driven by a speculative insertion (BTW, I don't think that this internal implementation detail should be referenced in user-facing docs). I inferred that that could not always be true, since there won't always be an INSERT/WHEN NOT MATCHED case, assuming that you allow that at all (I now gather that you will). >> You've removed what seems like a neat >> adjunct to the MERGE, but it actually changes everything else too when >> using READ COMMITTED. Isn't that pretty surprising? > > I think you're presuming things I haven't said and don't mean, so > we're both surprised. You're right -- I'm surmising what I think might be true, because I don't have the information available to know one way or the other. As far as this issue with using speculative insertions in one context but not in another goes, I still don't really know where you stand. I can still only surmise that you must want both implementations, and will use one or the other as circumstances dictate (to avoid dup violations in the style of ON CONFLICT where that's possible). This seems true because you now say that it will be possible to omit WHEN NOT MATCHED, and yet there is no such thing as a speculative insertion without the insertion. You haven't said that that conclusion is true yourself, but it's the only conclusion that I can draw based on what you have said. > I think we need some way of expressing the problems clearly. It's certainly hard to talk about these problems. I know this from experience. > "a general purpose solution is one that more or > less works like an UPDATE FROM, with an outer join, whose ModifyTable > node is capable of insert, update, or delete (and accepts quals for > MATCHED and NOT matched cases, etc). You could still get duplicate > violations due to concurrent activity in READ COMMITTED mode". > > Surely the whole point of this is to avoid duplicate violations due to > concurrent activity? Now we're getting somewhere. I *don't* think that that's the whole point of MERGE. No other MERGE implementation does that, or claims to do that. The SQL standard says nothing about this. Heikki found this to be acceptable when working on the GSoC MERGE implementation that went nowhere. My position is that we ought to let MERGE be MERGE, and let ON CONFLICT be ON CONFLICT. In Postgres, you can avoid duplicate violations with MERGE by using a higher isolation level (these days, those are turned into a serialization error at higher isolation levels when no duplicate is visible to the xact's snapshot). MERGE isn't and shouldn't be special when it comes to concurrency. > I'm not seeing how either design sketch rigorously avoids live locks, > but those are fairly unlikely and easy to detect and abort. My MERGE semantics (which really are not mine at all) avoid live lock/lock starvation by simply never retrying anything without making forward progress. MERGE doesn't take any special interest in concurrency, just like any other DML statement that isn't INSERT with ON CONFLICT. ON CONFLICT would have live locks if it didn't always have the choice of inserting [1]. In many ways, the syntax of INSERT ON CONFLICT DO UPDATE is restricted in exactly the way it needs to be in order to function correctly. It wasn't an accident that it didn't end up being UPDATE ... ON NOUPDATE DO INSERT, or something like that, which Robert proposed at one point. ON CONFLICT plays by its own rules to a certain extent, because that's what you need in order to get the desired guarantees in READ COMMITTED mode [2]. This is the main reason why it was as painful a project as it was. Further generalizing that seems fraught with difficulties. It seems logically impossible to generalize it in a way where you don't end up with two behaviors masquerading as one. [1] https://wiki.postgresql.org/wiki/UPSERT#Theoretical_lock_starvation_hazards [2] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_implementation -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 1 November 2017 at 01:55, Peter Geoghegan <pg@bowt.ie> wrote: > The problem here is: Iff the first statement uses ON CONFLICT > infrastructure, doesn't the absence of WHEN NOT MATCHED imply > different semantics for the remaining updates and deletes in the > second version of the query? You've removed what seems like a neat > adjunct to the MERGE, but it actually changes everything else too when > using READ COMMITTED. Would these concerns be alleviated by adding some kind of Pg-specific decoration that constrained concurrency-safe MERGEs? So your first statement would be MERGE CONCURRENTLY ... and when you removed the WHEN NOT MATCHED clause it'd ERROR because that's no longer able to be done with the same concurrency-safe semantics? I don't know if this would be helpful TBH, or if it would negate Simon's compatibility goals. Just another idea. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 November 2017 at 01:14, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Nico Williams wrote: > >> As an aside, I'd like to be able to control which CTEs are view-like and >> which are table-like. In SQLite3, for example, they are all view-like, >> and the optimizer will act accordingly, whereas in PG they are all >> table-like, and thus optimizer barriers. > > There was a short and easy to grasp (OK, maybe not) discussion on the > topic of CTEs acting differently. I think the consensus is that for > CTEs that are read-only and do not use functions that aren't immutable, > they may be considered for inlining. > https://www.postgresql.org/message-id/5351711493487900@web53g.yandex.ru Yep. All theoretical though, I don't think anyone (myself included) stumped up a patch. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 2, 2017 at 8:28 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 1 November 2017 at 01:55, Peter Geoghegan <pg@bowt.ie> wrote: >> The problem here is: Iff the first statement uses ON CONFLICT >> infrastructure, doesn't the absence of WHEN NOT MATCHED imply >> different semantics for the remaining updates and deletes in the >> second version of the query? You've removed what seems like a neat >> adjunct to the MERGE, but it actually changes everything else too when >> using READ COMMITTED. > > Would these concerns be alleviated by adding some kind of Pg-specific > decoration that constrained concurrency-safe MERGEs? > > So your first statement would be > > MERGE CONCURRENTLY ... > > and when you removed the WHEN NOT MATCHED clause it'd ERROR because > that's no longer able to be done with the same concurrency-safe > semantics? > > I don't know if this would be helpful TBH, or if it would negate > Simon's compatibility goals. Just another idea. Yes, that fixes the problem. Of course, it also turns MERGE CONCURRENTLY into syntactic sugar for INSERT ON CONFLICT UPDATE, which brings one back to the question of exactly what we're trying to achieve here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 31, 2017 at 5:14 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > I've proposed a SQL Standard compliant implementation that would do > much more than be new syntax over what we already have. > > So these two claims aren't accurate: "radical difference" and "syntax > sugar over a capability we have". I think those claims are pretty accurate. The design of INSERT .. ON CONFLICT UPDATE and the supporting machinery only work if there is a unique index. It provides radically different behavior than what you get with any other DML statement, with completely novel concurrency behavior, and there is no reasonable way to emulate that behavior in cases where no relevant unique index exists. Therefore, if MERGE eventually uses INSERT .. ON CONFLICT UPDATE when a relevant unique index exists and does something else, such as your proposal of taking a strong lock, or Peter's proposal of doing this in a concurrency-oblivious manner, in other cases, then those two cases will behave very differently. And if, in the meantime, MERGE can only handle the cases where there is a unique index, then it can only handle the cases INSERT .. ON CONFLICT UPDATE can cover, which makes it, as far as I can see, syntactic sugar over what we already have. Maybe it's not entirely - you might be planning to make some minor functional enhancements - but it's not clear what those are, and I feel like whatever it is could be done with less work and more elegance by just extending the INSERT .. ON CONFLICT UPDATE syntax. And it does seem to be your intention to only handle the cases which the INSERT .. ON CONFLICT UPDATE infrastructure can cover, because upthread you wrote this: "I didn't say it but my intention was to just throw an ERROR if no single unique index can be identified." I don't think anybody's putting words into your mouth here. We're just reading what you wrote. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote: >And if, in the meantime, MERGE can only handle the cases where there >is a unique index, then it can only handle the cases INSERT .. ON >CONFLICT UPDATE can cover, which makes it, as far as I can see, >syntactic sugar over what we already have. Maybe it's not entirely - >you might be planning to make some minor functional enhancements - but >it's not clear what those are, and I feel like whatever it is could be >done with less work and more elegance by just extending the INSERT .. >ON CONFLICT UPDATE syntax. +1 Marko Tiikkaja's INSERT ... ON CONFLICT SELECT patch, which is in the current CF [1], moves things in this direction. I myself have occasionally wondered if it was worth adding an alternative DO DELETE conflict_action. This could appear alongside DO UPDATE, and be applied using MERGE-style conditions. All of these things seem like small adjuncts to ON CONFLICT because they're all just an alternative way of modifying or projecting the tuple that is locked by ON CONFLICT. Everything new would have to happen after the novel ON CONFLICT handling has already completed. The only reason that I haven't pursued this is because it doesn't seem that compelling. I mention it now because It's worth acknowledging that ON CONFLICT could be pushed a bit further in this direction. Of course, this still falls far short of making ON CONFLICT entirely like MERGE. [1] https://commitfest.postgresql.org/15/1241/ -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 1 November 2017 at 18:20, Peter Geoghegan <pg@bowt.ie> wrote: > In Postgres, you can avoid duplicate violations with MERGE by using a > higher isolation level (these days, those are turned into a > serialization error at higher isolation levels when no duplicate is > visible to the xact's snapshot). So if I understand you correctly, in your view MERGE should just fail with an ERROR if it runs concurrently with other DML? i.e. if a race condition between the query and an INSERT runs concurrently with another INSERT We have no interest in making that work? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 02, 2017 at 06:49:18PM +0000, Simon Riggs wrote: > On 1 November 2017 at 18:20, Peter Geoghegan <pg@bowt.ie> wrote: > > In Postgres, you can avoid duplicate violations with MERGE by using a > > higher isolation level (these days, those are turned into a > > serialization error at higher isolation levels when no duplicate is > > visible to the xact's snapshot). > > So if I understand you correctly, in your view MERGE should just fail > with an ERROR if it runs concurrently with other DML? > > i.e. if a race condition between the query and an INSERT runs > concurrently with another INSERT > > We have no interest in making that work? If you map MERGE to a DML with RETURNING-DML CTEs as I suggested before, how would that interact with concurrent DMLs? The INSERT DML of the mapped statement could produce conflicts that abort the whole MERGE, correct? If you want to ignore conflicts arising from concurrency you could always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I proposed earlier. Thus a MERGE CONCURRENTLY could just do that. Is there any reason not to map MERGE as I proposed? Such an implementation of MERGE wouldn't be online because CTEs are always implemented sequentially currently. That's probably reason enough to eventually produce a native implementation of MERGE, ... or to revamp the CTE machinery to allow such a mapping to be online. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> wrote: >So if I understand you correctly, in your view MERGE should just fail >with an ERROR if it runs concurrently with other DML? That's certainly my opinion on the matter. It seems like that might be the consensus, too. Obviously there are things that you as a user can do about this on your own, like opt to use a higher isolation level, or manually LOCK TABLE. For some use cases, including bulk loading for OLAP, users might just know that there isn't going to be concurrent activity because it's not an OLTP system. If this still seems odd to you, then consider that exactly the same situation exists with UPDATE. A user could want their UPDATE to affect a row where no row version is actually visible to their MVCC snapshot, because they have an idea about reliably updating the latest row. UPDATE doesn't work like that, of course. Is this unacceptable because the user expects that it should work that way? Bear in mind that ON CONFLICT DO UPDATE *can* actually update a row when there is no version of it visible to the snapshot. It can also update a row where there is a concurrent DELETE + INSERT, and the tuples with the relevant unique index values end up not even being part of the same update chain in each case (MVCC-snapshot-visible vs. latest). IOW, you may end up updating a completely different logical row to the row with the conflicting value that is visible to your MVCC snapshot! >i.e. if a race condition between the query and an INSERT runs >concurrently with another INSERT > >We have no interest in making that work? Without meaning to sound glib: we already did make it work for a special, restricted case that is important enough to justify introducing a couple of kludges -- ON CONFLICT DO UPDATE/upsert. I do agree that what I propose for MERGE will probably cause confusion; just look into Oracle's MERGE implementation for examples of this. We ought to go out of our way to make it clear that MERGE doesn't provide these guarantees. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
If nothing else, anyone needing MERGE can port their MERGE statements to a DML with DML-containing CTEs... The generic mapping would be something like this, I think: WITH rows AS (SELECT <target> FROM <target> WHERE <condition>) , updated AS ( UPDATE <target> SET ... WHERE <key>IN (SELECT <key> FROM rows) /* matched */ RETURNING <target> ) , inserted AS ( INSERT INTO <target> SELECT ... WHERE <key> NOT IN (SELECT <key> FROM rows) /* not matched */ RETURNING <target> ) DELETE FROM <target> WHERE (...) AND <key> NOT IN (SELECT <key> FROM updated UNION SELECT <key> FROM inserted); Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Nico Williams <nico@cryptonector.com> wrote: >If you want to ignore conflicts arising from concurrency you could >always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I >proposed earlier. Thus a MERGE CONCURRENTLY could just do that. > >Is there any reason not to map MERGE as I proposed? Performance, for one. MERGE generally has a join that can be optimized like an UPDATE FROM join. I haven't studied this question in any detail, but FWIW I think that using CTEs for merging is morally equivalent to a traditional MERGE implementation. It may actually be possible to map from CTEs to a MERGE statement, but I don't think that that's a good approach to implementing MERGE. Most of the implementation time will probably be spent doing things like making sure MERGE behaves appropriately with triggers, RLS, updatable views, and so on. That will take quite a while, but isn't particularly technically challenging IMV. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 02, 2017 at 12:51:45PM -0700, Peter Geoghegan wrote: > Nico Williams <nico@cryptonector.com> wrote: > >If you want to ignore conflicts arising from concurrency you could > >always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I > >proposed earlier. Thus a MERGE CONCURRENTLY could just do that. > > > >Is there any reason not to map MERGE as I proposed? > > Performance, for one. MERGE generally has a join that can be optimized > like an UPDATE FROM join. Ah, right, I think my mapping was pessimal. How about this mapping instead then: WITH updated AS ( UPDATE <target> SET ... WHERE <condition> RETURNING <target> ) , inserted AS ( INSERTINTO <target> SELECT ... WHERE <key> NOT IN (SELECT <key> FROM updated) AND .. /* * Add ON CONFLICTDO NOTHING here to avoid conflicts in the face * of concurrency. */ RETURNING <target> ) DELETE FROM <target> WHERE <key> NOT IN (SELECT <key> FROM updated) AND <key> NOT IN (SELECT <key> FROM inserted) AND ...; ? If a MERGE has no delete clause, then the mapping would be: WITH updated AS ( UPDATE <target> SET ... WHERE <condition> RETURNING <target> ) INSERT INTO <target> SELECT ... WHERE <key> NOT IN (SELECT <key> FROM updated) AND .. /** Add ON CONFLICT DO NOTHING here to avoid conflicts in the face* of concurrency.*/ ; > I haven't studied this question in any detail, but FWIW I think that > using CTEs for merging is morally equivalent to a traditional MERGE > implementation. [...] I agree. So why not do that initially? Optimize later. Such a MERGE mapping could be implemented entirely within src/backend/parser/gram.y ... Talk about cheap to implement, review, and maintain! Also, this would be notionally very simple. Any optimizations to CTE query/DML execution would be generic and applicable to MERGE and other things besides. If mapping MERGE to CTE-using DMLs motivates such optimizations, all the better. > [...]. It may actually be possible to map from CTEs to a MERGE > statement, but I don't think that that's a good approach to implementing > MERGE. Surely not every DML with CTEs can map to MERGE. Maybe I misunderstood your comment? > Most of the implementation time will probably be spent doing things like > making sure MERGE behaves appropriately with triggers, RLS, updatable > views, and so on. That will take quite a while, but isn't particularly > technically challenging IMV. Note that mapping to a DML with CTEs as above gets triggers, RLS, and updateable views right from the get-go, because DMLs with CTEs, and DMLs as CTEs, surely do as well. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 November 2017 at 19:16, Peter Geoghegan <pg@bowt.ie> wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: >> >> So if I understand you correctly, in your view MERGE should just fail >> with an ERROR if it runs concurrently with other DML? > > > That's certainly my opinion on the matter. It seems like that might be > the consensus, too. Given that I only just found out what you've been talking about, I don't believe that anybody else did either. I think people imagined you had worked out how to make MERGE run concurrently, I certainly did, but in fact you're just saying you don't believe it ever should. That is strange since the SQL Standard specifically allows the implementation to decide upon concurrent behaviour. > Without meaning to sound glib: we already did make it work for a > special, restricted case that is important enough to justify introducing > a couple of kludges -- ON CONFLICT DO UPDATE/upsert. > > I do agree that what I propose for MERGE will probably cause confusion; > just look into Oracle's MERGE implementation for examples of this. We > ought to go out of our way to make it clear that MERGE doesn't provide > these guarantees. So in your view we should make no attempt to avoid concurrent errors, even when we have the capability to do so (in some cases) and doing so would be perfectly compliant with the SQLStandard. Yes, that certainly will make an easier patch for MERGE. Or are you arguing against allowing any patch for MERGE? Now we have more clarity, who else agrees with this? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> wrote: >I think people imagined you had worked out how to make MERGE run >concurrently, I certainly did, but in fact you're just saying you >don't believe it ever should. I'm certain that they didn't think that at all. But I'll let them speak for themselves. >That is strange since the SQL Standard specifically allows the >implementation to decide upon concurrent behaviour. And yet nobody else decided to do what you propose with this apparent leeway. (See the UPSERT wiki page for many references that confirm this.) >So in your view we should make no attempt to avoid concurrent errors, >even when we have the capability to do so (in some cases) and doing so >would be perfectly compliant with the SQLStandard. Yes. That's what I believe. I believe this because I can't see a way to do this that isn't a mess, and because ON CONFLICT DO UPDATE exists and works well for the cases where we can do better in READ COMMITTED mode. Did you know that Oracle doesn't have an EPQ style mechanism at all? Instead, it rolls back the entire statement and retries it from scratch. That is not really any further from or closer to the standard than the EPQ stuff, because the standard doesn't say anything about what should happen as far as READ COMMITTED conflict handling goes. My point here is that all of the stuff I'm talking about is only relevant in READ COMMITTED mode, in areas where the standard never provides us with guidance. If you can rely on SSI, then there is no difference between what you propose and what I propose anyway, except that what I propose is more general and will have better performance, especially for batch MERGEs. If READ COMMITTED didn't exist, implementing ON CONFLICT would have been more or less free of controversy. >Yes, that certainly will make an easier patch for MERGE. Indeed, it will. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 02, 2017 at 02:05:19PM -0700, Peter Geoghegan wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: > >So in your view we should make no attempt to avoid concurrent errors, > >even when we have the capability to do so (in some cases) and doing so > >would be perfectly compliant with the SQLStandard. > > Yes. That's what I believe. I believe this because I can't see a way to > do this that isn't a mess, and because ON CONFLICT DO UPDATE exists and > works well for the cases where we can do better in READ COMMITTED mode. A MERGE mapped to a DML like this: WITH updated AS ( UPDATE <target> SET ... WHERE <condition> RETURNING <target> ) , insertedAS ( INSERT INTO <target> SELECT ... WHERE <key> NOT IN (SELECT <key> FROM updated) AND .. ON CONFLICT DO NOTHING -- see below! RETURNING <target> ) DELETE FROM <target> WHERE <key> NOT IN (SELECT <key>FROM updated) AND <key> NOT IN (SELECT <key> FROM inserted) AND ...; can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE. Now, one could write a MERGE that produces conflicts even without concurrency, so adding ON CONFLICT DO NOTHING by default as above... seems not-quite-correct. But presumably one wouldn't write MERGE statements that produce conflicts in the absence of concurrency, so this seems close enough to me. Another thing is that MERGE itself could get an ON CONFLICT clause for the INSERT portion of the MERGE, allowing one to ignore some conflicts and not others, though there would be no need for DO UPDATE, only DO NOTHING for conflict resolution :) This seems better. I do believe this mapping is correct, and could be implemented entirely in src/backend/parser/gram.y! Am I wrong about this? Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Nico Williams <nico@cryptonector.com> wrote: >A MERGE mapped to a DML like this: > > WITH > updated AS ( > UPDATE <target> > SET ... > WHERE <condition> > RETURNING <target> > ) > , inserted AS ( > INSERT INTO <target> > SELECT ... > WHERE <key> NOT IN (SELECT <key> FROM updated) AND .. > ON CONFLICT DO NOTHING -- see below! > RETURNING <target> > ) > DELETE FROM <target> > WHERE <key> NOT IN (SELECT <key> FROM updated) AND > <key> NOT IN (SELECT <key> FROM inserted) AND ...; > This is a bad idea. An implementation like this is not at all maintainable. >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE. That's not handling concurrency -- it's silently ignoring an error. Who is to say that the conflict that IGNORE ignored is associated with a row visible to the MVCC snapshot of the statement? IOW, why should the DELETE affect any row? There are probably a great many reasons why you need a ModifyTable executor node that keeps around state, and explicitly indicates that a MERGE is a MERGE. For example, we'll probably want statement level triggers to execute in a fixed order, regardless of the MERGE, RLS will probably require explicitly knowledge of MERGE semantics, and so on. FWIW, your example doesn't actually have a source (just a target), so it isn't actually like MERGE. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote: > Nico Williams <nico@cryptonector.com> wrote: > >A MERGE mapped to a DML like this: > > This is a bad idea. An implementation like this is not at all > maintainable. Assuming the DELETE issue can be addressed, why would this not be maintainable? > >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE. > > That's not handling concurrency -- it's silently ignoring an error. Who > is to say that the conflict that IGNORE ignored is associated with a row > visible to the MVCC snapshot of the statement? IOW, why should the DELETE > affect any row? Ah, yes, we'd have to make sure the DELETE does not delete rows that could not be inserted. There's... no way to find out what those would have been -- RETURNING won't mention them, though it'd be a nice addition to UPSERT to have a way to do that, and it'd make this mapping feasible. > There are probably a great many reasons why you need a ModifyTable > executor node that keeps around state, and explicitly indicates that a > MERGE is a MERGE. For example, we'll probably want statement level > triggers to execute in a fixed order, regardless of the MERGE, RLS will > probably require explicitly knowledge of MERGE semantics, and so on. Wouldn't those fire anyways in a statement like the one I mentioned? > FWIW, your example doesn't actually have a source (just a target), so it > isn't actually like MERGE. That can be added. I was trying to keep it pithy. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 November 2017 at 17:06, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Oct 31, 2017 at 5:14 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I've proposed a SQL Standard compliant implementation that would do >> much more than be new syntax over what we already have. >> >> So these two claims aren't accurate: "radical difference" and "syntax >> sugar over a capability we have". > > I think those claims are pretty accurate. No, because there is misunderstanding on some technical points. They key point is at the end - what do we do next? > The design of INSERT .. ON CONFLICT UPDATE and the supporting > machinery only work if there is a unique index. It provides radically > different behavior than what you get with any other DML statement, > with completely novel concurrency behavior, and there is no reasonable > way to emulate that behavior in cases where no relevant unique index > exists. Agreed > Therefore, if MERGE eventually uses INSERT .. ON CONFLICT > UPDATE when a relevant unique index exists and does something else, > such as your proposal of taking a strong lock, or Peter's proposal of > doing this in a concurrency-oblivious manner, in other cases, then > those two cases will behave very differently. The *only* behavioural difference I have proposed would be the *lack* of an ERROR in (some) concurrent cases. We have a way to avoid some concurrency errors during MERGE, while still maintaining exact SQL Standard behaviour. Peter has pointed out that we could not catch errors in certain cases; that does nothing to the cases where it will work well. It would be fallacious to imagine it is an all or nothing situation. > And if, in the meantime, MERGE can only handle the cases where there > is a unique index, I haven't proposed that MERGE would be forever limited to cases with a unique index, only that MERGE-for-unique-indexes would be the version in PG11, for good reason. > then it can only handle the cases INSERT .. ON > CONFLICT UPDATE can cover, which makes it, as far as I can see, > syntactic sugar over what we already have. SQL:2011 is greatly enhanced and this is no longer true; it was in earlier versions but is not now. MERGE allows you do DELETE, as well as conditional UPDATE and INSERT. It is very clearly much more than UPSERT. > Maybe it's not entirely - > you might be planning to make some minor functional enhancements - but > it's not clear what those are, and I feel like whatever it is could be > done with less work and more elegance by just extending the INSERT .. > ON CONFLICT UPDATE syntax. The differences are clearly apparent in the Standard and in my submitted docs. But I am interested in submitting a SQL Standard compliant feature. > And it does seem to be your intention to only handle the cases which > the INSERT .. ON CONFLICT UPDATE infrastructure can cover, because > upthread you wrote this: "I didn't say it but my intention was to just > throw an ERROR if no single unique index can be identified." I don't > think anybody's putting words into your mouth here. We're just > reading what you wrote. Happy to have written it because that covers the common use case I was hoping to deliver in PG11. Incremental development. My proposal was to implement the common case, while avoiding concurrency errors. Peter proposes that I cover both the common case and more general cases, without avoiding concurrency errors. All I have at the moment is that a few people disagree, but that doesn't help determine the next action. We seem to have a few options for PG11 1. Do nothing, we reject MERGE 2. Implement MERGE for unique index situations only, attempting to avoid errors (Simon OP) 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter) 4. Implement MERGE, while attempting to avoid concurrent ERRORs in cases where that is possible. Stephen, Robert, please say which option you now believe we should pick. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 November 2017 at 22:59, Nico Williams <nico@cryptonector.com> wrote: > On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote: >> Nico Williams <nico@cryptonector.com> wrote: >> >A MERGE mapped to a DML like this: >> >> This is a bad idea. An implementation like this is not at all >> maintainable. > > Assuming the DELETE issue can be addressed, why would this not be > maintainable? It would only take one change to make this approach infeasible and when that happened we would need to revert to the full-executor version. One difference that comes to mind is that MERGE doesn't behave the same way as an UPDATE-join, according to SQL:2011 in that it must throw an error if duplicate changes are requested. That would be hard to emulate using a parser only version. I would call it impressively clever but likely fragile, in this case, though I encourage more ideas like that in the future. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
PMFJI > We seem to have a few options for PG11 > > 1. Do nothing, we reject MERGE > > 2. Implement MERGE for unique index situations only, attempting to > avoid errors (Simon OP) > > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs > (Peter) > > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in > cases where that is possible. From an end-users point of view I would prefer 3 (or 4 if that won't prevent this from going into 11) INSERT ... ON CONFLICT is great, but there are situations where the restrictions can get in the way and it would be nice to have an alternative - albeit with some (documented) drawbacks. As far as I know Oracle also doesn't guarantee that MERGE is safe for concurrent use - you can still wind up with a unique key violation. -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 3, 2017 at 1:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Therefore, if MERGE eventually uses INSERT .. ON CONFLICT >> UPDATE when a relevant unique index exists and does something else, >> such as your proposal of taking a strong lock, or Peter's proposal of >> doing this in a concurrency-oblivious manner, in other cases, then >> those two cases will behave very differently. > > The *only* behavioural difference I have proposed would be the *lack* > of an ERROR in (some) concurrent cases. I think that's a big difference. Error vs. non-error is a big deal by itself; also, the non-error case involves departing from MVCC semantics just as INSERT .. ON CONFLICT UPDATE does. > All I have at the moment is that a few people disagree, but that > doesn't help determine the next action. > > We seem to have a few options for PG11 > > 1. Do nothing, we reject MERGE > > 2. Implement MERGE for unique index situations only, attempting to > avoid errors (Simon OP) > > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter) > > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in > cases where that is possible. > > Stephen, Robert, please say which option you now believe we should pick. I think Peter has made a good case for #3, so I lean toward that option. I think #4 is too much of a non-obvious behavior difference between the cases where we can avoid those errors and the cases where we can't, and I don't see where #2 can go in the future other than #4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 3 November 2017 at 07:46, Thomas Kellerer <spam_eater@gmx.net> wrote: > PMFJI > >> We seem to have a few options for PG11 >> >> 1. Do nothing, we reject MERGE >> >> 2. Implement MERGE for unique index situations only, attempting to >> avoid errors (Simon OP) >> >> 3. Implement MERGE, but without attempting to avoid concurrent ERRORs >> (Peter) >> >> 4. Implement MERGE, while attempting to avoid concurrent ERRORs in >> cases where that is possible. > > From an end-users point of view I would prefer 3 (or 4 if that won't prevent > this from going into 11) Sounds reasonable approach. > INSERT ... ON CONFLICT is great, but there are situations where the > restrictions can get in the way and it would be nice to have an alternative > - albeit with some (documented) drawbacks. As far as I know Oracle also > doesn't guarantee that MERGE is safe for concurrent use - you can still wind > up with a unique key violation. Yes, Oracle allows some unique key violations. It's clear that we would need to allow some also. So we clearly can't infer whether they avoid some errors just because they allow some. My approach will be to reduce the errors in the best way, not to try to copy errors Oracle makes, if any. But that error avoidance can easily be a later add-on if we prefer it that way. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Nov 3, 2017 at 1:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > We seem to have a few options for PG11 > > > > 1. Do nothing, we reject MERGE > > > > 2. Implement MERGE for unique index situations only, attempting to > > avoid errors (Simon OP) > > > > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter) > > > > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in > > cases where that is possible. > > > > Stephen, Robert, please say which option you now believe we should pick. > > I think Peter has made a good case for #3, so I lean toward that > option. I think #4 is too much of a non-obvious behavior difference > between the cases where we can avoid those errors and the cases where > we can't, and I don't see where #2 can go in the future other than #4. Agreed. Thanks! Stephen
On 3 November 2017 at 08:26, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Nov 3, 2017 at 1:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Therefore, if MERGE eventually uses INSERT .. ON CONFLICT >>> UPDATE when a relevant unique index exists and does something else, >>> such as your proposal of taking a strong lock, or Peter's proposal of >>> doing this in a concurrency-oblivious manner, in other cases, then >>> those two cases will behave very differently. >> >> The *only* behavioural difference I have proposed would be the *lack* >> of an ERROR in (some) concurrent cases. > > I think that's a big difference. Error vs. non-error is a big deal by > itself; Are you saying avoiding an ERROR is a bad or good thing? > also, the non-error case involves departing from MVCC > semantics just as INSERT .. ON CONFLICT UPDATE does. Meaning what exactly? What situation occurs that a user would be concerned with? Please describe exactly what you mean so we get it clear. The concurrent behaviour for MERGE is allowed to be implementation-specific, so we can define it any way we want. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> wrote: >>> The *only* behavioural difference I have proposed would be the *lack* >>> of an ERROR in (some) concurrent cases. >> >> I think that's a big difference. Error vs. non-error is a big deal by >> itself; > >Are you saying avoiding an ERROR is a bad or good thing? Are you really asking Robert to repeat what has already been said about a dozen different ways? That's *not* the only difference. You need to see a couple of steps ahead to see further differences, as the real dilemma comes when you have to reconcile having provided the UPSERT-guarantees with cases that that doesn't map on to (which can happen in a number of different ways). I don't understand why you'll talk about just about anything but that. This is a high-level concern about the overarching design. Do you really not understand the concern at this point? >> also, the non-error case involves departing from MVCC >> semantics just as INSERT .. ON CONFLICT UPDATE does. > >Meaning what exactly? What situation occurs that a user would be concerned with? > >Please describe exactly what you mean so we get it clear. > >The concurrent behaviour for MERGE is allowed to be >implementation-specific, so we can define it any way we want. Agreed -- we can. It isn't controversial at all to say that the SQL standard has nothing to say on this question. The problem is that the semantics you argue for are ill-defined, and seem to create more problems than they solve. Why keep bringing up the SQL standard? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 3 November 2017 at 16:35, Peter Geoghegan <pg@bowt.ie> wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: >>>> >>>> The *only* behavioural difference I have proposed would be the *lack* >>>> of an ERROR in (some) concurrent cases. >>> >>> >>> I think that's a big difference. Error vs. non-error is a big deal by >>> itself; >> >> >> Are you saying avoiding an ERROR is a bad or good thing? > > > Are you really asking Robert to repeat what has already been said about > a dozen different ways? I'm asking for clarity of explanation rather than assertions. > That's *not* the only difference. You need to see a couple of steps > ahead to see further differences, as the real dilemma comes when you > have to reconcile having provided the UPSERT-guarantees with cases that > that doesn't map on to (which can happen in a number of different ways). > > I don't understand why you'll talk about just about anything but that. > This is a high-level concern about the overarching design. Do you really > not understand the concern at this point? You're either referring to what is in the docs, which is INSERT ... ON CONFLICT violates MVCC in a particular way, or something as yet unstated. If it is the former, then I still don't see the problem (see later). If it is the latter, I need more. So either way I need more. > Robert Haas said >In the past, there have been objections to implementations of MERGE >which would give rise to such serialization anomalies, but I'm not >sure we should feel bound by those discussions. One thing that's >different is that the common and actually-useful case can now be made >to work in a fairly satisfying way using INSERT .. ON CONFLICT UPDATE; >if less useful cases are vulnerable to some weirdness, maybe it's OK >to just document the problems. I agreed with that, and still do. We need a clear, explicit description of this situation so I will attempt that in detail here. The basic concurrency problem we have is this APPROACH1 1. Join to produce results based upon snapshot at start of query 2. Apply results for INSERT, UPDATE or DELETE Given there is a time delay between 1 and 2 there is a race condition so that if another user concurrently inserts the same value into a unique index then an INSERT will fail with a uniqueness violation. Such failures are of great concern in practice because the time between 1 and 2 could be very long for large statements, or for smaller statements we might have sufficiently high concurrency to allow us to see regular failures. APPROACH2 (modified from my original proposal slightly) 1. Join... 2. Apply results for UPDATE, if present not visible via the snapshot taken at 1, do EPQ to ensure we locate current live tuple 3. If still not visible, do speculative insertion if we have a unique index available, otherwise ERROR. If spec insertion fails, go to 2 The loop created above can live-lock, meaning that an infinite loop could be created. In practice, such live-locks are rare and we could detect them by falling out of the loop after a few tries. Approach2's purpose is to alleviate errors in Approach1, so falling out of the loop merely takes us back to the error we would have got if we didn't try, so Approach2 has considerable benefit over Approach1. This only applies if we do an INSERT, so if there is a WHEN NOT MATCHED ... AND clause with probability W, that makes the INSERT rare then we simply have the probablility of error in Approach2 approach the probability of error in Approach1 as the W drops to zero, but with W high we may avoid many errors. Approach2 never generates more errors than Approach1. I read that step 3 in Approach2 is some kind of problem in MVCC semantics. My understanding is that SQL Standard allows us to define what the semantics of the statement are in relation to concurrency, so any semantic issue can be handled by defining it to work the way we want. The semantics are: a) when a unique index is available we avoid errors by using semantics of INSERT .. ON CONFLICT UPDATE. b) when a unique index is not available we use other semantics. To me this is the same as INSERTs failing in the presence of unique indexes, but not failing when no index is present. The presence of a unique constraint alters the semantics of the query. We can choose Approach2 - as Robert says "[we should not] feel bound by those [earlier] discussions" Please explain what is wrong with the above without merely asserting there is a problem. As you point out, whichever we choose, we will be bound by those semantics. So if we take Approach1, as has been indicated currently, what is the written explanation for that, so we can show that to the people who ask in the future about our decisions? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> wrote: >APPROACH1 >1. Join to produce results based upon snapshot at start of query >2. Apply results for INSERT, UPDATE or DELETE >Such failures are of great concern in practice because the time >between 1 and 2 could be very long for large statements, or for >smaller statements we might have sufficiently high concurrency to >allow us to see regular failures. I'm not sure that they're a *great* concern in a world with something that targets UPSERT use cases, which is a situation that does not exist in DBMSs with MERGE (with the notable exception of Teradata). But it's clearly a concern that users may expect to avoid duplicate violations in READ COMMITTED, since this caused confusion among users of other database systems with MERGE. >APPROACH2 (modified from my original proposal slightly) This write-up actually begins to confront the issues that I've raised. I'm glad to see this. >1. Join... >2. Apply results for UPDATE, if present not visible via the snapshot >taken at 1, do EPQ to ensure we locate current live tuple >3. If still not visible, do speculative insertion if we have a unique >index available, otherwise ERROR. If spec insertion fails, go to 2 > >The loop created above can live-lock, meaning that an infinite loop >could be created. The loop is *guaranteed* to live-lock once you "goto 2". So you might as well just throw an error at that point, which is the behavior that I've been arguing for all along! If this isn't guaranteed to live-lock at "goto 2", then it's not clear why. The outcome of step 2 is clearly going to be identical if you don't acquire a new MVCC snapshot, but you don't address that. You might have meant "apply an equivalent ON CONFLICT DO UPDATE", or something like that, despite the fact that the use of ON CONFLICT DO NOTHING was clearly implied by the "goto 2". I also see problems with that, but I'll wait for you to clarify what you meant before going into what they are. >In practice, such live-locks are rare and we could detect them by >falling out of the loop after a few tries. Approach2's purpose is to >alleviate errors in Approach1, so falling out of the loop merely takes >us back to the error we would have got if we didn't try, so Approach2 >has considerable benefit over Approach1. I don't hate the idea of retrying a fixed number of times for things like this, but I don't like it either. I'm going to assume that it's fine for now. >I read that step 3 in Approach2 is some kind of problem in MVCC >semantics. My understanding is that SQL Standard allows us to define >what the semantics of the statement are in relation to concurrency, so >any semantic issue can be handled by defining it to work the way we >want. My only concern is that our choices here should be good ones, based on practical considerations. We both more or less agree on how this should be assessed, I think; we just reach different conclusions. >As you point out, whichever we choose, we will be bound by those >semantics. So if we take Approach1, as has been indicated currently, >what is the written explanation for that, so we can show that to the >people who ask in the future about our decisions? Well, Approach1 is what other systems implement. I think that it would be important to point out that MERGE with Approach1 isn't special, but ON CONFLICT DO UPDATE is special. We'd also say that higher isolation levels will not have duplicate violations. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 6 November 2017 at 18:35, Peter Geoghegan <pg@bowt.ie> wrote: >> APPROACH2 (modified from my original proposal slightly) > > > This write-up actually begins to confront the issues that I've raised. > I'm glad to see this. > >> 1. Join... >> 2. Apply results for UPDATE, if present not visible via the snapshot >> taken at 1, do EPQ to ensure we locate current live tuple >> 3. If still not visible, do speculative insertion if we have a unique >> index available, otherwise ERROR. If spec insertion fails, go to 2 >> >> The loop created above can live-lock, meaning that an infinite loop >> could be created. > > > The loop is *guaranteed* to live-lock once you "goto 2". So you might as > well just throw an error at that point, which is the behavior that I've > been arguing for all along! > > If this isn't guaranteed to live-lock at "goto 2", then it's not clear > why. The outcome of step 2 is clearly going to be identical if you don't > acquire a new MVCC snapshot, but you don't address that. > > You might have meant "apply an equivalent ON CONFLICT DO UPDATE", or > something like that, despite the fact that the use of ON CONFLICT DO > NOTHING was clearly implied by the "goto 2". I also see problems with > that, but I'll wait for you to clarify what you meant before going into > what they are. In step 3 we discover that an entry exists in the index for a committed row. Since we have a unique index we use it to locate the row we know exists and UPDATE that. We don't use a new MVCC snapshot, we do what EPQ does. EPQ is already violating MVCC for UPDATEs, so why does it matter if we do it for INSERTs also? Where hides the problem? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> wrote: >In step 3 we discover that an entry exists in the index for a committed row. > >Since we have a unique index we use it to locate the row we know >exists and UPDATE that. > >We don't use a new MVCC snapshot, we do what EPQ does. EPQ is already >violating MVCC for UPDATEs, so why does it matter if we do it for >INSERTs also? Before I go on to say why I think that this approach is problematic, I want to point out a few things that I think we actually agree on: * EPQ is fairly arbitrary as a behavior for READ COMMITTED UPDATE conflict handling. It has more to do with how VACUUM worksthan about some platonic ideal that everyone agrees on. * We can imagine other alternatives, such as the behavior in Oracle (statement level rollback + optimistic retry). * Those alternatives are probably better in some ways but worse in other ways. * EPQ violates snapshot consistency, even though that's not inherently necessary to avoid "READ COMMITTED serialization errors". * ON CONFLICT also violates snapshot consistency, in rather a different way. (Whether or not this is necessary is more debatable.) I actually think that other MVCC systems don't actually copy Oracle here, either, and for similar pragmatic reasons. It's a mixed bag. >Where hides the problem? The problem is violating MVCC is something that can be done in different ways, and by meaningful degrees: * EPQ semantics are believed to be fine because we don't get complaints about it. I think that that's because it's specializedto UPDATEs and UPDATE-like operations, where we walk an UPDATE chain specifically, and only use a dirty snapshotfor the chain's newer tuples. * ON CONFLICT doesn't care about UPDATE chains. Unlike EPQ, it makes no distinction between a concurrent UPDATE, and a concurrentDELETE + fresh INSERT. It's specialized to CONFLICTs. This might seem abstract, but it has real, practical implications. Certain contradictions exist when you start with MVCC semantics, then fall back to EPQ semantics, then finally fall back to ON CONFLICT semantics. Questions about mixing these two things: * What do we do if someone concurrently UPDATEs in a way that makes the qual not pass during EPQ traversal? Should we INSERTwhen that happens? * If so, what about the case when the MERGE join qual/unique index values didn't change (just some other attributes thatdo not pass the additional WHEN MATCHED qual)? * What about when there was a concurrent DELETE -- should we INSERT then? ON CONFLICT goes from a CONFLICT, and then applies its own qual. That's hugely different to doing it the other way around: starting from your own MVCC snapshot qual, and going to a CONFLICT. This is because evaluating the DO UPDATE's WHERE clause is just one little extra step after the one and only latest row for that value has been locked. You could theoretically go this way with 2PL, I think, because that's a bit like locking every row that the predicate touches, but of course that isn't at all practical. I should stop trying to make a watertight case against this, even though I still think that's possible. For now, instead, I'll just say that this is *extremely* complicated, and still has unresolved questions about semantics. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 6 November 2017 at 17:35, Simon Riggs <simon@2ndquadrant.com> wrote: > I read that step 3 in Approach2 is some kind of problem in MVCC > semantics. My understanding is that SQL Standard allows us to define > what the semantics of the statement are in relation to concurrency, so > any semantic issue can be handled by defining it to work the way we > want. The semantics are: > a) when a unique index is available we avoid errors by using semantics > of INSERT .. ON CONFLICT UPDATE. > b) when a unique index is not available we use other semantics. I'm obviously being obtuse. If a unique index is not available, then surely there won't _be_ a failure? The INSERT (or indeed UPDATE) that results in two similar records will simply happen, and you will end up with two records the same. That's OK, based on the semantics of MERGE, no? At the transaction-start INSERT was the correct thing to do. Geoff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote: > Nico Williams <nico@cryptonector.com> wrote: > >A MERGE mapped to a DML like this: I needed to spend more time reading MERGE docs from other RDBMSes. The best MERGE so far is MS SQL Server's, which looks like: MERGE INTO <target> <target_alias> USING <source> <source_alias> ON (<join condition>) -- optional: WHEN MATCHED THEN UPDATESET ... -- optional: WHEN NOT MATCHED [ BY TARGET ] THEN INSERT ... -- optional: WHEN NOT MATCHED BY SOURCE THEN DELETE -- optional: OUTPUT ... ; (The other MERGEs are harder to use because they lack a WHEN NOT MATCHED BY SOURCE THEN DELETE, instead having a DELETE clause on the UPDATE, which is then difficult to use.) This is *trivial* to map to a CTE, and, in fact, I and my colleagues have resorted to hand-coded CTEs like this precisely because PG lacks MERGE (though we ourselves didn't know about MERGE -- it's new to us). If <source> is a query, then we start with a CTE for that, else if it's a view or table, then we don't setup a CTE for it. Each of the UPDATE, INSERT, and/or DELETE can be it's own CTE. If there's an OUTPUT clause, that can be a final SELECT that queries from the CTEs that ran the DMLs with RETURNING. If there's no OUTPUT then none of the DMLs need to have RETURNING, and one of them will be the main statement, rather than a CTE. The pattern is: WITH -- IFF <source> is a query: <source_alias> AS (<source>), -- IFF there's a WHEN MATCHED THEN UPDATE updates AS ( UPDATE <target> AS <target_alias> SET ... FROM <source> WHERE <join_condition> -- IFF there's an OUTPUT clause, then: RETURNING 'update' as "@action", ... ), inserts AS ( INSERT INTO <target> (<column_list>) SELECT ... FROM <source> LEFT JOIN <target> ON <join_condition> WHERE <target> IS NOT DISTINCT FROM NULL -- IFF there's a CONCURRENTLY clause: ON CONFLICT DONOTHING -- IFF there's an OUTPUT clause, then: RETURNING 'insert' as "@action", ... ), deletes AS ( DELETE FROM <target> WHERE NOT EXISTS (SELECT * FROM <source> WHERE <join_condition>) -- IFFthere's an OUTPUT clause, then: RETURNING 'delete' as "@action", ... ), -- IFF there's an OUTPUT clause SELECT * FROM updates UNION SELECT * FROM inserts UNION SELECT * FROM deletes; If there's not an output clause then one of the DMLs has to be the main statement: WITH ... DELETE ...; -- or UPDATE, or INSERT Note that if the source is a view or table and there's no OUTPUT clause, then it's one DML with up to (but not referring to) two CTEs, and in all cases the CTEs do not refer to each other. This means that the executor can parallelize all of the DMLs. If the source is a query, then that could be made a temp view to avoid having to run the query first. The CTE executor needs to learn to sometimes do this anyways, so this is good. The <deletes> CTE can be equivalently written without a NOT EXISTS: to_be_deleted AS ( SELECT <target> FROM <target> LEFT JOIN <source> ON (<join_condition>) WHERE <source>IS NOT DISTINCT FROM NULL ), deletes AS ( DELETE FROM <target> USING to_be_deleted tbd WHERE <target>= <tbd> ) if that were to run faster (probably not, since PG today would first run the to_be_deleted CTE, then the deletes CTE). I mention only because it's nice to see the symmetry of LEFT JOINs for the two WHEN NOT MATCHED cases. (Here <source> is the alias for it if one was given.) *** This mapping triggers triggers as one would expect (at least FOR EACH ROW; I expect the DMLs in CTEs should also trigger FOR EACH STATEMENT triggers, and if they don't I consider that a bug). > This is a bad idea. An implementation like this is not at all > maintainable. I beg to differ. First of all, not having to add an executor for MERGE is a win: much, much less code to maintain. The code to map MERGE to CTEs can easily be contained entirely in src/backend/parser/gram.y, which is a maintainability win: any changes to how CTEs are compiled will fail to compile if they break the MERGE mapping to CTEs. > >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE. > > That's not handling concurrency -- it's silently ignoring an error. Who > is to say that the conflict that IGNORE ignored is associated with a row > visible to the MVCC snapshot of the statement? IOW, why should the DELETE > affect any row? That was me misunderstanding MERGE. The DELETE is independent of the INSERT -- if an INSERT does nothing because of an ON CONFLICT DO NOTHING clause, then that won't cause that row to be deleted -- the inserts and deletes CTEs are independent in the latest mapping (see above). I believe adding ON CONFLICT DO NOTHING to the INSERT in this mapping is all that's needed to support concurrency. > There are probably a great many reasons why you need a ModifyTable > executor node that keeps around state, and explicitly indicates that a > MERGE is a MERGE. For example, we'll probably want statement level > triggers to execute in a fixed order, regardless of the MERGE, RLS will > probably require explicitly knowledge of MERGE semantics, and so on. Let's take those examples one at a time: - Is there a reason to believe that MERGE could not parallelize the DMLs it implies? If they can be parallelized, then we should not define the order in which the corresponding triggers fire. Surely we want to leave that possibility (parallelization) open, rather than exclude it. The user should not depend on the order in which the FOR EACH STATEMENT and FOR EACH ROW triggers will fire. They canalways check at the end of the transaction with DEFERRED triggers (see also my patch for ALWAYS DEFERRED constraintsand triggers). AFTER <op> FOR EACH STATEMENT triggers will only run after all the corresponding DMLs in the mapping have completed, buttheir relative orders should still not be defined. - I don't see how RLS isn't entirely orthogonal. RLS would (does) apply as normal to all of the DMLs in the mapping. If that was not the case, then there'd be a serious bug in PG right now! Using a CTE must *not* disable RLS. FOR UPDATE RLS policies are broken, however, since they don't get to see the OLD and NEW values. But that's orthogonalhere. > FWIW, your example doesn't actually have a source (just a target), so it > isn't actually like MERGE. That was my mistake -- as I say above, I had to spend more time with the various RDBMSes' MERGE docs. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Nov 7, 2017 at 3:29 PM, Nico Williams <nico@cryptonector.com> wrote: > On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote: >> Nico Williams <nico@cryptonector.com> wrote: >> >A MERGE mapped to a DML like this: > > I needed to spend more time reading MERGE docs from other RDBMSes. Please don't hijack this thread. It's about the basic question of semantics, and is already hard enough for others to follow as-is. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Nov 07, 2017 at 03:31:22PM -0800, Peter Geoghegan wrote: > On Tue, Nov 7, 2017 at 3:29 PM, Nico Williams <nico@cryptonector.com> wrote: > > On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote: > >> Nico Williams <nico@cryptonector.com> wrote: > >> >A MERGE mapped to a DML like this: > > > > I needed to spend more time reading MERGE docs from other RDBMSes. > > Please don't hijack this thread. It's about the basic question of > semantics, and is already hard enough for others to follow as-is. I'm absolutely not. If you'd like a pithy summary devoid of detail, it is this: I'm making the argument that using ON CONFLICT to implement MERGE cannot produce a complete implementation [you seem toagree], but there is at least one light-weight way to implement MERGE with _existing_ machinery in PG: CTEs. It's perfectly fine to implement an executor for MERGE, but I think that's a bit silly and I explain why. Further, I explored your question regarding order of events, which you (and I) think is a very important semantics question. You thought order of execution / trigger firing should be defined, whereas I think it should not because MERGE explicitly says, at least MSFT's! MSFT's MERGE says: | For every insert, update, or delete action specified in the MERGE | statement, SQL Server fires any corresponding AFTER triggers defined | on the target table, but does not guarantee on which action to fire | triggers first or last. Triggers defined for the same action honor the | order you specify. Impliedly (though not stated explicitly), the actual updates, inserts, and deletes, can happen in any order as well as the triggers firing in any order. As usual, in the world of programming language design, leaving order of execution undefined as much as possible increases the level of available opportunities to parallelize. Presumably MSFT is leaving the door open to parallizing MERGE, if they haven't already. Impliedly, CTEs that have no dependencies on each other are also ripe for parallelization. This is important too! For one of my goals is: to improve CTE performance. If implementing MERGE as a mapping to CTEs leads to improvements in CTEs, so much the better. But also this *is* a simple implementation of MERGE, and simplicity seems like a good thing. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Ah, there is one reason not to use a mapping to CTEs to implement MERGE: it might be faster to use a single query that is a FULL OUTER JOIN of the source and target to drive the update/insert/delete operations. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 6 November 2017 at 16:50, Peter Geoghegan <pg@bowt.ie> wrote: >> Where hides the problem? > > > The problem is violating MVCC is something that can be done in different > ways, and by meaningful degrees: > > * EPQ semantics are believed to be fine because we don't get complaints > about it. I think that that's because it's specialized to UPDATEs and > UPDATE-like operations, where we walk an UPDATE chain specifically, > and only use a dirty snapshot for the chain's newer tuples. > > * ON CONFLICT doesn't care about UPDATE chains. Unlike EPQ, it makes no > distinction between a concurrent UPDATE, and a concurrent DELETE + fresh > INSERT. It's specialized to CONFLICTs. > > This might seem abstract, but it has real, practical implications. > Certain contradictions exist when you start with MVCC semantics, then > fall back to EPQ semantics, then finally fall back to ON CONFLICT > semantics. > > Questions about mixing these two things: > > * What do we do if someone concurrently UPDATEs in a way that makes the > qual not pass during EPQ traversal? Should we INSERT when that > happens? > > * If so, what about the case when the MERGE join qual/unique index > values didn't change (just some other attributes that do not pass the > additional WHEN MATCHED qual)? > > * What about when there was a concurrent DELETE -- should we INSERT then? > > ON CONFLICT goes from a CONFLICT, and then applies its own qual. That's > hugely different to doing it the other way around: starting from your > own MVCC snapshot qual, and going to a CONFLICT. This is because > evaluating the DO UPDATE's WHERE clause is just one little extra step > after the one and only latest row for that value has been locked. You > could theoretically go this way with 2PL, I think, because that's a bit > like locking every row that the predicate touches, but of course that > isn't at all practical. > > I should stop trying to make a watertight case against this, even though > I still think that's possible. For now, instead, I'll just say that this > is *extremely* complicated, and still has unresolved questions about > semantics. That's a good place to leave this for now - we're OK to make progress with the main feature, and we have some questions to be addressed once we have a cake to decorate. Thanks for your input. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Nov 14, 2017 at 11:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > That's a good place to leave this for now - we're OK to make progress > with the main feature, and we have some questions to be addressed once > we have a cake to decorate. > > Thanks for your input. Thanks for listening. I regret that it became heated, but I'm glad that we now understand each other's perspective. I'm also glad that you're pushing ahead with MERGE as a project, because MERGE is certainly a compelling feature. -- Peter Geoghegan
On 27 October 2017 at 13:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Simon Riggs wrote: > >> Earlier thoughts on how this could/could not be done were sometimes >> imprecise or inaccurate, so I have gone through the command per >> SQL:2011 spec and produced a definitive spec in the form of an SGML >> ref page. This is what I intend to deliver for PG11. > > Nice work. I didn't verify the SQL spec, just read your HTML page; > some very minor comments based on that: > > * use "and" not "where" as initial words in "when_clause" and > "merge_update" clause definitions > > * missing word here: "the DELETE privilege on the if you specify" > > * I think the word "match." is leftover from some editing in the phrase > " that specifies which rows in the data_source match rows in the > target_table_name. match." In the same paragraph, it is not clear > whether all columns must be matched or it can be a partial match. Thanks for these review points, I have included them. > * In the when_clause note, it is not clear whether you can have multiple > WHEN MATCHED and WHEN NOT MATCHED clauses. Obviously you can have one > of each, but I think your doc says it is possible to have more than one of > each, with different conditions (WHEN MATCHED AND foo THEN bar WHEN > MATCHED AND baz THEN qux). No example shows more than one. > > On the same point: Is there short-circuiting of such conditions, i.e. > will the execution will stop looking for further WHEN matches if some > rule matches, or will it rather check all rules and raise an error if > more than one WHEN rules match each given row? Docs rewritten to better explain. > * Your last example uses ELSE but that appears nowhere in the synopsys. This last has been removed. New version of HTML docs attached for easy reading. Attached: MERGE patch is now MOSTLY complete, but still WIP. Patch works sufficiently well to take data from source and use it correctly against target, for the DELETE operation and INSERT DEFAULT VALUES. Patch also includes PL/pgSQL changes. Patch has full set of docs and tests, but does not yet pass all tests. UPDATE and INSERT are not yet working because I've chosen to leave targetist handling until last, which is still WIP. The patch doesn't crash, but does not yet work for those subcommands - though I haven't prevented it from executing those subcommands. Patch uses mechanism as agreed previously with Peter G et al. on this thread. SUMMARY Works * EXPLAIN * DELETE actions * DO NOTHING actions * PL/pgSQL * Triggers for row and statement Not yet working * Execute UPDATE actions * Execute INSERT actions * EvalPlanQual * No isolation tests yet * RLS * Partitioning -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Sat, Dec 30, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Patch uses mechanism as agreed previously with Peter G et al. on this thread. I'm not sure that an agreement was reached, or what the substance of that agreement was. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 4 January 2018 at 17:29, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Dec 30, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Patch uses mechanism as agreed previously with Peter G et al. on this thread. > > I'm not sure that an agreement was reached, or what the substance of > that agreement was. I refer to this... and confirm I have implemented option 3 On 3 November 2017 at 11:07, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Fri, Nov 3, 2017 at 1:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> > We seem to have a few options for PG11 >> > >> > 1. Do nothing, we reject MERGE >> > >> > 2. Implement MERGE for unique index situations only, attempting to >> > avoid errors (Simon OP) >> > >> > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter) >> > >> > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in >> > cases where that is possible. >> > >> > Stephen, Robert, please say which option you now believe we should pick. >> >> I think Peter has made a good case for #3, so I lean toward that >> option. I think #4 is too much of a non-obvious behavior difference >> between the cases where we can avoid those errors and the cases where >> we can't, and I don't see where #2 can go in the future other than #4. > > Agreed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jan 4, 2018 at 12:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 4 January 2018 at 17:29, Robert Haas <robertmhaas@gmail.com> wrote: >> On Sat, Dec 30, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Patch uses mechanism as agreed previously with Peter G et al. on this thread. >> >> I'm not sure that an agreement was reached, or what the substance of >> that agreement was. > > I refer to this... and confirm I have implemented option 3 Thanks. Sorry, I had forgotten about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 30 December 2017 at 11:01, Simon Riggs <simon@2ndquadrant.com> wrote: > Attached: MERGE patch is now MOSTLY complete, but still WIP. New v10a attached, with additional dev work by Pavan and some review from Andrew > Patch works sufficiently well to take data from source and use it > correctly against target, for the DELETE operation and INSERT DEFAULT > VALUES. Patch also includes PL/pgSQL changes. > > Patch has full set of docs and tests, but does not yet pass all tests. Now passes all tests, including throwing new type of semantic error discovered during dev. > Patch uses mechanism as agreed previously with Peter G et al. on this thread. LATEST SUMMARY Works * EXPLAIN * INSERT actions (thanks Pavan) * UPDATE actions (thanks Pavan) * DELETE actions * DO NOTHING actions * PL/pgSQL * Triggers for row and statement * SQL Standard error requirements Not yet working * AND conditions (currently WIP, expected soon) * No isolation tests yet, so EvalPlanQual untested * RLS * Partitioning Based on this successful progress I imagine I'll be looking to commit this by the end of the CF, allowing us 2 further months to bugfix. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 18 January 2018 at 17:19, Simon Riggs <simon@2ndquadrant.com> wrote: > On 30 December 2017 at 11:01, Simon Riggs <simon@2ndquadrant.com> wrote: > >> Attached: MERGE patch is now MOSTLY complete, but still WIP. v11 attached > LATEST SUMMARY > Works > * EXPLAIN > * INSERT actions (thanks Pavan) > * UPDATE actions (thanks Pavan) > * DELETE actions > * AND conditions (thanks Pavan) > * Isolation tests and EvalPlanQual > * DO NOTHING actions > * PL/pgSQL > * Triggers for row and statement > * SQL Standard error requirements > > Not yet working > * Partitioning > * RLS > > Based on this successful progress I imagine I'll be looking to commit > this by the end of the CF, allowing us 2 further months to bugfix. This is complete and pretty clean now. 1200 lines of code, plus docs and tests. I'm expecting to commit this and then come back for the Partitioning & RLS later, but will wait a few days for comments and other reviews. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Jan 23, 2018 at 5:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Not yet working >> * Partitioning >> * RLS >> >> Based on this successful progress I imagine I'll be looking to commit >> this by the end of the CF, allowing us 2 further months to bugfix. > > This is complete and pretty clean now. 1200 lines of code, plus docs and tests. That timeline seems aggressive to me. Also, the patch appears to have bitrot. Please rebase, and post a new version. Some feedback based on a short read-through of your v11: * What's the postgres_fdw status? * Relatedly, when/where does applying the junkfilter happen, if not here?: > @@ -1767,9 +1833,9 @@ ExecModifyTable(PlanState *pstate) > } > > /* > - * apply the junkfilter if needed. > + * apply the junkfilter if needed - we do this later for CMD_MERGE > */ > - if (operation != CMD_DELETE) > + if (operation == CMD_UPDATE || operation == CMD_INSERT) > slot = ExecFilterJunk(junkfilter, slot); * Isn't "consider INSERT ... ON CONFLICT DO UPDATE" obsolete in these doc changes?: > -F312 MERGE statement NO consider INSERT ... ON CONFLICT DO UPDATE > -F313 Enhanced MERGE statement NO > -F314 MERGE statement with DELETE branch NO > +F312 MERGE statement YES consider INSERT ... ON CONFLICT DO UPDATE > +F313 Enhanced MERGE statement YES > +F314 MERGE statement with DELETE branch YES * What's the deal with transition tables? Your docs say this: > + <varlistentry> > + <term><replaceable class="parameter">source_table_name</replaceable></term> > + <listitem> > + <para> > + The name (optionally schema-qualified) of the source table, view or > + transition table. > + </para> > + </listitem> > + </varlistentry> But the code says this: > + /* > + * XXX if we support transition tables this would need to move earlier > + * before ExecSetupTransitionCaptureState() > + */ > + switch (action->commandType) * Can UPDATEs themselves accept an UPDATE-style WHERE clause, in addition to the WHEN quals, and the main ON join quals? I don't see any examples of this in your regression tests. * You added a new MERGE command tag. Shouldn't there be changes to libpq's fe-exec.c, to go along with that? * I noticed this: > + else if (node->operation == CMD_MERGE) > + { > + /* > + * XXX Add more detailed instrumentation for MERGE changes > + * when running EXPLAIN ANALYZE? > + */ > + } I think that doing better here is necessary. * I'm not a fan of stored rules, but I still think that this needs to be discussed: > - product_queries = fireRules(parsetree, > + /* > + * First rule of MERGE club is we don't talk about rules > + */ > + if (event == CMD_MERGE) > + product_queries = NIL; > + else > + product_queries = fireRules(parsetree, > result_relation, > event, > locks, * This seems totally unnecessary at first blush: > + /* > + * Lock tuple for update. > + * > + * XXX Is this really needed? I put this in > + * just to get hold of the existing tuple. > + * But if we do need, then we probably > + * should be looking at the return value of > + * heap_lock_tuple() and take appropriate > + * action. > + */ > + tuple.t_self = *tupleid; > + test = heap_lock_tuple(relation, &tuple, estate->es_output_cid, > + lockmode, LockWaitBlock, false, &buffer, > + &hufd); * I don't know what the deal is with EPQ and MERGE in general, because just after the above heap_lock_tuple(), you do this: > + /* > + * Test condition, if any > + * > + * In the absence of a condition we perform the action > + * unconditionally (no need to check separately since > + * ExecQual() will return true if there are no > + * conditions to evaluate). > + */ > + if (!ExecQual(action->whenqual, econtext)) > + { > + if (BufferIsValid(buffer)) > + ReleaseBuffer(buffer); > + continue; > + } Maybe *this* is why you call heap_lock_tuple(), actually, but that's not clear, and in any case I don't see any point in it if you don't check heap_lock_tuple()'s return value, and do some other extra thing on the basis of that return value. No existing heap_lock_tuple() ignores its return value, and ignoring the return value simply can't make sense. Don't WHEN quals need to participate in EPQ reevaluation, in order to preserve the behavior that we see with UPDATE and DELETE? Why, or why not? I suppose that the way you evaluate WHEN quals separately makes a certain amount of sense, but I think that you need to get things straight with WHEN quals and READ COMMITTED conflict handling at a high level (the semantics), which may or may not mean that WHEN quals participate in EPQ evaluation. If you're going to introduce a special case, I think you need to note it under the EvalPlanQual section of the executor README. This much I'm sure of: you should reevaluate WHEN quals if the UPDATE chain is walked in READ COMMITTED mode, one way or another. It might end up happening in your new heap_lock_tuple() retry loop, or you might do something differently with EPQ, but something should happen (I haven't got an opinion on the implementation details just yet, though). * Your isolation test should be commented. I'd expect you to talk about what is different about MERGE as far as concurrency goes, if anything. I note that you don't use additional WHEN quals in your isolation test at all (just simple WHEN NOT MATCHED + WHEN MATCHED), which was the first thing I looked for there. Look at insert-conflict-do-update-3.spec for an example of roughly the kind of commentary I had hoped to see in your regression test. > I'm expecting to commit this and then come back for the Partitioning & > RLS later, but will wait a few days for comments and other reviews. I don't think that it's okay to defer RLS or partitioning support till after an initial commit. While it's probably true that MERGE can just follow ON CONFLICT's example when it comes to column-level privileges, this won't be true with RLS. -- Peter Geoghegan
On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote: > On Tue, Jan 23, 2018 at 5:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Not yet working >>> * Partitioning >>> * RLS >>> >>> Based on this successful progress I imagine I'll be looking to commit >>> this by the end of the CF, allowing us 2 further months to bugfix. >> >> This is complete and pretty clean now. 1200 lines of code, plus docs and tests. > > That timeline seems aggressive to me. Thank you for the review. > Also, the patch appears to have > bitrot. Please rebase, and post a new version. Will do, though I'm sure that's only minor since we rebased only a few days ago. > Some feedback based on a short read-through of your v11: > > * What's the postgres_fdw status? MERGE currently works with normal relations and materialized views only. > * Relatedly, when/where does applying the junkfilter happen, if not here?: A few lines later in the same file. Junkfilters are still used. > * Isn't "consider INSERT ... ON CONFLICT DO UPDATE" obsolete in these > doc changes?: > >> -F312 MERGE statement NO consider INSERT ... ON CONFLICT DO UPDATE >> -F313 Enhanced MERGE statement NO >> -F314 MERGE statement with DELETE branch NO >> +F312 MERGE statement YES consider INSERT ... ON CONFLICT DO UPDATE >> +F313 Enhanced MERGE statement YES >> +F314 MERGE statement with DELETE branch YES I think that it is still useful to highlight the existence of a non-standard feature which people might not be otherwise unaware. If you wish me to remove a reference to your work, I will bow to your wish. > * What's the deal with transition tables? Your docs say this: > >> + <varlistentry> >> + <term><replaceable class="parameter">source_table_name</replaceable></term> >> + <listitem> >> + <para> >> + The name (optionally schema-qualified) of the source table, view or >> + transition table. >> + </para> >> + </listitem> >> + </varlistentry> > > But the code says this: > >> + /* >> + * XXX if we support transition tables this would need to move earlier >> + * before ExecSetupTransitionCaptureState() >> + */ >> + switch (action->commandType) Changes made by MERGE can theoretically be visible in a transition table. When I tried to implement that there were problems caused by the way INSERT ON CONFLICT has been implemented, so it will take a fair amount of study and lifting to make that work. For now, they are not supported and generate an error message. That behaviour was not documented, but I've done that now. SQL Std says "Transition tables are not generally updatable (and therefore not simply updatable) and their columns are not updatable." So MERGE cannot be a target of a transition table, but it can be the source of one. So the docs you cite are correct, as is the comment. > * Can UPDATEs themselves accept an UPDATE-style WHERE clause, in > addition to the WHEN quals, and the main ON join quals? > > I don't see any examples of this in your regression tests. No, they can't. Oracle allows it but it isn't SQL Standard. > * You added a new MERGE command tag. Shouldn't there be changes to > libpq's fe-exec.c, to go along with that? Nice catch. One liner patch and docs updated in repo for next patch. > * I noticed this: > >> + else if (node->operation == CMD_MERGE) >> + { >> + /* >> + * XXX Add more detailed instrumentation for MERGE changes >> + * when running EXPLAIN ANALYZE? >> + */ >> + } > > I think that doing better here is necessary. Please define better. It may be possible. > * I'm not a fan of stored rules, but I still think that this needs to > be discussed: > >> - product_queries = fireRules(parsetree, >> + /* >> + * First rule of MERGE club is we don't talk about rules >> + */ >> + if (event == CMD_MERGE) >> + product_queries = NIL; >> + else >> + product_queries = fireRules(parsetree, >> result_relation, >> event, >> locks, It has been discussed, in my recent proposal, mentioned in the doc page for clarity. It has also been discussed previously in discussions around MERGE. It's not clear how they would work, but we already have an example of a statement type that doesn't support rules. > * This seems totally unnecessary at first blush: > >> + /* >> + * Lock tuple for update. >> + * >> + * XXX Is this really needed? I put this in >> + * just to get hold of the existing tuple. >> + * But if we do need, then we probably >> + * should be looking at the return value of >> + * heap_lock_tuple() and take appropriate >> + * action. >> + */ >> + tuple.t_self = *tupleid; >> + test = heap_lock_tuple(relation, &tuple, estate->es_output_cid, >> + lockmode, LockWaitBlock, false, &buffer, >> + &hufd); > > * I don't know what the deal is with EPQ and MERGE in general, because > just after the above heap_lock_tuple(), you do this: > >> + /* >> + * Test condition, if any >> + * >> + * In the absence of a condition we perform the action >> + * unconditionally (no need to check separately since >> + * ExecQual() will return true if there are no >> + * conditions to evaluate). >> + */ >> + if (!ExecQual(action->whenqual, econtext)) >> + { >> + if (BufferIsValid(buffer)) >> + ReleaseBuffer(buffer); >> + continue; >> + } > > Maybe *this* is why you call heap_lock_tuple(), actually, but that's > not clear, and in any case I don't see any point in it if you don't > check heap_lock_tuple()'s return value, and do some other extra thing > on the basis of that return value. No existing heap_lock_tuple() > ignores its return value, and ignoring the return value simply can't > make sense. Agreed, I don't think it is necessary. > Don't WHEN quals need to participate in EPQ reevaluation, in order to > preserve the behavior that we see with UPDATE and DELETE? Why, or why > not? WHEN qual evaluation occurs to establish which action to take. The Standard is clear that this happens prior to the action. > I suppose that the way you evaluate WHEN quals separately makes a > certain amount of sense, but I think that you need to get things > straight with WHEN quals and READ COMMITTED conflict handling at a > high level (the semantics), which may or may not mean that WHEN quals > participate in EPQ evaluation. If you're going to introduce a special > case, I think you need to note it under the EvalPlanQual section of > the executor README. I wasn't going to introduce a special case. > This much I'm sure of: you should reevaluate WHEN quals if the UPDATE > chain is walked in READ COMMITTED mode, one way or another. It might > end up happening in your new heap_lock_tuple() retry loop, or you > might do something differently with EPQ, but something should happen > (I haven't got an opinion on the implementation details just yet, > though). An interesting point, thank you for raising it. WHEN quals, if they exist, may have dependencies on either the source or target. If there is a dependency on the target there might be an issue. If the WHEN has a condition AND the WHEN qual fails a re-check after we do EPQ, then I think we should just throw an error. If we re-evaluate everything, I'm sure we'll get into some weird cases that make MATCHED/NOT MATCHED change and that is a certain error case for MERGE. We might do better than that after some thought, but that seems like a rabbit hole we should avoid in the first release. As we agreed earlier, we can later extend MERGE to produce less errors in certain concurrency cases. > * Your isolation test should be commented. I'd expect you to talk > about what is different about MERGE as far as concurrency goes, if > anything. I note that you don't use additional WHEN quals in your > isolation test at all (just simple WHEN NOT MATCHED + WHEN MATCHED), > which was the first thing I looked for there. Look at > insert-conflict-do-update-3.spec for an example of roughly the kind of > commentary I had hoped to see in your regression test. I will be happy to add one that exercises some new code resulting from the above. >> I'm expecting to commit this and then come back for the Partitioning & >> RLS later, but will wait a few days for comments and other reviews. > > I don't think that it's okay to defer RLS or partitioning support till > after an initial commit. While it's probably true that MERGE can just > follow ON CONFLICT's example when it comes to column-level privileges, > this won't be true with RLS. I think it is OK to do things in major pieces, as has been done with many other commands. We have more time in this release to do that, though we want to find and fix any issues in basic functionality like concurrency ahead of trying to add fancy stuff and hitting problems with it. I've already made two changes your review has raised, thanks. Will re-post soon. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 24 January 2018 at 04:12, Simon Riggs <simon@2ndquadrant.com> wrote: > On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote: >> > Please rebase, and post a new version. > > Will do, though I'm sure that's only minor since we rebased only a few days ago. New v12 with various minor corrections and rebased. Main new aspect here is greatly expanded isolation tests. Please read and suggest new tests. We've used those to uncover a few unhandled cases in the concurrency of very comple MERGE statements, so we will repost again on Mon/Tues with a new version covering all the new tests and any comments made here. Nothing to worry about, just some changed logic. I will post again later today with written details of the concurrency rules we're working to now. I've left most of the isolation test expected output as "TO BE DECIDED", so that we can agree our way forwards. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 26 January 2018 at 11:25, Simon Riggs <simon@2ndquadrant.com> wrote: > On 24 January 2018 at 04:12, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote: >>> >> Please rebase, and post a new version. >> >> Will do, though I'm sure that's only minor since we rebased only a few days ago. > > New v12 with various minor corrections and rebased. > > Main new aspect here is greatly expanded isolation tests. Please read > and suggest new tests. > > We've used those to uncover a few unhandled cases in the concurrency > of very comple MERGE statements, so we will repost again on Mon/Tues > with a new version covering all the new tests and any comments made > here. Nothing to worry about, just some changed logic. > > I will post again later today with written details of the concurrency > rules we're working to now. I've left most of the isolation test > expected output as "TO BE DECIDED", so that we can agree our way > forwards. New patch attached that correctly handles all known concurrency cases, with expected test output. The concurrency rules are very simple: If a MATCHED row is concurrently updated/deleted 1. We run EvalPlanQual 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a row with NULL values, then { if NOT MATCHED action exists, then raise ERROR else continue to next row } else re-check all MATCHED AND conditions and execute the first action whose WHEN Condition evaluates to TRUE This means MERGE will work just fine for "normal" UPDATEs, but it will often fail (deterministically) in concurrent tests with mixed insert/deletes or UPDATEs that touch the PK, as requested. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi
2018-01-29 15:11 GMT+01:00 Simon Riggs <simon@2ndquadrant.com>:
On 26 January 2018 at 11:25, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 24 January 2018 at 04:12, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote:
>>>
>> Please rebase, and post a new version.
>>
>> Will do, though I'm sure that's only minor since we rebased only a few days ago.
>
> New v12 with various minor corrections and rebased.
>
> Main new aspect here is greatly expanded isolation tests. Please read
> and suggest new tests.
>
> We've used those to uncover a few unhandled cases in the concurrency
> of very comple MERGE statements, so we will repost again on Mon/Tues
> with a new version covering all the new tests and any comments made
> here. Nothing to worry about, just some changed logic.
>
> I will post again later today with written details of the concurrency
> rules we're working to now. I've left most of the isolation test
> expected output as "TO BE DECIDED", so that we can agree our way
> forwards.
New patch attached that correctly handles all known concurrency cases,
with expected test output.
The concurrency rules are very simple:
If a MATCHED row is concurrently updated/deleted
1. We run EvalPlanQual
2. If the updated row is gone EPQ returns NULL slot or EPQ returns a
row with NULL values, then
{
if NOT MATCHED action exists, then raise ERROR
else continue to next row
}
else
re-check all MATCHED AND conditions and execute the first action
whose WHEN Condition evaluates to TRUE
This means MERGE will work just fine for "normal" UPDATEs, but it will
often fail (deterministically) in concurrent tests with mixed
insert/deletes or UPDATEs that touch the PK, as requested.
can be nice to have part about differences between MERGE and INSERT ON CONFLICT DO
Regards
Pavel
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 January 2018 at 14:19, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> The concurrency rules are very simple: >> If a MATCHED row is concurrently updated/deleted >> 1. We run EvalPlanQual >> 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a >> row with NULL values, then >> { >> if NOT MATCHED action exists, then raise ERROR >> else continue to next row >> } >> else >> re-check all MATCHED AND conditions and execute the first action >> whose WHEN Condition evaluates to TRUE >> >> >> This means MERGE will work just fine for "normal" UPDATEs, but it will >> often fail (deterministically) in concurrent tests with mixed >> insert/deletes or UPDATEs that touch the PK, as requested. > > > can be nice to have part about differences between MERGE and INSERT ON > CONFLICT DO We've agreed not to attempt to make it do anything like INSERT ON CONFLICT, so we don't need to discuss that here anymore. MERGE can be semantically equivalent to an UPDATE join or a DELETE join, and in those cases, MERGE behaves the same. It handles much more complex cases also. MERGE as submitted here follows all MVCC rules similar to an UPDATE join. If it hits a problem with concurent activity it throws ERROR: could not serialize access due to concurrent update to make sure there is no ambiguity (as described directly above). As we discussed earlier, removing some of those ERRORs and making it do something useful instead may be possible later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-01-29 15:40 GMT+01:00 Simon Riggs <simon@2ndquadrant.com>:
On 29 January 2018 at 14:19, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> The concurrency rules are very simple:
>> If a MATCHED row is concurrently updated/deleted
>> 1. We run EvalPlanQual
>> 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a
>> row with NULL values, then
>> {
>> if NOT MATCHED action exists, then raise ERROR
>> else continue to next row
>> }
>> else
>> re-check all MATCHED AND conditions and execute the first action
>> whose WHEN Condition evaluates to TRUE
>>
>>
>> This means MERGE will work just fine for "normal" UPDATEs, but it will
>> often fail (deterministically) in concurrent tests with mixed
>> insert/deletes or UPDATEs that touch the PK, as requested.
>
>
> can be nice to have part about differences between MERGE and INSERT ON
> CONFLICT DO
We've agreed not to attempt to make it do anything like INSERT ON
CONFLICT, so we don't need to discuss that here anymore.
My note was not against MERGE or INSERT ON CONFLICT. If I understand to this topic, I agree so these commands should be implemented separately. But if we use two commands with some intersection, there can be nice to have documentation about recommended use cases. Probably it will be very often question.
Regards
Pavel
MERGE can be semantically equivalent to an UPDATE join or a DELETE
join, and in those cases, MERGE behaves the same. It handles much more
complex cases also.
MERGE as submitted here follows all MVCC rules similar to an UPDATE
join. If it hits a problem with concurent activity it throws
ERROR: could not serialize access due to concurrent update
to make sure there is no ambiguity (as described directly above).
As we discussed earlier, removing some of those ERRORs and making it
do something useful instead may be possible later.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 23, 2018 at 8:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > This is complete and pretty clean now. 1200 lines of code, plus docs and tests. > > I'm expecting to commit this and then come back for the Partitioning & > RLS later, but will wait a few days for comments and other reviews. I agree with Peter: that's unacceptable. You're proposing to commit a patch that is not only has had only a very limited amount of review yet but by your own admission is not even complete. Partitioning and RLS shouldn't be afterthoughts; they should be in the original patch. Moreover, the patch should have had meaningful review from people not involved in writing it, and that is a process that generally takes a few months or at least several weeks, not a few days. An argument could be made that this patch is already too late for PG 11, because it's a major feature that was not submitted in relatively complete form before the beginning of the penultimate CommitFest. I'm not going to make that argument, because I believe this patch is probably sufficiently low-risk that it can be committed between now and feature freeze without great risk of destabilizing the release. But committing it without some in-depth review is not the way to get there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 29 January 2018 at 14:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: > My note was not against MERGE or INSERT ON CONFLICT. If I understand to this > topic, I agree so these commands should be implemented separately. But if we > use two commands with some intersection, there can be nice to have > documentation about recommended use cases. Probably it will be very often > question. That is more qualitative assessment of each, which I think I will defer on. This patch is about implementing the SQL Standard compliant MERGE command which is widely used in other databases and by various tools. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 January 2018 at 15:07, Robert Haas <robertmhaas@gmail.com> wrote: > Moreover, the patch should have had meaningful review from people not > involved in writing it, and that is a process that generally takes a > few months or at least several weeks, not a few days. The code is about 1200 lines and has extensive docs, comments and tests. There are no contentious infrastructure changes, so the debate around concurrency is probably the main one. So it looks to me like meaningful review has taken place, though I know Andrew and Pavan have also looked at it in detail. But having said that, I'm not rushing to commit and further detailed review is welcome, hence the CF status. > An argument could be made that this patch is already too late for PG > 11, because it's a major feature that was not submitted in relatively > complete form before the beginning of the penultimate CommitFest. I'm > not going to make that argument, because I believe this patch is > probably sufficiently low-risk that it can be committed between now > and feature freeze without great risk of destabilizing the release. > But committing it without some in-depth review is not the way to get > there. The patch was substantially complete at that time (was v9d). Later work has changed isolated areas. I agree that this is low-risk. If I suggest committing it sooner rather than later it is because that is more likely to throw up bugs that will increase the eventual quality. Overall, I'm following the style of development process you have yourself used a number of times now. Waiting for mega-patches to be complete is not as useful as phased development. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 03:12:23PM +0000, Simon Riggs wrote: > On 29 January 2018 at 14:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > My note was not against MERGE or INSERT ON CONFLICT. If I understand to this > > topic, I agree so these commands should be implemented separately. But if we > > use two commands with some intersection, there can be nice to have > > documentation about recommended use cases. Probably it will be very often > > question. > > That is more qualitative assessment of each, which I think I will defer on. > > This patch is about implementing the SQL Standard compliant MERGE > command which is widely used in other databases and by various tools. Uh, if we know we are going to get question on this, the patch had better have an explanation of when to use it. Pushing the problem to later doesn't seem helpful. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Simon Riggs <simon@2ndquadrant.com> writes: > On 29 January 2018 at 15:07, Robert Haas <robertmhaas@gmail.com> wrote: >> An argument could be made that this patch is already too late for PG >> 11, because it's a major feature that was not submitted in relatively >> complete form before the beginning of the penultimate CommitFest. > Overall, I'm following the style of development process you have > yourself used a number of times now. Waiting for mega-patches to be > complete is not as useful as phased development. An important part of that style is starting at an appropriate time in the release cycle. As things stand, you are proposing to commit an unfinished feature to v11, and then we have to see if the missing parts show up on time (ie before 1 March) and with adequate quality. Otherwise we'll be having a debate on whether to revert the feature or not ... and if it comes to that, my vote will be for reverting. I'd be much happier about committing this with some essential parts missing if it were done at the start of a devel cycle rather than near the end. regards, tom lane
On 29 January 2018 at 15:44, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Jan 29, 2018 at 03:12:23PM +0000, Simon Riggs wrote: >> On 29 January 2018 at 14:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> > My note was not against MERGE or INSERT ON CONFLICT. If I understand to this >> > topic, I agree so these commands should be implemented separately. But if we >> > use two commands with some intersection, there can be nice to have >> > documentation about recommended use cases. Probably it will be very often >> > question. >> >> That is more qualitative assessment of each, which I think I will defer on. >> >> This patch is about implementing the SQL Standard compliant MERGE >> command which is widely used in other databases and by various tools. > > Uh, if we know we are going to get question on this, the patch had > better have an explanation of when to use it. Pushing the problem to > later doesn't seem helpful. What problem are you referring to? MERGE is not being implemented as some kind of rival to existing functionality, it does things we cannot yet do. Info below is for interest only, it is unrelated to this patch: INSERT ON CONFLICT UPDATE does only INSERT and UPDATE and has various restrictions. It violates MVCC when it needed to allow it to succeed more frequently in updating a concurrently inserted row. It is not SQL Standard. MERGE allows you to make INSERTs, UPDATEs and DELETEs against a single target table using complex conditionals. It follows the SQLStandard; many developers from other databases, much existing code and many tools know it. e.g. MERGE INTO target t USING source s ON t.tid = s.sid WHEN MATCHED AND balance > delta THEN UPDATE SET balance = balance - delta WHEN MATCHED DELETE; WHEN NOT MATCHED THEN INSERT (balance, tid) VALUES (balance + delta, sid) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/29/2018 11:13 AM, Simon Riggs wrote: > On 29 January 2018 at 15:44, Bruce Momjian <bruce@momjian.us> wrote: >> Uh, if we know we are going to get question on this, the patch had >> better have an explanation of when to use it. Pushing the problem to >> later doesn't seem helpful. > > What problem are you referring to? > > INSERT ON CONFLICT UPDATE does ... > > MERGE allows you to ... In my reading of Pavel and Bruce, the only 'problem' being suggested is that the patch hasn't added a bit of documentation somewhere that lays out the relationship between these two things, more or less as you just did. -Chap
On 29 January 2018 at 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> On 29 January 2018 at 15:07, Robert Haas <robertmhaas@gmail.com> wrote: >>> An argument could be made that this patch is already too late for PG >>> 11, because it's a major feature that was not submitted in relatively >>> complete form before the beginning of the penultimate CommitFest. > >> Overall, I'm following the style of development process you have >> yourself used a number of times now. Waiting for mega-patches to be >> complete is not as useful as phased development. > > An important part of that style is starting at an appropriate time in the > release cycle. As things stand, you are proposing to commit an unfinished > feature to v11, and then we have to see if the missing parts show up on > time (ie before 1 March) and with adequate quality. Otherwise we'll be > having a debate on whether to revert the feature or not ... and if it > comes to that, my vote will be for reverting. > > I'd be much happier about committing this with some essential parts > missing if it were done at the start of a devel cycle rather than near > the end. I agree with all of the above. In terms of timing of commits, I have marked the patch Ready For Committer. To me that signifies that it is ready for review by a Committer prior to commit. In case of doubt, I would not even suggest committing this if it had any concurrency issues. That would be clearly unacceptable. The only discussion would be about the word "unfinished". I'm not clear why this patch, which has current caveats all clearly indicated in the docs, differs substantially from other projects that have committed their work ahead of having everything everybody wants, such as replication, materialized views, parallel query, partitioning, logical decoding etc.. All of those features had caveats in the first release in which they were included and many of them were committed prior to the last CF. We are working now to remove those caveats. Why is this different? It shouldn't be. If unfinished means it has caveats that is different to unfinished meaning crappy, risky, contentious etc.. Anyway, reviews welcome, but few people know anything about targetlists and column handling. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: > I agree with all of the above. > > In terms of timing of commits, I have marked the patch Ready For > Committer. To me that signifies that it is ready for review by a > Committer prior to commit. > > In case of doubt, I would not even suggest committing this if it had > any concurrency issues. That would be clearly unacceptable. > > The only discussion would be about the word "unfinished". I'm not > clear why this patch, which has current caveats all clearly indicated > in the docs, differs substantially from other projects that have > committed their work ahead of having everything everybody wants, such > as replication, materialized views, parallel query, partitioning, > logical decoding etc.. All of those features had caveats in the first > release in which they were included and many of them were committed > prior to the last CF. We are working now to remove those caveats. Why > is this different? It shouldn't be. If unfinished means it has caveats > that is different to unfinished meaning crappy, risky, contentious > etc.. I think the question is how does it handle cases it doesn't support? Does it give wrong answers? Does it give a helpful error message? Can you summarize that? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: >> ... If unfinished means it has caveats >> that is different to unfinished meaning crappy, risky, contentious >> etc.. > I think the question is how does it handle cases it doesn't support? > Does it give wrong answers? Does it give a helpful error message? Can > you summarize that? What I was reacting to was the comments just upthread that it doesn't yet handle partitions or RLS. Those things don't seem optional to me. Maybe they're small additions, but if so why aren't they done already? Also, as far as phased development goes: Simon's drawing analogies to things like parallel query, which we all understood had to be done over multiple dev cycles because they were too big to finish in one cycle. I don't think MERGE qualifies: there seems no good reason why it can't be done, full stop, in the first release where it appears. regards, tom lane
On 29 January 2018 at 16:44, Bruce Momjian <bruce@momjian.us> wrote: > I think the question is how does it handle cases it doesn't support? > Does it give wrong answers? Does it give a helpful error message? Can > you summarize that? I'm happy to report that it gives correct answers to every known MERGE test, except * where it hits a concurrency issue and throws SQLCODE = ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that * where it hits an unsupported feature and throws SQLCODE = ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text but of course Robert is correct and everything benefits from further review. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 January 2018 at 16:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Bruce Momjian <bruce@momjian.us> writes: >> On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: >>> ... If unfinished means it has caveats >>> that is different to unfinished meaning crappy, risky, contentious >>> etc.. > >> I think the question is how does it handle cases it doesn't support? >> Does it give wrong answers? Does it give a helpful error message? Can >> you summarize that? > > What I was reacting to was the comments just upthread that it doesn't > yet handle partitions or RLS. Those things don't seem optional to me. > Maybe they're small additions, but if so why aren't they done already? Phasing and risk. Partitioning doesn't look too bad, so that looks comfortable for PG11, assuming it doesn't hit some unhandled complexity. Including RLS in the first commit/release turns this into a high risk patch. Few people use it, but if they do, they don't want me putting a hole in their battleship (literally) should we discover some weird unhandled logic in a complex new command. My recommendation would be to support that later for those that use it. For those that don't, it doesn't matter so can also be done later. > Also, as far as phased development goes: Simon's drawing analogies > to things like parallel query, which we all understood had to be > done over multiple dev cycles because they were too big to finish > in one cycle. I don't think MERGE qualifies: there seems no good > reason why it can't be done, full stop, in the first release where > it appears. That remains the plan, barring delays. If you want to include RLS, then I would appreciate an early review. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 January 2018 at 16:23, Chapman Flack <chap@anastigmatix.net> wrote: > On 01/29/2018 11:13 AM, Simon Riggs wrote: >> On 29 January 2018 at 15:44, Bruce Momjian <bruce@momjian.us> wrote: >>> Uh, if we know we are going to get question on this, the patch had >>> better have an explanation of when to use it. Pushing the problem to >>> later doesn't seem helpful. >> >> What problem are you referring to? >> >> INSERT ON CONFLICT UPDATE does ... >> >> MERGE allows you to ... > In my reading of Pavel and Bruce, the only 'problem' being suggested > is that the patch hasn't added a bit of documentation somewhere that > lays out the relationship between these two things, more or less as > you just did. I am happy to write docs as requested. There are currently no docs saying when INSERT ON CONFLICT UPDATE should be used other than the ref page for that command. There is no mention of it in the "Data Manipulation" section of the docs. I've included docs for MERGE so it is mentioned in concurrency and reference sections, so it follows the same model. Where would people like me to put these docs? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-01-29 18:08 GMT+01:00 Simon Riggs <simon@2ndquadrant.com>:
On 29 January 2018 at 16:23, Chapman Flack <chap@anastigmatix.net> wrote:
> On 01/29/2018 11:13 AM, Simon Riggs wrote:
>> On 29 January 2018 at 15:44, Bruce Momjian <bruce@momjian.us> wrote:
>>> Uh, if we know we are going to get question on this, the patch had
>>> better have an explanation of when to use it. Pushing the problem to
>>> later doesn't seem helpful.
>>
>> What problem are you referring to?
>>
>> INSERT ON CONFLICT UPDATE does ...
>>
>> MERGE allows you to ...
> In my reading of Pavel and Bruce, the only 'problem' being suggested
> is that the patch hasn't added a bit of documentation somewhere that
> lays out the relationship between these two things, more or less as
> you just did.
I am happy to write docs as requested.
There are currently no docs saying when INSERT ON CONFLICT UPDATE
should be used other than the ref page for that command. There is no
mention of it in the "Data Manipulation" section of the docs.
I've included docs for MERGE so it is mentioned in concurrency and
reference sections, so it follows the same model.
Where would people like me to put these docs?
Depends on size - small note can be placed in MERGE docs and link from INSERT ON CONFLICT DO.
Regards
Pavel
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 8:44 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: >> The only discussion would be about the word "unfinished". I'm not >> clear why this patch, which has current caveats all clearly indicated >> in the docs, differs substantially from other projects that have >> committed their work ahead of having everything everybody wants, such >> as replication, materialized views, parallel query, partitioning, >> logical decoding etc.. All of those features had caveats in the first >> release in which they were included and many of them were committed >> prior to the last CF. We are working now to remove those caveats. Why >> is this different? It shouldn't be. If unfinished means it has caveats >> that is different to unfinished meaning crappy, risky, contentious >> etc.. > > I think the question is how does it handle cases it doesn't support? > Does it give wrong answers? Does it give a helpful error message? Can > you summarize that? +1 ON CONFLICT had support for logical decoding, updatable views, and RLS in the first commit. ON CONFLICT was committed in a form that worked seamlessly with any other feature in the system you can name. Making ON CONFLICT play nice with all adjacent features was a great deal of work, and I definitely needed Andres' help for the logical decoding part, but we got it done. (Logical decoding for MERGE should be quite a lot easier, though.) I'm willing to talk about why MERGE is different to ON CONFLICT, and why it may not need to tick all of the same boxes. DML statements are supposed to be highly composable things, though. That's the premise you should start from IMV. -- Peter Geoghegan
On Mon, Jan 29, 2018 at 8:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 29 January 2018 at 16:44, Bruce Momjian <bruce@momjian.us> wrote: > >> I think the question is how does it handle cases it doesn't support? >> Does it give wrong answers? Does it give a helpful error message? Can >> you summarize that? > > I'm happy to report that it gives correct answers to every known MERGE > test, except > > * where it hits a concurrency issue and throws SQLCODE = > ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that > > * where it hits an unsupported feature and throws SQLCODE = > ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text What specific features does it not work with already? A list would be helpful. -- Peter Geoghegan
On 29 January 2018 at 17:35, Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Jan 29, 2018 at 8:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 29 January 2018 at 16:44, Bruce Momjian <bruce@momjian.us> wrote: >> >>> I think the question is how does it handle cases it doesn't support? >>> Does it give wrong answers? Does it give a helpful error message? Can >>> you summarize that? >> >> I'm happy to report that it gives correct answers to every known MERGE >> test, except >> >> * where it hits a concurrency issue and throws SQLCODE = >> ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that >> >> * where it hits an unsupported feature and throws SQLCODE = >> ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text > > What specific features does it not work with already? A list would be helpful. Yes, I added that to the docs as a result of your review comments. I also mentioned them here last week in your review in answer to your specific questions. The current list of features that return ERRCODE_FEATURE_NOT_SUPPORTED is * Tables with Row Security enabled * Partitioning & Inheritance * Foreign Tables Rules are ignored, as they are with COPY. If people have concerns or find problems following review, I will be happy to update this list and/or fix issues, as normal. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 6:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > New patch attached that correctly handles all known concurrency cases, > with expected test output. This revision, v13, seems much improved in this area. > This means MERGE will work just fine for "normal" UPDATEs, but it will > often fail (deterministically) in concurrent tests with mixed > insert/deletes or UPDATEs that touch the PK, as requested. * While looking at how you're handling concurrency/EPQ now, I noticed this code: > + /* > + * Test condition, if any > + * > + * In the absence of a condition we perform the action > + * unconditionally (no need to check separately since > + * ExecQual() will return true if there are no > + * conditions to evaluate). > + */ > + if (!ExecQual(action->whenqual, econtext)) > + { > + if (BufferIsValid(buffer)) > + ReleaseBuffer(buffer); > + continue; > + } (As well as its interactions with ExecUpdate() + ExecDelete(), which are significant.) The way that routines like ExecUpdate() interact with MERGE for WHEN qual + EPQ handling seems kind of convoluted. I hope for something cleaner in the next revision. * This also stood out (not sure if you were referring/alluding to this in the quoted text): > + /* > + * If EvalPlanQual did not return a tuple, it means we > + * have seen a concurrent delete, or a concurrent update > + * where the row has moved to another partition. > + * > + * UPDATE ignores this case and continues. > + * > + * If MERGE has a WHEN NOT MATCHED clause we know that the > + * user would like to INSERT something in this case, yet > + * we can't see the delete with our snapshot, so take the > + * safe choice and throw an ERROR. If the user didn't care > + * about WHEN NOT MATCHED INSERT then neither do we. > + * > + * XXX We might consider setting matched = false and loop > + * back to lmerge though we'd need to do something like > + * EvalPlanQual, but not quite. > + */ > + else if (epqstate->epqresult == EPQ_TUPLE_IS_NULL && > + node->mt_merge_subcommands & ACL_INSERT) > + { > + /* > + * We need to throw a retryable ERROR because of the > + * concurrent update which we can't handle. > + */ > + ereport(ERROR, > + (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), > + errmsg("could not serialize access due to concurrent update"))); > + } I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in READ COMMITTED mode. Anyway, I wonder why we shouldn't just go ahead an do the WHEN NOT MATCHED INSERT on the basis of information which is not visible to our snapshot. We choose to not UPDATE/DELETE on the basis of information from the future already, within EPQ. My first feeling is that this is a distinction without a difference, and you should actually go and INSERT at this point, though I reserve the right to change my mind about that. Yeah, EPQ semantics are icky, but would actually inserting here really be any worse than what we do? Other things I noticed (not related to concurrency) following a fairly quick pass: * Basic tab completion support would be nice. * The SQL standard explicitly imposes this limitation: postgres=# explain merge into cities a using cities b on a.city = b.city when matched and (select 1=1) then update set country = b.country; ERROR: 0A000: cannot use subquery in WHEN AND condition LINE 1: ...sing cities b on a.city = b.city when matched and (select 1=... ^ LOCATION: transformSubLink, parse_expr.c:1865 Why do you, though? Do we really need it? I'm just curious about your thoughts on it. To be clear, I'm not asserting that you're wrong. * ISTM that this patch should have inserted/updated/deleted rows, in roughly the same style as ON CONFLICT's EXPLAIN ANALYZE output. I mentioned this already, and you seemed unclear on what I meant. Hopefully my remarks here are clearer. * Subselect handling is buggy: postgres=# merge into cities a using cities b on a.city = b.city when matched and a.city = 'Straffan' then update set country = (select 'Some country'); ERROR: XX000: unrecognized node type: 114 LOCATION: ExecInitExprRec, execExpr.c:2114 * Why no CTE support? SQL Server has this. * The INSERT ... SELECT syntax doesn't work: postgres=# merge into array_test a using (select '{1,2,3}'::int[] aaa) b on a.aaa = b.aaa when matched then update set aaa = default when not matched then insert (aaa) select '{1,2,3}'; ERROR: 42601: syntax error at or near "select" LINE 1: ... aaa = default when not matched then insert (aaa) select '{1... ^ LOCATION: scanner_yyerror, scan.l:1092 But docs imply otherwise -- "source_query -- A query (SELECT statement or VALUES statement) that supplies the rows to be merged into the target_table_name". Either the docs are wrong, or the code is wrong. Hopefully you can just fix the code. * Rules are not going to be supported, on the grounds that the behavior is unclear, which I suppose is fine. But what about ruleutils.c support? That seems like entirely another matter to me. What about EXPLAIN, etc? Deparse support seems to be considered generic infrastructure, that doesn't necessarily have much to do with the user-visible rules feature. * This restriction seems arbitrary and unjustified: postgres=# merge into testoids a using (select 1 "key", 'foo' "data") b on a.key = b.key when matched and a.oid = 5 then update set data = b.data when not matched then insert (key, data) values (1, 'foo'); ERROR: 42P10: system column "oid" reference in WHEN AND condition is invalid LINE 1: ...'foo' "data") b on a.key = b.key when matched and a.oid = 5 ... ^ LOCATION: scanRTEForColumn, parse_relation.c:738 * Wholerow vars are broken: postgres=# merge into testoids a using (select 1 "key", 'foo' "data") b on a.key = b.key when matched then update set data = b.*::text when not matched then insert (key, data) values (1, 'foo'); ERROR: XX000: variable not found in subplan target lists LOCATION: fix_join_expr_mutator, setrefs.c:2351 That's all I have for now. -- Peter Geoghegan
On 29 January 2018 at 20:41, Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Jan 29, 2018 at 6:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> New patch attached that correctly handles all known concurrency cases, >> with expected test output. > > This revision, v13, seems much improved in this area. > >> This means MERGE will work just fine for "normal" UPDATEs, but it will >> often fail (deterministically) in concurrent tests with mixed >> insert/deletes or UPDATEs that touch the PK, as requested. > > * While looking at how you're handling concurrency/EPQ now, I noticed this code: > >> + /* >> + * Test condition, if any >> + * >> + * In the absence of a condition we perform the action >> + * unconditionally (no need to check separately since >> + * ExecQual() will return true if there are no >> + * conditions to evaluate). >> + */ >> + if (!ExecQual(action->whenqual, econtext)) >> + { >> + if (BufferIsValid(buffer)) >> + ReleaseBuffer(buffer); >> + continue; >> + } > > (As well as its interactions with ExecUpdate() + ExecDelete(), which > are significant.) > > The way that routines like ExecUpdate() interact with MERGE for WHEN > qual + EPQ handling seems kind of convoluted. I hope for something > cleaner in the next revision. Cleaner? > * This also stood out (not sure if you were referring/alluding to this > in the quoted text): > >> + /* >> + * If EvalPlanQual did not return a tuple, it means we >> + * have seen a concurrent delete, or a concurrent update >> + * where the row has moved to another partition. >> + * >> + * UPDATE ignores this case and continues. >> + * >> + * If MERGE has a WHEN NOT MATCHED clause we know that the >> + * user would like to INSERT something in this case, yet >> + * we can't see the delete with our snapshot, so take the >> + * safe choice and throw an ERROR. If the user didn't care >> + * about WHEN NOT MATCHED INSERT then neither do we. >> + * >> + * XXX We might consider setting matched = false and loop >> + * back to lmerge though we'd need to do something like >> + * EvalPlanQual, but not quite. >> + */ >> + else if (epqstate->epqresult == EPQ_TUPLE_IS_NULL && >> + node->mt_merge_subcommands & ACL_INSERT) >> + { >> + /* >> + * We need to throw a retryable ERROR because of the >> + * concurrent update which we can't handle. >> + */ >> + ereport(ERROR, >> + (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), >> + errmsg("could not serialize access due to concurrent update"))); >> + } > > I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in > READ COMMITTED mode. We use that code already in Hot Standby in READ COMMITTED mode. What code should it be? It needs to be a retryable errcode. > Anyway, I wonder why we shouldn't just go ahead > an do the WHEN NOT MATCHED INSERT on the basis of information which is > not visible to our snapshot. We choose to not UPDATE/DELETE on the > basis of information from the future already, within EPQ. My first > feeling is that this is a distinction without a difference, and you > should actually go and INSERT at this point, though I reserve the > right to change my mind about that. > > Yeah, EPQ semantics are icky, but would actually inserting here really > be any worse than what we do? I argued that was possible and desirable, but you argued it was not and got everybody else to agree with you. I'm surprised to see you change your mind on that. At your request I have specifically written the logic to avoid that. I'm happy with that, for now, since what we have is correct, even if we can do better later. We can have fun with looping, live locks and weird errors another day. SQL Standard says this area is implementation defined. > Other things I noticed (not related to concurrency) following a fairly > quick pass: > > * Basic tab completion support would be nice. OK, but as most other patches do, that can be done later. > * The SQL standard explicitly imposes this limitation: > > postgres=# explain merge into cities a using cities b on a.city = > b.city when matched and (select 1=1) then update set country = > b.country; > ERROR: 0A000: cannot use subquery in WHEN AND condition > LINE 1: ...sing cities b on a.city = b.city when matched and (select 1=... > ^ > LOCATION: transformSubLink, parse_expr.c:1865 > > Why do you, though? Do we really need it? I'm just curious about your > thoughts on it. To be clear, I'm not asserting that you're wrong. Which limitation? Not allowing sub-selects? They are not supported, as the message says. > * ISTM that this patch should have inserted/updated/deleted rows, in > roughly the same style as ON CONFLICT's EXPLAIN ANALYZE output. I > mentioned this already, and you seemed unclear on what I meant. > Hopefully my remarks here are clearer. Yes, thanks. Can do. > * Subselect handling is buggy: > > postgres=# merge into cities a using cities b on a.city = b.city when > matched and a.city = 'Straffan' then update set country = (select > 'Some country'); > ERROR: XX000: unrecognized node type: 114 > LOCATION: ExecInitExprRec, execExpr.c:2114 Not buggy, subselects are not supported in WHEN AND clauses because they are not part of the planned query, nor can they be if we want to handle the WHEN clause logic per spec. > * Why no CTE support? SQL Server has this. The SQL Standard doesn't require CTEs or RETURNING syntax, but they could in time be supported. > * The INSERT ... SELECT syntax doesn't work: > > postgres=# merge into array_test a using (select '{1,2,3}'::int[] aaa) > b on a.aaa = b.aaa when matched then update set aaa = default when not > matched then insert (aaa) select '{1,2,3}'; > ERROR: 42601: syntax error at or near "select" > LINE 1: ... aaa = default when not matched then insert (aaa) select '{1... > ^ > LOCATION: scanner_yyerror, scan.l:1092 > > But docs imply otherwise -- "source_query -- A query (SELECT statement > or VALUES statement) that supplies the rows to be merged into the > target_table_name". Either the docs are wrong, or the code is wrong. > Hopefully you can just fix the code. Neither. The docs show that is unsupported. The source query is the USING phrase, there is no INSERT SELECT. > * Rules are not going to be supported, on the grounds that the > behavior is unclear, which I suppose is fine. But what about > ruleutils.c support? > > That seems like entirely another matter to me. What about EXPLAIN, > etc? Deparse support seems to be considered generic infrastructure, > that doesn't necessarily have much to do with the user-visible rules > feature. The MERGE query is a normal query, so that should all just work. EXPLAIN is specifically regression tested. > * This restriction seems arbitrary and unjustified: > > postgres=# merge into testoids a using (select 1 "key", 'foo' "data") > b on a.key = b.key when matched and a.oid = 5 then update set data = > b.data when not matched then insert (key, data) values (1, 'foo'); > ERROR: 42P10: system column "oid" reference in WHEN AND condition is invalid > LINE 1: ...'foo' "data") b on a.key = b.key when matched and a.oid = 5 ... > ^ > LOCATION: scanRTEForColumn, parse_relation.c:738 I followed the comments of how we handle CHECK constraints. It's hard to think of a real world example that would use that; your example seems strange. Why would you want to use Oids in the WHEN AND clause? In the ON or DML clauses, sure, but not there. This is a non-standard feature, so we can decide whether to support that or not. Allowing them is easy, just not very meaningful. Not sure if it has consequences. > * Wholerow vars are broken: > > postgres=# merge into testoids a using (select 1 "key", 'foo' "data") > b on a.key = b.key when matched then update set data = b.*::text when > not matched then insert (key, data) values (1, 'foo'); > ERROR: XX000: variable not found in subplan target lists > LOCATION: fix_join_expr_mutator, setrefs.c:2351 > > That's all I have for now. Good catch; that one is a valid error. I hadn't tried to either support them or block them. Maybe its in the SQL Standard, not sure. Support for whole row vars probably isn't a priority though. Thanks for your comments. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 1:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The way that routines like ExecUpdate() interact with MERGE for WHEN >> qual + EPQ handling seems kind of convoluted. I hope for something >> cleaner in the next revision. > > Cleaner? Yeah, cleaner. The fact that when quals kind of participate in EPQ evaluation without that being in execMain.c seems like it could be a lot cleaner. I don't have more specifics than that right now. >> I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in >> READ COMMITTED mode. > > We use that code already in Hot Standby in READ COMMITTED mode. > > What code should it be? It needs to be a retryable errcode. I don't think that there should be any error, so I can't say. >> Anyway, I wonder why we shouldn't just go ahead >> an do the WHEN NOT MATCHED INSERT on the basis of information which is >> not visible to our snapshot. We choose to not UPDATE/DELETE on the >> basis of information from the future already, within EPQ. My first >> feeling is that this is a distinction without a difference, and you >> should actually go and INSERT at this point, though I reserve the >> right to change my mind about that. >> >> Yeah, EPQ semantics are icky, but would actually inserting here really >> be any worse than what we do? > > I argued that was possible and desirable, but you argued it was not > and got everybody else to agree with you. I'm surprised to see you > change your mind on that. You're mistaken. Nothing I've said here is inconsistent with my previous remarks on how we deal with concurrency. > At your request I have specifically written the logic to avoid that. > I'm happy with that, for now, since what we have is correct, even if > we can do better later. > > We can have fun with looping, live locks and weird errors another day. > SQL Standard says this area is implementation defined. Who said anything about a loop? Where is the loop here? I will rephrase what I said, to make it top-down rather than bottom-up, which may make my intent clearer: According to your documentation, "MERGE provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements". But you're introducing a behavior/error that would not occur in equivalent procedural client code consisting of an UPDATE followed by a (conditionally executed) INSERT statement when run in READ COMMITTED mode. You actually get exactly the concurrency issue that you cite as unacceptable in justifying your serialization error with such procedural code (when the UPDATE didn't affect any rows, only following EPQ walking the UPDATE chain from the snapshot-visible tuple, making the client code decide to do an INSERT on the basis of information "from the future"). I think that it isn't this patch's job to make READ COMMITTED mode any safer than it is in that existing scenario. A scenario that doesn't involve ON CONFLICT at all. >> * The SQL standard explicitly imposes this limitation: >> >> postgres=# explain merge into cities a using cities b on a.city = >> b.city when matched and (select 1=1) then update set country = >> b.country; >> ERROR: 0A000: cannot use subquery in WHEN AND condition >> LINE 1: ...sing cities b on a.city = b.city when matched and (select 1=... >> ^ >> LOCATION: transformSubLink, parse_expr.c:1865 >> >> Why do you, though? Do we really need it? I'm just curious about your >> thoughts on it. To be clear, I'm not asserting that you're wrong. > > Which limitation? Not allowing sub-selects? They are not supported, as > the message says. I'm simply asking you to explain why you think that it would be problematic or even impossible to support it. The question is asked without any agenda. I'm verifying my own understanding, as much as anything else. I've acknowledged that the standard has something to say on this that supports your position, which has real weight. >> * Subselect handling is buggy: >> >> postgres=# merge into cities a using cities b on a.city = b.city when >> matched and a.city = 'Straffan' then update set country = (select >> 'Some country'); >> ERROR: XX000: unrecognized node type: 114 >> LOCATION: ExecInitExprRec, execExpr.c:2114 > > Not buggy, subselects are not supported in WHEN AND clauses because > they are not part of the planned query, nor can they be if we want to > handle the WHEN clause logic per spec. I'm not asking about WHEN AND here (that was my last question). I'm asking about a subselect that appears in the targetlist. (In any case, "unrecognized node type: 114" seems buggy to me in any context.) >> * Why no CTE support? SQL Server has this. > > The SQL Standard doesn't require CTEs or RETURNING syntax, but they > could in time be supported. No time like the present. Is there some reason why it would be difficult with our implementation of CTEs? I can't think why it would be. >> But docs imply otherwise -- "source_query -- A query (SELECT statement >> or VALUES statement) that supplies the rows to be merged into the >> target_table_name". Either the docs are wrong, or the code is wrong. >> Hopefully you can just fix the code. > > Neither. The docs show that is unsupported. The source query is the > USING phrase, there is no INSERT SELECT. My mistake. >> * Rules are not going to be supported, on the grounds that the >> behavior is unclear, which I suppose is fine. But what about >> ruleutils.c support? >> >> That seems like entirely another matter to me. What about EXPLAIN, >> etc? Deparse support seems to be considered generic infrastructure, >> that doesn't necessarily have much to do with the user-visible rules >> feature. > > The MERGE query is a normal query, so that should all just work. Look at get_query_def(), for example. That clearly isn't going to work with MERGE. > EXPLAIN is specifically regression tested. You do very little with EXPLAIN right now, though. More importantly, I think that this is considered a necessary piece of functionality, even if no core code uses it. There are definitely third-party extensions that use ruleutils in a fairly broad way. > I followed the comments of how we handle CHECK constraints. > > It's hard to think of a real world example that would use that; your > example seems strange. Why would you want to use Oids in the WHEN AND > clause? In the ON or DML clauses, sure, but not there. > > This is a non-standard feature, so we can decide whether to support > that or not. Allowing them is easy, just not very meaningful. Not sure > if it has consequences. It's easier to just support them. That way we don't have to think about it. >> * Wholerow vars are broken: >> >> postgres=# merge into testoids a using (select 1 "key", 'foo' "data") >> b on a.key = b.key when matched then update set data = b.*::text when >> not matched then insert (key, data) values (1, 'foo'); >> ERROR: XX000: variable not found in subplan target lists >> LOCATION: fix_join_expr_mutator, setrefs.c:2351 >> >> That's all I have for now. > > Good catch; that one is a valid error. I hadn't tried to either > support them or block them. > > Maybe its in the SQL Standard, not sure. Support for whole row vars > probably isn't a priority though. I think that this needs to work, on general principle. Again, just fixing it is much easier than arguing about it. -- Peter Geoghegan
On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: > In terms of timing of commits, I have marked the patch Ready For > Committer. To me that signifies that it is ready for review by a > Committer prior to commit. My understanding of this meaning is different than yours. It should not be the author's role to mark his own patch as ready for committer, but the role of one or more people who have reviewed in-depth the proposed patch and feature concepts. If you can get a committer-level individual to review your patch, then good for you. But review basics need to happen first. And based on my rough lookup of this thread this has not happened yet. Other people on this thread are pointing out that as well. -- Michael
Attachment
On 29 January 2018 at 17:18, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > 2018-01-29 18:08 GMT+01:00 Simon Riggs <simon@2ndquadrant.com>: >> >> On 29 January 2018 at 16:23, Chapman Flack <chap@anastigmatix.net> wrote: >> > On 01/29/2018 11:13 AM, Simon Riggs wrote: >> >> On 29 January 2018 at 15:44, Bruce Momjian <bruce@momjian.us> wrote: >> >>> Uh, if we know we are going to get question on this, the patch had >> >>> better have an explanation of when to use it. Pushing the problem to >> >>> later doesn't seem helpful. >> >> >> >> What problem are you referring to? >> >> >> >> INSERT ON CONFLICT UPDATE does ... >> >> >> >> MERGE allows you to ... >> > In my reading of Pavel and Bruce, the only 'problem' being suggested >> > is that the patch hasn't added a bit of documentation somewhere that >> > lays out the relationship between these two things, more or less as >> > you just did. >> >> I am happy to write docs as requested. >> >> There are currently no docs saying when INSERT ON CONFLICT UPDATE >> should be used other than the ref page for that command. There is no >> mention of it in the "Data Manipulation" section of the docs. >> >> I've included docs for MERGE so it is mentioned in concurrency and >> reference sections, so it follows the same model. >> >> Where would people like me to put these docs? > > > Depends on size - small note can be placed in MERGE docs and link from > INSERT ON CONFLICT DO. I've put in cross-referencing comments in those two places. v14 attached, with minor additions as requested or notes Changes * Add: X-ref docs * Add: New self-referencing test case * Add: EXPLAIN ANALYZE in the same style as INSERT .. ON CONFLICT * Add: Allow Oids to be used in WHEN AND conditions * Add: Prevent WHEN AND clause from writing data to db, per SQL spec I'll set up a wiki page to track open items. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 29 January 2018 at 22:31, Peter Geoghegan <pg@bowt.ie> wrote: > I don't think that there should be any error, so I can't say. >> I argued that was possible and desirable, but you argued it was not >> and got everybody else to agree with you. I'm surprised to see you >> change your mind on that. > > You're mistaken. Nothing I've said here is inconsistent with my > previous remarks on how we deal with concurrency. Please see here https://www.postgresql.org/message-id/20171102191636.GA27644%40marmot On 2 November 2017 at 19:16, Peter Geoghegan <pg@bowt.ie> wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: >> >> So if I understand you correctly, in your view MERGE should just fail >> with an ERROR if it runs concurrently with other DML? > > > That's certainly my opinion on the matter. It seems like that might be > the consensus, too. You've changed your position, which is good, thanks. No problem at all. The proposal you make here had already been discussed in detail by Pavan and myself. My understanding of that discussion was that he thinks it might be possible, but I had said we must stick to the earlier agreement on how to proceed. I am willing to try to produce fewer concurrent errors, since that was an important point from earlier work. My only issue now is to make sure this does not cause confusion and ask if that changes the views of others. > According to your documentation, "MERGE provides a single SQL > statement that can conditionally INSERT, UPDATE or DELETE rows, a task > that would otherwise require multiple procedural language statements". > But you're introducing a behavior/error that would not occur in > equivalent procedural client code consisting of an UPDATE followed by > a (conditionally executed) INSERT statement when run in READ COMMITTED > mode. You actually get exactly the concurrency issue that you cite as > unacceptable in justifying your serialization error with such > procedural code (when the UPDATE didn't affect any rows, only > following EPQ walking the UPDATE chain from the snapshot-visible > tuple, making the client code decide to do an INSERT on the basis of > information "from the future"). "You're introducing a behavior/error"... No, I advocate nothing in the patch, I am merely following the agreement made here: https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com Robert, Stephen, may we attempt to implement option 4 as Peter now suggests? > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in > cases where that is possible. I will discuss in more detail at the Brussels Dev meeting and see if we can achieve consensus on how to proceed. v14 posted with changes requested by multiple people. Patch status: Needs Review. Current summary: 0 wrong answers; 2 ERRORs raised need better handling; some open requests for change/enhancement. I will open a wiki page to track open items by the end of the week. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 29, 2018 at 10:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > The code is about 1200 lines and has extensive docs, comments and tests. > > There are no contentious infrastructure changes, so the debate around > concurrency is probably the main one. So it looks to me like > meaningful review has taken place, though I know Andrew and Pavan have > also looked at it in detail. Only design-level review, not detailed review of the code. To be clear, I think the design-level review was quite productive and I'm glad it happened, but it's not a substitute for someone going over the code in detail to look for problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 29, 2018 at 7:15 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Jan 29, 2018 at 04:34:48PM +0000, Simon Riggs wrote: >> In terms of timing of commits, I have marked the patch Ready For >> Committer. To me that signifies that it is ready for review by a >> Committer prior to commit. > > My understanding of this meaning is different than yours. It should not > be the author's role to mark his own patch as ready for committer, but > the role of one or more people who have reviewed in-depth the proposed > patch and feature concepts. If you can get a committer-level individual > to review your patch, then good for you. But review basics need to > happen first. And based on my rough lookup of this thread this has not > happened yet. Other people on this thread are pointing out that as > well. +1 to all of that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 29, 2018 at 12:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Partitioning doesn't look too bad, so that looks comfortable for PG11, > assuming it doesn't hit some unhandled complexity. > > Including RLS in the first commit/release turns this into a high risk > patch. Few people use it, but if they do, they don't want me putting a > hole in their battleship (literally) should we discover some weird > unhandled logic in a complex new command. > > My recommendation would be to support that later for those that use > it. For those that don't, it doesn't matter so can also be done later. -1. Every other feature we've added recently, including partitioning, has had to decide what to do about RLS before the initial commit, and this feature shouldn't be exempt. In general, newer features need to work with older features unless there is some extremely good architectural reason why that is unreasonably difficult. If that is the case here, I don't see that you've made an argument for it. The proper way to avoid having you put a hole in their battleship is good code, proper code review, and good testing, not leaving that case off to one side. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jan 30, 2018 at 4:45 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > "You're introducing a behavior/error"... No, I advocate nothing in the > patch, I am merely following the agreement made here: > > https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com > > Robert, Stephen, may we attempt to implement option 4 as Peter now suggests? Simon, I don't think you should represent Peter as having changed his position when he has already said that he didn't. To be honest, the discussion up to this point suggests to me that Peter has a significantly better grip of the issues than you do, yet your posts convey, at least to me, that you're quite sure he's wrong about a lot of stuff, including whether or not his current statements are compatible with his previous statements. I think that the appropriate course of action is for you and he to spend a few more emails trying to actually get on the same page here. As far as I am able to understand, the substantive issue here is what to do when we match an invisible tuple rather than a visible tuple. The patch currently throws a serialization error on the basis that you (Simon) thought that's what was previously agreed. Peter is arguing that we don't normally issue a serialization error at READ COMMITTED (which I think is true) and proposed that we instead try to INSERT. I don't necessarily think that's different from consensus to implement option #3 from https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com because that point #3 says that we're not going to try to AVOID errors under concurrency, not that we're going to create NEW errors. In other words, I understand Peter, then and now, to be saying that MERGE should behave just as if invisible tuples didn't exist at all; if that leads some other part of the system to throw an ERROR, then that's what happens. Presumably, in a case like this, that would be a common outcome, because the merge would be performed on the basis of a unique key and so inserting would trigger a duplicate key violation. But maybe not, because I don't think MERGE requires there to be a unique key on that column, so maybe the insert would just work, or maybe the conflicting transaction would abort just in time to let it work anyway. It is possible that I am confused, here, among other reasons because I haven't read the code, but if I'm not confused then Peter's analysis is correct. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 30 January 2018 at 16:27, Robert Haas <robertmhaas@gmail.com> wrote: > As far as I am able to understand, the substantive issue here is what > to do when we match an invisible tuple rather than a visible tuple. > The patch currently throws a serialization error on the basis that you > (Simon) thought that's what was previously agreed. Correct. We discussed this and agreed point (3) below > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter) > > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in > cases where that is possible. Acting in good faith, in respect of all of your wishes, I implemented exactly that and not what I had personally argued in favour of. > Peter is arguing > that we don't normally issue a serialization error at READ COMMITTED > (which I think is true) and proposed that we instead try to INSERT. Which IMHO is case 4 since it would avoid a concurrent ERROR. This meets exactly my original implementation goals as clearly stated on this thread, so of course I agree with him and have already said I am happy to change the code, though I am still wary of the dangers he noted upthread. If you now agree with doing that and are happy that there are no dangers, then I'm happy we now have consensus again and we can continue implementing MERGE for PG11. This is a good outcome, thanks, our users will be happy. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 30, 2018 at 11:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Which IMHO is case 4 since it would avoid a concurrent ERROR. This > meets exactly my original implementation goals as clearly stated on > this thread, so of course I agree with him and have already said I am > happy to change the code, though I am still wary of the dangers he > noted upthread. > > If you now agree with doing that and are happy that there are no > dangers, then I'm happy we now have consensus again and we can > continue implementing MERGE for PG11. I can't certify that there are no dangers because I haven't studied it in that much detail, and I still don't think this is the same thing as #4 for the reasons I already stated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jan 30, 2018 at 8:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: > As far as I am able to understand, the substantive issue here is what > to do when we match an invisible tuple rather than a visible tuple. > The patch currently throws a serialization error on the basis that you > (Simon) thought that's what was previously agreed. Peter is arguing > that we don't normally issue a serialization error at READ COMMITTED > (which I think is true) and proposed that we instead try to INSERT. I > don't necessarily think that's different from consensus to implement > option #3 from https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com > because that point #3 says that we're not going to try to AVOID errors > under concurrency, not that we're going to create NEW errors. > In other words, I understand Peter, then and now, to be saying that MERGE > should behave just as if invisible tuples didn't exist at all; if that > leads some other part of the system to throw an ERROR, then that's > what happens. Yes, I am still saying that. What's at issue here specifically is the exact behavior of EvalPlanQual() in the context of having *multiple* sets of WHEN quals that need to be evaluated one at a time (in addition to conventional EPQ join quals). This is a specific, narrow question about the exact steps that are taken by EPQ when we have to switch between WHEN MATCHED and WHEN NOT MATCHED cases *as we walk the UPDATE chain*. Right now, I suspect that we will require some minor variation of EPQ's logic to account for new risks. The really interesting question is what happens when we walk the UPDATE chain, while reevaluating EPQ quals alongside WHEN quals, and then determine that no UPDATE/DELETE should happen for the first WHEN case -- what then? I suspect that we may not want to start from scratch (from the MVCC-visible tuple) as we reach the second or subsequent WHEN case, but that's a very tentative view, and I definitely want to hear more opinions it. (Simon wants to just throw a serialization error here instead, even in READ COMMITTED mode, which I see as a cop-out.) Note in particular that this EPQ question has nothing to do with seeing tuples that are not either visible to our MVCC snapshot, or visible to EPQ through an UPDATE chain (which starts from the MVCC visible tuple). The idea that I have done some kind of about-face on how concurrency should work is just plain wrong. It is not a helpful way of framing things. What I am talking about here is very complicated, but also really narrow. > Presumably, in a case like this, that would be a common > outcome, because the merge would be performed on the basis of a unique > key and so inserting would trigger a duplicate key violation. But > maybe not, because I don't think MERGE requires there to be a unique > key on that column, so maybe the insert would just work, or maybe the > conflicting transaction would abort just in time to let it work > anyway. I think that going on to INSERT having decided against an UPDATE only having done an EPQ walk (rather than throwing a serialization error) is very likely to result in the INSERT succeeding, actually. But there is no guarantee that you won't get a duplicate violation, because there is nothing to stop a concurrent *INSERT* with the same PK value. (That's something that's *always* true, regardless of whether or not somebody needs to do EPQ.) -- Peter Geoghegan
On Tue, Jan 30, 2018 at 8:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Peter is arguing >> that we don't normally issue a serialization error at READ COMMITTED >> (which I think is true) and proposed that we instead try to INSERT. > > Which IMHO is case 4 since it would avoid a concurrent ERROR. It would avoid the error that you added in v13 of your patch, a type of error that I never proposed or even considered. > This meets exactly my original implementation goals as clearly stated on > this thread I'm glad that you see it that way. I didn't and don't, though I objected to your characterization mostly because it muddied some already rather muddy waters. I'm happy to let it go now, though. > If you now agree with doing that and are happy that there are no > dangers, then I'm happy we now have consensus again and we can > continue implementing MERGE for PG11. My outline from earlier today about EPQ handling, and how it needs to deal with multiple independent sets of WHEN ... AND quals is, as I said, very tentative. There isn't even consensus in my own mind about this, much less between everyone that has taken an interest in this project. I'm glad that we all seem to agree that serialization failures as a way of dealing with concurrency issues in READ COMMITTED mode are a bad idea. Unfortunately, I still think that we have a lot of work ahead of us when it comes to agreeing to the right semantics with READ COMMITTED conflict handling with multiple WHEN ... AND quals. I see that your v14 still has the serialization error, even though it's now clear that nobody wants to go that way. So...where do we go from here? (For the avoidance of doubt, this is *not* a rhetorical question.) -- Peter Geoghegan
On Tue, Jan 30, 2018 at 8:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: > As far as I am able to understand, the substantive issue here is what > to do when we match an invisible tuple rather than a visible tuple. > The patch currently throws a serialization error on the basis that you > (Simon) thought that's what was previously agreed. Peter is arguing > that we don't normally issue a serialization error at READ COMMITTED > (which I think is true) and proposed that we instead try to INSERT. I > don't necessarily think that's different from consensus to implement > option #3 from https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com > because that point #3 says that we're not going to try to AVOID errors > under concurrency, not that we're going to create NEW errors. I should have mentioned earlier that you have this exactly right: I do not want to make any special effort to avoid duplicate violation errors. I also don't want to create any novel new kind of error (e.g., READ COMMITTED serialization errors). That having been said, I think that Simon was correct to propose a novel solution. It just seemed like READ COMMITTED serialization errors were the wrong novel solution, because that takes the easy way out. ISTM that the right thing is to adapt EvalPlanQual() (or READ COMMITTED conflict handling more generally) to the new complication that is multiple "WHEN ... AND" quals (that need to be evaluated one at a time, a bona fide new requirement). In short, his novel solution seemed much too novel. As I've pointed out already, we will define MERGE to users as something that "provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements". I believe that MERGE's charter should be to live up to that definition in the least surprising way possible, up to and including preserving the maybe-surprising aspects of how multiple procedural language statements can behave when the system does READ COMMITTED conflict handling. That's my opinion in a nutshell. -- Peter Geoghegan
On Tue, Jan 30, 2018 at 2:28 PM, Peter Geoghegan <pg@bowt.ie> wrote: > What's at issue here specifically is the exact behavior of > EvalPlanQual() in the context of having *multiple* sets of WHEN quals > that need to be evaluated one at a time (in addition to conventional > EPQ join quals). This is a specific, narrow question about the exact > steps that are taken by EPQ when we have to switch between WHEN > MATCHED and WHEN NOT MATCHED cases *as we walk the UPDATE chain*. > > Right now, I suspect that we will require some minor variation of > EPQ's logic to account for new risks. The really interesting question > is what happens when we walk the UPDATE chain, while reevaluating EPQ > quals alongside WHEN quals, and then determine that no UPDATE/DELETE > should happen for the first WHEN case -- what then? I suspect that we > may not want to start from scratch (from the MVCC-visible tuple) as we > reach the second or subsequent WHEN case, but that's a very tentative > view, and I definitely want to hear more opinions it. (Simon wants to > just throw a serialization error here instead, even in READ COMMITTED > mode, which I see as a cop-out.) I don't fully grok merge but suppose you have: WHEN MATCHED AND a = 0 THEN UPDATE ... WHEN MATCHED AND a = 1 THEN UPDATE ... WHEN NOT MATCHED THEN INSERT ... Suppose you match a tuple with a = 0 but, upon trying to update it, find that it's been updated to a = 1. It seems like there are a few possible behaviors: 1. Throw an error! I guess this is what the patch does now. 2. Do absolutely nothing. I think this is what would happen with an ordinary UPDATE; the tuple fails the EPQ recheck and so is not updated, but that doesn't trigger anything else. 3. Fall through to the NOT MATCHED clause and try that instead. Allows MERGE to work as UPSERT in some simple cases, I think. 4. Continue walking the chain of WHEN MATCHED items in order and test them against the new tuple. This is actually pretty weird because a 0->1 update will fall through to the second UPDATE rule, but a 1->0 update will fall through to the NOT MATCHED clause. 5. Retry from the top of the chain with the updated tuple. Could theoretically livelock - not sure how much of a risk that is in practice. Maybe there are more options? My initial reaction is to wonder what's wrong with #2. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jan 31, 2018 at 7:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I don't fully grok merge but suppose you have: > > WHEN MATCHED AND a = 0 THEN UPDATE ... > WHEN MATCHED AND a = 1 THEN UPDATE ... > WHEN NOT MATCHED THEN INSERT ... > > Suppose you match a tuple with a = 0 but, upon trying to update it, > find that it's been updated to a = 1. It seems like there are a few > possible behaviors: > > 1. Throw an error! I guess this is what the patch does now. Right. > 2. Do absolutely nothing. I think this is what would happen with an > ordinary UPDATE; the tuple fails the EPQ recheck and so is not > updated, but that doesn't trigger anything else. I think #2 is fine if you're talking about join quals. Which, of course, you're not. These WHEN quals really do feel like tuple-at-a-time procedural code, more than set-orientated quals (if that wasn't true, we'd have to allow cardinality violations, which we at least try to avoid). Simon said something like "the SQL standard requires that WHEN quals be evaluated first" at one point, which makes sense to me. > 3. Fall through to the NOT MATCHED clause and try that instead. > Allows MERGE to work as UPSERT in some simple cases, I think. I probably wouldn't put it that way myself, FWIW. > 4. Continue walking the chain of WHEN MATCHED items in order and test > them against the new tuple. This is actually pretty weird because a > 0->1 update will fall through to the second UPDATE rule, but a 1->0 > update will fall through to the NOT MATCHED clause. You have two WHEN MATCHED cases here, which is actually quite a complicated example. If you didn't, then IIUC there would be no distinction between #3 and #4. Whether or not this "pretty weird" behavior would be weirder than equivalent procedural code consisting of multiple (conditionally executed) DML statements is subjective. If you imagine that the equivalent procedural code is comprised of two UPDATE statements and an INSERT (one statement for every WHEN case), then it's not weird, I think, or at least is no weirder. If you imagine that it's only one UPDATE (plus an INSERT), then is does indeed seem weirder. I'm inclined to think of it as two UPDATE statements (that can only affect zero or one tuples) rather than one (the statements are inside a loop that processes many input rows, equivalent to the left side of MERGE's join). After all, it seems very likely that one of the two WHEN MATCHED items would actually end up containing a DELETE in real world queries, and not another UPDATE. That's why I lean towards #4 ever so slightly right now. > 5. Retry from the top of the chain with the updated tuple. Could > theoretically livelock - not sure how much of a risk that is in > practice. I'd say the livelock risk is non-zero, but it might still be worth it. Isn't this like rolling back and repeating the statement in most real-world cases? Apparently READ COMMITTED conflict handling in a system that's similar to Postgres occurs through a statement-level rollback. A little bird told me that it can repeat again and again, and that there is a little known mechanism for that to eventually error out after a fixed number of retries. It might be desirable to emulate that in a rudimentary way -- by implementing #5. This doesn't seem all that appealing to me right now, though. > Maybe there are more options? Probably. Minor point on semantics: There is clearly a two phase nature to WHEN quals, which is the actual structure that Simon chose. Technically, what you described wouldn't ever require EPQ recheck -- it might require WHEN recheck. I think we should start being careful about which we're talking about going forward. Hopefully Simon's MERGE wiki page can establish a standard lexicon to talk about this stuff without everyone becoming even more confused. That seems like it would be a big help. -- Peter Geoghegan
On 31 January 2018 at 15:17, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jan 30, 2018 at 2:28 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> What's at issue here specifically is the exact behavior of >> EvalPlanQual() in the context of having *multiple* sets of WHEN quals >> that need to be evaluated one at a time (in addition to conventional >> EPQ join quals). This is a specific, narrow question about the exact >> steps that are taken by EPQ when we have to switch between WHEN >> MATCHED and WHEN NOT MATCHED cases *as we walk the UPDATE chain*. >> >> Right now, I suspect that we will require some minor variation of >> EPQ's logic to account for new risks. The really interesting question >> is what happens when we walk the UPDATE chain, while reevaluating EPQ >> quals alongside WHEN quals, and then determine that no UPDATE/DELETE >> should happen for the first WHEN case -- what then? I suspect that we >> may not want to start from scratch (from the MVCC-visible tuple) as we >> reach the second or subsequent WHEN case, but that's a very tentative >> view, and I definitely want to hear more opinions it. (Simon wants to >> just throw a serialization error here instead, even in READ COMMITTED >> mode, which I see as a cop-out.) > > I don't fully grok merge but suppose you have: > > WHEN MATCHED AND a = 0 THEN UPDATE ... > WHEN MATCHED AND a = 1 THEN UPDATE ... > WHEN NOT MATCHED THEN INSERT ... > > Suppose you match a tuple with a = 0 but, upon trying to update it, > find that it's been updated to a = 1. It seems like there are a few > possible behaviors: > > 1. Throw an error! I guess this is what the patch does now. > > 2. Do absolutely nothing. I think this is what would happen with an > ordinary UPDATE; the tuple fails the EPQ recheck and so is not > updated, but that doesn't trigger anything else. > > 3. Fall through to the NOT MATCHED clause and try that instead. > Allows MERGE to work as UPSERT in some simple cases, I think. > > 4. Continue walking the chain of WHEN MATCHED items in order and test > them against the new tuple. This is actually pretty weird because a > 0->1 update will fall through to the second UPDATE rule, but a 1->0 > update will fall through to the NOT MATCHED clause. > > 5. Retry from the top of the chain with the updated tuple. Could > theoretically livelock - not sure how much of a risk that is in > practice. > > Maybe there are more options? > > My initial reaction is to wonder what's wrong with #2. No, we don't throw an ERROR in that case, because it is a simple variation of existing EvalPlanQual behavior. #2 is possible, yes, and is how we had it coded in v11. #4 was how I first assumed it had to work, but it gives the wrong answer in some cases and right answer in others, depending upon order of WHEN clauses. That was ruled out as inconsistent. #5 is what the patch does now. There are tests covering that behavior in specs/merge-match-recheck.spec There are more complex cases to consider. If a concurrent DELETE hits, then we can try #3, i.e. changing MATCHED to NOT MATCHED. That currently throws an error, as requested. It looks to be possible, but it would require some variation of EvalPlanQual. My prototype of that doesn't yet work, so I can't yet confirm whether it is even possible. If it is, I will submit as an option for PG11. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 30 January 2018 at 21:47, Peter Geoghegan <pg@bowt.ie> wrote: > I'm glad that we all seem to agree that serialization failures as a > way of dealing with concurrency issues in READ COMMITTED mode are a > bad idea. ERRORs are undesirable, yet safe and correct. Doing better is as yet unclear if it can be done correctly in all cases, or whether a practical subset exists. > Unfortunately, I still think that we have a lot of work > ahead of us when it comes to agreeing to the right semantics with READ > COMMITTED conflict handling with multiple WHEN ... AND quals. OK > I see that your v14 still has the serialization error, even though > it's now clear that nobody wants to go that way. So...where do we go > from here? (For the avoidance of doubt, this is *not* a rhetorical > question.) This way forward is new. We're trying it, but it may not be possible. I haven't made it work yet. If we can find a way to do this, we will. If you want to propose some code, please do. I think it would be very helpful if we could discuss everything with direct relevance to v14, so this becomes a patch review, not just a debate. i.e. which isolation test would we like to change from ERROR to success? or which new test would you like to add? Thanks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1 February 2018 at 12:45, Simon Riggs <simon@2ndquadrant.com> wrote: > I think it would be very helpful if we could discuss everything with > direct relevance to v14, so this becomes a patch review, not just a > debate. > i.e. which isolation test would we like to change from ERROR to > success? or which new test would you like to add? > > Thanks In my understanding, we are discussing changing the potential outcome from concurrent operations on the small subset of test results noted in the attached patch-on-patch. (This will break the patch tester) If you can confirm these are the ones we are discussing and say what you think the output should be, that will help us be very specific . -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Feb 1, 2018 at 4:45 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I think it would be very helpful if we could discuss everything with > direct relevance to v14, so this becomes a patch review, not just a > debate. I wish I could give you a clear answer on the way forward with total confidence, or even moderate confidence, but right now I can't. Hopefully I'll be able to convince myself that I've understood all the nuances of this EPQ + WHEN ... AND qual business shortly, perhaps in the next couple of days, at which point I'll have a position that I'm willing to defend. I don't even have that much right now. On a more positive note, I do agree with you that this only affects a small subset of real-world queries (even if we assume READ COMMITTED conflicts are common). To put it another way, I would be able to give you a simple opinion right now if WHEN ... AND quals were prohibited. Not that I'm proposing actually prohibiting them. (BTW, don't you think it's interesting that Oracle doesn't allow them, but instead requires WHERE clauses, even on an INSERT?) Leaving concurrency aside for a moment, I want to talk about bugs. I read through this: https://wiki.postgresql.org/wiki/SQL_MERGE_Patch_Status#Bugs This suggestion you make, which is that the only problem with wholerow vars is that there is a weird error message (and they're actually unsupported) is not helping your cause. While I think that users will expect that to work, that isn't really the main point. I tried using wholerow vars as a smoke test for setrefs.c handling, as well as the handling of targetlists within the rewriter. The fact that wholerow vars don't work is likely indicative of your general approach in these areas needing to be thought through in more detail. That's the really important thing. Wholerow vars shouldn't be a special case to the underlying implementation, and if you found a way to make them work that was a special case that would seem questionable to me for similar reasons. Sometimes wholerow vars actually *do* work with your patch -- the problem only happens with data_source whole-row vars, and never target_table_name wholerow vars: postgres=# merge into testoids a using (select i "key", 'foo' "data" from generate_series(0,3) i) b on a.key = b.key when matched then update set data = a.*::text when not matched then insert (key, data) values (b.key, 'foo'); MERGE 4 postgres=# merge into testoids a using (select i "key", 'foo' "data" from generate_series(0,3) i) b on a.key = b.key when matched then update set data = b.*::text when not matched then insert (key, data) values (b.key, 'foo'); ERROR: variable not found in subplan target lists There is also the matter of subselects in the update targetlist, which you similarly claim is only a problem of having the wrong error message. The idea that those are unsupported for any principled reason doesn't have any justification (unlike WHEN ... AND quals, and their restrictions, which I agree are necessary). It clearly works with Oracle, for example: http://sqlfiddle.com/#!4/2d5405/10 You're reusing set_clause_list in the grammar, so I don't see why it shouldn't work within MERGE in just the same way as it works in UPDATE. While I think that there is a legitimate need for restrictions on some merge_when_clause cases, such as the VALUES() of merge_insert, this isn't an example of that. Again, this suggests to me a need for more work within the optimizer. Finally, I noticed a problem with your new EXPLAIN ANALYZE instrumentation: Is it 4 rows inserted, or 0 inserted? postgres=# merge into testoids a using (select i "key", 'foo' "data" from generate_series(0,3) i) b on a.key = b.key when matched and 1=0 then update set data = b.data when not matched then insert (key, data) values (b.key, 'foo'); MERGE 0 postgres=# explain analyze merge into testoids a using (select i "key", 'foo' "data" from generate_series(0,3) i) b on a.key = b.key when matched and 1=0 then update set data = b.data when not matched then insert (key, data) values (b.key, 'foo'); QUERY PLAN ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Merge on testoids a (cost=38.58..61.19 rows=1000 width=42) (actual time=0.043..0.049 rows=4 loops=1) Tuples Inserted: 4 Tuples Updated: 0 Tuples Deleted: 0 -> Hash Left Join (cost=38.58..61.19 rows=1000 width=42) (actual time=0.039..0.043 rows=4 loops=1) Hash Cond: (i.i = a.key) -> Function Scan on generate_series i (cost=0.00..10.00 rows=1000 width=4) (actual time=0.009..0.010 rows=4 loops=1) -> Hash (cost=22.70..22.70 rows=1270 width=10) (actual time=0.021..0.021 rows=4 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 17kB -> Seq Scan on testoids a (cost=0.00..22.70 rows=1270 width=10) (actual time=0.014..0.016 rows=4 loops=1) Planning time: 0.202 ms Execution time: 0.109 ms (12 rows) (It should be 0 rows inserted, not 4.) -- Peter Geoghegan
On Thu, Feb 1, 2018 at 11:39 AM, Peter Geoghegan <pg@bowt.ie> wrote: > There is also the matter of subselects in the update targetlist, which > you similarly claim is only a problem of having the wrong error > message. The idea that those are unsupported for any principled reason > doesn't have any justification (unlike WHEN ... AND quals, and their > restrictions, which I agree are necessary). It clearly works with > Oracle, for example: > > http://sqlfiddle.com/#!4/2d5405/10 > > You're reusing set_clause_list in the grammar, so I don't see why it > shouldn't work within MERGE in just the same way as it works in > UPDATE. Actually, I now wonder if there is a good reason for restrictions (e.g. no subselects) on WHEN ... AND quals, too. See this SQL fiddle from SQL Server: http://sqlfiddle.com/#!18/8acef/27 I started looking at SQL Server's MERGE to verify that it also does not impose any restrictions on subselects in a MERGE UPDATE's targetlist, just like Oracle. Unsurprisingly, it does not. More surprisingly, I noticed that it also doesn't seem to impose restrictions on what can appear in WHEN ... AND quals. Most surprisingly of all, even the main join ON condition itself can have subselects (though that's probably a bad idea). What this boils down to is that I don't think that this part of your design is committable (from your recent v14): >> + * As top-level statements INSERT, UPDATE and DELETE have a Query, >> + * whereas with MERGE the individual actions do not require >> + * separate planning, only different handling in the executor. >> + * See nodeModifyTable handling of commandType CMD_MERGE. >> + * >> + * A sub-query can include the Target, but otherwise the sub-query >> + * cannot reference the outermost Target table at all. >> + */ >> + qry->querySource = QSRC_PARSER; >> + joinexpr = makeNode(JoinExpr); >> + joinexpr->isNatural = false; >> + joinexpr->alias = NULL; >> + joinexpr->usingClause = NIL; >> + joinexpr->quals = stmt->join_condition; I am willing to continue to engage with you on the concurrency issues for the time being, since that is the most pressing issue for the patch. We can get to this stuff later. Note that I consider cleaning up the Query and plan representations to be prerequisite to commit. -- Peter Geoghegan
Hi, as promised in Brussels, I taught sqlsmith about MERGE and did some testing with merge.v14.patch applied on master at 9aef173163. So far, it found a couple of failing assertions and a suspicous error message. Details and Testcases against the regression database below. regards, Andreas -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_SubLink))", File: "clauses.c", Line: 440) MERGE INTO public.brin_test as target_0 USING pg_catalog.pg_database as ref_0 left join pg_catalog.pg_user_mapping as sample_0 tablesample system (2.3) on (pg_catalog.mul_d_interval( cast(pg_catalog.pi() as float8), cast(case when sample_0.umoptions is not NULL then (select write_lag from pg_catalog.pg_stat_replication limit 1offset 2) else (select write_lag from pg_catalog.pg_stat_replication limit 1 offset 2) end as "interval")) = (select intervalcol from public.brintest limit 1 offset 2) ) ON target_0.a = ref_0.encoding WHEN NOT MATCHED AND cast(null as "timestamp") < cast(null as date) THEN INSERT VALUES ( 62, 6) WHEN NOT MATCHED AND false THEN DO NOTHING; -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_SubLink))", File: "prepunion.c", Line: 2246) MERGE INTO public.onek2 as target_0 USING public.prt1 as ref_0 inner join public.tenk1 as ref_1 on ((select t from public.btree_tall_tbl limit 1 offset 63) is not NULL) ON target_0.stringu1 = ref_1.stringu1 WHEN NOT MATCHED THEN DO NOTHING; -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_Query))", File: "var.c", Line: 248) MERGE INTO public.clstr_tst_inh as target_0 USING pg_catalog.pg_statio_sys_tables as ref_0 left join public.rule_and_refint_t3 as ref_1 on (((ref_0.heap_blks_hit is not NULL) or (((select f1 from public.path_tbl limit 1 offset 5) >= (select thepath from public.shighway limit 1 offset 33) ) or (cast(null as tsvector) <> cast(null as tsvector)))) and (ref_0.toast_blks_read is not NULL)) ON target_0.d = ref_1.data WHEN NOT MATCHED AND cast(null as int2) = pg_catalog.lastval() THEN DO NOTHING; -- ERROR: unrecognized node type: 114 MERGE INTO public.found_test_tbl as target_0 USING public.itest7a as ref_0 ON target_0.a = ref_0.a WHEN NOT MATCHED THEN INSERT VALUES ((select a from public.rtest_t3 limit 1 offset 6));
On 2 February 2018 at 01:59, Peter Geoghegan <pg@bowt.ie> wrote: > On Thu, Feb 1, 2018 at 11:39 AM, Peter Geoghegan <pg@bowt.ie> wrote: >> There is also the matter of subselects in the update targetlist, which >> you similarly claim is only a problem of having the wrong error >> message. The idea that those are unsupported for any principled reason >> doesn't have any justification (unlike WHEN ... AND quals, and their >> restrictions, which I agree are necessary). It clearly works with >> Oracle, for example: >> >> http://sqlfiddle.com/#!4/2d5405/10 >> >> You're reusing set_clause_list in the grammar, so I don't see why it >> shouldn't work within MERGE in just the same way as it works in >> UPDATE. > > Actually, I now wonder if there is a good reason for restrictions > (e.g. no subselects) on WHEN ... AND quals, too. See this SQL fiddle > from SQL Server: > > http://sqlfiddle.com/#!18/8acef/27 > > I started looking at SQL Server's MERGE to verify that it also does > not impose any restrictions on subselects in a MERGE UPDATE's > targetlist, just like Oracle. Unsurprisingly, it does not. More > surprisingly, I noticed that it also doesn't seem to impose > restrictions on what can appear in WHEN ... AND quals. You earlier agreed that subselects were not part of the Standard. > Most > surprisingly of all, even the main join ON condition itself can have > subselects (though that's probably a bad idea). That should be supported, though I can't think of why you'd want that either. > What this boils down to is that I don't think that this part of your > design is committable (from your recent v14): So your opinion is that because v14 patch doesn't include a feature extension that is in Oracle and SQLServer that we cannot commit this patch. There are quite a few minor additional things in that category and the syntax of those two differ, so its clearly impossible to match both exactly. That seems like poor reasoning on why we should block the patch. If you would like to say how you think the design should look, it might be possible to change it for this release. Changing it in the future would not be blocked by commiting without that. >>> + * As top-level statements INSERT, UPDATE and DELETE have a Query, >>> + * whereas with MERGE the individual actions do not require >>> + * separate planning, only different handling in the executor. >>> + * See nodeModifyTable handling of commandType CMD_MERGE. >>> + * >>> + * A sub-query can include the Target, but otherwise the sub-query >>> + * cannot reference the outermost Target table at all. >>> + */ >>> + qry->querySource = QSRC_PARSER; >>> + joinexpr = makeNode(JoinExpr); >>> + joinexpr->isNatural = false; >>> + joinexpr->alias = NULL; >>> + joinexpr->usingClause = NIL; >>> + joinexpr->quals = stmt->join_condition; > > I am willing to continue to engage with you on the concurrency issues > for the time being, since that is the most pressing issue for the > patch. We can get to this stuff later. There are no concurrency issues. The patch gives the correct answer in all cases, or an error to avoid problems. We've agreed that it is desirable we remove some of those if possible, though they are there as a result of our earlier discussions. > Note that I consider cleaning > up the Query and plan representations to be prerequisite to commit. You'll need to be more specific. Later patches do move some things. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3 February 2018 at 19:57, Andreas Seltenreich <seltenreich@gmx.de> wrote: > as promised in Brussels, I taught sqlsmith about MERGE and did some > testing with merge.v14.patch applied on master at 9aef173163. > > So far, it found a couple of failing assertions and a suspicous error > message. Details and Testcases against the regression database below. Brilliant work, thank you. It will likely take some time to work through these and the current work items but will fix. Do you have the DDL so we can recreate these easily? Thanks > regards, > Andreas > > -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_SubLink))", File: "clauses.c", Line: 440) > MERGE INTO public.brin_test as target_0 > USING pg_catalog.pg_database as ref_0 > left join pg_catalog.pg_user_mapping as sample_0 tablesample system (2.3) > on (pg_catalog.mul_d_interval( > cast(pg_catalog.pi() as float8), > cast(case when sample_0.umoptions is not NULL then (select write_lag from pg_catalog.pg_stat_replication limit1 offset 2) > else (select write_lag from pg_catalog.pg_stat_replication limit 1 offset 2) > end > as "interval")) = (select intervalcol from public.brintest limit 1 offset 2) > ) > ON target_0.a = ref_0.encoding > WHEN NOT MATCHED AND cast(null as "timestamp") < cast(null as date) > THEN INSERT VALUES ( 62, 6) > WHEN NOT MATCHED > AND false > THEN DO NOTHING; > > -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_SubLink))", File: "prepunion.c", Line: 2246) > MERGE INTO public.onek2 as target_0 > USING public.prt1 as ref_0 > inner join public.tenk1 as ref_1 > on ((select t from public.btree_tall_tbl limit 1 offset 63) > is not NULL) > ON target_0.stringu1 = ref_1.stringu1 > WHEN NOT MATCHED THEN DO NOTHING; > > -- TRAP: FailedAssertion("!(!((((const Node*)(node))->type) == T_Query))", File: "var.c", Line: 248) > MERGE INTO public.clstr_tst_inh as target_0 > USING pg_catalog.pg_statio_sys_tables as ref_0 > left join public.rule_and_refint_t3 as ref_1 > on (((ref_0.heap_blks_hit is not NULL) > or (((select f1 from public.path_tbl limit 1 offset 5) > >= (select thepath from public.shighway limit 1 offset 33) > ) > or (cast(null as tsvector) <> cast(null as tsvector)))) > and (ref_0.toast_blks_read is not NULL)) > ON target_0.d = ref_1.data > WHEN NOT MATCHED > AND cast(null as int2) = pg_catalog.lastval() > THEN DO NOTHING; > > -- ERROR: unrecognized node type: 114 > MERGE INTO public.found_test_tbl as target_0 > USING public.itest7a as ref_0 > ON target_0.a = ref_0.a > WHEN NOT MATCHED > THEN INSERT VALUES ((select a from public.rtest_t3 limit 1 offset 6)); -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1 February 2018 at 19:39, Peter Geoghegan <pg@bowt.ie> wrote: > Finally, I noticed a problem with your new EXPLAIN ANALYZE instrumentation: > > Is it 4 rows inserted, or 0 inserted? > > postgres=# merge into testoids a using (select i "key", 'foo' "data" > from generate_series(0,3) i) b on a.key = b.key when matched and 1=0 > then update set data = b.data when not matched then insert (key, data) > values (b.key, 'foo'); > MERGE 0 Got it. I'm reporting the number of rows processed instead of the number of rows inserted. My test happened to have those values set equal. Minor bug, thanks for spotting. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Feb 3, 2018 at 2:15 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I started looking at SQL Server's MERGE to verify that it also does >> not impose any restrictions on subselects in a MERGE UPDATE's >> targetlist, just like Oracle. Unsurprisingly, it does not. More >> surprisingly, I noticed that it also doesn't seem to impose >> restrictions on what can appear in WHEN ... AND quals. > > You earlier agreed that subselects were not part of the Standard. You know that I didn't say that, Simon. >> What this boils down to is that I don't think that this part of your >> design is committable (from your recent v14): > > So your opinion is that because v14 patch doesn't include a feature > extension that is in Oracle and SQLServer that we cannot commit this > patch. > > There are quite a few minor additional things in that category and the > syntax of those two differ, so its clearly impossible to match both > exactly. > > That seems like poor reasoning on why we should block the patch. It certainly is. Good thing I never said anything of the sort. There are 3 specific issues on query structure, that together paint a picture about what you're not doing in the optimizer: 1. Whether or not subselects in the UPDATE targetlist are supported. 2. Whether or not subselects in the WHEN ... AND quals support subselects. 3. Whether or not subselects are possible within the main ON () join. I gave a lukewarm endorsement of not supporting #3, was unsure with #2, and was very clear on #1 as soon as I saw the restriction: UPDATE targetlist in a MERGE are *not* special, and so must support subselects, just like ON CONFLICT DO UPDATE, for example. > If you would like to say how you think the design should look, it > might be possible to change it for this release. Changing it in the > future would not be blocked by commiting without that. I can tell you right now that you need to support subselects in the targetlist. They are *not* an extension to the standard, AFAICT. And even if they were, every other system supports them, and there is absolutely no logical reason to not support them other than the fact that doing so requires significant changes to the data structures in the parser, planner, and executor. Reworking that will probably turn out to be necessary for other reasons that I haven't thought of. I think that restrictions like this are largely an accident of how your patch evolved. It would be a lot easier to work with you if you acknowledged that. >> I am willing to continue to engage with you on the concurrency issues >> for the time being, since that is the most pressing issue for the >> patch. We can get to this stuff later. > > There are no concurrency issues. The patch gives the correct answer in > all cases, or an error to avoid problems. We've agreed that it is > desirable we remove some of those if possible, though they are there > as a result of our earlier discussions. You seem to presume to be in charge of the parameters of this discussion. I don't see it that way. I think that READ COMMITTED conflict handling semantics are by far the biggest issue for the patch, and that we should prioritize reaching agreement there. This needs to be worked out through community consensus, since it concerns fundamental semantics much more than implementation details. (In contrast, the optimizer issues I mentioned are fairly heavy on relatively uncontentious implementation questions.) The problem with how you've represented MERGE in the parser, optimizer, and executor is not that it's "half-baked crap", as you suggested others might think at the FOSDEM developer meeting [1]. I wouldn't say that at all. What I'd say is that it's *unfinished*. It's definitely sufficient to prototype different approaches to concurrency, as well as to determine how triggers should work, and many other such things. That's a good start. I am willing to mostly put aside the other issues for the time being, to get the difficult questions on concurrency out of the way first. But if you don't make some broad concessions on the secondary issues pretty quickly, then I will have to conclude that our positions are irreconcilable. I will have nothing further to contribute to the discussion. [1] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2018_Developer_Meeting#Minutes -- Peter Geoghegan
On Wed, Jan 31, 2018 at 11:37 PM, Peter Geoghegan <pg@bowt.ie> wrote: > On Wed, Jan 31, 2018 at 7:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> I don't fully grok merge but suppose you have: >> >> WHEN MATCHED AND a = 0 THEN UPDATE ... >> WHEN MATCHED AND a = 1 THEN UPDATE ... >> WHEN NOT MATCHED THEN INSERT ... >> >> Suppose you match a tuple with a = 0 but, upon trying to update it, >> find that it's been updated to a = 1. It seems like there are a few >> possible behaviors: >> >> 1. Throw an error! I guess this is what the patch does now. > > Right. > >> 2. Do absolutely nothing. I think this is what would happen with an >> ordinary UPDATE; the tuple fails the EPQ recheck and so is not >> updated, but that doesn't trigger anything else. > > I think #2 is fine if you're talking about join quals. Which, of > course, you're not. These WHEN quals really do feel like > tuple-at-a-time procedural code, more than set-orientated quals (if > that wasn't true, we'd have to allow cardinality violations, which we > at least try to avoid). Simon said something like "the SQL standard > requires that WHEN quals be evaluated first" at one point, which makes > sense to me. > It is not clear to me what is exactly your concern if we try to follow #2? To me, #2 seems like a natural choice. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 4 February 2018 at 06:32, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Jan 31, 2018 at 11:37 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> On Wed, Jan 31, 2018 at 7:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> I don't fully grok merge but suppose you have: >>> >>> WHEN MATCHED AND a = 0 THEN UPDATE ... >>> WHEN MATCHED AND a = 1 THEN UPDATE ... >>> WHEN NOT MATCHED THEN INSERT ... >>> >>> Suppose you match a tuple with a = 0 but, upon trying to update it, >>> find that it's been updated to a = 1. It seems like there are a few >>> possible behaviors: >>> >>> 1. Throw an error! I guess this is what the patch does now. >> >> Right. >> >>> 2. Do absolutely nothing. I think this is what would happen with an >>> ordinary UPDATE; the tuple fails the EPQ recheck and so is not >>> updated, but that doesn't trigger anything else. >> >> I think #2 is fine if you're talking about join quals. Which, of >> course, you're not. These WHEN quals really do feel like >> tuple-at-a-time procedural code, more than set-orientated quals (if >> that wasn't true, we'd have to allow cardinality violations, which we >> at least try to avoid). Simon said something like "the SQL standard >> requires that WHEN quals be evaluated first" at one point, which makes >> sense to me. >> > > It is not clear to me what is exactly your concern if we try to follow > #2? To me, #2 seems like a natural choice. At first, but it gives an anomaly so is not a good choice. The patch does behavior #5, it rechecks the conditions with the latest row. Otherwise WHEN MATCHED AND a=0 THEN UPDATE SET b=0 WHEN MATCHED AND a=1 THEN UPDATE SET b=1 would result in (a=1, b=0) in case of concurrent updates, which the user clearly doesn't want. The evaluation of the WHEN qual must occur prior to the update, which will still be true in #5. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3 February 2018 at 23:17, Peter Geoghegan <pg@bowt.ie> wrote: > On Sat, Feb 3, 2018 at 2:15 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> I started looking at SQL Server's MERGE to verify that it also does >>> not impose any restrictions on subselects in a MERGE UPDATE's >>> targetlist, just like Oracle. Unsurprisingly, it does not. More >>> surprisingly, I noticed that it also doesn't seem to impose >>> restrictions on what can appear in WHEN ... AND quals. >> >> You earlier agreed that subselects were not part of the Standard. > > You know that I didn't say that, Simon. I'm happy to quote your words. "I've acknowledged that the standard has something to say on this that supports your position, which has real weight." https://www.postgresql.org/message-id/CAH2-WzkAjSN1H-ym-sSDh%2B6EJWmEhyHdDStzXDB%2BFxt1hcKEgg%40mail.gmail.com -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3 February 2018 at 23:17, Peter Geoghegan <pg@bowt.ie> wrote: > There are 3 specific issues on query structure, that together paint a > picture about what you're not doing in the optimizer: > > 1. Whether or not subselects in the UPDATE targetlist are supported. > > 2. Whether or not subselects in the WHEN ... AND quals support subselects. > > 3. Whether or not subselects are possible within the main ON () join. > > I gave a lukewarm endorsement of not supporting #3, was unsure with > #2, and was very clear on #1 as soon as I saw the restriction: UPDATE > targetlist in a MERGE are *not* special, and so must support > subselects, just like ON CONFLICT DO UPDATE, for example. All three of the above give errors in the current patch, as we already discussed for (1) and (2). I've added these to the tests so we can track support for them explicitly. The current patch runs one query then executes the quals post-execution as we do for check constraints and RLS. Changes would be required to support subselects. Changes to support sub-selects don't invalidate what is there now in the current patch with regard to query representation or optimization. So support of those extra features can be added later if we choose. Frankly, whatever we do now, I'm sure we will discover cases that need further optimization, just as we have done with RLS and Partitioning, so the query representation was likely to change over time anyway. Whatever we decide for concurrent behavior will affect how we support them. We can't move forwards on them until we have that nailed down. I could give a longer technical commentary but I will be unavailable now for some time, so unable to give further discussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs writes: > It will likely take some time to work through these and the current > work items but will fix. > > Do you have the DDL so we can recreate these easily? Attached are testcases that trigger the assertions when run against an empty database instead of the one left behind by make installcheck. The "unrecognized node type" one appears to be a known issue according to the wiki, so I didn't bother doing the work for this one as well. regards, Andreas
Attachment
On Sun, Feb 4, 2018 at 12:42 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I'm happy to quote your words. > > "I've acknowledged that the standard has something to > say on this that supports your position, which has real weight." > > https://www.postgresql.org/message-id/CAH2-WzkAjSN1H-ym-sSDh%2B6EJWmEhyHdDStzXDB%2BFxt1hcKEgg%40mail.gmail.com Immediately afterwards, in that same e-mail, I go on to say: "I'm not asking about WHEN AND here (that was my last question) [Simon quoted my last response to said question]. I'm asking about a subselect that appears in the targetlist." Even if you are right to take "I've acknowledged that the standard has something to say on this that supports your position" as a blanket endorsement of disallowing subselects in all 3 places, I still don't see why this is worth even talking about. That would mean that I said something on January 29th that I subsequently withdrew on February 1st. -- Peter Geoghegan
On Sun, Feb 4, 2018 at 3:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> It is not clear to me what is exactly your concern if we try to follow >> #2? To me, #2 seems like a natural choice. > > At first, but it gives an anomaly so is not a good choice. The patch > does behavior #5, it rechecks the conditions with the latest row. > > Otherwise > WHEN MATCHED AND a=0 THEN UPDATE SET b=0 > WHEN MATCHED AND a=1 THEN UPDATE SET b=1 > would result in (a=1, b=0) in case of concurrent updates, which the > user clearly doesn't want. I am unable to understand this. What are you presuming the tuple was originally? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Feb 4, 2018 at 5:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Changes to support sub-selects don't invalidate what is there now in > the current patch with regard to query representation or optimization. > So support of those extra features can be added later if we choose. I don't think you get to make a unilateral decision to exclude features that work everywhere else from the scope of this patch. If there is agreement that those features can be left out of scope, then that is one thing, but so far all the commentary about the things that you've chosen to exclude has been negative. Nor have you really given any reason why they should be exempt. You've pointed out that parallel query doesn't handle everything (which is certainly true, but does not mean that any feature from now and the end of time is allowed to exclude from scope whatever seems inconvenient regardless of contrary community consensus) and you've pointed out here and elsewhere that somebody could go add the features you omitted later (which is also true, but misses the general point that we want committed patches to be reasonably complete already, not have big gaps that someone will have to fix later). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Feb 5, 2018 at 7:56 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I don't think you get to make a unilateral decision to exclude > features that work everywhere else from the scope of this patch. If > there is agreement that those features can be left out of scope, then > that is one thing, but so far all the commentary about the things that > you've chosen to exclude has been negative. Nor have you really given > any reason why they should be exempt. You've pointed out that > parallel query doesn't handle everything (which is certainly true, but > does not mean that any feature from now and the end of time is allowed > to exclude from scope whatever seems inconvenient regardless of > contrary community consensus) and you've pointed out here and > elsewhere that somebody could go add the features you omitted later > (which is also true, but misses the general point that we want > committed patches to be reasonably complete already, not have big gaps > that someone will have to fix later). For me, the concern is not really the omission of support for certain features as such. The concern is that those omissions hint that there is a problem with the design itself, particularly in the optimizer. Allowing subselects in the UPDATE part of a MERGE do not seem like they could be written as a neat adjunct to what Simon already came up with. If that was possible, Simon probably already would have done it. -- Peter Geoghegan
On Tue, Feb 6, 2018 at 9:50 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Feb 5, 2018 at 7:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I don't think you get to make a unilateral decision to exclude
> features that work everywhere else from the scope of this patch. If
> there is agreement that those features can be left out of scope, then
> that is one thing, but so far all the commentary about the things that
> you've chosen to exclude has been negative. Nor have you really given
> any reason why they should be exempt. You've pointed out that
> parallel query doesn't handle everything (which is certainly true, but
> does not mean that any feature from now and the end of time is allowed
> to exclude from scope whatever seems inconvenient regardless of
> contrary community consensus) and you've pointed out here and
> elsewhere that somebody could go add the features you omitted later
> (which is also true, but misses the general point that we want
> committed patches to be reasonably complete already, not have big gaps
> that someone will have to fix later).
For me, the concern is not really the omission of support for certain
features as such. The concern is that those omissions hint that there
is a problem with the design itself, particularly in the optimizer.
Allowing subselects in the UPDATE part of a MERGE do not seem like
they could be written as a neat adjunct to what Simon already came up
with. If that was possible, Simon probably already would have done it.
As someone who's helping Simon with that part of the code, I must say that omission of sub-selects in the UPDATE targetlist and WHEN quals is not because of some known design problems. So while it may be true that we've a design problem, it's also quite likely that we are missing some planner/optimiser trick and once we add those missing pieces, it will start working. Same is the case with RLS.
I first tried to treat MERGE similar to UPDATE/DELETE case and ensure that the INSERTs go through the root partition. That mostly works, but the RIGHT OUTER join between the child tables and the source relation ends up emitting duplicate rows, if the partitioned table is the resultRelation and when it gets expanded in inheritance_planner(). That's a blocker. So what I am trying now is to push the join between the Append relation and the source relation below the ModifyTable node, so that we get the final join result. We can then look up the tableoid in the row returned from the join, find the corresponding result relation and then carry out MERGE actions. Note that unlike regular ExecModifyTable(), here we must execute just one subplan as that will return all the required tuples.
Does anyone see a potential blocker with this approach, except that it may not be the most elegant way? I think EvalPlanQual might need some treatment because when the plan is re-executed, it will expect to the find the updated tuple in the slot of the underlying query's RTE and not in the resultRelation's RTE, which does not participate in the join at all. Anything else I could be missing out completely?
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Mon, Feb 5, 2018 at 7:56 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > I don't think you get to make a unilateral decision to exclude > > features that work everywhere else from the scope of this patch. If > > there is agreement that those features can be left out of scope, then > > that is one thing, but so far all the commentary about the things that > > you've chosen to exclude has been negative. Nor have you really given > > any reason why they should be exempt. You've pointed out that > > parallel query doesn't handle everything (which is certainly true, but > > does not mean that any feature from now and the end of time is allowed > > to exclude from scope whatever seems inconvenient regardless of > > contrary community consensus) and you've pointed out here and > > elsewhere that somebody could go add the features you omitted later > > (which is also true, but misses the general point that we want > > committed patches to be reasonably complete already, not have big gaps > > that someone will have to fix later). > > For me, the concern is not really the omission of support for certain > features as such. The concern is that those omissions hint that there > is a problem with the design itself, particularly in the optimizer. > Allowing subselects in the UPDATE part of a MERGE do not seem like > they could be written as a neat adjunct to what Simon already came up > with. If that was possible, Simon probably already would have done it. I tend to agree with Robert here that we should be trying to include relatively complete features which work with the rest of the system whenever possible. To the extent that partitioning wasn't entirely complete, I think the subsequent work on it (partituclarly in this release cycle) clearly demonstrates that it's a huge project which needed multiple releases, but that's an exception to the general rule. That was also what seemed to be the consensus coming out of the FOSDEM Developer meeting (notes here: https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2018_Developer_Meeting). There was also discussion and apparent consensus that performance issues or sub-par plans are reasonable for an initial feature (as happened with leakproof views and subsequently RLS until the rework which improved it greatly in later releases). Coming out of that, my understanding is that Simon is planning to have a patch which implements RLS and partitioning (though the query plans for partitioning may be sub-par and not ideal) as part of MERGE and I've agreed to review at least the RLS bits (though my intention is to at least go through the rest of the patch as well, though likely in less detail). Of course, I encourage others to review it as well. Thanks! Stephen
Attachment
On Tue, Feb 6, 2018 at 9:19 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Feb 4, 2018 at 3:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> It is not clear to me what is exactly your concern if we try to follow >>> #2? To me, #2 seems like a natural choice. >> >> At first, but it gives an anomaly so is not a good choice. The patch >> does behavior #5, it rechecks the conditions with the latest row. >> >> Otherwise >> WHEN MATCHED AND a=0 THEN UPDATE SET b=0 >> WHEN MATCHED AND a=1 THEN UPDATE SET b=1 >> would result in (a=1, b=0) in case of concurrent updates, which the >> user clearly doesn't want. > > I am unable to understand this. > Neither do I. There is nothing in above statement which changes 'a'. > What are you presuming the tuple was > originally? > I have tried to think of one example which can result in what Simon is saying. Consider original tuple has a = 0 and b = 1 Session -1 WHEN MATCHED AND a=0 THEN UPDATE SET b=0 WHEN MATCHED AND a=1 THEN UPDATE SET b=1 Session-2 WHEN MATCHED AND b=0 THEN UPDATE SET a=0 WHEN MATCHED AND b=1 THEN UPDATE SET a=1 Now assume both the session got the tuple to Update, Session-1 locks to Update b = 0 and Session-2 will wait for Session-1 to complete. After Session-1 commits, Session-2 will wake up and performs EvalPlanQual because it will find the tuple as Updated. Now, I think EvalPlanQual mechanism will succeed if we don't match WHEN clauses as part of EvalPlanQual mechanism and it will update a = 1. So, now we will have a=1, b=0. I think if this is going to happen with approach-2 (#2), then one can argue that Session-2's update shouldn't have succeeded. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 02/06/2018 05:20 AM, Peter Geoghegan wrote: > On Mon, Feb 5, 2018 at 7:56 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I don't think you get to make a unilateral decision to exclude >> features that work everywhere else from the scope of this patch. >> If there is agreement that those features can be left out of scope, >> then that is one thing, but so far all the commentary about the >> things that you've chosen to exclude has been negative. Nor have >> you really given any reason why they should be exempt. You've >> pointed out that parallel query doesn't handle everything (which is >> certainly true, but does not mean that any feature from now and the >> end of time is allowed to exclude from scope whatever seems >> inconvenient regardless of contrary community consensus) and you've >> pointed out here and elsewhere that somebody could go add the >> features you omitted later (which is also true, but misses the >> general point that we want committed patches to be reasonably >> complete already, not have big gaps that someone will have to fix >> later). > > For me, the concern is not really the omission of support for > certain features as such. The concern is that those omissions hint > that there is a problem with the design itself, particularly in the > optimizer. Allowing subselects in the UPDATE part of a MERGE do not > seem like they could be written as a neat adjunct to what Simon > already came up with. If that was possible, Simon probably already > would have done it. > I agree with both of these statements wholeheartedly. As I mentioned at the developer meeting last week, whenever someone asks me for reasons why I find PostgreSQL better than MySQL, I usually point them to "MySQL Restrictions and Limitations" document. Which is pretty much a matrix of features that do not work together in some way. It would be somewhat unfortunate to get on that route too ... So I think the general principle should be to make the features as complete as possible. For the more complex features that may not be quite achievable, though, forcing us to add the pieces over multiple releases. Which is what happened with partitioning, for example. I think we can do that for MERGE too, assuming we actually understand (1) why each of the pieces is missing (2) what would it take to make it work I'm not sure we have those answers for MERGE, though. I haven't done anything on the MERGE patch so far, and I've been unable to convince myself what the reasoning is for the limitations. Perhaps there are explanations somewhere far back in the thread, but considering how much both the implementation and our understanding of it changed over time, I'm not sure how relevant those past arguments are. Take for example the "sub-selects are not supported" limitation. It's not clear to me why this does not work. I see claims that SQL standard says something about it (but not what exactly), that it supposedly depends on how we handle concurrency, and perhaps other reasons. Furthermore, there are multiple places where the sub-select might be, and I'm not sure which of those arguments applies to which case. Without answering (2) I think it's perfectly understandable Peter is concerned we may run into design issues later, when we try to address some of the limitations. I plan to go through the patch and this thread over the couple of days, and summarize what the current status is (or my understanding of it). That is (a) what are the missing pieces, (b) why are they missing, (c) how we plan to address them in the future and (d) opinions on these issues expressed by others on this thread. kind regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Feb 6, 2018 at 9:40 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > I think we can do that for MERGE too, assuming we actually understand > > (1) why each of the pieces is missing > > (2) what would it take to make it work Right, I completely agree with that. For example, if we find that a certain thing can't be supported without implementing global indexes on partitions, then I have no problem saying "OK, that's not going to be supported", because global indexes are a huge project unto themselves. That's the same reason ON CONFLICT .. UPDATE isn't supported on partitioned tables, and it's just as reasonable for this patch as it is for that one. What we don't want is things that this patch doesn't support because it would take a couple of days of work and the people who wrote the patch were busy and didn't have a couple of extra days. We routinely expect patch authors to plug gaps caused by oversight or lack of round tuits, and this patch should be held to the same standard. In short, we don't have a hard and fast rule that every feature must work with every other feature, but when it doesn't, the burden is on the patch author to justify why that omission is reasonable. I know I expended a lot of ink explaining why parallel query couldn't reasonably be made to support writes. > I plan to go through the patch and this thread over the couple of days, > and summarize what the current status is (or my understanding of it). > That is (a) what are the missing pieces, (b) why are they missing, (c) > how we plan to address them in the future and (d) opinions on these > issues expressed by others on this thread. Thank you. That sounds fantastic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 6, 2018 at 8:10 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Take for example the "sub-selects are not supported" limitation. It's
not clear to me why this does not work. I see claims that SQL standard
says something about it (but not what exactly), that it supposedly
depends on how we handle concurrency, and perhaps other reasons.
Furthermore, there are multiple places where the sub-select might be,
and I'm not sure which of those arguments applies to which case.
Without answering (2) I think it's perfectly understandable Peter is
concerned we may run into design issues later, when we try to address
some of the limitations.
I plan to go through the patch and this thread over the couple of days,
and summarize what the current status is (or my understanding of it).
That is (a) what are the missing pieces, (b) why are they missing, (c)
how we plan to address them in the future and (d) opinions on these
issues expressed by others on this thread.
Here is v15 of the patch. It now fully supports partitioned tables. As I explained upthread, supporting partitioned table turned out much trickier than what I initially thought because of complete different code paths that INSERT and UPDATE/DELETE take in case of inheritance. Since MERGE need both the facilities, I'd to pretty much merge both the machineries. But the end result seems okay. I am sure we can improve this further, but whatever I have tested so far (which may not be necessarily thorough) seems to work fine.
Since the way RIGHT OUTER join is now pushed below the ModifyTable node, I'd to make appropriate changes to EvalPlanQual calling locations so that we pass in the RT index of the table used in the join, and not of the resultRelInfo, when we are handling MERGE.
Initially I was a bit surprised that EvalPlanQual silently ignores the case when partition key is updated and a row is moved from one partition to another. But then I realised that this is the behaviour of the partitioned tables and not MERGE itself.
The revised version also supports subqueries in SET targetlist as well as WHEN AND conditions. As expected, this needed a minor tweak in query/expression mutators. So once I worked on that for partitioned tables, subqueries started working with very minimal adjustments elsewhere. Other things such as whole-var references are still broken. A few new tests are also added.
--
Next I am going to look at RLS. While I've no prior experience with RLS, I am expecting it to be less cumbersome in comparison to partitioning. I am out of action till Monday and may not be able to respond in time. But any reviews and comments are appreciated as always.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Feb 7, 2018 at 1:24 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Here is v15 of the patch. Cool. > Initially I was a bit surprised that EvalPlanQual silently ignores the case > when partition key is updated and a row is moved from one partition to > another. But then I realised that this is the behaviour of the partitioned > tables and not MERGE itself. Apparently there is a pending patch to do better there - the commit message of 2f178441 refers to this. > The revised version also supports subqueries in SET targetlist as well as > WHEN AND conditions. As expected, this needed a minor tweak in > query/expression mutators. So once I worked on that for partitioned tables, > subqueries started working with very minimal adjustments elsewhere. Other > things such as whole-var references are still broken. A few new tests are > also added. Great! Wholerow references are expected to be a bit trickier. See commit ad227837 for some hints on how you could fix this. > Next I am going to look at RLS. While I've no prior experience with RLS, I > am expecting it to be less cumbersome in comparison to partitioning. I am > out of action till Monday and may not be able to respond in time. But any > reviews and comments are appreciated as always. I don't think that RLS support will be particularly challenging. It might take a while. If your rapid progress here is any indication, most open items are not likely to be particularly challenging. Once again, I suggest that a good area for us to focus on is the semantics of READ COMMITTED conflict handling. Maybe you'd prefer to just blast through the simpler open items, which is fine, but do bear in mind that the EPQ and EPQ-adjacent stuff is probably going to be the thing that makes or breaks this patch for v11. -- Peter Geoghegan
On Wed, Feb 7, 2018 at 10:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:
Apparently there is a pending patch to do better there - the commit
message of 2f178441 refers to this.
Thanks. Will look at it.
> The revised version also supports subqueries in SET targetlist as well as
> WHEN AND conditions. As expected, this needed a minor tweak in
> query/expression mutators. So once I worked on that for partitioned tables,
> subqueries started working with very minimal adjustments elsewhere. Other
> things such as whole-var references are still broken. A few new tests are
> also added.
Great!
Wholerow references are expected to be a bit trickier. See commit
ad227837 for some hints on how you could fix this.
Thanks again.
> Next I am going to look at RLS. While I've no prior experience with RLS, I
> am expecting it to be less cumbersome in comparison to partitioning. I am
> out of action till Monday and may not be able to respond in time. But any
> reviews and comments are appreciated as always.
I don't think that RLS support will be particularly challenging. It
might take a while.
Ok. I would start from writing a test case to check what works and what doesn't with the current patch and work from there. My understanding of RLS is limited right now, but from what I've seen in the code (while hacking other stuff), my guess is it will require us evaluating a set of quals and then deciding on the action.
If your rapid progress here is any indication, most open items are not
likely to be particularly challenging. Once again, I suggest that a
good area for us to focus on is the semantics of READ COMMITTED
conflict handling.
I understand getting EPQ semantics right is very important. Can you please (once again) summarise your thoughts on what you think is the *most* appropriate behaviour? I can then think how much efforts might be involved in that. If the efforts are disproportionately high, we can discuss if settling for some not-so-nice semantics, like we apparently did for partition key updates.
I am sorry, I know you and Simon have probably done that a few times already and I should rather study those proposals first. So it's okay if you don't want to repeat those; I will work on them next week once I am back from holidays.
Maybe you'd prefer to just blast through the
simpler open items, which is fine, but do bear in mind that the EPQ
and EPQ-adjacent stuff is probably going to be the thing that makes or
breaks this patch for v11.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Feb 7, 2018 at 7:51 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > I understand getting EPQ semantics right is very important. Can you please > (once again) summarise your thoughts on what you think is the *most* > appropriate behaviour? I can then think how much efforts might be involved > in that. If the efforts are disproportionately high, we can discuss if > settling for some not-so-nice semantics, like we apparently did for > partition key updates. I personally believe that the existing EPQ semantics are already not-so-nice. They're what we know, though, and we haven't actually had any real world complaints, AFAIK. My concern is mostly just that MERGE manages to behave in a way that actually "provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements", as the docs put it. As long as MERGE manages to do something as close to that high level description as possible in READ COMMITTED mode (with our current semantics for multiple statements in RC taken as the baseline), then I'll probably be happy. Some novel new behavior -- "EPQ with a twist"-- is clearly necessary. I feel a bit uneasy about it because anything that anybody suggests is likely to be at least a bit arbitrary (EPQ itself is kind of arbitrary). We only get to make a decision on how "EPQ with a twist" will work once, and that should be a decision that is made following careful deliberation. Ambiguity is much more likely to kill a patch than a specific technical defect, at least in my experience. Somebody can usually just fix a technical defect. > I am sorry, I know you and Simon have probably done that a few times already > and I should rather study those proposals first. So it's okay if you don't > want to repeat those; I will work on them next week once I am back from > holidays. Unfortunately, I didn't get very far with Simon on this. I didn't really start talking about this until recently, though, so it's not like you missed much. The first time I asked Simon about this was January 23rd, and I first proposed something about 10 days ago. Something very tentative. (I did ask some questions about EPQ, and even WHEN ... AND quals much earlier, but that was in the specific context of a debate about MERGE's use of ON CONFLICT's speculative insertion mechanism. I consider this to be a totally different discussion, that ended before Simon even posted his V1 patch, and isn't worth spending your time on now.) > TBH I did not consider partitioning any less complex and it was indeed very > complex, requiring at least 3 reworks by me. And from what I understood, it > would have been a blocker too. So is subquery handling and RLS. That's why I > focused on addressing those items while you and Simon were still debating > EPQ semantics. Sorry if I came across as dismissive of that effort. That was certainly not my intention. I am pleasantly surprised that you've managed to move a number of things forward rather quickly. I'll rephrase: while it would probably have been a blocker in theory (I didn't actually weigh in on that), I doubted that it would actually end up doing so in practice (and it now looks like I was right to doubt that, since you got it done). It was a theoretical blocker, as opposed to an open item that could drag on indefinitely despite everyone's best efforts. Obviously details matter, and obviously there are a lot of details to get right outside of RC semantics, but it seems wise to focus on the big risk that is EPQ/RC conflict handling. The only other thing that comes close to that risk is the risk that we'll get stuck on RLS. Though even the RLS discussion may actually end up being blocked on this crucial question of EPQ/RC conflict handling. Did you know that the RLS docs [1] have a specific discussion of the implications of EPQ for users of RLS, and that it mentions doing things like using SELECT ... FOR SHARE to work around the problem? It has a whole example of a scenario that users actually kind of need to know about, at least in theory. RC conflict handling semantics could bleed into a number of other things. I'll need to think some more about RC conflict handling (deciding what "EPQ with a twist" actually means), since I haven't focused on MERGE recently. Bear with me. [1] https://www.postgresql.org/docs/current/static/ddl-rowsecurity.html -- Peter Geoghegan
On Thu, Feb 8, 2018 at 8:23 PM, Peter Geoghegan <pg@bowt.ie> wrote: > Some novel new behavior -- "EPQ with a twist"-- is clearly necessary. > I feel a bit uneasy about it because anything that anybody suggests is > likely to be at least a bit arbitrary (EPQ itself is kind of > arbitrary). We only get to make a decision on how "EPQ with a twist" > will work once, and that should be a decision that is made following > careful deliberation. Ambiguity is much more likely to kill a patch > than a specific technical defect, at least in my experience. Somebody > can usually just fix a technical defect. Here's my $0.02: I think that new concurrency errors thrown by the merge code itself deserve strict scrutiny and can survive only if they have a compelling justification, but if the merge code happens to choose an action that leads to a concurrency error of its own, I think that deserves only mild scrutiny. On that basis, of the options I listed in http://postgr.es/m/CA+TgmoZDL-caukHkWet7sr7sqr0-e2T91+DEvhqeN5sfqsMjqw@mail.gmail.com I like #1 least. I also dislike #4 from that list for the reasons stated there. For example, if you say WHEN MATCHED AND x.some_boolean and then WHEN MATCHED, you expect that every tuple that hits the latter clause will have that Boolean as false or null, but #4 makes that not true. I think the best options are #2 and #5 -- #2 because it's simple, and #5 because it's (maybe) more intuitive, albeit at the risk of livelock. But I'm willing to convinced of some other option; like you, I'm willing to be open-minded about this. But, as you say, we need a coherent, well-considered justification for the selected option, not just "well, this is what we implemented so you should like it". The specification should define the implementation, not the reverse. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 02/07/2018 10:24 AM, Pavan Deolasee wrote: > > ... > > Here is v15 of the patch. > I've been looking at this version of the patch, mostly to educate myself before attempting to write the "status summary". One bit that I don't quite understand is GetXactWALBytes(). It pretty much just returns XactLastRecEnd and is used in ExecMerge like this: int64 startWAL = GetXactWALBytes(); bool qual = ExecQual(action->whenqual, econtext); /* * SQL Standard says that WHEN AND conditions must not * write to the database, so check we haven't written * any WAL during the test. Very sensible that is, since * we can end up evaluating some tests multiple times if * we have concurrent activity and complex WHEN clauses. * * XXX If we had some clear form of functional labelling * we could use that, if we trusted it. */ if (startWAL < GetXactWALBytes()) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot write to database ..."))); I think this actually fails to enforce the rule, because some writes may not produce WAL (think of unlogged tables). I also suspect it may be incorrect "in the opposite direction" because a query may not do any changes and yet it may produce WAL (e.g. due to wal_hint_bits=true). So we may need to think of a different way to enforce this ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 9, 2018 at 6:53 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Feb 7, 2018 at 7:51 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> I understand getting EPQ semantics right is very important. Can you please
> (once again) summarise your thoughts on what you think is the *most*
> appropriate behaviour? I can then think how much efforts might be involved
> in that. If the efforts are disproportionately high, we can discuss if
> settling for some not-so-nice semantics, like we apparently did for
> partition key updates.
I personally believe that the existing EPQ semantics are already
not-so-nice. They're what we know, though, and we haven't actually had
any real world complaints, AFAIK.
I agree.
My concern is mostly just that MERGE manages to behave in a way that
actually "provides a single SQL statement that can conditionally
INSERT, UPDATE or DELETE rows, a task that would otherwise require
multiple procedural language statements", as the docs put it. As long
as MERGE manages to do something as close to that high level
description as possible in READ COMMITTED mode (with our current
semantics for multiple statements in RC taken as the baseline), then
I'll probably be happy.
IMO it will be quite hard, if not impossible, to guarantee the same semantics to a single statement MERGE and multi statement UPDATE/DELETE/INSERT in RC mode. For example, the multi statement model will execute each statement with a new MVCC snapshot and hence the rows visible to individual statement may vary. Whereas in MERGE, everything runs with a single snapshot. There could be other such subtle differences.
Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
I feel a bit uneasy about it because anything that anybody suggests is
likely to be at least a bit arbitrary (EPQ itself is kind of
arbitrary). We only get to make a decision on how "EPQ with a twist"
will work once, and that should be a decision that is made following
careful deliberation. Ambiguity is much more likely to kill a patch
than a specific technical defect, at least in my experience. Somebody
can usually just fix a technical defect.
While I agree, I think we need to make these decisions in a time bound fashion. If there is too much ambiguity, then it's not a bad idea to settle for throwing appropriate errors instead of providing semantically wrong answers, even in some remote corner case.
> TBH I did not consider partitioning any less complex and it was indeed very
> complex, requiring at least 3 reworks by me. And from what I understood, it
> would have been a blocker too. So is subquery handling and RLS. That's why I
> focused on addressing those items while you and Simon were still debating
> EPQ semantics.
Sorry if I came across as dismissive of that effort. That was
certainly not my intention. I am pleasantly surprised that you've
managed to move a number of things forward rather quickly.
I'll rephrase: while it would probably have been a blocker in theory
(I didn't actually weigh in on that), I doubted that it would actually
end up doing so in practice (and it now looks like I was right to
doubt that, since you got it done). It was a theoretical blocker, as
opposed to an open item that could drag on indefinitely despite
everyone's best efforts. Obviously details matter, and obviously there
are a lot of details to get right outside of RC semantics, but it
seems wise to focus on the big risk that is EPQ/RC conflict handling.
Ok. I am now back from holidays and I will too start thinking about this. I've also requested a colleague to help us with comparing it against Oracle's behaviour. N That's not a gold standard for us, but knowing how other major databases handle RC conflicts, is not a bad idea.
I see the following important areas and as long as we have a consistent and coherent handling of these cases, we should not have difficulty agreeing on a outcome.
1. Concurrent UPDATE does not affect MATCHED case. The WHEN conditions may or may not be affected.
2. Concurrently UPDATEd tuple fails the join qual and the current source tuple no longer matches with the updated target tuple that the EPQ is set for. It matches no other target tuple either. So a MATCHED case is turned into a NOT MATCHED case.
3. Concurrently UPDATEd tuple fails the join qual and the current source tuple no longer matches with the updated target tuple that the EPQ is set for. But it matches some other target tuple. So it's still a MATCHED case, but with different target tuple(s).
4. Concurrent UPDATE/INSERT creates a matching target tuple for a source tuple, thus turning a NOT MATCHED case to a MATCHED case.
5. Concurrent DELETE turns a MATCHED case into NOT MATCHED case
Any other case that I am missing? Assuming all cases are covered, what should we do in each of these cases, so that there is no or very little ambiguity and the outcome seems consistent (at the very least as far as MERGE goes and hopefully with regular UPDATE/DELETE handling)
I think #1 is pretty straight forward. We should start from the top, re-evaluate the WHEN conditions again and execute the first matching action.
For #2, it seems natural that we skip the MATCHED actions and execute the NOT MATCHED action for the current source tuple. But the trouble is how to differentiate between #2 and #3. I'm not sure if we can really distinguish between these cases i.e. given the current source tuple, does another matching target tuple exists? If another matching target tuple exists, we must not invoke the NOT MATCHED action. Otherwise we might end up executing NOT MATCHED as well as MATCHED action for the same source tuple, which seems weird.
#4 looks similar to INSERT ON CONFLICT and one may argue that we should detect concurrent inserts/updates and execute the MATCHED action. But I don't know if that can be done in a reasonable way. It will probably require us to have a primary key on the table to detect those conflicts. I think we should just let the operation fail, like a regular INSERT.
It seems natural that #5 should skip the MATCHED action and instead execute the first satisfying NOT MATCHED action. But it's outcome may depend on how we handle #2 and #3 so that they are all consistent. If we allow #2 and #3 to error out, whenever there is ambiguity, we should do the same for #5.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Feb 10, 2018 at 7:19 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,
On 02/07/2018 10:24 AM, Pavan Deolasee wrote:
>
if (startWAL < GetXactWALBytes())
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot write to database ...")));
I think this actually fails to enforce the rule, because some writes may
not produce WAL (think of unlogged tables). I also suspect it may be
incorrect "in the opposite direction" because a query may not do any
changes and yet it may produce WAL (e.g. due to wal_hint_bits=true).
So we may need to think of a different way to enforce this ...
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Feb 9, 2018 at 8:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:
--
On that basis, of the options I listed in
http://postgr.es/m/CA+TgmoZDL-caukHkWet7sr7sqr0-e2T91+ DEvhqeN5sfqsMjqw@mail.gmail. com
I like #1 least.
I also dislike #4 from that list for the reasons stated there. For
example, if you say WHEN MATCHED AND x.some_boolean and then WHEN
MATCHED, you expect that every tuple that hits the latter clause will
have that Boolean as false or null, but #4 makes that not true.
I think the best options are #2 and #5 -- #2 because it's simple, and
#5 because it's (maybe) more intuitive, albeit at the risk of
livelock.
As you said, #5 seems the best and that's what the patch does. But ISTM that the options you listed are not really the concerning points. As the patch stands today, we evaluate WHEN AND conditions separately, outside the EPQ. The problem arises when the join qual returns a different result with the updated tuple. I listed down those cases in my earlier email in the day. To me (and I assume to Peter and Simon too), those are the more interesting cases.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Hi Stephen,
On Tue, Feb 6, 2018 at 3:37 PM, Stephen Frost <sfrost@snowman.net> wrote:
Coming out of that, my understanding is that Simon is planning to have a
patch which implements RLS and partitioning (though the query plans for
partitioning may be sub-par and not ideal) as part of MERGE and I've
agreed to review at least the RLS bits (though my intention is to at
least go through the rest of the patch as well, though likely in less
detail). Of course, I encourage others to review it as well.
While executing MERGE, for existing tuples in the target table, we may end up doing an UPDATE or DELETE, depending on the WHEN MATCHED AND conditions. So it seems unlikely that we would be able to push down USING security quals down to the scan. For example, if the target row is set for deletion, it seems wrong to exclude the row from the join based on UPDATE policy's USING quals. So I am thinking that we should apply the respective USING quals *after* the decision to either update, delete or do nothing for the given target tuple is made.
--
The question I have is, if the USING qual evaluates to false or NULL, should we silently ignore the tuple (like regular UPDATE does) or throw an error (like INSERT ON CONFLICT DO UPDATE)? ISTM that we might have decided to throw an error in case of INSERT ON CONFLICT to avoid any confusion where the tuple is neither inserted nor updated. Similar situation may arise with MERGE because for a source row, we may neither do UPDATE (because of RLS) nor INSERT because a matching tuple already exists. But someone may argue that we should stay closer to regular UPDATE/DELETE. Apart from that, are there any security angles that we need to be mindful of and would those impact the choice?
SELECT policies will be applied to the target table during the scan and rows which do not pass SELECT quals will not be processed at all. If there are NOT MATCHED actions, we might end up inserting duplicate rows in that case or throw errors, but I don't see anything wrong with that. Similar things would have happened if someone tried to insert rows into the table using regular INSERT.
Similarly, INSERT policies will be applied when MERGE attempts to INSERT a row into the table and error will be thrown if the row does not satisfy INSERT policies.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Greetings Pavan, * Pavan Deolasee (pavan.deolasee@gmail.com) wrote: > On Tue, Feb 6, 2018 at 3:37 PM, Stephen Frost <sfrost@snowman.net> wrote: > > Coming out of that, my understanding is that Simon is planning to have a > > patch which implements RLS and partitioning (though the query plans for > > partitioning may be sub-par and not ideal) as part of MERGE and I've > > agreed to review at least the RLS bits (though my intention is to at > > least go through the rest of the patch as well, though likely in less > > detail). Of course, I encourage others to review it as well. > > Thanks for volunteering to review the RLS bits. I am planning to send a > revised version soon. As I work through it, I am faced with some semantic > questions again. Would appreciate if you or anyone have answers to those. Thanks for working on this. > While executing MERGE, for existing tuples in the target table, we may end > up doing an UPDATE or DELETE, depending on the WHEN MATCHED AND conditions. > So it seems unlikely that we would be able to push down USING security > quals down to the scan. For example, if the target row is set for deletion, > it seems wrong to exclude the row from the join based on UPDATE policy's > USING quals. So I am thinking that we should apply the respective USING > quals *after* the decision to either update, delete or do nothing for the > given target tuple is made. > > The question I have is, if the USING qual evaluates to false or NULL, > should we silently ignore the tuple (like regular UPDATE does) or throw an > error (like INSERT ON CONFLICT DO UPDATE)? ISTM that we might have decided > to throw an error in case of INSERT ON CONFLICT to avoid any confusion > where the tuple is neither inserted nor updated. Similar situation may > arise with MERGE because for a source row, we may neither do UPDATE > (because of RLS) nor INSERT because a matching tuple already exists. But > someone may argue that we should stay closer to regular UPDATE/DELETE. The reasoning behind the INSERT ON CONFLICT DO UPDATE approach when it comes to RLS is that we specifically didn't want to end up "losing" data- an INSERT which doesn't actually INSERT a row (or take the UPDATE action if the row already exists) ends up throwing that data away even though clearly the user expected us to do something with it, which is why we throw an error in that case instead. For my part, at least, it seems like MERGE would likewise possibly be throwing away data that the user is expecting to be incorporated if no action is taken due to RLS and that then argues that we should be throwing an error in such a case, similar to the INSERT ON CONFLICT DO UPDATE case. > Apart from that, are there any security angles that we need to be mindful > of and would those impact the choice? Regarding the above, no security issues come to mind with either approach. The security concerns are primairly not allowing rows to be directly seen which the user does not have access to, and not allowing the action to result in rows being added, modified, or removed in a way which would violate the policies defined. > SELECT policies will be applied to the target table during the scan and > rows which do not pass SELECT quals will not be processed at all. If there > are NOT MATCHED actions, we might end up inserting duplicate rows in that > case or throw errors, but I don't see anything wrong with that. Similar > things would have happened if someone tried to insert rows into the table > using regular INSERT. I agree, this seems like the right approach. > Similarly, INSERT policies will be applied when MERGE attempts to INSERT a > row into the table and error will be thrown if the row does not satisfy > INSERT policies. That sounds correct to me. Thanks! Stephen
Attachment
Hi Stephen,
--
On Wed, Feb 14, 2018 at 9:59 PM, Stephen Frost <sfrost@snowman.net> wrote:
The reasoning behind the INSERT ON CONFLICT DO UPDATE approach when it
comes to RLS is that we specifically didn't want to end up "losing"
data- an INSERT which doesn't actually INSERT a row (or take the UPDATE
action if the row already exists) ends up throwing that data away even
though clearly the user expected us to do something with it, which is
why we throw an error in that case instead.
For my part, at least, it seems like MERGE would likewise possibly be
throwing away data that the user is expecting to be incorporated if no
action is taken due to RLS and that then argues that we should be
throwing an error in such a case, similar to the INSERT ON CONFLICT DO
UPDATE case.
Thanks for confirming the approach. That matches my own thinking too.
Here is an updated v16a patch. This now supports RLS based on what we agreed above. I've added a few test cases to confirm that RLS works correctly with MERGE. We can more later if needed.
This version now also adds support for OVERRIDING clause in the INSERT statement. The whole var referencing issue pointed out by Peter is also fixed by this version. So to summarise the following things are now working:
- Partitioning
- Subqueries in WHEN AND quals and UPDATE targetlists
- Row level security (documentation changes are pending)
- OVERRIDING clause
- Various bugs reported by Peter are fixed (I haven't double checked if all issues are addressed or not, but we should be fairly close)
- Issues so far reported by Andreas Seltenreich as part of sqlsmith testing are fixed
I haven't yet addressed Tomas's review comment on using WAL position for write detection. I'm waiting for Simon to come back from holidays before doing anything there. One idea is to use some variant of contain_mutable_functions() and throw error during query planning, but not do anything during execution.
Other than that, I think we are getting to a point where the patch is mostly feature complete. We still need to decide the concurrent execution behaviour, but hopefully code changes there will be limited (unless we try to do something very invasive, which I hope we don't).
I'm not entirely happy with the way we're going about resolving column references in INSERT/UPDATE targetlist and subsequently what we are doing in setrefs.c. I'll continue to look for improvements there and also hoping to get suggestions from this list.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hi Andreas,
--
On Sun, Feb 4, 2018 at 5:45 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
Attached are testcases that trigger the assertions when run against an
empty database instead of the one left behind by make installcheck. The
"unrecognized node type" one appears to be a known issue according to
the wiki, so I didn't bother doing the work for this one as well.
Thanks for doing those tests. I've just sent v16a version of the patch and I think it fixes the issues reported so far. Can you please recheck? Please let me know if there are other issues detected by sqlsmith or otherwise.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Feb 11, 2018 at 11:09 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > On Fri, Feb 9, 2018 at 6:53 AM, Peter Geoghegan <pg@bowt.ie> wrote: >> My concern is mostly just that MERGE manages to behave in a way that >> actually "provides a single SQL statement that can conditionally >> INSERT, UPDATE or DELETE rows, a task that would otherwise require >> multiple procedural language statements", as the docs put it. As long >> as MERGE manages to do something as close to that high level >> description as possible in READ COMMITTED mode (with our current >> semantics for multiple statements in RC taken as the baseline), then >> I'll probably be happy. > > > IMO it will be quite hard, if not impossible, to guarantee the same > semantics to a single statement MERGE and multi statement > UPDATE/DELETE/INSERT in RC mode. For example, the multi statement model will > execute each statement with a new MVCC snapshot and hence the rows visible > to individual statement may vary. Whereas in MERGE, everything runs with a > single snapshot. There could be other such subtle differences. I didn't mean this literally. For simple cases, an EPQ walk of the update chain is kind of like acquiring a new snapshot. ISTM that a MERGE isn't really a thing that replaces 2 or 3 other DML statements, at least in most cases. It's more like a replacement for procedural code with an outer join, with an INSERT, UPDATE or DELETE that affects zero or one rows inside the procedural loop that processes matching/non-matching rows. The equivalent procedural code could ultimately perform *thousands* of snapshot acquisitions for thousands of RC DML statements. MERGE is sometimes explained in terms of "here is the kind of procedural code that you don't have to write anymore, thanks to MERGE" -- that's what the code looks like. I attach a rough example of this, that uses plpgsql. >> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary. >> I feel a bit uneasy about it because anything that anybody suggests is >> likely to be at least a bit arbitrary (EPQ itself is kind of >> arbitrary). We only get to make a decision on how "EPQ with a twist" >> will work once, and that should be a decision that is made following >> careful deliberation. Ambiguity is much more likely to kill a patch >> than a specific technical defect, at least in my experience. Somebody >> can usually just fix a technical defect. > > > While I agree, I think we need to make these decisions in a time bound > fashion. If there is too much ambiguity, then it's not a bad idea to settle > for throwing appropriate errors instead of providing semantically wrong > answers, even in some remote corner case. Everything is still on the table, I think. > Ok. I am now back from holidays and I will too start thinking about this. > I've also requested a colleague to help us with comparing it against > Oracle's behaviour. N That's not a gold standard for us, but knowing how > other major databases handle RC conflicts, is not a bad idea. The fact that Oracle doesn't allow WHEN MATCHED ... AND quals did seem like it might be significant to me. I think that any theoretical justification for one behavior over another will be hard for anyone to come up with. As I said before, this is not an area where something like the SQL standard provides us with a platonic ideal. The best behavior is likely to be one that lives up to the high level description of MERGE, is as close as possible to existing behaviors, and is not otherwise surprising. > I see the following important areas and as long as we have a consistent and > coherent handling of these cases, we should not have difficulty agreeing on > a outcome. > > 1. Concurrent UPDATE does not affect MATCHED case. The WHEN conditions may > or may not be affected. > 2. Concurrently UPDATEd tuple fails the join qual and the current source > tuple no longer matches with the updated target tuple that the EPQ is set > for. It matches no other target tuple either. So a MATCHED case is turned > into a NOT MATCHED case. > 3. Concurrently UPDATEd tuple fails the join qual and the current source > tuple no longer matches with the updated target tuple that the EPQ is set > for. But it matches some other target tuple. So it's still a MATCHED case, > but with different target tuple(s). > 4. Concurrent UPDATE/INSERT creates a matching target tuple for a source > tuple, thus turning a NOT MATCHED case to a MATCHED case. > 5. Concurrent DELETE turns a MATCHED case into NOT MATCHED case > > Any other case that I am missing? Assuming all cases are covered, what > should we do in each of these cases, so that there is no or very little > ambiguity and the outcome seems consistent (at the very least as far as > MERGE goes and hopefully with regular UPDATE/DELETE handling) Uhhh...I still need to spend more time on this. Sorry. > It seems natural that #5 should skip the MATCHED action and instead execute > the first satisfying NOT MATCHED action. But it's outcome may depend on how > we handle #2 and #3 so that they are all consistent. If we allow #2 and #3 > to error out, whenever there is ambiguity, we should do the same for #5. I definitely agree on behaving consistently in the way you describe here. -- Peter Geoghegan
Attachment
Pavan Deolasee writes: > Thanks for doing those tests. I've just sent v16a version of the patch and > I think it fixes the issues reported so far. Can you please recheck? Please > let me know if there are other issues detected by sqlsmith or otherwise. I re-did the testing with merge_v16a applied to master at 7923118c16 with ad7dbee368a reverted because of conflicts. I can confirm that the previous testcases don't fail anymore, but sqlsmith readily triggers the following assertion: TRAP: FailedAssertion("!(mergeTargetRelation > 0)", File: "planner.c", Line: 1496) Testcase attached. regards, Andreas
Attachment
Hi Andreas,
Sorry for the late response; I was busy at PGConf India.
On Sun, Feb 18, 2018 at 4:48 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
Thanks,Pavan Deolasee writes:
> Thanks for doing those tests. I've just sent v16a version of the patch and
> I think it fixes the issues reported so far. Can you please recheck? Please
> let me know if there are other issues detected by sqlsmith or otherwise.
I re-did the testing with merge_v16a applied to master at 7923118c16
with ad7dbee368a reverted because of conflicts. I can confirm that the
previous testcases don't fail anymore,
Thanks for confirming.
but sqlsmith readily triggers the
following assertion:
TRAP: FailedAssertion("!(mergeTargetRelation > 0)", File: "planner.c",
Line: 1496)
Ah. Looks like the support and test cases for sub-partitioning were missing. I've attached v17a of the patch which fixes this and some other issues I noticed as part of your testing. To provide more details, while we allow more complex un-aliased joins for the source relation, referencing columns from the join was broken. I've now fixed that.
I've added relevant test cases to the regression. Also added some tests to check function scans.
The patch is rebased on the current master. Please let me know how the sqlsmith tests go with this new version.
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Fri, Feb 16, 2018 at 6:37 AM, Peter Geoghegan <pg@bowt.ie> wrote:
ISTM that a MERGE isn't really a thing that replaces 2 or 3 other DML
statements, at least in most cases. It's more like a replacement for
procedural code with an outer join, with an INSERT, UPDATE or DELETE
that affects zero or one rows inside the procedural loop that
processes matching/non-matching rows. The equivalent procedural code
could ultimately perform *thousands* of snapshot acquisitions for
thousands of RC DML statements. MERGE is sometimes explained in terms
of "here is the kind of procedural code that you don't have to write
anymore, thanks to MERGE" -- that's what the code looks like.
I attach a rough example of this, that uses plpgsql.
Thanks for writing the sample code. I understand you probably don't mean to suggest that we need to mimic the behaviour of the plpgsql code and the semantics offered by MERGE would most likely be different than what the plpgsql sample does. Because there are several problems with the plpgsql code:
- It would never turn a MATCHED case into a NOT MATCHED case because of concurrent UPDATE/DELETE
- The WHERE clauses attached to the UPDATE/DELETE statement should be using the quals attached to the WHEN clauses to ensure they are evaluated on the new version of the row, if needed.
>> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
>> I feel a bit uneasy about it because anything that anybody suggests is
>> likely to be at least a bit arbitrary (EPQ itself is kind of
>> arbitrary). We only get to make a decision on how "EPQ with a twist"
>> will work once, and that should be a decision that is made following
>> careful deliberation. Ambiguity is much more likely to kill a patch
>> than a specific technical defect, at least in my experience. Somebody
>> can usually just fix a technical defect.
TBH that's one reason why I like Simon's proposed behaviour of throwing errors in case of corner cases. I am not suggesting that's what we do at the end, but it's definitely worth considering.
>
>
> While I agree, I think we need to make these decisions in a time bound
> fashion. If there is too much ambiguity, then it's not a bad idea to settle
> for throwing appropriate errors instead of providing semantically wrong
> answers, even in some remote corner case.
Everything is still on the table, I think.
Ok.
> Ok. I am now back from holidays and I will too start thinking about this.
> I've also requested a colleague to help us with comparing it against
> Oracle's behaviour. N That's not a gold standard for us, but knowing how
> other major databases handle RC conflicts, is not a bad idea.
The fact that Oracle doesn't allow WHEN MATCHED ... AND quals did seem
like it might be significant to me.
Here are some observations from Rahila's analysis so far. I must say, Oracle's handling seems quite inconsistent, especially the conditions under which it sometimes re-evaluates the join and sometimes don't.
- Oracle does not support multiple WHEN MATCHED clauses. So the question of re-checking all WHEN clauses does not arise.
- Only one UPDATE and one DELETE clause is supported. The DELETE must be used in conjunction with UPDATE.
- The DELETE clause is invoked iff the UPDATE clause is invoked. It works on the updated rows. Since the row is already updated (and locked) by the MERGE, DELETE action never blocks on a concurrent update/delete
- MERGE does not allow updating the column used in the JOIN's ON qual
- In case of concurrent UPDATE, the join is re-evaluated iff the concurrent
UPDATE updates (modifies?) the same column that MERGE is updating OR a column
that MERGE is referencing in the WHERE clause is updated by the concurrent update. IOW if the
MERGE and concurrent UPDATE is operating on different columns, join is NOT
re-evaluated, thus possibly invoking WHEN MATCHED action on a row which no
longer matches the join condition.
- In case of concurrent DELETE, the join is re-evaluated and the action may change from MATCHED to NOT MATCHED
I am curiously surprised by it's behaviour of re-evaluating join only when certain columns are updated. It looks to me irrespective of what we choose, our implementation would be much superior to what Oracle offers.
BTW I've sent v17a of the patch, which is very close to being complete from my perspective (except some documentation fixes/improvements). The only thing pending is the decision to accept or change the currently implemented concurrency semantics.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Feb 27, 2018 at 10:19 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> I attach a rough example of this, that uses plpgsql. > > Thanks for writing the sample code. I understand you probably don't mean to > suggest that we need to mimic the behaviour of the plpgsql code and the > semantics offered by MERGE would most likely be different than what the > plpgsql sample does. Because there are several problems with the plpgsql > code: > > - It would never turn a MATCHED case into a NOT MATCHED case because of > concurrent UPDATE/DELETE > - The WHERE clauses attached to the UPDATE/DELETE statement should be using > the quals attached to the WHEN clauses to ensure they are evaluated on the > new version of the row, if needed. It's definitely possible to poke holes in my plpgsql example, most (or all) of which are not fixable within the confines of what plpgsql can do. I still think that it's really useful to frame the discussion with examples of the kind of procedural code MERGE replaces, though, because: * That's the explicit purpose of MERGE, according to the SQL standard. Everyone is very clear that the join outputs rows that are separately inserted, updated, or deleted (additional "WHEN ... AND" quals are evaluated separately). We should be clear on that, too. We're quite specifically replacing procedural code that follows a general pattern. We might even give an example of such procedural code in the docs, as SQL Server does. * It shows that in some ways, the INSERT/UPDATE/DELETE parts are separate "zero or one row" statements. They could do their own snapshot acquisitions within RC mode, for example. Also, you don't get the ON CONFLICT behavior with excluded being affected by BEFORE ROW triggers within UPDATE expression evaluation for ON CONFLICT DO UPDATE. * My example is buggy, but seemingly only in a way that is just about unavoidable -- all the bugs are in RC mode with concurrent writes. Poking holes in what I came up with is actually useful, and may be less confusing than discussing the same issues some other way. There are very few users in the world that would understand these issues. Moreover, if all affected users that have this kind of code in the wild were to somehow magically develop a strong understanding of this stuff overnight, even then I'm not sure that much would change. They still use RC mode for all the usual reasons, and they mostly won't have any way of fixing concurrency issues that is actually acceptable to them. In short, I'm not sure that I can fix the problems with my plpgsql code, so what chance do they have of fixing theirs? >> >> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary. >> >> I feel a bit uneasy about it because anything that anybody suggests is >> >> likely to be at least a bit arbitrary (EPQ itself is kind of >> >> arbitrary). We only get to make a decision on how "EPQ with a twist" >> >> will work once, and that should be a decision that is made following >> >> careful deliberation. Ambiguity is much more likely to kill a patch >> >> than a specific technical defect, at least in my experience. Somebody >> >> can usually just fix a technical defect. > > > TBH that's one reason why I like Simon's proposed behaviour of throwing > errors in case of corner cases. I am not suggesting that's what we do at the > end, but it's definitely worth considering. I now feel like Simon's suggestion of throwing an error in corner cases isn't so bad. It still seems like we could do better, but the more I think about it, the less that seems like a cop-out. My reasons are: * As I already said, my plpgsql example, while buggy, might actually be the safest way to get the intended behavior in RC mode today. I can definitely imagine a way of dealing with concurrency that is both safer and less prone to throwing weird errors, but the fact remains that my example is the state of the art here, in a way. * Simon has already introduced something that looks like "EPQ with a twist" to me -- the steps that happen before he even raises this error. IOW, he does something extra that is EPQ-like. He likely does a lot better than my plpgsql example manages to, I think. * I suspect that the kind of users that really like the ON CONFLICT DO UPDATE's simplicity (in terms of what it guarantees them) are unlikely to care about MERGE at all. The kind of user that cares about MERGE is likely to have at least heard of the isolation levels. * I do see an important difference between making likely-unexpected errors in RC mode very unlikely, and making them *impossible*. This patch is not ON CONFLICT DO UPDATE, though, and that strong guarantee simply isn't on the table. * We can all agree that *not* raising an error in the specific way Simon proposes is possible, somehow or other. We also all agree that avoiding the broader category of RC errors can only be taken so far (e.g. in any event duplicate violations errors are entirely possible, in RC mode, when a MERGE inserts a row). So this is a question of what exact middle ground to take. Neither of the two extremes (throwing an error on the first sign of a RC conflict, and magically preventing concurrency anomalies) are actually on the table. I will not block this patch because it merely makes throwing likely-unexpected errors in RC mode "very unlikely", rather than "very very unlikely". Not least because I have a hard time imagining a single user caring about the difference that still exists with Simon's less ambitious (though not entirely unambitious) version of "EPQ with a twist". > Here are some observations from Rahila's analysis so far. I must say, > Oracle's handling seems quite inconsistent, especially the conditions under > which it sometimes re-evaluates the join and sometimes don't. > I am curiously surprised by it's behaviour of re-evaluating join only when > certain columns are updated. It looks to me irrespective of what we choose, > our implementation would be much superior to what Oracle offers. I'm not that surprised that it's generally kind of arbitrary, though the fact that you can update something while the join qual no longer passes does seem particularly poor. BTW, one thing that I remember very clearly from my research on MERGE years ago is this: all of the major implementations were in some way or other quite buggy. At least in RC mode, MERGE kind of promises something that it really can't quite deliver. Implementations are seemingly left to pick and choose how to paper over those cracks. The description of what Oracle allows here does make me feel better about our direction. Having the least-worst semantics on RC conflict handling certainly seems good enough to me. Especially because we have a "true UPSERT" already, unlike both Oracle and SQL Server. > BTW I've sent v17a of the patch, which is very close to being complete from > my perspective (except some documentation fixes/improvements). The only > thing pending is the decision to accept or change the currently implemented > concurrency semantics. I need to go look at that. I'll try to take a firmer position on this. I know that I've been saying that for a quite a while now, but my failure to take a firmer position for so long is not because I didn't try. It's because there is no really good answer. -- Peter Geoghegan
On Tue, Feb 27, 2018 at 5:07 PM, Peter Geoghegan <pg@bowt.ie> wrote: > I now feel like Simon's suggestion of throwing an error in corner > cases isn't so bad. It still seems like we could do better, but the > more I think about it, the less that seems like a cop-out. My reasons > are: I still think we really ought to try not to add a new class of error. > * We can all agree that *not* raising an error in the specific way > Simon proposes is possible, somehow or other. We also all agree that > avoiding the broader category of RC errors can only be taken so far > (e.g. in any event duplicate violations errors are entirely possible, > in RC mode, when a MERGE inserts a row). So this is a question of what > exact middle ground to take. Neither of the two extremes (throwing an > error on the first sign of a RC conflict, and magically preventing > concurrency anomalies) are actually on the table. Just because there's no certainty about which behavior is best doesn't mean that none of them are better than throwing an error. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 28, 2018 at 8:53 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Feb 27, 2018 at 5:07 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> I now feel like Simon's suggestion of throwing an error in corner >> cases isn't so bad. It still seems like we could do better, but the >> more I think about it, the less that seems like a cop-out. My reasons >> are: > > I still think we really ought to try not to add a new class of error. I do too. However, it's only fair to acknowledge that the lack of any strong counterproposal after a month or so tells us something. >> * We can all agree that *not* raising an error in the specific way >> Simon proposes is possible, somehow or other. We also all agree that >> avoiding the broader category of RC errors can only be taken so far >> (e.g. in any event duplicate violations errors are entirely possible, >> in RC mode, when a MERGE inserts a row). So this is a question of what >> exact middle ground to take. Neither of the two extremes (throwing an >> error on the first sign of a RC conflict, and magically preventing >> concurrency anomalies) are actually on the table. > > Just because there's no certainty about which behavior is best doesn't > mean that none of them are better than throwing an error. Bear in mind that the patch doesn't throw an error at the first sign of trouble. It throws an error after doing an EPQ style walk, which individually verifies the extra "WHEN ... AND" quals for every tuple in the update chain (ExecUpdate()/ExecDelete() return after first EPQ tuple check for MERGE, so that an ExecMerge() caller can do that extra part with special EPQ slot). The controversial serialization error only comes when the original basic assessment of MATCHED needs to change, without regard to extra "WHEN ... AND" quals (and only when the MERGE statement was written in such a way that that matters -- it must have a WHEN NOT MATCHED clause, too). AFAICT, the patch will only throw an error when the join quals no longer pass -- not when the extra "WHEN ... AND" quals no longer pass. That would be far worse, IMV. ExecMerge() will retry until it reaches the end of the update chain, possibly changing its mind about which particular WHEN case applies repeatedly, going back and forth between mergeActions (WHEN cases) as the update chain in walked. The patch can do a lot to roll with conflicts before it gives up. Conflicts are generally caused by concurrent DELETEs, so you could say that this offers a kind of symmetry with concurrent INSERTs causing duplicate violation errors. (Concurrent UPDATEs that change the join qual values could provoke the error too, but presumably we're joining on the PK in most cases, so that seems much less likely than a simple DELETE.) Bear in mind that both the SQL Server and Oracle implementations have many significant restrictions/caveats around doing something cute with the main join quals. It's not surprising that we'd have something like that, too. According to Rahila, Oracle doesn't allow MERGE to update the columns in the join qual at all, and only allows a single WHEN MATCHED case (a DELETE can only come after an UPDATE in Oracle, oddly enough). SQL Server's docs have a warning that states: "Do not attempt to improve query performance by filtering out rows in the target table in the ON clause, such as by specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results.". What does that even mean!? I am slightly concerned that the patch may still have livelock hazards in its approach to RC conflict handling. I haven't studied that aspect in enough detail, though. Since we always make forward progress by following an update chain, any problem here is probably fixable. -- Peter Geoghegan
On Fri, Feb 9, 2018 at 6:36 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Here's my $0.02: I think that new concurrency errors thrown by the > merge code itself deserve strict scrutiny and can survive only if they > have a compelling justification, but if the merge code happens to > choose an action that leads to a concurrency error of its own, I think > that deserves only mild scrutiny. > > On that basis, of the options I listed in > http://postgr.es/m/CA+TgmoZDL-caukHkWet7sr7sqr0-e2T91+DEvhqeN5sfqsMjqw@mail.gmail.com > I like #1 least. > > I also dislike #4 from that list for the reasons stated there. For > example, if you say WHEN MATCHED AND x.some_boolean and then WHEN > MATCHED, you expect that every tuple that hits the latter clause will > have that Boolean as false or null, but #4 makes that not true. > > I think the best options are #2 and #5 -- #2 because it's simple, and > #5 because it's (maybe) more intuitive, albeit at the risk of > livelock. But I'm willing to convinced of some other option; like > you, I'm willing to be open-minded about this. But, as you say, we > need a coherent, well-considered justification for the selected > option, not just "well, this is what we implemented so you should like > it". The specification should define the implementation, not the > reverse. At first I hated option #2. I'm warming to #2 a lot now, though, because I've come to understand the patch's approach a little better. (Pavan and Simon should still verify that I got things right in my mail from earlier today, though.) It now seems like the patch throws a RC serialization error more or less only due to concurrent deletions (rarely, it will actually be a concurrent update that changed the join qual values of our MERGE). We're already not throwing the error (we just move on to the next input row from the join) when we happen to not have a WHEN NOT MATCHED case. But why even make that distinction? Why not just go ahead and WHEN NOT MATCHED ... INSERT when the MERGE happens to have such a clause? The INSERT will succeed, barring any concurrent conflicting insertion by a third session -- a hazard that has nothing to do with RC conflict handling in particular. Just inserting without throwing a RC serialization error is almost equivalent to a new MVCC snapshot being acquired due to a RC conflict, so all of this now looks okay to me. Especially because every other MERGE implementation seems to have serious issues with doing anything too fancy with the MERGE target table's quals within the main ON join. I think that SQL Server actually assumes that you're using the target's PK in a simple equi-join. All the examples look like that, and this assumption is heavily suggested by the "Do not attempt to improve query performance by filtering out rows in the target table in the ON clause" weasel words from their docs, that I referenced in my mail from earlier today. I can get my head around all of this now only because I've come to understand that once we've decided that a given input from the main join is NOT MATCHED, we stick to that determination. We don't bounce around between MATCHED and NOT MATCHED cases during an EPQ update chain walk. We can bounce around between multiple alternative MATCHED merge actions/WHEN cases, but that seems okay because it's still part of essentially the same EPQ update chain walk -- no obvious livelock hazard. It seems fairly clean to restart everything ("goto lmerge") for each and every tuple in the update chain. Maybe we should actually formalize that you're only supposed to do a PK or unique index equi-join within the main ON join, though -- you can do something fancy with the source table, but not the target table. -- Peter Geoghegan
On Thu, Mar 1, 2018 at 3:50 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Feb 9, 2018 at 6:36 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Here's my $0.02: I think that new concurrency errors thrown by the
> merge code itself deserve strict scrutiny and can survive only if they
> have a compelling justification, but if the merge code happens to
> choose an action that leads to a concurrency error of its own, I think
> that deserves only mild scrutiny.
>
> On that basis, of the options I listed in
> http://postgr.es/m/CA+TgmoZDL-caukHkWet7sr7sqr0-e2T91+ DEvhqeN5sfqsMjqw@mail.gmail. com
> I like #1 least.
>
> I also dislike #4 from that list for the reasons stated there. For
> example, if you say WHEN MATCHED AND x.some_boolean and then WHEN
> MATCHED, you expect that every tuple that hits the latter clause will
> have that Boolean as false or null, but #4 makes that not true.
>
> I think the best options are #2 and #5 -- #2 because it's simple, and
> #5 because it's (maybe) more intuitive, albeit at the risk of
> livelock. But I'm willing to convinced of some other option; like
> you, I'm willing to be open-minded about this. But, as you say, we
> need a coherent, well-considered justification for the selected
> option, not just "well, this is what we implemented so you should like
> it". The specification should define the implementation, not the
> reverse.
At first I hated option #2. I'm warming to #2 a lot now, though,
because I've come to understand the patch's approach a little better.
(Pavan and Simon should still verify that I got things right in my
mail from earlier today, though.)
It now seems like the patch throws a RC serialization error more or
less only due to concurrent deletions (rarely, it will actually be a
concurrent update that changed the join qual values of our MERGE).
We're already not throwing the error (we just move on to the next
input row from the join) when we happen to not have a WHEN NOT MATCHED
case. But why even make that distinction? Why not just go ahead and
WHEN NOT MATCHED ... INSERT when the MERGE happens to have such a
clause? The INSERT will succeed, barring any concurrent conflicting
insertion by a third session -- a hazard that has nothing to do with
RC conflict handling in particular.
Just inserting without throwing a RC serialization error is almost
equivalent to a new MVCC snapshot being acquired due to a RC conflict,
so all of this now looks okay to me. Especially because every other
MERGE implementation seems to have serious issues with doing anything
too fancy with the MERGE target table's quals within the main ON join.
I think that SQL Server actually assumes that you're using the
target's PK in a simple equi-join. All the examples look like that,
and this assumption is heavily suggested by the "Do not attempt to
improve query performance by filtering out rows in the target table in
the ON clause" weasel words from their docs, that I referenced in my
mail from earlier today.
I think you've fairly accurately described what the patch does today. I take your point that we can very well just execute the WHEN NOT MATCHED action if the join condition fails for the updated tuple. There is one case we ought to think about though and that might explain why executing the WHEN NOT MATCHED action may not be best choice. Or for that matter even skipping the target when no NOT MATCHED action exists, as the patch does today.
What if the updated tuple fails the join qual with respect to the current tuple from the source relation but it now matches some other tuple from the source relation? I described this case in one of the earlier emails too. In this case, we might end up doing an INSERT (if we decide to execute WHEN NOT MATCHED action), even though a MATCH exists. If there is no WHEN NOT MATCHED action, the current patch will just skip the updated tuple even though a match exists, albeit it's not the current source tuple.
Oracle behaves differently and it actually finds a new matching tuple from the source relation and executes the WHEN MATCHED action, using that source tuple. But I am seriously doubtful that we want to go down that path and whether it's even feasible. Our regular UPDATE .. FROM does not do that either. Given that, it seems better to just throw an error (even when no NOT MATCHED action exists) and explain to the users that MERGE will work as long as concurrent updates don't modify the columns used in the join condition. Concurrent deletes should be fine and we may actually even invoke WHEN NOT MATCHED action in that case.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 1, 2018 at 4:33 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > What if the updated tuple fails the join qual with respect to the current > tuple from the source relation but it now matches some other tuple from the > source relation? I described this case in one of the earlier emails too. In > this case, we might end up doing an INSERT (if we decide to execute WHEN NOT > MATCHED action), even though a MATCH exists. If there is no WHEN NOT MATCHED > action, the current patch will just skip the updated tuple even though a > match exists, albeit it's not the current source tuple. If it does happen, then that will typically be because someone else concurrently updated a row, changing the primary key attributes, or some unique index attributes that our MERGE joins on, which I think is pretty rare. I'm assuming that the user does an idiomatic MERGE, like every example I can find shows, where the join quals on the target table are simple equality predicates on the primary key attribute(s). I think it's fine to simply let the insertion fail with a duplicate error. Is this any different to a concurrent INSERT that produces that same outcome? If the MERGE isn't idiomatic in the way I describe, then the INSERT may actually succeed, which also seems fine. > Oracle behaves differently and it actually finds a new matching tuple from > the source relation and executes the WHEN MATCHED action, using that source > tuple. But I am seriously doubtful that we want to go down that path and > whether it's even feasible. I think that that's just a consequence of Oracle using statement level rollback to do RC conflict handling. It's the same with an UPDATE ... FROM, or any other type of UPDATE. > Our regular UPDATE .. FROM does not do that > either. Given that, it seems better to just throw an error (even when no NOT > MATCHED action exists) and explain to the users that MERGE will work as long > as concurrent updates don't modify the columns used in the join condition. > Concurrent deletes should be fine and we may actually even invoke WHEN NOT > MATCHED action in that case. Again, I have to ask: is such an UPDATE actually meaningfully different from a concurrent DELETE + INSERT? If so, why is a special error better than a dup violation, or maybe even having the INSERT (and whole MERGE statement) succeed? -- Peter Geoghegan
On 02/06/2018 03:40 PM, Tomas Vondra wrote: > > > I plan to go through the patch and this thread over the couple of > days, and summarize what the current status is (or my understanding > of it). That is (a) what are the missing pieces, (b) why are they > missing, (c) how we plan to address them in the future and (d) > opinions on these issues expressed by others on this thread. > So, I've promised a summary of the patch status about three weeks ago. I've been postponing that as Pavan was moving the patch forward pretty quickly, and now there's not much to summarize ... which is a good thing I guess. If my understanding is correct, the MERGE now supports both partitioning and RLS, and subselects should also work on various places (which is somewhat consistent with my impression claims about SQL standard prohibiting them were not entirely accurate). Which leaves us with figuring out the right concurrency behavior, and that discussion seems to be in progress. So at this point I'm not aware of any other missing features in the patch, and a more detailed summary is not really needed. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 2, 2018 at 12:36 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> Our regular UPDATE .. FROM does not do that
> either. Given that, it seems better to just throw an error (even when no NOT
> MATCHED action exists) and explain to the users that MERGE will work as long
> as concurrent updates don't modify the columns used in the join condition.
> Concurrent deletes should be fine and we may actually even invoke WHEN NOT
> MATCHED action in that case.
Again, I have to ask: is such an UPDATE actually meaningfully
different from a concurrent DELETE + INSERT? If so, why is a special
error better than a dup violation, or maybe even having the INSERT
(and whole MERGE statement) succeed?
Ok, I agree. I have updated the patch to remove the serialization error. If MATCHED changes to NOT MATCHED because of concurrent update/delete, we now simply retry from the top and execute the first NOT MATCHED action, if WHEN AND qual passes on the updated version. Of course, if the MERGE does not contain any NOT MATCHED action then we simply ignore the target row and move to the next row. Since a NOT MATCHED case can never turn into a MATCHED case, there is no risk of a live lock.
I've updated the documentation and the test cases to reflect this change.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, Mar 5, 2018 at 3:02 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> Again, I have to ask: is such an UPDATE actually meaningfully >> different from a concurrent DELETE + INSERT? If so, why is a special >> error better than a dup violation, or maybe even having the INSERT >> (and whole MERGE statement) succeed? >> > > Ok, I agree. I have updated the patch to remove the serialization error. Cool. This is really shaping up. Thank you for working so hard to address my concerns. > If MATCHED changes to NOT MATCHED because of concurrent update/delete, we now > simply retry from the top and execute the first NOT MATCHED action, if WHEN > AND qual passes on the updated version. Of course, if the MERGE does not > contain any NOT MATCHED action then we simply ignore the target row and move > to the next row. Since a NOT MATCHED case can never turn into a MATCHED > case, there is no risk of a live lock. Makes sense. We either follow an UPDATE chain, which must always make forward progress, or do NOT MATCHED handling/insert a row, which only happens when NOT MATCHED handling has been abandoned, and so similarly must make forward progress. I think that this needs to be documented in a central place, though. I'd like to see arguments for why the design is livelock safe, as well as an explanation for how EPQ handling works, what the goals of how it works are, and so on. I would perhaps summarize it by saying something like the following within ExecMerge(): """ ExecMerge() EPQ handling repeatedly rechecks all WHEN MATCHED actions for each new candidate tuple as an update chain is followed, either until a tuple is actually updated/deleted, or until we decide that the new join input row actually requires WHEN NOT MATCHED handling after all. Rechecking join quals for a single candidate tuple is performed by ExecUpdate() and ExecDelete(), which can specially instruct us to advance to the next tuple in the update chain so that we can recheck our own WHEN ... AND quals (when the join quals no longer pass due to a concurrent UPDATE), or to switch to the WHEN NOT MATCHED case (when join quals no longer pass due to a concurrent DELETE). EPQ only ever installs a new tuple for the target relation (never the source), so changing from one WHEN MATCHED action to another during READ COMMITTED conflict handling must be due to a concurrent UPDATE that changes WHEN ... AND qual referenced attribute(s). If it was due to a concurrent DELETE, or, equivalently, some concurrent UPDATE that affects the target's primary key attribute(s) (or whatever the outer join's target qual attributes are), then we switch to WHEN NOT MATCHED handling, which will never switch back to WHEN MATCHED. ExecMerge() avoids livelocks by always either walking the UPDATE chain, which makes forward progress, or by finally switching to WHEN NOT MATCHED processing. """ ExecUpdate()/ExecDelete() are not "driving" EPQ here, which is new -- they're doing one small part of it (the join quals for one tuple) on behalf of ExecMerge(). I don't expect you to just take my wording without editing or moving parts around a bit; I think that you'll get the general idea from what I've written, though. Actually, my wording may need to be changed to reflect a new code structure for EPQ. The current control flow makes the reader think about UPDATE quals, as well as DELETE quals. Instead, the reader should think about WHEN ... AND quals, as well as MERGE's outer JOIN quals (the join quals are the same, regardless of whether an UPDATE or DELETE happens to be involved). The control flow between ExecMerge(), ExecUpdate(), and ExecDelete() is just confusing. Sure, ExecUpdate() and ExecDelete() *do* require a little special handling for MERGE, but ExecMerge() should itself call EvalPlanQual() for the join quals, rather than getting ExecUpdate()/ExecDelete() to do it. All that ExecUpdate()/ExecDelete() need to tell ExecMerge() is "you need to do your HeapTupleUpdated stuff". This not only clarifies the new EPQ design -- it also lets you do that special case rti EPQ stuff in the right place (BTW, I'm not a huge fan of that detail in general, but haven't thought about it enough to say more). The new EPQ code within ExecUpdate() and ExecDelete() is already identical, so why not do it that way? I especially like this organization because ExecOnConflictUpdate() already calls ExecUpdate() without expecting it to do EPQ handling at all, which is kind of similar to what I suggest -- in both cases, the caller "takes care of all of the details of the scan". And, by returning a HTSU_Result value to ExecMerge() from ExecUpdate()/ExecDelete(), you could also do the new cardinality violation stuff from within ExecMerge() in exactly one place -- no need to invent new error_on_SelfUpdate arguments. I would also think about organizing ExecMerge() handling so that CMD_INSERT handling became WHEN NOT MATCHED handling. It's unnecessarily confusing to have that included in the same switch statement as CMD_UPDATE and CMD_DELETE, since that organization fails to convey that once we reach WHEN NOT MATCHED, there is no going back. Actually, I suppose the new EPQ organization described in the last few paragraphs already implies that you'd do this CMD_INSERT stuff, since the high level goal is breaking ExecMerge() into WHEN MATCHED and WHEN NOT MATCHED sections (doing "commonality and variability analysis" for CMD_UPDATE and CMD_DELETE only serves that high level goal). Other feedback: * Is a new ExecMaterializeSlot() call needed for the EPQ slot? Again, doing your own EvalPlanQual() within ExecMerge() would perhaps have avoided this. * Do we really need UpdateStmt raw parser node pointers in structs that appear in execnodes.h? What's that all about? * Tests for transition table behavior (mixed INSERTs, UPDATEs, and DELETEs) within triggers.sql seems like a good idea. * Is this comment really accurate?: > + /* > + * If EvalPlanQual did not return a tuple, it means we > + * have seen a concurrent delete, or a concurrent update > + * where the row has moved to another partition. > + * > + * Set matched = false and loop back if there exists a NOT > + * MATCHED action. Otherwise, we have nothing to do for this > + * tuple and we can continue to the next tuple. > + */ Won't we loop back when a concurrent update occurs that makes the new candidate tuple not satisfy the join qual anymore? What does this have to do with partitioning? * Why does MergeJoin get referenced here?: > + * Also, since the internal MergeJoin node can hide the source and target > + * relations, we must explicitly make the respective relation as visible so > + * that columns can be referenced unqualified from these relations. That's a struct that has nothing to do with SQL MERGE in particular (no more than hash join does, for example). * This patch could use a pg_indent. * We already heard about this code from Tomas, who found it unnecessary: > + /* > + * SQL Standard says that WHEN AND conditions must not > + * write to the database, so check we haven't written > + * any WAL during the test. Very sensible that is, since > + * we can end up evaluating some tests multiple times if > + * we have concurrent activity and complex WHEN clauses. > + * > + * XXX If we had some clear form of functional labelling > + * we could use that, if we trusted it. > + */ > + if (startWAL < GetXactWALBytes()) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot write to database within WHEN AND condition"))); This needs to go. Apart from the fact that GetXactWALBytes() is buggy (it returns int64 for the unsigned type XLogRecPtr), the whole idea just seems unnecessary. I don't see why this is any different to using a volatile function in a regular UPDATE. * I still think we probably need to add something to ruleutils.c, so that MERGE Query structs can be deparsed -- see get_query_def(). Yeah, we've decided we're not going to support user-visible rules, but I continue to think that deparsing support is necessary on general principle, and for the benefit of extensions like Citus that use deparsing in a fairly broad way. At the very least, if we're not going to support deparsing, there needs to be a better reason than "we're not supporting user-visible rules". -- Peter Geoghegan
On Tue, Mar 6, 2018 at 9:55 AM, Peter Geoghegan <pg@bowt.ie> wrote:
Sure, ExecUpdate() and ExecDelete() *do* require a little special
handling for MERGE, but ExecMerge() should itself call EvalPlanQual()
for the join quals, rather than getting ExecUpdate()/ExecDelete() to
do it. All that ExecUpdate()/ExecDelete() need to tell ExecMerge() is
"you need to do your HeapTupleUpdated stuff". This not only clarifies
the new EPQ design -- it also lets you do that special case rti EPQ
stuff in the right place (BTW, I'm not a huge fan of that detail in
general, but haven't thought about it enough to say more). The new EPQ
code within ExecUpdate() and ExecDelete() is already identical, so why
not do it that way?
I especially like this organization because ExecOnConflictUpdate()
already calls ExecUpdate() without expecting it to do EPQ handling at
all, which is kind of similar to what I suggest -- in both cases, the
caller "takes care of all of the details of the scan". And, by
returning a HTSU_Result value to ExecMerge() from
ExecUpdate()/ExecDelete(), you could also do the new cardinality
violation stuff from within ExecMerge() in exactly one place -- no
need to invent new error_on_SelfUpdate arguments.
I would also think about organizing ExecMerge() handling so that
CMD_INSERT handling became WHEN NOT MATCHED handling. It's
unnecessarily confusing to have that included in the same switch
statement as CMD_UPDATE and CMD_DELETE, since that organization fails
to convey that once we reach WHEN NOT MATCHED, there is no going back.
Actually, I suppose the new EPQ organization described in the last few
paragraphs already implies that you'd do this CMD_INSERT stuff, since
the high level goal is breaking ExecMerge() into WHEN MATCHED and WHEN
NOT MATCHED sections (doing "commonality and variability analysis" for
CMD_UPDATE and CMD_DELETE only serves that high level goal).
Thanks for the feedback. I've greatly refactored the code based on your comments and I too like the resultant code. What we have now have essentially is: two functions:
ExecMergeMatched() - deals with matched rows. In case of concurrent update/delete, it also runs EvalPlanQual and checks if the updated row still meets the join quals. If so, it will restart and recheck if some other WHEN MATCHED action should be executed. If the target row is deleted or if the join quals fail, it simply returns and let the next function handle it
ExecMargeNotMatched() - deals with not-matched rows. Essentially it only needs to execute INSERT if a WHEN NOT MATCHED action exists and the additional quals pass.
Per your suggestion, ExecUpdate() and ExecDelete() only returns enough information for ExecMergeMatched() to run EvalPlanQual or take any other action, as needed. So changes to these functions are now minimal. Also, the EPQ handling for UPDATE/DELETE is now common, which is nice.
I've also added a bunch of comments, but it might still not be enough. Feel free to suggest improvements there.
Other feedback:
* Is a new ExecMaterializeSlot() call needed for the EPQ slot? Again,
doing your own EvalPlanQual() within ExecMerge() would perhaps have
avoided this.
Fixed.
* Do we really need UpdateStmt raw parser node pointers in structs
that appear in execnodes.h? What's that all about?
No, it was left-over from the earlier code. Removed.
* Tests for transition table behavior (mixed INSERTs, UPDATEs, and
DELETEs) within triggers.sql seems like a good idea.
Ok, I will add. But not done in this version.
* Is this comment really accurate?:
> + /*
> + * If EvalPlanQual did not return a tuple, it means we
> + * have seen a concurrent delete, or a concurrent update
> + * where the row has moved to another partition.
> + *
> + * Set matched = false and loop back if there exists a NOT
> + * MATCHED action. Otherwise, we have nothing to do for this
> + * tuple and we can continue to the next tuple.
> + */
Won't we loop back when a concurrent update occurs that makes the new
candidate tuple not satisfy the join qual anymore? What does this have
to do with partitioning?
Yeah, it was wrong. Fixed as part of larger reorganisation anyways.
* Why does MergeJoin get referenced here?:
> + * Also, since the internal MergeJoin node can hide the source and target
> + * relations, we must explicitly make the respective relation as visible so
> + * that columns can be referenced unqualified from these relations.
I meant to say underlying JOIN for MERGE. Fixed by simply saying Join.
* This patch could use a pg_indent.
Done.
* We already heard about this code from Tomas, who found it unnecessary:
> + /*
> + * SQL Standard says that WHEN AND conditions must not
> + * write to the database, so check we haven't written
> + * any WAL during the test. Very sensible that is, since
> + * we can end up evaluating some tests multiple times if
> + * we have concurrent activity and complex WHEN clauses.
> + *
> + * XXX If we had some clear form of functional labelling
> + * we could use that, if we trusted it.
> + */
> + if (startWAL < GetXactWALBytes())
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot write to database within WHEN AND condition")));
This needs to go. Apart from the fact that GetXactWALBytes() is buggy
(it returns int64 for the unsigned type XLogRecPtr), the whole idea
just seems unnecessary. I don't see why this is any different to using
a volatile function in a regular UPDATE.
I removed this code since it was wrong. We might want to add some basic checks for existence of volatile functions in the WHEN or SET clauses. But I agree, it's no different than regular UPDATEs. So may be not a big deal.
* I still think we probably need to add something to ruleutils.c, so
that MERGE Query structs can be deparsed -- see get_query_def().
Ok. I will look at it. Not done in this version though.
Rebased on the current master too.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Thu, Mar 8, 2018 at 7:54 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> > + /* >> > + * SQL Standard says that WHEN AND conditions must not >> > + * write to the database, so check we haven't written >> > + * any WAL during the test. Very sensible that is, since >> > + * we can end up evaluating some tests multiple times if >> > + * we have concurrent activity and complex WHEN clauses. >> > + * >> > + * XXX If we had some clear form of functional labelling >> > + * we could use that, if we trusted it. >> > + */ >> > + if (startWAL < GetXactWALBytes()) >> > + ereport(ERROR, >> > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), >> > + errmsg("cannot write to database within WHEN >> > AND condition"))); >> >> This needs to go. Apart from the fact that GetXactWALBytes() is buggy >> (it returns int64 for the unsigned type XLogRecPtr), the whole idea >> just seems unnecessary. I don't see why this is any different to using >> a volatile function in a regular UPDATE. > > I removed this code since it was wrong. We might want to add some basic > checks for existence of volatile functions in the WHEN or SET clauses. But I > agree, it's no different than regular UPDATEs. So may be not a big deal. I just caught up on this thread. I'm definitely glad to see that code go because, wow, that is all kinds of wrong. I don't see a real need to add any kind of replacement check, either. Prohibiting volatile functions here doesn't seem likely to accomplish anything useful. It seems like the most we'd want to do is mention this the documentation somehow, and I'm not even sure we really need to do that much. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Mar 8, 2018 at 6:52 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> I removed this code since it was wrong. We might want to add some basic >> checks for existence of volatile functions in the WHEN or SET clauses. But I >> agree, it's no different than regular UPDATEs. So may be not a big deal. > > I just caught up on this thread. I'm definitely glad to see that code > go because, wow, that is all kinds of wrong. I don't see a real need > to add any kind of replacement check, either. Prohibiting volatile > functions here doesn't seem likely to accomplish anything useful. It > seems like the most we'd want to do is mention this the documentation > somehow, and I'm not even sure we really need to do that much. Thanks in large part to Pavan's excellent work, the situation in nodeModifyTable.c is much clearer than it was a few weeks ago. It's now obvious that MERGE is very similar to UPDATE ... FROM, which doesn't have any restrictions on volatile functions. I don't see any sense in prohibiting volatile functions in either case, because it should be obvious to users that that's just asking for trouble. I can believe that someone would make that mistake, just about, but they'd have to be writing their DML statement on auto-pilot. -- Peter Geoghegan
On 03/08/2018 11:46 PM, Peter Geoghegan wrote: > On Thu, Mar 8, 2018 at 6:52 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> I removed this code since it was wrong. We might want to add some basic >>> checks for existence of volatile functions in the WHEN or SET clauses. But I >>> agree, it's no different than regular UPDATEs. So may be not a big deal. >> >> I just caught up on this thread. I'm definitely glad to see that code >> go because, wow, that is all kinds of wrong. I don't see a real need >> to add any kind of replacement check, either. Prohibiting volatile >> functions here doesn't seem likely to accomplish anything useful. It >> seems like the most we'd want to do is mention this the documentation >> somehow, and I'm not even sure we really need to do that much. > > Thanks in large part to Pavan's excellent work, the situation in > nodeModifyTable.c is much clearer than it was a few weeks ago. It's > now obvious that MERGE is very similar to UPDATE ... FROM, which > doesn't have any restrictions on volatile functions. > Yeah, I agree Pavan did an excellent work on moving this patch forward. > I don't see any sense in prohibiting volatile functions in either > case, because it should be obvious to users that that's just asking > for trouble. I can believe that someone would make that mistake, just > about, but they'd have to be writing their DML statement on > auto-pilot. > The reason why the patch tried to prevent that is because the SQL standard says this (p. 1176 of SQL 2016): 15) The <search condition> immediately contained in a <merge statement>, the <search condition> immediately contained in a <merge when matched clause>, and the <search condition> immediately contained in a <merge when not matched clause> shall not generally contain a <routine invocation> whose subject routine is an SQL-invoked routine that possibly modifies SQL-data. I'm not quite sure what is required to be compliant with this rule. For example what does "immediately contained" or "shall not generally contain" mean? Does that mean user are expected not to do that because it's obviously silly, or do we need to implement some protection? That being said the volatility check seems reasonable to me (and i would not expect it to be a huge amount of code). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 8, 2018 at 3:29 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > The reason why the patch tried to prevent that is because the SQL > standard says this (p. 1176 of SQL 2016): > > 15) The <search condition> immediately contained in a <merge statement>, > the <search condition> immediately contained in a <merge when matched > clause>, and the <search condition> immediately contained in a <merge > when not matched clause> shall not generally contain a <routine > invocation> whose subject routine is an SQL-invoked routine that > possibly modifies SQL-data. > > I'm not quite sure what is required to be compliant with this rule. For > example what does "immediately contained" or "shall not generally > contain" mean? Does that mean user are expected not to do that because > it's obviously silly, or do we need to implement some protection? My impression is that this means that you shouldn't treat this as a particularly likely case, or try to facilitate it. The <search condition> blurb is about intent, rather than actual restrictions implementations must enforce, AFAICT. Though the UPDATE precedent is what really matters here -- not the SQL standard. The SQL standard doesn't say anything to make me doubt that that's the right precedent to want to follow. Close by, under "General Rules", rule #4 is: "The extent to which an SQL-implementation may disallow independent changes that are not significant is implementation-defined". This same sentence appears in quite a few different places, including in the description of UPDATE. ISTM that the SQL standard actually enforces that volatile qual weirdness (and what to do about it) is a general INSERT/UPDATE/DELETE/MERGE issue. > That being said the volatility check seems reasonable to me (and i would > not expect it to be a huge amount of code). If we're going to do this, we'd have to do the same with UPDATE, IMV. And, well, we're not gonna do that. -- Peter Geoghegan
On Thu, Mar 8, 2018 at 4:54 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Thanks for the feedback. I've greatly refactored the code based on your > comments and I too like the resultant code. What we have now have > essentially is: two functions: I really like how ExecMerge() now mostly just consists of this code (plus a lot of explanatory comments): > + if (!matched || > + !ExecMergeMatched(mtstate, estate, slot, junkfilter, tupleid)) > + ExecMergeNotMatched(mtstate, estate, slot); This is far easier to follow. In general, I'm now a lot less worried about what was previously the #1 concern -- READ COMMITTED conflict handling (EvalPlanQual() stuff). My #1 concern has become RLS, and perhaps only because I haven't studied it in enough detail. It seems like getting this patch committable is now a matter of verifying details. Of course, there are a lot of details to verify. :-) > I've also added a bunch of comments, but it might still not be enough. Feel > free to suggest improvements there. I think that the ExecMerge() comments are very good, although they could perhaps use some light copy-editing. Also, I noticed some specific issues with comments in ExecMerge() and friends, as well as a few code issues. Those are: * MERGE code needs to do cardinality violations like ON CONFLICT. Specifically, you need the TransactionIdIsCurrentTransactionId() check, and a second error fallback "attempted to lock invisible tuple" error (though this should say "attempted to update or delete invisible tuple" instead). The extra check is redundant, but better safe than sorry. I like defensive errors like this because it makes the code easier to read -- I don't get distracted by their absence. * The last item on cardinality violations also implies this new merge comment should be changed: > + /* > + * The target tuple was already updated or deleted by the > + * current command, or by a later command in the current > + * transaction. > + */ It should be changed because it can't have been a later command in this xact. We handled that case when we called ExecUpdate or ExecDelete() (plus we now have the extra defensive "can't happen" elog() error). * This related comment shouldn't even talk about update/historic behavior, now that it isn't in ExecUpdate() -- just MERGE: > + /* > + * The former case is possible in a join UPDATE where > + * multiple tuples join to the same target tuple. This is > + * pretty questionable, but Postgres has always allowed > + * it: we just execute the first update action and ignore > + * additional update attempts. SQLStandard disallows this > + * for MERGE, so allow the caller to select how to handle > + * this. > + */ * This wording should be tweaked: > + * If the current tuple is that last tuple in the update > + * chain, then we know that the tuple was concurrently > + * deleted. We just switch to NOT MATCHED case and let the > + * caller retry the NOT MATCHED actions. This should say something like "caller can move on to NOT MATCHED actions". They can never go back from there, of course, which I want us to be clear on. * This check of whether whenqual is set is unnecessary, and doesn't match MATCHED code, or the accompanying comments: > + /* > + * Test condition, if any > + * > + * In the absence of a condition we perform the action unconditionally > + * (no need to check separately since ExecQual() will return true if > + * there are no conditions to evaluate). > + */ > + if (action->whenqual && !ExecQual(action->whenqual, econtext)) > + continue; * I think that this ExecMerge() assertion is not helpful, since you go on to dereference the pointer in all cases anyway: > + Assert(junkfilter != NULL); * Executor README changes, particularly about projecting twice, really should be ExecMerge() comments. Maybe just get rid of these? * Why are we using CMD_NOTHING at all? That constant has something to do with user-visible rules, and there is no need to reuse it (make a new CMD_* if you have to). More importantly, why do we even have the corresponding DO NOTHING stuff in the grammar? Why would users want that? For quite a while, I thought that patch must have been support for ON CONFLICT DO NOTHING within MERGE INSERTs (the docs don't even say what DO NOTHING is). But that's not what it is at all. It seems like this is a way of having an action that terminates early, so you don't have to go on to evaluate other action quals. I can't see much point, though. More importantly, supporting this necessitates code like the following RLS code within ExecMergeMatched(): > + if ((action->commandType == CMD_UPDATE || > + action->commandType == CMD_DELETE) && > + resultRelInfo->ri_WithCheckOptions) > + { > + ExecWithCheckOptions(action->commandType == CMD_UPDATE ? > + WCO_RLS_MERGE_UPDATE_CHECK : WCO_RLS_MERGE_DELETE_CHECK, > + resultRelInfo, > + mtstate->mt_merge_existing[ud_target], > + mtstate->ps.state); > + } I wonder if the CMD_NOTHING case makes this code prone to information leak attacks. Even if it that isn't the case, ISTM that several code blocks within ExecMergeMatched() could do without the repeated "action->commandType" tests. Have I missed something that makes supporting DO NOTHING seem more compelling? * The docs say that we use a LEFT OUTER JOIN for MERGE, while the implementation uses a RIGHT OUTER JOIN because it's convenient for parse analysis finding the target. This difference is a bit confusing. Why say that we use any kind of join at all, though? The SQL standard doesn't say outer join at all. Neither do the MERGE docs of the other database systems that I checked. Taking a position on this seems to add nothing at all. Let's make the MERGE docs refer to it as a join, without further elaboration. * I don't find this comment from analyze.c very helpful: > + * We don't have a separate plan for each action, so the when > + * condition must be executed as a per-row check, making it very > + * similar to a CHECK constraint and so we adopt the same semantics > + * for that. Why explain it that way at all? There are two rels, unlike a check constraint. * The first time I read this comment, it made me laugh: > + /* > + * First rule of MERGE club is we don't talk about rules > + */ The joke has become significantly less funny since then, though. I'd just say that MERGE doesn't support rules, as it's unclear how that could work. * This comment seems redundant, since pstate is always allocated with palloc0(): + * Note: we assume that the pstate's p_rtable, p_joinlist, and p_namespace + * lists were initialized to NIL when the pstate was created. make_parsestate() knows about this. This is widespread, normal practice during parse analysis. * Is this actually needed at all?: > + /* In MERGE when and condition, no system column is allowed */ > + if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND && > + attnum < InvalidAttrNumber && > + !(attnum == TableOidAttributeNumber || attnum == ObjectIdAttributeNumber)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > + errmsg("system column \"%s\" reference in WHEN AND condition is invalid", > + colname), > + parser_errposition(pstate, location))); We're talking about the scantuple here. It's not like excluded.*. * Are these checks really needed?: > +void > +rewriteTargetListMerge(Query *parsetree, Relation target_relation) > +{ > + Var *var = NULL; > + const char *attrname; > + TargetEntry *tle; > + > + if (target_relation->rd_rel->relkind == RELKIND_RELATION || > + target_relation->rd_rel->relkind == RELKIND_MATVIEW || > + target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) * I think that you should remove this debug include: > index 3a02307bd9..ed4e857477 100644 > --- a/src/backend/parser/parse_clause.c > +++ b/src/backend/parser/parse_clause.c > @@ -31,6 +31,7 @@ > #include "commands/defrem.h" > #include "nodes/makefuncs.h" > #include "nodes/nodeFuncs.h" > +#include "nodes/print.h" > #include "optimizer/tlist.h" > #include "optimizer/var.h" > #include "parser/analyze.h" It may be worth considering custom dprintf code within your .gdbinit to do stuff like this automatically, without code changes that may end up in the patch you post to the list. Here is one that I sometimes use for tuplesort.c: define print_tuplesort_counts dprintf tuplesort_sort_memtuples, "memtupsize: %d, memtupcount: %d\n", state->memtupsize, state->memtupcount end This is way easier than adding custom printf style debug code, since it doesn't require that you rebuild. It would be safe to add a similar dprintf that called pprint() or similar when some function is reached. * find_mergetarget_parents() and find_mergetarget_for_rel() could both use at least a one-liner header comment. * This analyze.c comment, which is in transformMergeStmt(), seems pretty questionable: > + /* > + * Simplify the MERGE query as much as possible > + * > + * These seem like things that could go into Optimizer, but they are > + * semantic simplications rather than optimizations, per se. > + * > + * If there are no INSERT actions we won't be using the non-matching > + * candidate rows for anything, so no need for an outer join. We do still > + * need an inner join for UPDATE and DELETE actions. This is talking about an ad-hoc form of join strength reduction. Yeah, it's a semantic simplification, but that's generally true of join strength reduction, which mostly exists because of bad ORMs that sprinkle OUTER on top of queries indifferently. MERGE is hardly an ideal candidate for join strength reduction, and I would just lose this code entirely for Postgres v11. * This sounds really brittle, and so doesn't make sense even as an aspiration: > + * XXX if we were really keen we could look through the actionList and > + * pull out common conditions, if there were no terminal clauses and put > + * them into the main query as an early row filter but that seems like an > + * atypical case and so checking for it would be likely to just be wasted > + * effort. > + */ Again, I suggest removing everything on join strength reduction. * You should say *when* this happens (what later point): > + * > + * Track the RTE index of the target table used in the join query. This is > + * later used to add required junk attributes to the targetlist. > + */ * This seems unnecessary, as we don't say anything like it for ON CONFLICT DO UPDATE: > + * As top-level statements INSERT, UPDATE and DELETE have a Query, whereas > + * with MERGE the individual actions do not require separate planning, > + * only different handling in the executor. See nodeModifyTable handling > + * of commandType CMD_MERGE. * What does this mean?: > + * A sub-query can include the Target, but otherwise the sub-query cannot > + * reference the outermost Target table at all. * I don't see the point in this: > + * XXX Perhaps we require Parallel Safety since that is a superset of > + * the restriction and enforcing that makes it easier to consider > + * running MERGE plans in parallel in future. * This references the now-removed executor WAL check thing, so you'll need to remove it too: > + * SQL Standard says we should not allow anything that possibly > + * modifies SQL-data. We enforce that with an executor check that we > + * have not written any WAL. * This should be reworded: > + * Note that we don't add this to the MERGE Query's quals because > + * that's not the logic MERGE uses. > + */ > + action->qual = transformWhereClause(pstate, action->condition, > + EXPR_KIND_MERGE_WHEN_AND, "WHEN"); Perhaps say "WHEN ... AND quals are evaluated separately from the MERGE statement's join quals" instead. Or just lose it altogether. * This comment seems inaccurate: > + /* > + * Process INSERT ... VALUES with a single VALUES > + * sublist. We treat this case separately for > + * efficiency. The sublist is just computed directly > + * as the Query's targetlist, with no VALUES RTE. So > + * it works just like a SELECT without any FROM. > + */ Wouldn't it be more accurate to say that this it totally different to what transformInsertStmt() does for a SELECT without any FROM? Also, do you think that that's good? >> * Tests for transition table behavior (mixed INSERTs, UPDATEs, and >> DELETEs) within triggers.sql seems like a good idea. > > Ok, I will add. But not done in this version. Note that it's implied in at least one place that we don't support transition tables at all: > + /* > + * XXX if we support transition tables this would need to move > + * earlier before ExecSetupTransitionCaptureState() > + */ > + switch (action->commandType) > + { You'll want to get to this as part of that transition table effort. Any plan to fix this/support identity columns? I see that you don't support them here: > + /* > + * Handle INSERT much like in transformInsertStmt > + * > + * XXX currently ignore stmt->override, if present > + */ I think that this is a blocker, unfortunately. >> * I still think we probably need to add something to ruleutils.c, so >> that MERGE Query structs can be deparsed -- see get_query_def(). > > Ok. I will look at it. Not done in this version though. I also wonder what it would take to support referencing CTEs. Other implementations do have this. Why can't we? I do accept that RETURNING support and user-defined rule support are not desirable, because it isn't at all clear what that means, and there is no precedent for those from other database systems. CTEs and identity columns are not in that same category, though. Phew! Thanks for your patience and perseverance. I do have more feedback on the docs lined up, but that isn't so important right now. -- Peter Geoghegan
On 9 March 2018 at 08:29, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Mar 8, 2018 at 4:54 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> Thanks for the feedback. I've greatly refactored the code based on your
> comments and I too like the resultant code. What we have now have
> essentially is: two functions:
I really like how ExecMerge() now mostly just consists of this code
(plus a lot of explanatory comments):
Thanks!
My #1 concern has become RLS, and
perhaps only because I haven't studied it in enough detail.
Sure. I've done what I thought is the right thing to do, but please check. Stephen also wanted to review RLS related code; I don't know if he had chance to do so.
It seems
like getting this patch committable is now a matter of verifying
details. Of course, there are a lot of details to verify. :-)
Let's work through them together and hopefully we shall get there.
* MERGE code needs to do cardinality violations like ON CONFLICT.
Specifically, you need the TransactionIdIsCurrentTransactionId()
check, and a second error fallback "attempted to lock invisible tuple"
error (though this should say "attempted to update or delete invisible
tuple" instead). The extra check is redundant, but better safe than
sorry. I like defensive errors like this because it makes the code
easier to read -- I don't get distracted by their absence.
Ok. Fixed. I also added the missing case for HeapTupleInvisible, though I don't see how we can reach there. Since ExecUpdate()/ExecDelete() always wait for any concurrent transaction to finish, I don't see how we can reach HeapTupleBeingUpdated. So I skipped that test (or may be we should just add an error for completeness).
* The last item on cardinality violations also implies this new merge
comment should be changed:
> + /*
> + * The target tuple was already updated or deleted by the
> + * current command, or by a later command in the current
> + * transaction.
> + */
It should be changed because it can't have been a later command in
this xact. We handled that case when we called ExecUpdate or
ExecDelete() (plus we now have the extra defensive "can't happen"
elog() error).
Removed.
* This related comment shouldn't even talk about update/historic
behavior, now that it isn't in ExecUpdate() -- just MERGE:
> + /*
> + * The former case is possible in a join UPDATE where
> + * multiple tuples join to the same target tuple. This is
> + * pretty questionable, but Postgres has always allowed
> + * it: we just execute the first update action and ignore
> + * additional update attempts. SQLStandard disallows this
> + * for MERGE, so allow the caller to select how to handle
> + * this.
> + */
Removed.
* This wording should be tweaked:
> + * If the current tuple is that last tuple in the update
> + * chain, then we know that the tuple was concurrently
> + * deleted. We just switch to NOT MATCHED case and let the
> + * caller retry the NOT MATCHED actions.
This should say something like "caller can move on to NOT MATCHED
actions". They can never go back from there, of course, which I want
us to be clear on.
Fixed, but please check if the new wording is OK.
* This check of whether whenqual is set is unnecessary, and doesn't
match MATCHED code, or the accompanying comments:
> + /*
> + * Test condition, if any
> + *
> + * In the absence of a condition we perform the action unconditionally
> + * (no need to check separately since ExecQual() will return true if
> + * there are no conditions to evaluate).
> + */
> + if (action->whenqual && !ExecQual(action->whenqual, econtext))
> + continue;
Fixed.
* I think that this ExecMerge() assertion is not helpful, since you go
on to dereference the pointer in all cases anyway:
> + Assert(junkfilter != NULL);
Removed.
* Executor README changes, particularly about projecting twice, really
should be ExecMerge() comments. Maybe just get rid of these?
Fixed, with addition to TABLEOID column that we now fetch along with CTID column.
* Why are we using CMD_NOTHING at all? That constant has something to
do with user-visible rules, and there is no need to reuse it (make a
new CMD_* if you have to). More importantly, why do we even have the
corresponding DO NOTHING stuff in the grammar? Why would users want
that?
For quite a while, I thought that patch must have been support for ON
CONFLICT DO NOTHING within MERGE INSERTs (the docs don't even say what
DO NOTHING is). But that's not what it is at all. It seems like this
is a way of having an action that terminates early, so you don't have
to go on to evaluate other action quals.
Hmm. I was under the impression that the SQL Standards support DO NOTHING. But now that I read the standards, I can't find any mention of DO NOTHING. It might still be useful for users to skip all (matched/not-matched/both) actions for some specific conditions. For example,
WHEN MATCHED AND balance > min_balance THEN
DO NOTHING
WHEN MATCHED AND custcat = 'priority' THEN
DO NOTHING
....
But I agree that if we remove DO NOTHING, we can simplify the code. Or we can probably re-arrange the code to check DO NOTHING early and exit the loop. That simplifies it a lot, as in attached. In passing, I also realised that the target tuple may also be needed for DO NOTHING actions since they may have quals attached to them. Also, the tuple should really be fetched just once, not once per action. Those changes are done.
* The docs say that we use a LEFT OUTER JOIN for MERGE, while the
implementation uses a RIGHT OUTER JOIN because it's convenient for
parse analysis finding the target. This difference is a bit confusing.
Why say that we use any kind of join at all, though?
Hmm, right. In fact, we revert to INNER JOIN when there are no NOT MATCHED actions, so probably we should just not mention any kind of joins, definitely not in the user documentation.
The SQL standard doesn't say outer join at all. Neither do the MERGE
docs of the other database systems that I checked. Taking a position
on this seems to add nothing at all. Let's make the MERGE docs refer
to it as a join, without further elaboration.
* I don't find this comment from analyze.c very helpful:
> + * We don't have a separate plan for each action, so the when
> + * condition must be executed as a per-row check, making it very
> + * similar to a CHECK constraint and so we adopt the same semantics
> + * for that.
Why explain it that way at all? There are two rels, unlike a check constraint.
Agreed. I think it was left-over from the time when sub-selects were not allowed and we were using EXPR_KIND_CHECK_ CONSTRAINT to run those checks. Now we have a separate expression kind and we do support sub-selects.
* The first time I read this comment, it made me laugh:
> + /*
> + * First rule of MERGE club is we don't talk about rules
> + */
The joke has become significantly less funny since then, though. I'd
just say that MERGE doesn't support rules, as it's unclear how that
could work.
Changed that way.
* This comment seems redundant, since pstate is always allocated with palloc0():
+ * Note: we assume that the pstate's p_rtable, p_joinlist, and p_namespace
+ * lists were initialized to NIL when the pstate was created.
make_parsestate() knows about this. This is widespread, normal
practice during parse analysis.
Ok. Removed.
* Is this actually needed at all?:
> + /* In MERGE when and condition, no system column is allowed */
> + if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND &&
> + attnum < InvalidAttrNumber &&
> + !(attnum == TableOidAttributeNumber || attnum == ObjectIdAttributeNumber))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> + errmsg("system column \"%s\" reference in WHEN AND condition is invalid",
> + colname),
> + parser_errposition(pstate, location)));
We're talking about the scantuple here. It's not like excluded.*.
We might be able to support them, but do we really care?
* Are these checks really needed?:
> +void
> +rewriteTargetListMerge(Query *parsetree, Relation target_relation)
> +{
> + Var *var = NULL;
> + const char *attrname;
> + TargetEntry *tle;
> +
> + if (target_relation->rd_rel->relkind == RELKIND_RELATION ||
> + target_relation->rd_rel->relkind == RELKIND_MATVIEW ||
> + target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
You're right. Those checks are done at the beginning and we should probably just turn this into an Assert, just like ExecMerge(). Changed that way.
* I think that you should remove this debug include:
> index 3a02307bd9..ed4e857477 100644
> --- a/src/backend/parser/parse_clause.c
> +++ b/src/backend/parser/parse_clause.c
> @@ -31,6 +31,7 @@
> #include "commands/defrem.h"
> #include "nodes/makefuncs.h"
> #include "nodes/nodeFuncs.h"
> +#include "nodes/print.h"
> #include "optimizer/tlist.h"
> #include "optimizer/var.h"
> #include "parser/analyze.h"
Done.
It may be worth considering custom dprintf code within your .gdbinit
to do stuff like this automatically, without code changes that may end
up in the patch you post to the list. Here is one that I sometimes use
for tuplesort.c:
define print_tuplesort_counts
dprintf tuplesort_sort_memtuples, "memtupsize: %d, memtupcount:
%d\n", state->memtupsize, state->memtupcount
end
This is way easier than adding custom printf style debug code, since
it doesn't require that you rebuild. It would be safe to add a similar
dprintf that called pprint() or similar when some function is reached.
Ok. Thanks for the hint.
* find_mergetarget_parents() and find_mergetarget_for_rel() could both
use at least a one-liner header comment.
Done. It was indeed confusing, so comments should help.
* This analyze.c comment, which is in transformMergeStmt(), seems
pretty questionable:
> + /*
> + * Simplify the MERGE query as much as possible
> + *
> + * These seem like things that could go into Optimizer, but they are
> + * semantic simplications rather than optimizations, per se.
> + *
> + * If there are no INSERT actions we won't be using the non-matching
> + * candidate rows for anything, so no need for an outer join. We do still
> + * need an inner join for UPDATE and DELETE actions.
This is talking about an ad-hoc form of join strength reduction. Yeah,
it's a semantic simplification, but that's generally true of join
strength reduction, which mostly exists because of bad ORMs that
sprinkle OUTER on top of queries indifferently. MERGE is hardly an
ideal candidate for join strength reduction, and I would just lose
this code entirely for Postgres v11.
I am not sure if there could be additional query optimisation possibilities when a RIGHT OUTER JOIN is changed to INNER JOIN. Apart from existing optimisations, I am also thinking about some of the work that David and Amit are doing for partition pruning and I wonder if the choice of join might have a non-trivial effect. Having said that, I am yet to explore those things. But when we definitely know that a different kind of JOIN is all we need (because there are no NOT MATCHED actions), why not use that? Or do you see a problem there?
* This sounds really brittle, and so doesn't make sense even as an aspiration:
> + * XXX if we were really keen we could look through the actionList and
> + * pull out common conditions, if there were no terminal clauses and put
> + * them into the main query as an early row filter but that seems like an
> + * atypical case and so checking for it would be likely to just be wasted
> + * effort.
> + */
We might actually want to do this, if not for v11 then later. I am not sure if we should keep it in the final commit though.
.
* You should say *when* this happens (what later point):
> + *
> + * Track the RTE index of the target table used in the join query. This is
> + * later used to add required junk attributes to the targetlist.
> + */
Done.
* This seems unnecessary, as we don't say anything like it for ON
CONFLICT DO UPDATE:
> + * As top-level statements INSERT, UPDATE and DELETE have a Query, whereas
> + * with MERGE the individual actions do not require separate planning,
> + * only different handling in the executor. See nodeModifyTable handling
> + * of commandType CMD_MERGE.
I don't see anything wrong with keeping it. May be it needs rewording?
* What does this mean?:
> + * A sub-query can include the Target, but otherwise the sub-query cannot
> + * reference the outermost Target table at all.
Don't know :-) I think what Simon probably meant to say that you can't reference the Target table in the sub-query used in the source relation. But you can add the Target as a separate RTE
* I don't see the point in this:
> + * XXX Perhaps we require Parallel Safety since that is a superset of
> + * the restriction and enforcing that makes it easier to consider
> + * running MERGE plans in parallel in future.
Yeah, removed.
* This references the now-removed executor WAL check thing, so you'll
need to remove it too:
> + * SQL Standard says we should not allow anything that possibly
> + * modifies SQL-data. We enforce that with an executor check that we
> + * have not written any WAL.
Yes, removed too.
* This should be reworded:
> + * Note that we don't add this to the MERGE Query's quals because
> + * that's not the logic MERGE uses.
> + */
> + action->qual = transformWhereClause(pstate, action->condition,
> + EXPR_KIND_MERGE_WHEN_AND, "WHEN");
Perhaps say "WHEN ... AND quals are evaluated separately from the
MERGE statement's join quals" instead. Or just lose it altogether.
Reworded slightly.
* This comment seems inaccurate:
> + /*
> + * Process INSERT ... VALUES with a single VALUES
> + * sublist. We treat this case separately for
> + * efficiency. The sublist is just computed directly
> + * as the Query's targetlist, with no VALUES RTE. So
> + * it works just like a SELECT without any FROM.
> + */
Wouldn't it be more accurate to say that this it totally different to
what transformInsertStmt() does for a SELECT without any FROM? Also,
do you think that that's good?
Actually it's pretty much same as what transformInsertStmt(). Even the comments are verbatim copied from there. I don't see anything wrong with the comment or the code. Can you please explain what problem you see?
>> * Tests for transition table behavior (mixed INSERTs, UPDATEs, and
>> DELETEs) within triggers.sql seems like a good idea.
>
> Ok, I will add. But not done in this version.
Note that it's implied in at least one place that we don't support
transition tables at all:
> + /*
> + * XXX if we support transition tables this would need to move
> + * earlier before ExecSetupTransitionCaptureState()
> + */
> + switch (action->commandType)
> + {
You'll want to get to this as part of that transition table effort.
I actually didn't even know what transition tables are until today. But today I studied them and the new version now supports transition tables with MERGE. We might consider it to WIP given the amount of time I've spent coding that, though I am fairly happy with the result so far. The comment above turned out to be bogus.
I decided to create new tuplestores and mark them as new_update, old_update, new_insert and old_delete. This is necessary because MERGE can run all three kinds of commands i.e. UPDATE, DELETE and INSERT, and we would like to track their transition tables separately.
(Hmm.. I just noticed though INSERT ON CONFLICT does not do this and still able to track transition tables for INSERT and UPDATE correctly. So may be what I did wasn't necessary after all, though it's also likely that we can get IOC to use this new mechanism)
Any plan to fix this/support identity columns? I see that you don't
support them here:
> + /*
> + * Handle INSERT much like in transformInsertStmt
> + *
> + * XXX currently ignore stmt->override, if present
> + */
I have already added support for OVERRIDING. The comment needed adjustments which I have done now.
I think that this is a blocker, unfortunately.
You mean OVERRIDING or ruleutils?
>> * I still think we probably need to add something to ruleutils.c, so
>> that MERGE Query structs can be deparsed -- see get_query_def().
>
> Ok. I will look at it. Not done in this version though.
I also wonder what it would take to support referencing CTEs. Other
implementations do have this. Why can't we?
Hmm. I will look at them. But TBH I would like to postpone them to v12 if they turn out to be tricky. We have already covered a very large ground in the last few weeks, but we're approaching the feature freeze date.
Phew! Thanks for your patience and perseverance. I do have more
feedback on the docs lined up, but that isn't so important right now.
Thanks! Those were really useful review comments. In passing, I made some updates to the doc, but I really should make a complete pass over the patch.
Thanks,
Pavan
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Please don't forget to remove the xlog.c and miscadmin.h hunks from the patch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 9, 2018 at 6:55 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> * Is this actually needed at all?: >> >> > + /* In MERGE when and condition, no system column is allowed */ >> > + if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND && >> > + attnum < InvalidAttrNumber && >> > + !(attnum == TableOidAttributeNumber || attnum == >> > ObjectIdAttributeNumber)) >> > + ereport(ERROR, >> > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), >> > + errmsg("system column \"%s\" reference in WHEN AND >> > condition is invalid", >> > + colname), >> > + parser_errposition(pstate, location))); >> >> We're talking about the scantuple here. It's not like excluded.*. I often care about things like system columns not because of the user-visible functionality, but because it reassures me that the design is robust. >> I think that this is a blocker, unfortunately. > > You mean OVERRIDING or ruleutils? I meant OVERRIDING, but ruleutils seems like something we need, too. >> >> * I still think we probably need to add something to ruleutils.c, so >> >> that MERGE Query structs can be deparsed -- see get_query_def(). >> > >> > Ok. I will look at it. Not done in this version though. >> >> I also wonder what it would take to support referencing CTEs. Other >> implementations do have this. Why can't we? > > > Hmm. I will look at them. But TBH I would like to postpone them to v12 if > they turn out to be tricky. We have already covered a very large ground in > the last few weeks, but we're approaching the feature freeze date. I quickly implemented CTE support myself (not wCTE support, since MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work when I mechanically duplicate the approach taken with other types of DML statement in the parser. I have written a few tests, and so far it holds up. I also undertook something quite a bit harder: Changing the representation of the range table from parse analysis on. As I mentioned in passing at one point, I'm concerned about the fact that there are two RTEs for the target relation in all cases. You introduced a separate Query.resultRelation-style RTI index, Query.mergeTarget_relation, and we see stuff like this through every stage of query processing, from parse analysis right through to execution. One problem with the existing approach is that it leaves many cases where EXPLAIN MERGE shows the target relation alias "t" as "t_1" for some planner nodes, though not for others. More importantly, I suspect that having two distinct RTEs has introduced special cases that are not really needed, and will be more bug-prone (in fact, there are bugs already, which I'll get to at the end). I think that it's fair to say that what happens in the patch with EvalPlanQual()'s RTI argument is ugly, especially because we also have a separate resultRelInfo that we *don't* use. We should aspire to have a MERGE implementation that isn't terribly different to UPDATE or DELETE, especially within the executor. I wrote a very rough patch that arranged for the MERGE rtable to have only a single relation RTE. My approach was to teach transformFromClauseItem() and friends to recognize that they shouldn't create a new RTE for the MERGE target RangeVar. I actually added some hard coding to getRTEForSpecialRelationTypes() for this, which is ugly as sin, but the general approach is likely salvageable (we could invent a special type of RangeVar to do this, perhaps). The point here is that everything just works if there isn't a separate RTE for the join's leftarg and the setTargetTable() target, with exactly one exception: partitioning becomes *thoroughly* broken. Every single partitioning test fails with "ERROR: no relation entry for relid 1", and occasionally some other "can't happen" error. This looks like it would be hard to fix -- at least, I'd find it hard to fix. This is an extract from header comments for inheritance_planner() (master branch): * We have to handle this case differently from cases where a source relation * is an inheritance set. Source inheritance is expanded at the bottom of the * plan tree (see allpaths.c), but target inheritance has to be expanded at * the top. The reason is that for UPDATE, each target relation needs a * different targetlist matching its own column set. Fortunately, * the UPDATE/DELETE target can never be the nullable side of an outer join, * so it's OK to generate the plan this way. Of course, that isn't true of MERGE: The MERGE target *can* be the nullable side of an outer join. That's probably a big complication for using one target RTE. Your approach to implementing partitioning [1] seems to benefit from having two different RTEs, in a way -- you sidestep the restriction. As you put it, "Since MERGE need both the facilities [both INSERT and UPDATE facilities], I'd to pretty much merge both the machineries". As I understand it, you "merge" these machineries by having find_mergetarget_for_rel() work backwards. You are mapping from one relation tree to another relation tree in an ad-hoc fashion -- both trees for the same underlying RTE. (You compensate for this later, in the executor, with the special EvalPlanQual() RTI stuff I mentioned already.) I have some broad concerns here. I would especially like to hear from hackers that know more about partitioning/inheritance than I do on these concerns. They are: * Is it okay to take this approach with partitioning? I worry about things like ExecAuxRowMark handling. We avoid calling EvalPlanQualSetPlan() within ExecModifyTable() for CMD_MERGE, so I'm pretty sure that that's already broken as-is. Separately, can't see why it's okay that CMD_MERGE doesn't have mt_transition_capture initialization occur per-child, so that's probably another bug of the same general variety. To be clear, this is the specific part of the patch that avoids going through child plans as described: > @@ -1927,7 +2590,14 @@ ExecModifyTable(PlanState *pstate) > { > /* advance to next subplan if any */ > node->mt_whichplan++; > - if (node->mt_whichplan < node->mt_nplans) > + > + /* > + * If we are executing MERGE, we only need to execute the first > + * subplan since it's guranteed to return all the required tuples. > + * In fact, running remaining subplans would be a problem since we > + * will end up fetching the same tuples N times. > + */ > + if (node->mt_whichplan < node->mt_nplans && (operation != CMD_MERGE)) > { > resultRelInfo++; > subplanstate = node->mt_plans[node->mt_whichplan]; * Is there a way to make what I describe (having a single target RTE) work with partitioning, without any big new special cases, especially in the executor? * Any thoughts on this multiple-RTEs-for-target-rel business more generally? [1] https://postgr.es/m/CABOikdPjjG+JcdNeegrL7=BtPdJ6yEv--V4hU8KzJTTwX1SNmw@mail.gmail.com -- Peter Geoghegan
On Sun, Mar 11, 2018 at 9:27 AM, Peter Geoghegan <pg@bowt.ie> wrote:
[1] https://www.postgresql.org/message-id/CABOikdM%2Bc1vB_%2B3tYEjO%3DJ6U2uNHzKU_b%3DU72tadD5-9xQcbHA%40mail.gmail.com
>>
>> We're talking about the scantuple here. It's not like excluded.*.
I often care about things like system columns not because of the
user-visible functionality, but because it reassures me that the
design is robust.
Ok. I will look at it. I think it shouldn't be too difficult and the original restriction was mostly a fallout of expecting CHECK constraint style expressions there.
>> I think that this is a blocker, unfortunately.
>
> You mean OVERRIDING or ruleutils?
I meant OVERRIDING, but ruleutils seems like something we need, too.
Ok. OVERRIDING is done. I think we can support ruleutils easily too. I don't know how to test that though.
>> >> * I still think we probably need to add something to ruleutils.c, so
>> >> that MERGE Query structs can be deparsed -- see get_query_def().
>> >
>> > Ok. I will look at it. Not done in this version though.
>>
>> I also wonder what it would take to support referencing CTEs. Other
>> implementations do have this. Why can't we?
>
>
> Hmm. I will look at them. But TBH I would like to postpone them to v12 if
> they turn out to be tricky. We have already covered a very large ground in
> the last few weeks, but we're approaching the feature freeze date.
I quickly implemented CTE support myself (not wCTE support, since
MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work
when I mechanically duplicate the approach taken with other types of
DML statement in the parser. I have written a few tests, and so far it
holds up.
Ok, thanks. I started doing something similar, but great if you have already implemented. I will focus on other things for now.
I also undertook something quite a bit harder: Changing the
representation of the range table from parse analysis on. As I
mentioned in passing at one point, I'm concerned about the fact that
there are two RTEs for the target relation in all cases. You
introduced a separate Query.resultRelation-style RTI index,
Query.mergeTarget_relation, and we see stuff like this through every
stage of query processing, from parse analysis right through to
execution. One problem with the existing approach is that it leaves
many cases where EXPLAIN MERGE shows the target relation alias "t" as
"t_1" for some planner nodes, though not for others. More importantly,
I suspect that having two distinct RTEs has introduced special cases
that are not really needed, and will be more bug-prone (in fact, there
are bugs already, which I'll get to at the end). I think that it's
fair to say that what happens in the patch with EvalPlanQual()'s RTI
argument is ugly, especially because we also have a separate
resultRelInfo that we *don't* use. We should aspire to have a MERGE
implementation that isn't terribly different to UPDATE or DELETE,
especially within the executor.
I thought for a while about this and even tried multiple approaches before settling for what we have today. The biggest challenge is that inheritance/partition tables take completely different paths for INSERTs and UPDATE/DELETE. The RIGHT OUTER JOIN makes it kinda difficult because the regular UPDATE/DELETE code path ends up throwing duplicates when the source table is joined with individual partitions. IIRC that's the sole reason why I'd to settle on pushing the JOIN underneath, give it SELECT like treatment and then handle UPDATE/DELETE in the executor.
I wrote a very rough patch that arranged for the MERGE rtable to have
only a single relation RTE. My approach was to teach
transformFromClauseItem() and friends to recognize that they shouldn't
create a new RTE for the MERGE target RangeVar. I actually added some
hard coding to getRTEForSpecialRelationTypes() for this, which is ugly
as sin, but the general approach is likely salvageable (we could
invent a special type of RangeVar to do this, perhaps). The point here
is that everything just works if there isn't a separate RTE for the
join's leftarg and the setTargetTable() target, with exactly one
exception: partitioning becomes *thoroughly* broken. Every single
partitioning test fails with "ERROR: no relation entry for relid 1",
and occasionally some other "can't happen" error. This looks like it
would be hard to fix -- at least, I'd find it hard to fix.
Ok. If you've something which is workable, then great. But AFAICS this is what the original patch was doing until we came to support partitioning. Even with partitioning I could get everything to work, without duplicating the RTE, except the duplicate rows issue. I don't know how to solve that without doing what I've done or completely rewriting UPDATE/DELETE handling for inheritance/partition table. If you or others have better ideas, they are most welcome.
This is an extract from header comments for inheritance_planner()
(master branch):
* We have to handle this case differently from cases where a source relation
* is an inheritance set. Source inheritance is expanded at the bottom of the
* plan tree (see allpaths.c), but target inheritance has to be expanded at
* the top. The reason is that for UPDATE, each target relation needs a
* different targetlist matching its own column set. Fortunately,
* the UPDATE/DELETE target can never be the nullable side of an outer join,
* so it's OK to generate the plan this way.
Of course, that isn't true of MERGE: The MERGE target *can* be the
nullable side of an outer join. That's probably a big complication for
using one target RTE. Your approach to implementing partitioning [1]
seems to benefit from having two different RTEs, in a way -- you
sidestep the restriction.
Right. The entire purpose of having two different RTEs is to work around this problem. I explained this approach here [1]. I didn't receive any objections then, but that's mostly because nobody read it carefully. As I said, if we have an alternate feasible and better mechanism, let's go for it as long as efforts are justifiable.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Mar 10, 2018 at 9:22 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Ok. I will look at it. I think it shouldn't be too difficult and the > original restriction was mostly a fallout of expecting CHECK constraint > style expressions there. Good, thanks. > Ok. OVERRIDING is done. I think we can support ruleutils easily too. I don't > know how to test that though. Glad to hear it. > I thought for a while about this and even tried multiple approaches before > settling for what we have today. The biggest challenge is that > inheritance/partition tables take completely different paths for INSERTs and > UPDATE/DELETE. The RIGHT OUTER JOIN makes it kinda difficult because the > regular UPDATE/DELETE code path ends up throwing duplicates when the source > table is joined with individual partitions. IIRC that's the sole reason why > I'd to settle on pushing the JOIN underneath, give it SELECT like treatment > and then handle UPDATE/DELETE in the executor. It sounds like we should try to thoroughly understand why these duplicates arose. Did you actually call EvalPlanQualSetPlan() for all subplans at the time? > Ok. If you've something which is workable, then great. But AFAICS this is > what the original patch was doing until we came to support partitioning. > Even with partitioning I could get everything to work, without duplicating > the RTE, except the duplicate rows issue. I don't know how to solve that > without doing what I've done or completely rewriting UPDATE/DELETE handling > for inheritance/partition table. If you or others have better ideas, they > are most welcome. I don't claim that what I wrote was workable with partitioning. But I'm not getting how we can get away with not calling EvalPlanQualSetPlan() for child plans, or something like it, as things are. > Right. The entire purpose of having two different RTEs is to work around > this problem. I explained this approach here [1]. I didn't receive any > objections then, but that's mostly because nobody read it carefully. As I > said, if we have an alternate feasible and better mechanism, let's go for it > as long as efforts are justifiable. FWIW, you're right that I didn't give that aspect much thought until quite recently. I'm no expert on partitioning. As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in the works from Alvaro. In your explanation about that approach that you cited, you wondered what the trouble might have been with ON CONFLICT + partitioning, and supposed that the issues were similar there. Are they? Has that turned up much? -- Peter Geoghegan
On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan <pg@bowt.ie> wrote:
It sounds like we should try to thoroughly understand why these
duplicates arose. Did you actually call EvalPlanQualSetPlan() for all
subplans at the time?
The reason for duplicates or even wrong answers is quite simple. The way UPDATE/DELETE currently works for partition table is that we expand the inheritance tree for the parent result relation and then create a subplan for each partition separately. This works fine, even when there exists a FROM/USING clause in the UPDATE/DELETE statement because the final result does not change irrespective of whether you first do a UNION ALL between all partitions and then find the candidate rows or whether you find candidate rows from individual partitions separately.
In case of MERGE though, since we are performing a RIGHT OUTER JOIN between the result relation and the source relation, we may conclude that a matching target row does not exist for a source row, whereas it actually exists but in some other partition. For example,
CREATE TABLE target (key int, val text) PARTITION BY LIST ( key);
CREATE TABLE part1 PARTITION OF target FOR VALUES IN (1, 2, 3);
CREATE TABLE part2 PARTITION OF target FOR VALUES IN (4, 5, 6);
CREATE TABLE source (skey integer);
INSERT INTO source VALUES (1), (4), (7);
INSERT INTO part1 VALUES (1, 'v1'), (2, 'v2'), (3, 'v3');
INSERT INTO part2 VALUES (4, 'v4'), (5, 'v5'), (6, 'v6');
postgres=# SELECT * FROM target RIGHT OUTER JOIN source ON key = skey;
key | val | skey
-----+-----+------
1 | v1 | 1
4 | v4 | 4
| | 7
(3 rows)
This gives the right answer. But if we join individual partitions and then do a UNION ALL,
postgres=# SELECT * FROM part1 RIGHT OUTER JOIN source ON key = skey UNION ALL SELECT * FROM part2 RIGHT OUTER JOIN source ON key = skey;
key | val | skey
-----+-----+------
1 | v1 | 1
| | 4
| | 7
| | 1
4 | v4 | 4
| | 7
(6 rows)
This is what nodeModifyTable does and hence we end up getting duplicates or even incorrectly declared NOT MATCHED rows, where as they are matched in a different partition.
I don't think not calling EvalPlanQualSetPlan() on all subplans is a problem because we really never execute those subplans. In fact. we should fix that so that those subplans are not even initialised.
As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
the works from Alvaro. In your explanation about that approach that
you cited, you wondered what the trouble might have been with ON
CONFLICT + partitioning, and supposed that the issues were similar
there. Are they? Has that turned up much?
Well, I initially thought that ON CONFLICT DO UPDATE on partition table may have the same challenges, but that's probably not the case. For INSERT ON CONFLICT it's still just an INSERT path, with some special handling for UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go via inheritance_planner() thus expanding inheritance for the result relation where as INSERTs go via simple grouping_planner().
For MERGE, we do all three DMLs. That doesn't mean we could not re-implement MERGE on the lines of INSERTs, but that would most likely mean complete re-writing of the UPDATEs/DELETEs for partition/inheritance tables. The challenges would just be the same in both cases.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Greetings Pavan, all, * Pavan Deolasee (pavan.deolasee@gmail.com) wrote: > On 9 March 2018 at 08:29, Peter Geoghegan <pg@bowt.ie> wrote: > > My #1 concern has become RLS, and > > perhaps only because I haven't studied it in enough detail. > > Sure. I've done what I thought is the right thing to do, but please check. > Stephen also wanted to review RLS related code; I don't know if he had > chance to do so. I've started looking through the code from an RLS perspective and, at least on my initial review, it looks alright. A couple test that aren't included that I think should be in the regression suire are where both tables have RLS policies and where the RLS tables have actual SELECT policies beyond just 'true'. I certainly see SELECT policies which limits the records that individuals are allowed to see very frequently and it's an important case to ensure works correctly. I did a few tests here myself and they behaved as I expected, and reading through the code it looks reasonable, but they should be simple to write tests which run very quickly. I'm a bit on the fence about it, but as MERGE is added in quite a few places which previously mentioned UPDATE and DELETE throughout the system, I wonder if we shouldn't do better than this: =*# create policy p1 on t1 for merge using ((c1 % 2) = 0); ERROR: syntax error at or near "merge" LINE 1: create policy p1 on t1 for merge using ((c1 % 2) = 0); Specifically, perhaps we should change that to pick up on MERGE being asked for there and return an error saying that policies for the MERGE command aren't supported with a HINT that MERGE respects INSERT/UPDATE/DELETE policies and that users should use those instead. Also, in nodeModifyTable.c, there's a comment: * The WITH CHECK quals are applied in ExecUpdate() and hence we need * not do anything special to handle them. Which I believe is actually getting at the fact that ExecUpdate() will run ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK ...) and therefore in ExecMergeMatched() we don't need to check WCO_RLS_UPDATE_CHECK, but we do still need to check WCO_RLS_MERGE_UPDATE_CHECK (and that's what the code does). One thing I wonder about there though is if we really need to segregate those..? What's actually making WCO_RLS_MERGE_UPDATE_CHECK different from WCO_RLS_UPDATE_CHECK? I get that the DELETE case is different, because in a regular DELETE we'll never even see the row, but for MERGE, we will see the row (assuming it passes SELECT policies, of course) and then will check if it matches and that's when we realize that we've been asked to run a DELETE, so we have the special-case of WCO_RLS_MERGE_DELETE_CHECK, but that's not true of UPDATE, so I think this might be adding a bit of unnecessary complication by introducing WCO_RLS_MERGE_UPDATE_CHECK. Thanks! Stephen
Attachment
Hi Stephen,
On Fri, Mar 16, 2018 at 7:28 AM, Stephen Frost <sfrost@snowman.net> wrote:
Greetings Pavan, all,
* Pavan Deolasee (pavan.deolasee@gmail.com) wrote:
> On 9 March 2018 at 08:29, Peter Geoghegan <pg@bowt.ie> wrote:
> > My #1 concern has become RLS, and
> > perhaps only because I haven't studied it in enough detail.
>
> Sure. I've done what I thought is the right thing to do, but please check.
> Stephen also wanted to review RLS related code; I don't know if he had
> chance to do so.
I've started looking through the code from an RLS perspective and, at
least on my initial review, it looks alright.
Thanks for taking time out to review the patch. It certainly helps a lot.
A couple test that aren't included that I think should be in the
regression suire are where both tables have RLS policies and
where the RLS tables have actual SELECT policies beyond just 'true'.
I certainly see SELECT policies which limits the records that
individuals are allowed to see very frequently and it's an
important case to ensure works correctly. I did a few tests here myself
and they behaved as I expected, and reading through the code it looks
reasonable, but they should be simple to write tests which run very
quickly.
Ok. I will add those tests.
I'm a bit on the fence about it, but as MERGE is added in quite a few
places which previously mentioned UPDATE and DELETE throughout the
system, I wonder if we shouldn't do better than this:
=*# create policy p1 on t1 for merge using ((c1 % 2) = 0);
ERROR: syntax error at or near "merge"
LINE 1: create policy p1 on t1 for merge using ((c1 % 2) = 0);
Specifically, perhaps we should change that to pick up on MERGE being
asked for there and return an error saying that policies for the MERGE
command aren't supported with a HINT that MERGE respects
INSERT/UPDATE/DELETE policies and that users should use those instead.
Hmm. I am not sure if that would be a good idea just for RLS. We might then also want to change several other places in the grammar to accept INSERT/UPDATE/DELETE keyword and handle that during parse analysis. We can certainly do that, but I am not sure if it adds a lot of value and certainly adds a lot more code. We should surely document these things, if we are not already.
Also, in nodeModifyTable.c, there's a comment:
* The WITH CHECK quals are applied in ExecUpdate() and hence we need
* not do anything special to handle them.
Which I believe is actually getting at the fact that ExecUpdate() will
run ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK ...) and therefore in
ExecMergeMatched() we don't need to check WCO_RLS_UPDATE_CHECK, but we
do still need to check WCO_RLS_MERGE_UPDATE_CHECK (and that's what the
code does).
Right.
One thing I wonder about there though is if we really need
to segregate those..? What's actually making WCO_RLS_MERGE_UPDATE_CHECK
different from WCO_RLS_UPDATE_CHECK? I get that the DELETE case is
different, because in a regular DELETE we'll never even see the row, but
for MERGE, we will see the row (assuming it passes SELECT policies, of
course) and then will check if it matches and that's when we realize
that we've been asked to run a DELETE, so we have the special-case of
WCO_RLS_MERGE_DELETE_CHECK, but that's not true of UPDATE, so I think
this might be adding a bit of unnecessary complication by introducing
WCO_RLS_MERGE_UPDATE_CHECK.
I've modelled this code on the lines of ON CONFLICT DO NOTHING. And quite similar to that, I believe we need separate handling for MERGE as well because we don't (and can't) push down the USING security quals of UPDATE/DELETE to the scan. So we need to separately check that the target row actually passes the USING quals. WCO_RLS_MERGE_UPDATE_CHECK and WCO_RLS_MERGE_DELETE_CHECK are used for that purpose.
One point to deliberate though is whether it's a good idea to throw an error when the USING quals fail or should we silently ignore such rows. Regular UPDATE/DELETE does the latter and ON CONFLICT DO UPDATE does the former. I chose to throw an error because otherwise it may get confusing to the user since a row would neither be updated (meaning, it will be seen as a case of NOT MATCHED), but nor be inserted. I hope no problem there.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 12, 2018 at 5:43 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan <pg@bowt.ie> wrote:
As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
the works from Alvaro. In your explanation about that approach that
you cited, you wondered what the trouble might have been with ON
CONFLICT + partitioning, and supposed that the issues were similar
there. Are they? Has that turned up much?Well, I initially thought that ON CONFLICT DO UPDATE on partition table may have the same challenges, but that's probably not the case. For INSERT ON CONFLICT it's still just an INSERT path, with some special handling for UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go via inheritance_planner() thus expanding inheritance for the result relation where as INSERTs go via simple grouping_planner().For MERGE, we do all three DMLs. That doesn't mean we could not re-implement MERGE on the lines of INSERTs, but that would most likely mean complete re-writing of the UPDATEs/DELETEs for partition/inheritance tables. The challenges would just be the same in both cases.
Having thought more about this in the last couple of days, I am actually inclined to try out rewrite the UPDATE handling of MERGE on the lines of what ON CONFLICT DO UPDATE patch is doing. This might help us to completely eliminate invoking inheritance_planner() for partition table and that will be a huge win for tables with several hundred partitions. The code might also look much cleaner that way. I am gonna give it a try for next couple of days and see if its doable.
Thanks,
Pavan
On Sun, Mar 18, 2018 at 11:31 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
On Mon, Mar 12, 2018 at 5:43 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan <pg@bowt.ie> wrote:
As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
the works from Alvaro. In your explanation about that approach that
you cited, you wondered what the trouble might have been with ON
CONFLICT + partitioning, and supposed that the issues were similar
there. Are they? Has that turned up much?Well, I initially thought that ON CONFLICT DO UPDATE on partition table may have the same challenges, but that's probably not the case. For INSERT ON CONFLICT it's still just an INSERT path, with some special handling for UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go via inheritance_planner() thus expanding inheritance for the result relation where as INSERTs go via simple grouping_planner().For MERGE, we do all three DMLs. That doesn't mean we could not re-implement MERGE on the lines of INSERTs, but that would most likely mean complete re-writing of the UPDATEs/DELETEs for partition/inheritance tables. The challenges would just be the same in both cases.Having thought more about this in the last couple of days, I am actually inclined to try out rewrite the UPDATE handling of MERGE on the lines of what ON CONFLICT DO UPDATE patch is doing. This might help us to completely eliminate invoking inheritance_planner() for partition table and that will be a huge win for tables with several hundred partitions. The code might also look much cleaner that way. I am gonna give it a try for next couple of days and see if its doable.
So here is a version that completely avoids taking the inheritance_planner() route and rather handles the MERGE UPDATE/DELETE during execution, by translating tuples between child and root partition and vice versa. To achieve this we no longer expand partition inheritance for result relation, just like INSERT.
Apart from making the code look better, this also gives a very nice boost to performance. In fact, v20 performed very poorly with 10 or more partitions. Whereas this version even beats regular UPDATE when there are 10 or more partitions. This is because we are able to avoid calling grouping_planner() repeatedly for all the partitions, by not expanding partition tree for the result relation.
This patch is based on the on-going work on ON CONFLICT DO UPDATE for partitioned table. I've attached the base patch that I am using from that work and I will re-base MERGE patch once the other patch set takes a final shape.
Apart from this major change, there are several other changes:
- moved some of the new code to two new source files, src/backend/executor/nodeMerge.c and src/backend/parser/parse_merge.c. This necessitated exporting ExecUpdate/ExecInsert/ExecDelete functions from nodeModifyTable.c, but I think it's still a better change.
- changed the way slots were created per action-state. Instead, now we have a single slot per result relation and we only change slot-descriptor while working with specific action. We of course need to track tuple descriptor per action though
- moved some merge-specific state, such as list of matched and not-matched actions to result-relation. In case of partitioned table, if the partition child'd schema differs from the root, then we setup new merge-state for the partition and transform various members of the action-state to reflect the child table's schema
Regarding few other things:
- I wrote code for deparsing MERGE, but don't know how to test that
- I could support the system column references to the target relation by simply removing the blocking error, but it currently does not work for source relation. I am not sure if we even want to support that. But we should definitely throw a more user friendly error than what the patch does today.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Thu, Mar 8, 2018 at 6:59 PM, Peter Geoghegan <pg@bowt.ie> wrote: > Phew! Thanks for your patience and perseverance. I do have more > feedback on the docs lined up, but that isn't so important right now. I want to get this feedback on the docs to you now. This is a little raw, as it's taken from notes made 2 weeks ago. I haven't checked if these still points haven't been addressed already, so apologies if you've already done something about some number of them. * Why is terminating semi-colon on its own line in MERGE sql examples from the docs? * ON CONFLICT references in MERGE docs should be a "Tip" box, IMV. * Docs don't actually say what DO NOTHING does. * The docs say "A condition cannot contain subqueries, set returning functions, nor can it contain window or aggregate functions". Thought it can now? * The docs say "INSERT actions cannot contain sub-selects". Didn't that change? * The docs on merge_insert grammar element say "Do not include the table name", which seems like it was already covered. It's obvious. Suggest removing it. * The docs say "The number of rows inserted, updated and deleted can be seen in the output of EXPLAIN ANALYZE". This seems unnecessary. We don't see that for ON CONFLICT. * The docs say "it should be noted that no error is raised if that occurs". But should it? I don't think that this addresses a perception that users are likely to have. * "See MERGE" hyperlink within MERGE docs themselves seem odd. Suggest changing this. * Trigger behavior doesn't belong in main MERGE doc page, IMV. Just like with ON CONFLICT, it should be documented where triggers are already documented. The specifics of MERGE can be discussed there. * It might make sense to point out in the docs that join_condition should not filter the target table too much. Like SQL server docs say, don't put things in the join that filter the target that actually belong in the WHEN .. AND quals. In a way, this should be obvious, because it's an outer join. But I don't think it is, and ISTM that the sensible thing to do is to warn against it. * We never actually get around to saying that MERGE is good with bulk loading, ETL, and so on. I think that we should remark on that in passing. * I think that the mvcc.sgml changes can go. Perhaps a passing reference to MERGE can be left behind, that makes it clear that it's really rather like UPDATE FROM and so on. The fact that it's like UPDATE FROM now seems crystal clear. * insert.sgml need not mention MERGE IMV. I also have some additional minor feedback on the code itself that I'll send now, which is a bit more recent (based on the version posted on 2018-03-08): * Most of the field comments here should be improved: > /* ---------------- > * MergeActionState information > * ---------------- > */ > typedef struct MergeActionState > { > NodeTag type; > bool matched; /* MATCHED or NOT MATCHED */ > ExprState *whenqual; /* WHEN quals */ > CmdType commandType; /* type of action */ > TupleTableSlot *slot; /* instead of ResultRelInfo */ > ProjectionInfo *proj; /* instead of ResultRelInfo */ > } MergeActionState; * I wouldn't talk about a precedent like this: > > /* > > * Process BEFORE EACH STATEMENT triggers > > + * > > + * The precedent set by ON CONFLICT is that we fire INSERT then UPDATE. > > + * MERGE follows the same logic, firing INSERT, then UPDATE, then DELETE. > > */ The comments should read as one voice. Ideally, it will look like ON CONFLICT could have just as easily been added after MERGE, unless there is a very compelling reason to mention a precedent. I mean, is there really any reason to think that the precedent set by ON CONFLICT actually made any difference? Is the suggestion here that there is a better behavior, that we cannot go with for historical reasons? * This ExecModifyTable() code seems a bit odd: > > if (operation == CMD_MERGE) > { > ExecMerge(node, estate, slot, junkfilter, resultRelInfo); > continue; > } > > tupleid = NULL; > oldtuple = NULL; > if (junkfilter != NULL) > { > /* > * extract the 'ctid' or 'wholerow' junk attribute. > */ > if (operation == CMD_UPDATE || > operation == CMD_DELETE || > operation == CMD_MERGE) > { Why is there a test for CMD_MERGE if control flow cannot even reach there? What's going on here? * No need to say "not planned" here, I think: > +typedef struct MergeAction > +{ > + NodeTag type; > + bool matched; /* MATCHED or NOT MATCHED */ > + Node *condition; /* conditional expr (raw parser) */ > + Node *qual; /* conditional expr (transformWhereClause) */ > + CmdType commandType; /* type of action */ > + Node *stmt; /* T_UpdateStmt etc - not planned */ > + List *targetList; /* the target list (of ResTarget) */ > +} MergeAction; * Do tables with rules reject MERGE commands sensibly? We should have a test for that. * Logical decoding tests (test_decoding regression tests) seem like a good idea. This is much less important than with ON CONFLICT, since you don't have the significant complication of speculative insertions, but it seems like something to add on general principle. -- Peter Geoghegan
Hi Peter,
Thanks for the additional comments.
--
On Wed, Mar 21, 2018 at 2:17 AM, Peter Geoghegan <pg@bowt.ie> wrote:
* Why is terminating semi-colon on its own line in MERGE sql examples
from the docs?
I checked a few other places and didn't find another example which puts semi-colon on its own new line. So fixed per your suggestion.
* ON CONFLICT references in MERGE docs should be a "Tip" box, IMV.
Changed.
* Docs don't actually say what DO NOTHING does.
Added a para about that. In passing, I noticed that even though the grammar and the docs agree that DO NOTHING is only supported for WHEN NOT MATCHED, ExecMergeMatched() was unnecessarily checking for DO NOTHING. So fixed the code.
* The docs say "A condition cannot contain subqueries, set returning
functions, nor can it contain window or aggregate functions". Thought
it can now?
Yes, it now supports sub-queries. What about set-returning, aggregates etc? I assume they are not supported in other places such as WHERE conditions and JOIN quals. So they will continue to remain blocked even in WHEN conditions. Do you think it's worth mentioning or we should not mention anything at all?
* The docs say "INSERT actions cannot contain sub-selects". Didn't that change?
No, it did not. We only support VALUES clause with INSERT action.
* The docs on merge_insert grammar element say "Do not include the
table name", which seems like it was already covered. It's obvious.
Suggest removing it.
Ok, fixed.
* The docs say "The number of rows inserted, updated and deleted can
be seen in the output of EXPLAIN ANALYZE". This seems unnecessary. We
don't see that for ON CONFLICT.
You mean the doc changes are unnecessary or the EXPLAIN ANALYZE output is unnecessary? I assume the doc changes, but let me know if that's wrong assumption.
* The docs say "it should be noted that no error is raised if that
occurs". But should it? I don't think that this addresses a perception
that users are likely to have.
Again, do you mean we should raise error or just that the docs should not mention anything about it? I don't think raising an error because the candidate row did not meet any specified action is a good idea. May be some day we add another option to store such rows in a separate temporary table.
* "See MERGE" hyperlink within MERGE docs themselves seem odd.
Suggest changing this.
Removed.
* Trigger behavior doesn't belong in main MERGE doc page, IMV. Just
like with ON CONFLICT, it should be documented where triggers are
already documented. The specifics of MERGE can be discussed there.
Ok. I added couple of paras to trigger.sgml, but left merge.sgml untouched. Suggestions for better wording are welcome.
* It might make sense to point out in the docs that join_condition
should not filter the target table too much. Like SQL server docs say,
don't put things in the join that filter the target that actually
belong in the WHEN .. AND quals. In a way, this should be obvious,
because it's an outer join. But I don't think it is, and ISTM that the
sensible thing to do is to warn against it.
Hmm, ok. Not sure how exactly to put that in words without confusing users. Do you want to suggest something?
* We never actually get around to saying that MERGE is good with bulk
loading, ETL, and so on. I think that we should remark on that in
passing.
Suggestion?
* I think that the mvcc.sgml changes can go. Perhaps a passing
reference to MERGE can be left behind, that makes it clear that it's
really rather like UPDATE FROM and so on. The fact that it's like
UPDATE FROM now seems crystal clear.
It seems useful to me. Should we move it to merge.sgml instead?
* insert.sgml need not mention MERGE IMV.
Hmm. I am not sure. It seems worth leaving a reference there since MERGE provides a new way to handle INSERTs.
* Most of the field comments here should be improved:
> /* ----------------
> * MergeActionState information
> * ----------------
> */
> typedef struct MergeActionState
> {
> NodeTag type;
> bool matched; /* MATCHED or NOT MATCHED */
> ExprState *whenqual; /* WHEN quals */
> CmdType commandType; /* type of action */
> TupleTableSlot *slot; /* instead of ResultRelInfo */
> ProjectionInfo *proj; /* instead of ResultRelInfo */
> } MergeActionState;
Done.
* I wouldn't talk about a precedent like this:
> > /*
> > * Process BEFORE EACH STATEMENT triggers
> > + *
> > + * The precedent set by ON CONFLICT is that we fire INSERT then UPDATE.
> > + * MERGE follows the same logic, firing INSERT, then UPDATE, then DELETE.
> > */
The comments should read as one voice. Ideally, it will look like ON
CONFLICT could have just as easily been added after MERGE, unless
there is a very compelling reason to mention a precedent. I mean, is
there really any reason to think that the precedent set by ON CONFLICT
actually made any difference? Is the suggestion here that there is a
better behavior, that we cannot go with for historical reasons?
I agree. I removed those comments.
* This ExecModifyTable() code seems a bit odd:
>
> if (operation == CMD_MERGE)
> {
> ExecMerge(node, estate, slot, junkfilter, resultRelInfo);
> continue;
> }
>
> tupleid = NULL;
> oldtuple = NULL;
> if (junkfilter != NULL)
> {
> /*
> * extract the 'ctid' or 'wholerow' junk attribute.
> */
> if (operation == CMD_UPDATE ||
> operation == CMD_DELETE ||
> operation == CMD_MERGE)
> {
Why is there a test for CMD_MERGE if control flow cannot even reach
there? What's going on here?
Just leftover bits from the previous code. Removed.
* No need to say "not planned" here, I think:
> +typedef struct MergeAction
> +{
> + NodeTag type;
> + bool matched; /* MATCHED or NOT MATCHED */
> + Node *condition; /* conditional expr (raw parser) */
> + Node *qual; /* conditional expr (transformWhereClause) */
> + CmdType commandType; /* type of action */
> + Node *stmt; /* T_UpdateStmt etc - not planned */
> + List *targetList; /* the target list (of ResTarget) */
> +} MergeAction;
Fixed and also improved other comments for the struct.
* Do tables with rules reject MERGE commands sensibly? We should have
a test for that.
That check was indeed missing. Added the check and the test.
* Logical decoding tests (test_decoding regression tests) seem like a
good idea. This is much less important than with ON CONFLICT, since
you don't have the significant complication of speculative insertions,
but it seems like something to add on general principle.
Good point, added a test.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Mar 21, 2018 at 5:23 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> * The docs say "A condition cannot contain subqueries, set returning >> functions, nor can it contain window or aggregate functions". Thought >> it can now? > > > Yes, it now supports sub-queries. What about set-returning, aggregates etc? > I assume they are not supported in other places such as WHERE conditions and > JOIN quals. So they will continue to remain blocked even in WHEN conditions. > Do you think it's worth mentioning or we should not mention anything at all? I would not mention anything at all. It's just the same as other DML statements. >> * The docs say "INSERT actions cannot contain sub-selects". Didn't that >> change? > > > No, it did not. We only support VALUES clause with INSERT action. But can't you have a subselect in the VALUES()? Support for subselects seems like a totally distinct thing to the restriction that only a (single row) VALUES() is allowed in INSERT actions. > You mean the doc changes are unnecessary or the EXPLAIN ANALYZE output is > unnecessary? I assume the doc changes, but let me know if that's wrong > assumption. I meant the doc changes. > Again, do you mean we should raise error or just that the docs should not > mention anything about it? I don't think raising an error because the > candidate row did not meet any specified action is a good idea. May be some > day we add another option to store such rows in a separate temporary table. I agree that it's not a good idea to raise an error because the candidate row did not meet any specified action. My point is that you don't even need to mention that we don't do that in the docs. It's not like there is any reason to expect that we should -- there is no precedent. I think that that's in the docs because this behavior was contemplated during MERGE's development. But that behavior was ultimately rejected. In large part because there was no precedent. >> * It might make sense to point out in the docs that join_condition >> should not filter the target table too much. Like SQL server docs say, >> don't put things in the join that filter the target that actually >> belong in the WHEN .. AND quals. In a way, this should be obvious, >> because it's an outer join. But I don't think it is, and ISTM that the >> sensible thing to do is to warn against it. >> > > Hmm, ok. Not sure how exactly to put that in words without confusing users. > Do you want to suggest something? Perhaps a Warning box should say: Only columns from "target_table_name" that attempt to match "data_source" rows should appear in "join_condition". "join_condition" subexpressions that only reference "target_table_name" columns can only affect which action is taken, often in surprising ways. >> * We never actually get around to saying that MERGE is good with bulk >> loading, ETL, and so on. I think that we should remark on that in >> passing. > > > Suggestion? How about adding this sentence after "MERGE ... a task that would otherwise require multiple procedural language statements": MERGE can synchronize two tables by modifying one table based on differences between it and some other table. The point here is that we're primarily talking about two whole tables. That deserves such prominent placement, as that suggests where users might really find MERGE useful, but without being too prescriptive. Also, instead of saying "There are a variety of differences and restrictions between the two statement types [MERGE and INSERT ... ON CONFLICT DO UPDATE] and they are not interchangeable", you could instead be specific, and say: MERGE is well suited to synchronizing two tables using multiple complex conditions. Using INSERT with ON CONFLICT DO UPDATE works well when requirements are simpler. Only ON CONFLICT provides an atomic INSERT or UPDATE outcome in READ COMMITTED mode. BTW, the docs should be clear on the fact that "INSERT ... ON CONFLICT" isn't a statement. INSERT is. ON CONFLICT is a clause. >> * I think that the mvcc.sgml changes can go. Perhaps a passing >> reference to MERGE can be left behind, that makes it clear that it's >> really rather like UPDATE FROM and so on. The fact that it's like >> UPDATE FROM now seems crystal clear. >> > > It seems useful to me. Should we move it to merge.sgml instead? The mvcc.sgml changes read like a status report on the patch's behavior with concurrency. Obviously that general tone is not appropriate for a committed patch. Also, what it describes doesn't seem to have much to do with MVCC rules per say. The only thing that seems to warrant discussion in mvcc.sgml is how MERGE really *isn't* a special case. ISTM that you only really need to mention how the decision to use one particular WHEN action can change repeatedly - every time you walk the UPDATE chain, you start that part from the beginning. The "you might get a duplicate violation" bit can definitely live in merge.sgml, right at the point that ON CONFLICT is mentioned (the Tip box). I don't think that you need too much on this. -- Peter Geoghegan
On 21 March 2018 at 19:45, Peter Geoghegan <pg@bowt.ie> wrote: >>> * We never actually get around to saying that MERGE is good with bulk >>> loading, ETL, and so on. I think that we should remark on that in >>> passing. >> >> >> Suggestion? > > How about adding this sentence after "MERGE ... a task that would > otherwise require multiple procedural language statements": > > MERGE can synchronize two tables by modifying one table based on > differences between it and some other table. > > The point here is that we're primarily talking about two whole tables. > That deserves such prominent placement, as that suggests where users > might really find MERGE useful, but without being too prescriptive. The information I have is that many people are expecting MERGE to work for OLTP since that is how it is used in other databases, not solely as an ETL command. So we're not primarily talking about two whole tables. > Also, instead of saying "There are a variety of differences and > restrictions between the two statement types [MERGE and INSERT ... ON > CONFLICT DO UPDATE] and they are not interchangeable", you could > instead be specific, and say: > > MERGE is well suited to synchronizing two tables using multiple > complex conditions. Using INSERT with ON CONFLICT DO UPDATE works well > when requirements are simpler. Only ON CONFLICT provides an atomic > INSERT or UPDATE outcome in READ COMMITTED mode. > > BTW, the docs should be clear on the fact that "INSERT ... ON > CONFLICT" isn't a statement. INSERT is. ON CONFLICT is a clause. I think it would be better if you wrote a separate additional doc patch to explain all of this, perhaps in Performance Tips section or otherwise. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 21 March 2018 at 12:23, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Fixed Looks like that this completes all outstanding items with the MERGE code. I'm now proposing that I move to commit this, following my own final review, on Tues 27 Mar in 5 days time, giving time for cleanup of related issues. If there are any items you believe are still open, please say so now or mention any other objections you have. Thanks for all of your detailed comments, -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 22, 2018 at 1:15 AM, Peter Geoghegan <pg@bowt.ie> wrote:
Thanks,
>
>
> No, it did not. We only support VALUES clause with INSERT action.
But can't you have a subselect in the VALUES()? Support for subselects
seems like a totally distinct thing to the restriction that only a
(single row) VALUES() is allowed in INSERT actions.
Ah, right. That works even today.
postgres=# CREATE TABLE target (a int, b text);
CREATE TABLE
postgres=# MERGE INTO target USING (SELECT 1) s ON false WHEN NOT MATCHED THEN INSERT VALUES ((SELECT count(*) FROM pg_class), (SELECT relname FROM pg_class LIMIT 1));
MERGE 1
postgres=# SELECT * FROM target;
a | b
-----+----------------
755 | pgbench_source
(1 row)
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 22, 2018 at 1:36 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 21 March 2018 at 12:23, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> Fixed
Looks like that this completes all outstanding items with the MERGE code.
A slightly improved version attached. Apart from doc cleanup based on earlier feedback, fixed one assertion failure based on Rahila's report. This was happening when target relation is referenced in the source subquery. Fixed that and added a test case to test that situation.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Thu, Mar 22, 2018 at 11:42 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > A slightly improved version attached. You still need to remove this change: > diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h > index a4574cd533..dbfb5d2a1a 100644 > --- a/src/include/miscadmin.h > +++ b/src/include/miscadmin.h > @@ -444,5 +444,6 @@ extern bool has_rolreplication(Oid roleid); > /* in access/transam/xlog.c */ > extern bool BackupInProgress(void); > extern void CancelBackup(void); > +extern int64 GetXactWALBytes(void); I see that we're still using two target RTEs in this latest revision, v23e -- the new approach to partitioning, which I haven't had time to study in more detail, has not produced a change there. This causes weird effects, such as the following: """ pg@~[20658]=# create table foo(bar int4); CREATE TABLE pg@~[20658]=# merge into foo f using (select 1 col) dd on f.bar=dd.col when matched then update set bar = f.barr + 1; ERROR: column f.barr does not exist LINE 1: ...n f.bar=dd.col when matched then update set bar = f.barr + 1... ^ HINT: Perhaps you meant to reference the column "f.bar" or the column "f.bar". """ While I think this this particular HINT buglet is pretty harmless, I continue to be concerned about the unintended consequences of having multiple RTEs for MERGE's target table. Each RTE comes from a different lookup path -- the first one goes through setTargetTable()'s parserOpenTable() + addRangeTableEntryForRelation() calls. The second one goes through transformFromClauseItem(), for the join, which ultimately ends up calling transformTableEntry()/addRangeTableEntry(). Consider commit 5f173040, which fixed a privilege escalation security bug around multiple name lookup. Could the approach taken by MERGE here introduce a similar security issue? Using GDB, I see two calls to RangeVarGetRelidExtended() when a simple MERGE is executed. They both have identical relation arguments, that look like this: (gdb) p *relation $4 = { type = T_RangeVar, catalogname = 0x0, schemaname = 0x0, relname = 0x5600ebdcafb0 "foo", inh = 1 '\001', relpersistence = 112 'p', alias = 0x5600ebdcb048, location = 11 } This seems like something that needs to be explained, at a minimum. Even if I'm completely wrong about there being a security hazard, maybe the suggestion that there might be still gives you some idea of what I mean about unintended consequences. -- Peter Geoghegan
On Thu, Mar 22, 2018 at 12:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The point here is that we're primarily talking about two whole tables. >> That deserves such prominent placement, as that suggests where users >> might really find MERGE useful, but without being too prescriptive. > > The information I have is that many people are expecting MERGE to work > for OLTP since that is how it is used in other databases, not solely > as an ETL command. I'm sure that that's true, which is why I said "...without being too prescriptive". > So we're not primarily talking about two whole tables. Sure, but that's where MERGE is going to be compelling. Especially for Postgres, which already has ON CONFLICT DO UPDATE. >> Also, instead of saying "There are a variety of differences and >> restrictions between the two statement types [MERGE and INSERT ... ON >> CONFLICT DO UPDATE] and they are not interchangeable", you could >> instead be specific, and say: >> >> MERGE is well suited to synchronizing two tables using multiple >> complex conditions. Using INSERT with ON CONFLICT DO UPDATE works well >> when requirements are simpler. Only ON CONFLICT provides an atomic >> INSERT or UPDATE outcome in READ COMMITTED mode. >> >> BTW, the docs should be clear on the fact that "INSERT ... ON >> CONFLICT" isn't a statement. INSERT is. ON CONFLICT is a clause. > > I think it would be better if you wrote a separate additional doc > patch to explain all of this, perhaps in Performance Tips section or > otherwise. I don't think that it has much to do with performance. -- Peter Geoghegan
On Thu, Mar 22, 2018 at 1:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I'm now proposing that I move to commit this, following my own final > review, on Tues 27 Mar in 5 days time, giving time for cleanup of > related issues. > > If there are any items you believe are still open, please say so now > or mention any other objections you have. I would like to see clarity on the use of multiple RTEs for the target table by MERGE. See my remarks to Pavan just now. Also, I think that the related issue of how partitioning was implemented needs to get a lot more review (partitioning is the whole reason for using multiple RTEs, last I checked). It would be easy enough to fix the multiple RTEs issue if partitioning wasn't a factor. I didn't manage to do much review of partitioning at all. I don't really understand how the patch implements partitioning. Input from a subject matter expert might help matters quite a lot. Pavan hasn't added support for referencing CTEs, which other database systems with MERGE have. I think that it ought to be quite doable. It didn't take me long to get it working myself, but there wasn't follow through on that (I could have posted the patch, which looked exactly as you'd expect it to look). I think that we should add support for CTEs now, as I see no reason for the omission. In general, I still think that this patch could do with more review, but I'm running out of time. If you want to commit it, I will not explicitly try to block it, but I do have misgivings about your timeframe. -- Peter Geoghegan
Peter Geoghegan wrote: > Pavan hasn't added support for referencing CTEs, which other database > systems with MERGE have. I think that it ought to be quite doable. It > didn't take me long to get it working myself, but there wasn't follow > through on that (I could have posted the patch, which looked exactly > as you'd expect it to look). I think that we should add support for > CTEs now, as I see no reason for the omission. Incremental development is a good thing. Trying to do everything in a single commit is great when time is infinite or even merely very long, but if you run out of it, which I'm sure is common, leaving some things out that can be reasonable implemented in a separate patch is perfectly acceptable. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 22, 2018 at 6:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Incremental development is a good thing. Trying to do everything in a > single commit is great when time is infinite or even merely very long, > but if you run out of it, which I'm sure is common, leaving some things > out that can be reasonable implemented in a separate patch is perfectly > acceptable. We're talking about something that took me less than an hour to get working. AFAICT, it's just a matter of tweaking the grammar, and adding a bit of transformWithClause() boilerplate to the start of transformMergeStmt(). As I've pointed out on this thread already, I'm often concerned about supporting functionality like this because it increases my overall confidence in the design. If it was genuinely hard to add WITH clause support, then that would probably tell us something about the overall design that likely creates problems elsewhere. It's easy to say that it isn't worth holding the patch up for WITH clause support, because that's true, but it's also beside the point. -- Peter Geoghegan
On 2018/03/23 3:42, Pavan Deolasee wrote: > A slightly improved version attached. Apart from doc cleanup based on > earlier feedback, fixed one assertion failure based on Rahila's report. > This was happening when target relation is referenced in the source > subquery. Fixed that and added a test case to test that situation. > > Rebased on current master. I tried these patches (applied 0002 on top of 0001). When applying 0002, I got some apply errors: The next patch would create the file src/test/isolation/expected/merge-delete.out, which already exists! Assume -R? [n] I managed to apply it by ignoring the errors, but couldn't get make check to pass; attached regressions.diffs if you want to take a look. Btw, is 0001 redundant with the latest patch on ON CONFLICT DO UPDATE thread? Can I apply just 0002 on top of that patch? So, I tried that -- that is, skipped your 0001 and instead applied ON CONFLICT DO UPDATE patch, and then applied your 0002. I had to fix a couple of places to get MERGE working correctly for partitioned tables; attached find a delta patch for the fixes I made, which were needed because I skipped 0001 in favor of the ON CONFLICT DO UPDATE patch. But the regression test failure I mentioned above didn't go away, so it seems to have nothing to do with partitioning. Thanks, Amit
Attachment
On Fri, Mar 23, 2018 at 10:00 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/03/23 3:42, Pavan Deolasee wrote:
> A slightly improved version attached. Apart from doc cleanup based on
> earlier feedback, fixed one assertion failure based on Rahila's report.
> This was happening when target relation is referenced in the source
> subquery. Fixed that and added a test case to test that situation.
>
> Rebased on current master.
I tried these patches (applied 0002 on top of 0001). When applying 0002,
I got some apply errors:
The next patch would create the file
src/test/isolation/expected/merge-delete.out,
which already exists! Assume -R? [n]
I managed to apply it by ignoring the errors, but couldn't get make check
to pass; attached regressions.diffs if you want to take a look.
Thanks. Are you sure you're using a clean repo? I suspect you'd a previous version of the patch applied and hence the apply errors now. I also suspect that you may have made a mistake while resolving the conflicts while applying the patch (since a file at the same path existed). The failures also seem related to past version of the patch.
I just checked with a freshly checked out repo and the patches apply correctly on the current master and regression passes too. http://commitfest.cputube.org/ also reported success overnight.
Btw, is 0001 redundant with the latest patch on ON CONFLICT DO UPDATE
thread? Can I apply just 0002 on top of that patch? So, I tried that --
that is, skipped your 0001 and instead applied ON CONFLICT DO UPDATE
patch, and then applied your 0002.
Yes. I should probably rebase my patch on your v9 or just include the relevant changes in the MERGE patch itself to avoid any dependency right now. Will check.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/03/23 13:57, Pavan Deolasee wrote: > On Fri, Mar 23, 2018 at 10:00 AM, Amit Langote wrote: >> I managed to apply it by ignoring the errors, but couldn't get make check >> to pass; attached regressions.diffs if you want to take a look. > > Thanks. Are you sure you're using a clean repo? I suspect you'd a previous > version of the patch applied and hence the apply errors now. I also suspect > that you may have made a mistake while resolving the conflicts while > applying the patch (since a file at the same path existed). The failures > also seem related to past version of the patch. > > I just checked with a freshly checked out repo and the patches apply > correctly on the current master and regression passes too. > http://commitfest.cputube.org/ also reported success overnight. You're right, I seem to have messed something up. Sorry about the noise. Also, it seems that the delta patch I sent in the last email didn't contain all the changes I had to make. It didn't contain, for example, replacing adjust_and_expand_inherited_tlist() with adjust_partition_tlist(). I guess you'll know when you rebase anyway. Sorry that this is me coming a bit late to this thread, but I noticed a few things in patch that I thought I should comment on. 1. White space errors $ git diff master --check src/backend/executor/execPartition.c:737: trailing whitespace. + /* src/backend/executor/nodeMerge.c:90: indent with spaces. + PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing; src/backend/executor/nodeMerge.c:116: trailing whitespace. + src/backend/executor/nodeMerge.c:565: new blank line at EOF. 2. Sorry if this has been discussed before, but is it OK to use AclMode like this: + + AclMode mt_merge_subcommands; /* Flags show which cmd types are + * present */ 3. I think the comment above this should be updated to explain why the map_index is being set to the leaf index in case of MERGE instead of the subplan index as is done in case of plain UPDATE: - map_index = resultRelInfo - mtstate->resultRelInfo; - Assert(map_index >= 0 && map_index < mtstate->mt_nplans); - tupconv_map = tupconv_map_for_subplan(mtstate, map_index); + if (mtstate->operation == CMD_MERGE) + { + map_index = resultRelInfo->ri_PartitionLeafIndex; + Assert(mtstate->rootResultRelInfo == NULL); + tupconv_map = TupConvMapForLeaf(proute, + mtstate->resultRelInfo, + map_index); + } + else + { 4. Do you think it would be possible at a later late to change this junk attribute to contain something other than "tableoid"? + if (operation == CMD_MERGE) + { + j->jf_otherJunkAttNo = ExecFindJunkAttribute(j, "tableoid"); + if (!AttributeNumberIsValid(j->jf_otherJunkAttNo)) + elog(ERROR, "could not find junk tableoid column"); + + } Currently, it seems ExecMergeMatched will take this OID and look up the partition (its ResultRelInfo) by calling ExecFindPartitionByOid which in turn looks it up in the PartitionTupleRouting struct. I'm imagining it might be possible to instead return an integer that specifies "a partition number". Of course, nothing like that exists currently, but just curious if we're going to be "stuck" with this junk attribute always containing "tableoid". Or maybe putting a "partition number" into the junk attribute is not doable to begin with. 5. In ExecInitModifyTable, in the if (node->mergeActionList) block: + + /* initialize slot for the existing tuple */ + mtstate->mt_existing = ExecInitExtraTupleSlot(estate, NULL); Maybe a good idea to Assert that mt_existing is NULL before. Also, why not set the slot's descriptor right away if tuple routing is not going to be used. I did it that way in the ON CONFLICT DO UPDATE patch. 6. I see that there is a slot called mergeSlot that becomes part of ResultRelInfo of the table (partition) via ri_MergeState. That means we might end up creating as many slots as there are partitions (* number of actions?). Can't we have just one, say, mt_mergeproj in ModifyTableState similar to mt_conflproj and just reset its descriptor before use. I guess reset will have happen before carrying out an action applied to a given partition. When I tried that (see attached delta), nothing got broken. Thanks, Amit
Attachment
On Fri, Mar 23, 2018 at 12:57 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
Also, it seems that the delta patch I sent in the last email didn't
contain all the changes I had to make. It didn't contain, for example,
replacing adjust_and_expand_inherited_tlist() with
adjust_partition_tlist(). I guess you'll know when you rebase anyway.
Yes, I am planning to fix that once the ON CONFLICT patch is ready/committed.
1. White space errors
$ git diff master --check
Fixed.
2. Sorry if this has been discussed before, but is it OK to use AclMode
like this:
+
+ AclMode mt_merge_subcommands; /* Flags show which cmd types are
+ * present */
Hmm. I think you're right. Defined required flags in nodeModifyTable.c and using those now.
3. I think the comment above this should be updated to explain why the
map_index is being set to the leaf index in case of MERGE instead of the
subplan index as is done in case of plain UPDATE:
- map_index = resultRelInfo - mtstate->resultRelInfo;
- Assert(map_index >= 0 && map_index < mtstate->mt_nplans);
- tupconv_map = tupconv_map_for_subplan(mtstate, map_index);
+ if (mtstate->operation == CMD_MERGE)
+ {
+ map_index = resultRelInfo->ri_PartitionLeafIndex;
+ Assert(mtstate->rootResultRelInfo == NULL);
+ tupconv_map = TupConvMapForLeaf(proute,
+ mtstate->resultRelInfo,
+ map_index);
+ }
+ else
+ {
Done. I wonder though if we should just always set ri_ PartitionLeafIndex even for regular UPDATE and always use that to retrieve the map.
4. Do you think it would be possible at a later late to change this junk
attribute to contain something other than "tableoid"?
+ if (operation == CMD_MERGE)
+ {
+ j->jf_otherJunkAttNo =
ExecFindJunkAttribute(j, "tableoid");
+ if
(!AttributeNumberIsValid(j->jf_otherJunkAttNo))
+ elog(ERROR, "could not find junk tableoid
column");
+
+ }
Currently, it seems ExecMergeMatched will take this OID and look up the
partition (its ResultRelInfo) by calling ExecFindPartitionByOid which in
turn looks it up in the PartitionTupleRouting struct. I'm imagining it
might be possible to instead return an integer that specifies "a partition
number". Of course, nothing like that exists currently, but just curious
if we're going to be "stuck" with this junk attribute always containing
"tableoid". Or maybe putting a "partition number" into the junk attribute
is not doable to begin with.
I am not sure. Wouldn't adding a new junk column require a whole new machinery? It might be worth adding it someday to reduce the cost associated with the lookups. But I don't want to include the change in this already largish patch.
5. In ExecInitModifyTable, in the if (node->mergeActionList) block:
+
+ /* initialize slot for the existing tuple */
+ mtstate->mt_existing = ExecInitExtraTupleSlot(estate, NULL);
Maybe a good idea to Assert that mt_existing is NULL before. Also, why
not set the slot's descriptor right away if tuple routing is not going to
be used. I did it that way in the ON CONFLICT DO UPDATE patch.
Yeah, I plan to update this code once the other patch gets in. This change was mostly borrowed from your/Alvaro's patch, but I don't think it will be part of the MERGE patch if the ON CONFLICT DO UPDATE patch gets in ahead of this.
6. I see that there is a slot called mergeSlot that becomes part of
ResultRelInfo of the table (partition) via ri_MergeState. That means we
might end up creating as many slots as there are partitions (* number of
actions?). Can't we have just one, say, mt_mergeproj in ModifyTableState
similar to mt_conflproj and just reset its descriptor before use. I guess
reset will have happen before carrying out an action applied to a given
partition. When I tried that (see attached delta), nothing got broken.
Thanks! It was on my TODO list. So thanks for taking care of it. I've included your patch in the main patch. I imagine we can similarly set the tuple descriptor for this slot during initialisation if target table is a non-partitioned table. But I shall take care of that along with mt_existing. In fact, I wonder if we should combine mt_confproj and mt_mergeproj and just have one slot. They are mutually exclusive in their use, but have lot in common.
As someone who understands partitioning best, do you have any other comments/concerns regarding partitioning related code in the patch? I would appreciate if you can give it a look and also run any tests that you may have handy.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Fri, Mar 23, 2018 at 6:45 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Mar 22, 2018 at 6:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Incremental development is a good thing. Trying to do everything in a
> single commit is great when time is infinite or even merely very long,
> but if you run out of it, which I'm sure is common, leaving some things
> out that can be reasonable implemented in a separate patch is perfectly
> acceptable.
We're talking about something that took me less than an hour to get
working. AFAICT, it's just a matter of tweaking the grammar, and
adding a bit of transformWithClause() boilerplate to the start of
transformMergeStmt().
I quickly implemented CTE support myself (not wCTE support, since
MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work
when I mechanically duplicate the approach taken with other types of
DML statement in the parser. I have written a few tests, and so far it
holds up.Ok, thanks. I started doing something similar, but great if you have already implemented. I will focus on other things for now.
I am sorry. I was under the impression that you're actually writing this piece of code and hence did not pay much attention till now. I should have confirmed with you instead of assuming. I think it's a bit too late now, but I will give it a fair try tomorrow. I don't want to spend too much time on it though given how close we are to the deadline. As Alvaro said, we can always revisit this for pg12.
As I've pointed out on this thread already, I'm often concerned about
supporting functionality like this because it increases my overall
confidence in the design. If it was genuinely hard to add WITH clause
support, then that would probably tell us something about the overall
design that likely creates problems elsewhere. It's easy to say that
it isn't worth holding the patch up for WITH clause support, because
that's true, but it's also beside the point.
Understood.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Mar 23, 2018 at 4:43 AM, Peter Geoghegan <pg@bowt.ie> wrote:
How's it different than running a INSERT query with the target table again specified in a subquery producing the rows to be inserted? For example,
On Thu, Mar 22, 2018 at 11:42 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> A slightly improved version attached.
You still need to remove this change:
> diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
> index a4574cd533..dbfb5d2a1a 100644
> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -444,5 +444,6 @@ extern bool has_rolreplication(Oid roleid);
> /* in access/transam/xlog.c */
> extern bool BackupInProgress(void);
> extern void CancelBackup(void);
> +extern int64 GetXactWALBytes(void);
Sigh. Fixed in the recent version.
I see that we're still using two target RTEs in this latest revision,
v23e -- the new approach to partitioning, which I haven't had time to
study in more detail, has not produced a change there.
Yes, we continue to use two RTEs because I don't have any brighter idea than that to handle the case of partitioned table and right outer join. As I explained sometime back, this is necessary to ensure that we don't produce duplicate rows when a partition is joined with the source and then a second partition is joined again with the source.
Now I don't know if we can run a join query and still have a single RTE, but that looks improbable and wrong.
This causes
weird effects, such as the following:
"""
pg@~[20658]=# create table foo(bar int4);
CREATE TABLE
pg@~[20658]=# merge into foo f using (select 1 col) dd on f.bar=dd.col
when matched then update set bar = f.barr + 1;
ERROR: column f.barr does not exist
LINE 1: ...n f.bar=dd.col when matched then update set bar = f.barr + 1...
^
HINT: Perhaps you meant to reference the column "f.bar" or the column "f.bar".
"""
While I think this this particular HINT buglet is pretty harmless, I
continue to be concerned about the unintended consequences of having
multiple RTEs for MERGE's target table. Each RTE comes from a
different lookup path -- the first one goes through setTargetTable()'s
parserOpenTable() + addRangeTableEntryForRelation() calls. The second
one goes through transformFromClauseItem(), for the join, which
ultimately ends up calling transformTableEntry()/addRangeTableEntry().
How's it different than running a INSERT query with the target table again specified in a subquery producing the rows to be inserted? For example,
postgres=# insert into target as t select sid from source s join target t on t.ttid = s.sid;
ERROR: column t.ttid does not exist
LINE 1: ...rget as t select sid from source join target t on t.ttid = s...
^
HINT: Perhaps you meant to reference the column "t.tid" or the column "t.tid".
postgres=#
This produces a very similar looking HINT as your test above. I am certain that "target" table gets two RTEs, exactly via the same code paths as you discussed above. So if this is not a problem for INSERT, why it would be a problem for MERGE? May be I am missing a point here.
Consider commit 5f173040, which fixed a privilege escalation security
bug around multiple name lookup. Could the approach taken by MERGE
here introduce a similar security issue?
Using GDB, I see two calls to RangeVarGetRelidExtended() when a simple
MERGE is executed. They both have identical relation arguments, that
look like this:
(gdb) p *relation
$4 = {
type = T_RangeVar,
catalogname = 0x0,
schemaname = 0x0,
relname = 0x5600ebdcafb0 "foo",
inh = 1 '\001',
relpersistence = 112 'p',
alias = 0x5600ebdcb048,
location = 11
}
This seems like something that needs to be explained, at a minimum.
Even if I'm completely wrong about there being a security hazard,
maybe the suggestion that there might be still gives you some idea of
what I mean about unintended consequences.
Ok. I will try to explain it better and also think about the security hazards.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/03/23 20:07, Pavan Deolasee wrote: > On Fri, Mar 23, 2018 at 12:57 PM, Amit Langote wrote: >> Also, it seems that the delta patch I sent in the last email didn't >> contain all the changes I had to make. It didn't contain, for example, >> replacing adjust_and_expand_inherited_tlist() with >> adjust_partition_tlist(). I guess you'll know when you rebase anyway. >> > > Yes, I am planning to fix that once the ON CONFLICT patch is > ready/committed. OK, thanks. >> 3. I think the comment above this should be updated to explain why the >> map_index is being set to the leaf index in case of MERGE instead of the >> subplan index as is done in case of plain UPDATE: >> >> - map_index = resultRelInfo - mtstate->resultRelInfo; >> - Assert(map_index >= 0 && map_index < mtstate->mt_nplans); >> - tupconv_map = tupconv_map_for_subplan(mtstate, map_index); >> + if (mtstate->operation == CMD_MERGE) >> + { >> + map_index = resultRelInfo->ri_PartitionLeafIndex; >> + Assert(mtstate->rootResultRelInfo == NULL); >> + tupconv_map = TupConvMapForLeaf(proute, >> + mtstate->resultRelInfo, >> + map_index); >> + } >> + else >> + { >> > > Done. I wonder though if we should just always set ri_PartitionLeafIndex > even for regular UPDATE and always use that to retrieve the map. In the regular UPDATE case, we'd be looking at a resultRelInfo that's from the ModifyTableState's per-subplan result rel array and the TupleConversionMap array perhaps accepts subscripts that are in range 0 to mtstate->nplans - 1 (not 0 to nparts - 1), so using ri_PartitionLeafIndex there would be incorrect, I think. >> 4. Do you think it would be possible at a later late to change this junk >> attribute to contain something other than "tableoid"? >> >> + if (operation == CMD_MERGE) >> + { >> + j->jf_otherJunkAttNo = >> ExecFindJunkAttribute(j, "tableoid"); >> + if >> (!AttributeNumberIsValid(j->jf_otherJunkAttNo)) >> + elog(ERROR, "could not find junk tableoid >> column"); >> + >> + } >> >> Currently, it seems ExecMergeMatched will take this OID and look up the >> partition (its ResultRelInfo) by calling ExecFindPartitionByOid which in >> turn looks it up in the PartitionTupleRouting struct. I'm imagining it >> might be possible to instead return an integer that specifies "a partition >> number". Of course, nothing like that exists currently, but just curious >> if we're going to be "stuck" with this junk attribute always containing >> "tableoid". Or maybe putting a "partition number" into the junk attribute >> is not doable to begin with. >> > > I am not sure. Wouldn't adding a new junk column require a whole new > machinery? It might be worth adding it someday to reduce the cost > associated with the lookups. But I don't want to include the change in this > already largish patch. No I wasn't suggesting that we do that in this patch. Also, I wasn't saying that we store the "partition index" in a *new* junk column, but *instead of* tableoid as this patch does. My question was whether we can replace tableoid that we are going to store with this patch in the MERGE's source plan's targetlist with something else in the future. >> 5. In ExecInitModifyTable, in the if (node->mergeActionList) block: >> >> + >> + /* initialize slot for the existing tuple */ >> + mtstate->mt_existing = ExecInitExtraTupleSlot(estate, NULL); >> >> Maybe a good idea to Assert that mt_existing is NULL before. Also, why >> not set the slot's descriptor right away if tuple routing is not going to >> be used. I did it that way in the ON CONFLICT DO UPDATE patch. >> > > Yeah, I plan to update this code once the other patch gets in. This change > was mostly borrowed from your/Alvaro's patch, but I don't think it will be > part of the MERGE patch if the ON CONFLICT DO UPDATE patch gets in ahead of > this. OK. >> 6. I see that there is a slot called mergeSlot that becomes part of >> ResultRelInfo of the table (partition) via ri_MergeState. That means we >> might end up creating as many slots as there are partitions (* number of >> actions?). Can't we have just one, say, mt_mergeproj in ModifyTableState >> similar to mt_conflproj and just reset its descriptor before use. I guess >> reset will have happen before carrying out an action applied to a given >> partition. When I tried that (see attached delta), nothing got broken. >> >> > Thanks! It was on my TODO list. So thanks for taking care of it. I've > included your patch in the main patch. I imagine we can similarly set the > tuple descriptor for this slot during initialisation if target table is a > non-partitioned table. But I shall take care of that along with > mt_existing. In fact, I wonder if we should combine mt_confproj and > mt_mergeproj and just have one slot. They are mutually exclusive in their > use, but have lot in common. OK. > As someone who understands partitioning best, do you have any other > comments/concerns regarding partitioning related code in the patch? I would > appreciate if you can give it a look and also run any tests that you may > have handy. Actually, I don't yet understand the full scope of what MERGE is supposed to do. I guess if it gives same answers for a partitioned table as it does for regular tables for different MERGE commands, that's enough to say that MERGE supports partitioning. But maybe there are corner cases where MERGE doesn't work same with partitioning because of some underlying implementation details (mostly executor, afaics). I will try to test more early next week. Thanks, Amit
On 23 March 2018 at 11:26, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > > > On Fri, Mar 23, 2018 at 6:45 AM, Peter Geoghegan <pg@bowt.ie> wrote: >> >> On Thu, Mar 22, 2018 at 6:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> >> wrote: >> > Incremental development is a good thing. Trying to do everything in a >> > single commit is great when time is infinite or even merely very long, >> > but if you run out of it, which I'm sure is common, leaving some things >> > out that can be reasonable implemented in a separate patch is perfectly >> > acceptable. >> >> We're talking about something that took me less than an hour to get >> working. AFAICT, it's just a matter of tweaking the grammar, and >> adding a bit of transformWithClause() boilerplate to the start of >> transformMergeStmt(). >>> >>> >>> I quickly implemented CTE support myself (not wCTE support, since >>> MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work >>> when I mechanically duplicate the approach taken with other types of >>> DML statement in the parser. I have written a few tests, and so far it >>> holds up. Peter, if you have the code and you consider it important that this subfeature is in PostgreSQL, why not post the code so we can commit it? Why would we repeat what has already been done? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 23, 2018 at 6:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Peter, if you have the code and you consider it important that this > subfeature is in PostgreSQL, why not post the code so we can commit > it? Fair enough. Attached patch shows what I'm on about. This should be applied on top of 0001_merge_v23e_onconflict_work.patch + 0002_merge_v23e_main.patch. I'm not expecting an authorship credit for posting this patch. One thing that the test output shows that is interesting is that there is never a "SubPlan 1" or "InitPlan 1" in EXPLAIN output -- it seems to always start at "SubPlan 2". This probably has nothing to do with CTEs in particular. I didn't notice this before now, although there were no existing tests of EXPLAIN in the patch that show subplans or initplans. Is this somehow related to the issue of using two RTEs for the target relation? That's certainly why we always see unaliased target table "m" with the alias "m_1" in EXPLAIN output, so I would not be surprised if it caused another EXPLAIN issue. -- Peter Geoghegan
Attachment
On Fri, Mar 23, 2018 at 4:56 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > postgres=# insert into target as t select sid from source s join target t on > t.ttid = s.sid; > ERROR: column t.ttid does not exist > LINE 1: ...rget as t select sid from source join target t on t.ttid = s... > ^ > HINT: Perhaps you meant to reference the column "t.tid" or the column > "t.tid". > postgres=# > > This produces a very similar looking HINT as your test above. I am certain > that "target" table gets two RTEs, exactly via the same code paths as you > discussed above. So if this is not a problem for INSERT, why it would be a > problem for MERGE? May be I am missing a point here. I agree that this is very similar, as far as the RTEs go. What is dissimilar is the fact that there is hard-coded knowledge of both through parsing, planning, and execution. It's everything, taken together. ResultRelInfo has a ri_mergeTargetRTI field, which seems to be used instead of ri_RangeTableIndex in some contexts but not others. What might the interactions with something like GetInsertedColumns() and GetUpdatedColumns() be? Is that explained anywhere? In general, I think that there is potential for things to break in subtle ways. >> This seems like something that needs to be explained, at a minimum. >> Even if I'm completely wrong about there being a security hazard, >> maybe the suggestion that there might be still gives you some idea of >> what I mean about unintended consequences. > > > Ok. I will try to explain it better and also think about the security > hazards. I realize that I'm giving you a somewhat vague problem, without offering any real help on a solution. For what it's worth, I don't feel great about that, but I don't know enough about partitioning in general and your approach to partitioning for MERGE in particular to be more constructive. That said, checking that an issue like the one fixed by 5f173040 cannot recur here is one concrete thing you could do. Documenting/explaining the ri_RangeTableIndex/ri_mergeTargetRTI divide is another. The comment above ri_mergeTargetRTI is totally inadequate. -- Peter Geoghegan
On Fri, Mar 23, 2018 at 11:15 PM, Peter Geoghegan <pg@bowt.ie> wrote: > I agree that this is very similar, as far as the RTEs go. What is > dissimilar is the fact that there is hard-coded knowledge of both > through parsing, planning, and execution. It's everything, taken > together. > > ResultRelInfo has a ri_mergeTargetRTI field, which seems to be used > instead of ri_RangeTableIndex in some contexts but not others. What > might the interactions with something like GetInsertedColumns() and > GetUpdatedColumns() be? Is that explained anywhere? In general, I > think that there is potential for things to break in subtle ways. I just realized that there were no tests added to privileges.sql. You only have a small number of GRANT tests in merge.sql, for relation-level privileges, not column-level privileges. IOW, this area is totally untested. -- Peter Geoghegan
On Sat, Mar 24, 2018 at 1:36 AM, Peter Geoghegan <pg@bowt.ie> wrote:
Fair enough. Attached patch shows what I'm on about. This should be
applied on top of 0001_merge_v23e_onconflict_work.patch +
0002_merge_v23e_main.patch. I'm not expecting an authorship credit for
posting this patch.
Thanks for the patch. I will study and integrate this into the main patch.
One thing that the test output shows that is interesting is that there
is never a "SubPlan 1" or "InitPlan 1" in EXPLAIN output -- it seems
to always start at "SubPlan 2". This probably has nothing to do with
CTEs in particular. I didn't notice this before now, although there
were no existing tests of EXPLAIN in the patch that show subplans or
initplans.
This query e.g. correctly starts at InitPlan 1
postgres=# EXPLAIN MERGE INTO m USING (SELECT 1 a, 'val' b) s ON m.k = s.a WHEN NOT MATCHED THEN INSERT VALUES ((select count(*) from pg_class), s.b);
QUERY PLAN
-------------------------------------------------------------------------
Merge on m (cost=16.30..43.83 rows=6 width=106)
InitPlan 1 (returns $0)
-> Aggregate (cost=16.26..16.27 rows=1 width=8)
-> Seq Scan on pg_class (cost=0.00..15.41 rows=341 width=0)
-> Hash Right Join (cost=0.03..27.55 rows=6 width=106)
Hash Cond: (m_1.k = s.a)
-> Seq Scan on m m_1 (cost=0.00..22.70 rows=1270 width=14)
-> Hash (cost=0.02..0.02 rows=1 width=96)
-> Subquery Scan on s (cost=0.00..0.02 rows=1 width=96)
-> Result (cost=0.00..0.01 rows=1 width=36)
(10 rows)
Is this somehow related to the issue of using two RTEs for the target
relation? That's certainly why we always see unaliased target table
"m" with the alias "m_1" in EXPLAIN output, so I would not be
surprised if it caused another EXPLAIN issue.
I don't think it's related to using two RTEs. The following EXPLAIN for a regular UPDATE query also shows a SubPlan starting at 2. I think it's just to do with how planner assigns the plan_id.
postgres=# EXPLAIN WITH cte_basic AS (SELECT 1 a, 'cte_basic val' b) UPDATE m SET v = (SELECT b || ' merge update' FROM cte_basic WHERE cte_basic.a = m.k LIMIT 1) ;
QUERY PLAN
------------------------------------------------------------------------------
Update on m (cost=0.01..54.46 rows=1270 width=42)
CTE cte_basic
-> Result (cost=0.00..0.01 rows=1 width=36)
-> Seq Scan on m (cost=0.00..54.45 rows=1270 width=42)
SubPlan 2
-> Limit (cost=0.00..0.02 rows=1 width=32)
-> CTE Scan on cte_basic (cost=0.00..0.02 rows=1 width=32)
Filter: (a = m.k)
(8 rows)
A quick gdb tracing shows that the CTE itself is assigned plan_id 1 and the SubPlan then gets plan_id 2. I can investigate further, but given that we see a similar behaviour with regular UPDATE, I don't think it's worth.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 22, 2018 at 7:13 PM, Peter Geoghegan <pg@bowt.ie> wrote: > While I think this this particular HINT buglet is pretty harmless, I > continue to be concerned about the unintended consequences of having > multiple RTEs for MERGE's target table. Each RTE comes from a > different lookup path -- the first one goes through setTargetTable()'s > parserOpenTable() + addRangeTableEntryForRelation() calls. The second > one goes through transformFromClauseItem(), for the join, which > ultimately ends up calling transformTableEntry()/addRangeTableEntry(). > Consider commit 5f173040, which fixed a privilege escalation security > bug around multiple name lookup. Could the approach taken by MERGE > here introduce a similar security issue? Yeah, that seems really bad. I don't think there is a huge problem with having multiple RTEs; for example, we very commonly end up with both rte->inh and !rte->inh RTEs for the same table, and that is OK. However, generating those RTEs by doing multiple name lookups for the same table is a big problem. Imagine, for example, that a user has a search_path of a, b and that there is a table b.foo. The user does a merge on foo. Between the first name lookup and the second, somebody creates a.foo. Now, potentially, half of the MERGE statement is going to be running against b.foo and the other half against a.foo. I don't know whether that will crash or bomb out with a strange error or just make some unexpected modification to one of those tables, but the behavior, even if not insecure, will certainly be wrong. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 24, 2018 at 8:16 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Mar 22, 2018 at 7:13 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> While I think this this particular HINT buglet is pretty harmless, I >> continue to be concerned about the unintended consequences of having >> multiple RTEs for MERGE's target table. Each RTE comes from a >> different lookup path -- the first one goes through setTargetTable()'s >> parserOpenTable() + addRangeTableEntryForRelation() calls. The second >> one goes through transformFromClauseItem(), for the join, which >> ultimately ends up calling transformTableEntry()/addRangeTableEntry(). >> Consider commit 5f173040, which fixed a privilege escalation security >> bug around multiple name lookup. Could the approach taken by MERGE >> here introduce a similar security issue? > > Yeah, that seems really bad. I don't think there is a huge problem > with having multiple RTEs; for example, we very commonly end up with > both rte->inh and !rte->inh RTEs for the same table, and that is OK. > However, generating those RTEs by doing multiple name lookups for the > same table is a big problem. Imagine, for example, that a user has a > search_path of a, b and that there is a table b.foo. The user does a > merge on foo. Between the first name lookup and the second, somebody > creates a.foo. Now, potentially, half of the MERGE statement is going > to be running against b.foo and the other half against a.foo. I don't > know whether that will crash or bomb out with a strange error or just > make some unexpected modification to one of those tables, but the > behavior, even if not insecure, will certainly be wrong. If it's possible to identify the two OIDs that are supposed to match and cross-check that the OIDs are the same, then we could just bomb out with an error if they aren't. That's not lovely, and is basically a hack, but it's possible that no better fix is possible in the time we have, and it's wouldn't be any worse than this crock from copy.c: if (!list_member_oid(plan->relationOids, queryRelId)) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("relation referenced by COPY statement has changed"))); Uggh, that code makes me hold my nose every time I look at it. But it's a cheap fix. (Hmm... I wonder if that's really an adequate fix for the problem in COPY, if we can't verify that the OID in question plays the right role in the query, rather than just that it's there somewhere. Anyway, if we can reliably identify the two OIDs to be compared, that's certainly good enough.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 24 March 2018 at 12:16, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Mar 22, 2018 at 7:13 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> While I think this this particular HINT buglet is pretty harmless, I >> continue to be concerned about the unintended consequences of having >> multiple RTEs for MERGE's target table. Each RTE comes from a >> different lookup path -- the first one goes through setTargetTable()'s >> parserOpenTable() + addRangeTableEntryForRelation() calls. The second >> one goes through transformFromClauseItem(), for the join, which >> ultimately ends up calling transformTableEntry()/addRangeTableEntry(). >> Consider commit 5f173040, which fixed a privilege escalation security >> bug around multiple name lookup. Could the approach taken by MERGE >> here introduce a similar security issue? > > Yeah, that seems really bad. I don't think there is a huge problem > with having multiple RTEs; for example, we very commonly end up with > both rte->inh and !rte->inh RTEs for the same table, and that is OK. > However, generating those RTEs by doing multiple name lookups for the > same table is a big problem. Imagine, for example, that a user has a > search_path of a, b and that there is a table b.foo. The user does a > merge on foo. Between the first name lookup and the second, somebody > creates a.foo. Now, potentially, half of the MERGE statement is going > to be running against b.foo and the other half against a.foo. I don't > know whether that will crash or bomb out with a strange error or just > make some unexpected modification to one of those tables, but the > behavior, even if not insecure, will certainly be wrong. MERGE uses multiple RTEs in exactly the same way UPDATE does. I don't see a reason for specific concern wrt to MERGE. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 23, 2018 at 11:52 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > A quick gdb tracing shows that the CTE itself is assigned plan_id 1 and the > SubPlan then gets plan_id 2. I can investigate further, but given that we > see a similar behaviour with regular UPDATE, I don't think it's worth. Clearly I jumped the gun on this one. I agree that this is fine. -- Peter Geoghegan
On Sat, Mar 24, 2018 at 5:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: > If it's possible to identify the two OIDs that are supposed to match > and cross-check that the OIDs are the same, then we could just bomb > out with an error if they aren't. That's not lovely, and is basically > a hack, but it's possible that no better fix is possible in the time > we have, and it's wouldn't be any worse than this crock from copy.c: > > if (!list_member_oid(plan->relationOids, queryRelId)) > ereport(ERROR, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("relation referenced by COPY statement > has changed"))); That's definitely all we have time for. The only alternative is to rip out support for partitioning, as partitioning is the only thing that necessitates the use of multiple RTEs. I don't think it would make sense to use a second RTE only when needed. -- Peter Geoghegan
On 24 March 2018 at 12:27, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Mar 24, 2018 at 8:16 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Mar 22, 2018 at 7:13 PM, Peter Geoghegan <pg@bowt.ie> wrote: >>> While I think this this particular HINT buglet is pretty harmless, I >>> continue to be concerned about the unintended consequences of having >>> multiple RTEs for MERGE's target table. Each RTE comes from a >>> different lookup path -- the first one goes through setTargetTable()'s >>> parserOpenTable() + addRangeTableEntryForRelation() calls. The second >>> one goes through transformFromClauseItem(), for the join, which >>> ultimately ends up calling transformTableEntry()/addRangeTableEntry(). >>> Consider commit 5f173040, which fixed a privilege escalation security >>> bug around multiple name lookup. Could the approach taken by MERGE >>> here introduce a similar security issue? >> >> Yeah, that seems really bad. I don't think there is a huge problem >> with having multiple RTEs; for example, we very commonly end up with >> both rte->inh and !rte->inh RTEs for the same table, and that is OK. >> However, generating those RTEs by doing multiple name lookups for the >> same table is a big problem. Imagine, for example, that a user has a >> search_path of a, b and that there is a table b.foo. The user does a >> merge on foo. Between the first name lookup and the second, somebody >> creates a.foo. Now, potentially, half of the MERGE statement is going >> to be running against b.foo and the other half against a.foo. I don't >> know whether that will crash or bomb out with a strange error or just >> make some unexpected modification to one of those tables, but the >> behavior, even if not insecure, will certainly be wrong. > > If it's possible to identify the two OIDs that are supposed to match > and cross-check that the OIDs are the same, then we could just bomb > out with an error if they aren't. Since we now have MVCC catalog scans, all the name lookups are performed using the same snapshot so in the above scenario the newly created object would be invisible to the second name lookup. So I don't see anyway for the ERROR to occur and hence no need for a cross check, for UPDATE or MERGE. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 23, 2018 at 4:37 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
-- On Fri, Mar 23, 2018 at 12:57 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
Also, it seems that the delta patch I sent in the last email didn't
contain all the changes I had to make. It didn't contain, for example,
replacing adjust_and_expand_inherited_tlist() with
adjust_partition_tlist(). I guess you'll know when you rebase anyway.Yes, I am planning to fix that once the ON CONFLICT patch is ready/committed.
Now that ON CONFLICT patch is in, here are rebased patches. The second patch is to add support for CTE (thanks Peter).
Apart from rebase, the following things are fixed/improved:
- Added test cases for column level privileges as suggested by Peter. One problem got discovered during the process. Since we expand and track source relation targetlist, the exiting code was demanding SELECT privileges on all attributes, even though MERGE is only referencing a few attributes on which the user has privilege. Fixed that by disassociating expansion from the actual referencing.
- Added a test case for RLS where SELECT policy actually hides some rows, as suggested by Stephen in the past
- Added check to compare result relation's and merge target relation's OIDs, as suggested by Robert. Simon thinks it's not necessary given that we now scan catalog using MVCC snapshot. So will leave it to his discretion when he takes it up for commit
- Improved explanation regarding why we need a second RTE for merge target relation and general cleanup/improvements in that area
I think it will be a good idea to send any further patches as add-on patches for reviewer/committer's sake. I will do that unless someone disagrees.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 26 March 2018 at 15:39, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Now that ON CONFLICT patch is in, here are rebased patches. The second patch > is to add support for CTE (thanks Peter). > > Apart from rebase, the following things are fixed/improved: > > - Added test cases for column level privileges as suggested by Peter. One > problem got discovered during the process. Since we expand and track source > relation targetlist, the exiting code was demanding SELECT privileges on all > attributes, even though MERGE is only referencing a few attributes on which > the user has privilege. Fixed that by disassociating expansion from the > actual referencing. Good catch Peter. > - Added a test case for RLS where SELECT policy actually hides some rows, as > suggested by Stephen in the past > > - Added check to compare result relation's and merge target relation's OIDs, > as suggested by Robert. Simon thinks it's not necessary given that we now > scan catalog using MVCC snapshot. So will leave it to his discretion when he > takes it up for commit No problem with adding the test, its quick to compare two Oids. > - Improved explanation regarding why we need a second RTE for merge target > relation and general cleanup/improvements in that area > I think it will be a good idea to send any further patches as add-on patches > for reviewer/committer's sake. I will do that unless someone disagrees. +1 So v25 is the "commit candidate" and we can add other patches to it. Given recent bugfix/changes I don't plan to commit this tomorrow anymore. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 26, 2018 at 5:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Since we now have MVCC catalog scans, all the name lookups are > performed using the same snapshot so in the above scenario the newly > created object would be invisible to the second name lookup. That's not true, because each lookup would be performed using a new snapshot -- not all under one snapshot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 26 March 2018 at 15:39, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > reviewer 1. In ExecMergeMatched() we have a line of code that does this... if (TransactionIdIsCurrentTransactionId(hufd.xmax)) then errcode(ERRCODE_CARDINALITY_VIOLATION) I notice this is correct, but too strong. It should be possible to run a sequence of MERGE commands inside a transaction, repeatedly updating the same set of rows, as is possible with UPDATE. We need to check whether the xid is the current subxid and the cid is the current commandid, rather than using TransactionIdIsCurrentTransactionId() On further analysis, I note that ON CONFLICT suffers from this problem as well, looks like I just refactored it from there. 2. EXPLAIN ANALYZE looks unchanged from some time back. The math is only correct as long as there are zero rows that do not cause an INS/UPD/DEL. We don't test for that. I think this is a regression from an earlier report of the same bug, or perhaps I didn't fix it at the time. 3. sp. depedning trigger.sgml 4. trigger.sgml replace "specific actions specified" with "events specified in actions" to avoid the double use of "specific" 5. I take it the special code for TransformMerge target relations is replaced by "right_rte"? Seems fragile to leave it like that. Can we add an Assert()? Do we care? 6. I didn't understand "Assume that the top-level join RTE is at the end. The source relation + * is just before that." What is there is no source relation? 7. The formatting of the SQL statement in transformMergeStmt that begins "Construct a query of the form" is borked, so the SQL layout is unclear, just needs pretty print 8. I didn't document why I thought this was true "XXX if we have a constant subquery, we can also skip join", but one of the explain analyze outputs shows this is already true - where we provide a constant query and it skips the join. So perhaps we can remove the comment. (Search for "Seq Scan on target t_1") 9. I think we need to mention that MERGE won't work with rules or inheritance (only partitioning) in the main doc page. The current text says that rules are ignored, which would be true if we didn't specifically throw ERROR feature not supported. 10. Comment needs updating for changes in code below it - "In MERGE when and condition, no system column is allowed" 11. In comment "Since the plan re-evaluated by EvalPlanQual uses the second RTE", suggest using "join RTE" to make it more explicit which RTE we are discussing 12. Missed out merge.sgml from v25 patch. 13. For triggers we say "No separate triggers are defined for <command>MERGE</command>" we should also state the same caveat for POLICY events That's all I can see so far. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 26 March 2018 at 16:09, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Mar 26, 2018 at 5:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Since we now have MVCC catalog scans, all the name lookups are >> performed using the same snapshot so in the above scenario the newly >> created object would be invisible to the second name lookup. > > That's not true, because each lookup would be performed using a new > snapshot -- not all under one snapshot. You're saying we take a separate snapshot for each table we lookup? Sounds weird to me. So this error could happen in SELECT, UPDATE, DELETE or INSERT as well. Or you see this as something related specifically to MERGE, if so how? Please explain what you see. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 26, 2018 at 12:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 26 March 2018 at 16:09, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Mar 26, 2018 at 5:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Since we now have MVCC catalog scans, all the name lookups are >>> performed using the same snapshot so in the above scenario the newly >>> created object would be invisible to the second name lookup. >> >> That's not true, because each lookup would be performed using a new >> snapshot -- not all under one snapshot. > > You're saying we take a separate snapshot for each table we lookup? > Sounds weird to me. I'm saying we take a separate snapshot for each and every catalog lookup, except when we know that no catalog changes can have occurred. See the commit message for 568d4138c646cd7cd8a837ac244ef2caf27c6bb8. If you do a lookup in pg_class and 3 lookups in pg_attribute each of the 4 can be done under a different snapshot, in the worst case. You're not the first person to believe that the MVCC catalog scan patch fixes that problem, but as the guy who wrote it, it definitely doesn't. What that patch fixed was, prior to that patch, a catalog scan might find the WRONG NUMBER OF ROWS, like you might do a lookup against a unique index for an object that existed and, if the row was concurrently updated, you might find 0 rows or 2 rows instead of 1 row. IOW, it guaranteed that we used a consistent snapshot for each individual lookup, not a consistent snapshot for the whole course of a command. > So this error could happen in SELECT, UPDATE, DELETE or INSERT as well. > > Or you see this as something related specifically to MERGE, if so how? > Please explain what you see. As I said before, the problem occurs if the same command looks up the same table name in more than one place. There is absolutely nothing to guarantee that we get the same answer every time. As far as I know, the proposed MERGE patch has that issue an existing DML commands don't; but someone else may have better information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 26 March 2018 at 17:52, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Mar 26, 2018 at 12:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 26 March 2018 at 16:09, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Mon, Mar 26, 2018 at 5:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> Since we now have MVCC catalog scans, all the name lookups are >>>> performed using the same snapshot so in the above scenario the newly >>>> created object would be invisible to the second name lookup. >>> >>> That's not true, because each lookup would be performed using a new >>> snapshot -- not all under one snapshot. >> >> You're saying we take a separate snapshot for each table we lookup? >> Sounds weird to me. > > I'm saying we take a separate snapshot for each and every catalog > lookup, except when we know that no catalog changes can have occurred. > See the commit message for 568d4138c646cd7cd8a837ac244ef2caf27c6bb8. > If you do a lookup in pg_class and 3 lookups in pg_attribute each of > the 4 can be done under a different snapshot, in the worst case. > You're not the first person to believe that the MVCC catalog scan > patch fixes that problem, but as the guy who wrote it, it definitely > doesn't. What that patch fixed was, prior to that patch, a catalog > scan might find the WRONG NUMBER OF ROWS, like you might do a lookup > against a unique index for an object that existed and, if the row was > concurrently updated, you might find 0 rows or 2 rows instead of 1 > row. IOW, it guaranteed that we used a consistent snapshot for each > individual lookup, not a consistent snapshot for the whole course of a > command. That all makes sense, thanks for explaining. I spent a few more minutes, going "but", "but" though I can now see good reasons for everything to work this way. >> So this error could happen in SELECT, UPDATE, DELETE or INSERT as well. >> >> Or you see this as something related specifically to MERGE, if so how? >> Please explain what you see. > > As I said before, the problem occurs if the same command looks up the > same table name in more than one place. There is absolutely nothing > to guarantee that we get the same answer every time. > As far as I > know, the proposed MERGE patch has that issue an existing DML commands > don't; but someone else may have better information. I will look deeper and report back. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 26, 2018 at 12:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> As far as I >> know, the proposed MERGE patch has that issue an existing DML commands >> don't; but someone else may have better information. > > I will look deeper and report back. It's quite clear that the problem exists with the MERGE patch; the simple fact that RangeVarGetRelidExtended() is called twice with the same RangeVar argument shows this. However, the Oid cross-check seems like a sufficient defense against an inconsistency that causes real trouble, since the cross-check will only error-out when a concurrent table creation (or maybe ALTER TABLE) makes a second table visible, in a schema that appears earlier in the user's search_path. It's hard to imagine any legitimate user truly preferring some alternative behavior in this particular scenario, which makes it okay. This cross-check workaround is ugly, but apparently there is a precedent in copy.c. I didn't know that detail until Robert pointed it out. That makes me feel a lot better about this general question of how the target relation is represented, having two RTEs, etc. -- Peter Geoghegan
On 26 March 2018 at 23:10, Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Mar 26, 2018 at 12:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> As far as I >>> know, the proposed MERGE patch has that issue an existing DML commands >>> don't; but someone else may have better information. >> >> I will look deeper and report back. > > It's quite clear that the problem exists with the MERGE patch Accepted, the only question is whether it affects UPDATE as well cos it looks like it should. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 26 March 2018 at 17:06, Simon Riggs <simon@2ndquadrant.com> wrote: > On 26 March 2018 at 15:39, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > > That's all I can see so far. * change comment “once to” to “once” in src/include/nodes/execnodes.h * change comment “and to run” to “and once to run” * change “result relation” to “target relation” * XXX we probably need to check plan output for CMD_MERGE also * Spurious line feed in src/backend/optimizer/prep/preptlist.c * No need to remove whitespace in src/backend/optimizer/util/relnode.c * README should note that the TABLEOID junk column is not strictly needed when joining to a non-partitioned table but we don't try to optimize that away. Is that an XXX to fix in future or do we just think the extra 4 bytes won't make much difference so we leave it? * Comment in rewriteTargetListMerge() should mention TABLEOID exists to allow us to find the correct relation, not the correct row, comment just copied from CTID above it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 26, 2018 at 9:36 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 26 March 2018 at 15:39, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> reviewer
1. In ExecMergeMatched() we have a line of code that does this...
if (TransactionIdIsCurrentTransactionId(hufd.xmax))
then errcode(ERRCODE_CARDINALITY_VIOLATION)
I notice this is correct, but too strong. It should be possible to run
a sequence of MERGE commands inside a transaction, repeatedly updating
the same set of rows, as is possible with UPDATE.
We need to check whether the xid is the current subxid and the cid is
the current commandid, rather than using
TransactionIdIsCurrentTransactionId()
AFAICS this is fine because we invoke that code only when HeapTupleSatisfiesUpdate returns HeapTupleSelfUpdated i.e. for the case when the tuple is updated by our transaction after the scan is started. HeapTupleSatisfiesUpd ate already checks for command id before returning HeapTupleSelfUpdated .
2. EXPLAIN ANALYZE looks unchanged from some time back. The math is
only correct as long as there are zero rows that do not cause an
INS/UPD/DEL.
We don't test for that. I think this is a regression from an earlier
report of the same bug, or perhaps I didn't fix it at the time.
I've now added a separate counter to count all three actions and we also report "Tuples skipped" which could be either because there was no action to handle that source tuple or quals did not match. Added regression tests specific to EXPLAIN ANALYZE.
3. sp. depedning trigger.sgml
Fixed.
4. trigger.sgml replace "specific actions specified" with "events
specified in actions"
to avoid the double use of "specific"
Fixed.
5. I take it the special code for TransformMerge target relations is
replaced by "right_rte"? Seems fragile to leave it like that. Can we
add an Assert()? Do we care?
I didn't get this point. Can you please explain?
6. I didn't understand "Assume that the top-level join RTE is at the
end. The source relation
+ * is just before that."
What is there is no source relation?
Can that happen? I mean shouldn't there always be a source relation? It could be a subquery or a function scan or just a plain relation, but something, right?
7. The formatting of the SQL statement in transformMergeStmt that
begins "Construct a query of the form" is borked, so the SQL layout is
unclear, just needs pretty print
Fixed.
8. I didn't document why I thought this was true "XXX if we have a
constant subquery, we can also skip join", but one of the explain
analyze outputs shows this is already true - where we provide a
constant query and it skips the join. So perhaps we can remove the
comment. (Search for "Seq Scan on target t_1")
Agree, removed.
9. I think we need to mention that MERGE won't work with rules or
inheritance (only partitioning) in the main doc page. The current
text says that rules are ignored, which would be true if we didn't
specifically throw ERROR feature not supported.
Added a short para to merge.sgml
10. Comment needs updating for changes in code below it - "In MERGE
when and condition, no system column is allowed"
Yeah, that's kinda half-true since the code below supports TABLEOID and OID system columns. I am thinking about this in a larger context though. Peter has expressed desire to support system columns in WHEN targetlist and quals. I gave it a try and it seems if we remove that error block, all system columns are supported readily. But only from the target side. There is a problem if we try to refer a system column from the source side since the mergrSourceTargetList only includes user columns and so set_plan_refs() complains about a system column.
I am not sure what's the best way to handle this. May be we can add system columns to the mergrSourceTargetList. I haven't yet found a neat way to do that.
11. In comment "Since the plan re-evaluated by EvalPlanQual uses the
second RTE", suggest using "join RTE" to make it more explicit which
RTE we are discussing
Ok, fixed.
12. Missed out merge.sgml from v25 patch.
Ouch, added. Also generating a new patch which includes merge.sgml and sending other improvements as add-ons.
13. For triggers we say "No separate triggers are defined for
<command>MERGE</command>"
we should also state the same caveat for POLICY events
Ok. Added a short para in create_policy.sgml
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Tue, Mar 27, 2018 at 1:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 26 March 2018 at 17:06, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 26 March 2018 at 15:39, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
> That's all I can see so far.
* change comment “once to” to “once” in src/include/nodes/execnodes.h
* change comment “and to run” to “and once to run”
* change “result relation” to “target relation”
Fixed all of that in the patch v26 set I just sent.
* XXX we probably need to check plan output for CMD_MERGE also
Yeah. Added those checks for MERGE action's target lists in v26.
* Spurious line feed in src/backend/optimizer/prep/preptlist.c
Couldn't spot it. Will look closer, but any hint will be appreciated.
* No need to remove whitespace in src/backend/optimizer/util/relnode.c
Fixed in v26.
* README should note that the TABLEOID junk column is not strictly
needed when joining to a non-partitioned table but we don't try to
optimize that away. Is that an XXX to fix in future or do we just
think the extra 4 bytes won't make much difference so we leave it?
I actually took the opportunity to conditionally fetch tableoid only if we are dealing with partitioned table.
* Comment in rewriteTargetListMerge() should mention TABLEOID exists
to allow us to find the correct relation, not the correct row, comment
just copied from CTID above it.
Fixed in v26.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On 27 March 2018 at 10:28, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> 1. In ExecMergeMatched() we have a line of code that does this... >> >> if (TransactionIdIsCurrentTransactionId(hufd.xmax)) >> then errcode(ERRCODE_CARDINALITY_VIOLATION) >> >> I notice this is correct, but too strong. It should be possible to run >> a sequence of MERGE commands inside a transaction, repeatedly updating >> the same set of rows, as is possible with UPDATE. >> >> We need to check whether the xid is the current subxid and the cid is >> the current commandid, rather than using >> TransactionIdIsCurrentTransactionId() > > AFAICS this is fine because we invoke that code only when > HeapTupleSatisfiesUpdate returns HeapTupleSelfUpdated i.e. for the case when > the tuple is updated by our transaction after the scan is started. > HeapTupleSatisfiesUpdate already checks for command id before returning > HeapTupleSelfUpdated. Cool. >> 5. I take it the special code for TransformMerge target relations is >> replaced by "right_rte"? Seems fragile to leave it like that. Can we >> add an Assert()? Do we care? > > I didn't get this point. Can you please explain? The code confused me at that point. More docs pls. >> 6. I didn't understand "Assume that the top-level join RTE is at the >> end. The source relation >> + * is just before that." >> What is there is no source relation? > > > Can that happen? I mean shouldn't there always be a source relation? It > could be a subquery or a function scan or just a plain relation, but > something, right? Yes, OK. So ordering is target, source, join. >> 10. Comment needs updating for changes in code below it - "In MERGE >> when and condition, no system column is allowed" >> > > Yeah, that's kinda half-true since the code below supports TABLEOID and OID > system columns. I am thinking about this in a larger context though. Peter > has expressed desire to support system columns in WHEN targetlist and quals. > I gave it a try and it seems if we remove that error block, all system > columns are supported readily. But only from the target side. There is a > problem if we try to refer a system column from the source side since the > mergrSourceTargetList only includes user columns and so set_plan_refs() > complains about a system column. > > I am not sure what's the best way to handle this. May be we can add system > columns to the mergrSourceTargetList. I haven't yet found a neat way to do > that. I was saying the comment needs changing, not the code. Cool, thanks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 27 March 2018 at 10:31, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Fixed in v26. More comments on v26 * Change errmsg “Ensure that not more than one source rows match any one target row” should be “Ensure that not more than one source row matches any one target row” * I think we need an example in the docs showing a constant source query, so people can understand how to use MERGE for OLTP as well as large ELT * Long comment in ExecMerge() needs rewording, formatting and spell check I suggest not referring to an "order" since that concept doesn't exist anywhere else * Need tests for coverage of these ERROR messages Named security policy violation SELECT not allowed in MERGE INSERT... Multiple VALUES clauses not... MERGE is not supported for this... MERGE is not supported for relations with inheritance MERGE is not supported for relations with rules -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 27 March 2018 at 11:46, Simon Riggs <simon@2ndquadrant.com> wrote: > On 27 March 2018 at 10:31, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > >> Fixed in v26. > > More comments on v26 In terms of further performance optimization, if there is just one WHEN AND condition and no unconditional WHEN clauses then we can add the WHEN AND easily to the join query. That seems like an easy thing to do for PG11 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Mar 27, 2018 at 1:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Accepted, the only question is whether it affects UPDATE as well cos > it looks like it should. If you mean an UPDATE FROM self-join, then I suppose that it does, in a very limited way. The difference is that there are no hard-coded assumptions about the relationship between those two RTEs. -- Peter Geoghegan
On Tue, Mar 27, 2018 at 2:28 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > (Version 26) I have some feedback on this version: * ExecMergeMatched() needs to determine tuple lock mode for EvalPlanQual() in a way that's based on how everything else works; it's not okay to just use LockTupleExclusive in all cases. That will often lead to lock escalation, which can cause unprincipled deadlocks. You need to pass back the relevant info from routines like heap_update(), which means more state needs to come back to ExecMergeMatched() from routines like ExecUpdate(). * Doesn't ExecUpdateLockMode(), which is called from places like ExecBRUpdateTriggers(), also need to be taught about GetEPQRangeTableIndex() (i.e. the ri_mergeTargetRTI/ri_RangeTableIndex divide)? You should audit everything like that carefully. Maybe GetEPQRangeTableIndex() is not the best choke-point to do this kind of thing. Not that I have a clearly better idea. * Looks like there is a similar problem in ExecPartitionCheckEmitError(). I don't really understand how that works, so I might be wrong here. * More or less the same issue seems to exist within ExecConstraints(), including where GetInsertedColumns() is used. * Compiler warning: fwrapv -fexcess-precision=standard -Og -g3 -fno-omit-frame-pointer -I../../../src/include -I/code/postgresql/root/build/../source/src/include -D_FORTIFY_SOURCE=2 -D TRUST_STRXFRM -D LOCK_DEBUG -D WAL_DEBUG -D BTREE_BUILD_STATS -D MULTIXACT_DEBUG -D SELECTIVITY_DEBUG -D HJDEBUG -D_GNU_SOURCE -I/usr/include/libxml2 -c -o nodeModifyTable.o /code/postgresql/root/build/../source/src/backend/executor/nodeModifyTable.c /code/postgresql/root/build/../source/src/backend/executor/nodeModifyTable.c: In function ‘ExecInsert’: /code/postgresql/root/build/../source/src/backend/executor/nodeModifyTable.c:412:4: warning: ‘wco_kind’ may be used uninitialized in this function [-Wmaybe-uninitialized] ExecWithCheckOptions(wco_kind, resultRelInfo, slot, estate); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * The BufferIsValid() checks to decide if we need to ReleaseBuffer() within ExecMergeMatched() are unnecessary -- a buffer pin must be held throughout. This looks like it's leftover from before the ExecMergeNotMatched()/ExecMergeMatched() split was made. * There should be ResetExprContext() calls for your new MergeActionState projections. That's what we see for the RETURNING + ON CONFLICT projections within nodeModifyTable.c, which the new projections are very similar to, and clearly modeled on. -- Peter Geoghegan
On Wed, Mar 28, 2018 at 8:28 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Mar 27, 2018 at 2:28 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> (Version 26)
I have some feedback on this version:
* ExecMergeMatched() needs to determine tuple lock mode for
EvalPlanQual() in a way that's based on how everything else works;
it's not okay to just use LockTupleExclusive in all cases. That will
often lead to lock escalation, which can cause unprincipled deadlocks.
You need to pass back the relevant info from routines like
heap_update(), which means more state needs to come back to
ExecMergeMatched() from routines like ExecUpdate().
You're right. I am thinking what would be a good way to pass that information back. Should we add a new out parameter to ExecUpdate() or a new member to HeapUpdateFailureData? It seems only ExecUpdate() would need the change, so may be it's fine to change the API but HeapUpdateFailureData doesn't look bad either since it deals with failure cases and we are indeed dealing with ExecUpdate() failure. Any preference?
* Doesn't ExecUpdateLockMode(), which is called from places like
ExecBRUpdateTriggers(), also need to be taught about
GetEPQRangeTableIndex() (i.e. the ri_mergeTargetRTI/ri_RangeTableIndex
divide)? You should audit everything like that carefully. Maybe
GetEPQRangeTableIndex() is not the best choke-point to do this kind of
thing. Not that I have a clearly better idea.
* Looks like there is a similar problem in
ExecPartitionCheckEmitError(). I don't really understand how that
works, so I might be wrong here.
* More or less the same issue seems to exist within ExecConstraints(),
including where GetInsertedColumns() is used.
They all look fine to me. Remember that we always resolve column references in WHEN targetlist from the target relation and hence things like updatedCols/insertedCols are get set on the target RTE. All of these routines read from the target RTE as far as I can see. But I will check in more detail.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Mar 28, 2018 at 8:28 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Mar 27, 2018 at 4:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
* ExecMergeMatched() needs to determine tuple lock mode for
EvalPlanQual() in a way that's based on how everything else works;
it's not okay to just use LockTupleExclusive in all cases. That will
often lead to lock escalation, which can cause unprincipled deadlocks.
You need to pass back the relevant info from routines like
heap_update(), which means more state needs to come back to
ExecMergeMatched() from routines like ExecUpdate().
Fixed by adding a new member to HeapUpdateFailureData. That seemed more natural, but we can change if others disagree.
* Doesn't ExecUpdateLockMode(), which is called from places like
ExecBRUpdateTriggers(), also need to be taught about
GetEPQRangeTableIndex() (i.e. the ri_mergeTargetRTI/ri_RangeTableIndex
divide)? You should audit everything like that carefully. Maybe
GetEPQRangeTableIndex() is not the best choke-point to do this kind of
thing. Not that I have a clearly better idea.
* Looks like there is a similar problem in
ExecPartitionCheckEmitError(). I don't really understand how that
works, so I might be wrong here.
* More or less the same issue seems to exist within ExecConstraints(),
including where GetInsertedColumns() is used.
As I said, I do not see a problem with this. The insertedCols/updatedCols etc are tracked in the target table's RTE and hence we should continue to use ri_RangeTableIndex for such purposes. The ri_mergeTargetRTI is only useful to running EvalPlanQual correctly.
* Compiler warning:
fwrapv -fexcess-precision=standard -Og -g3 -fno-omit-frame-pointer
-I../../../src/include
-I/code/postgresql/root/build/../source/src/include
-D_FORTIFY_SOURCE=2 -D TRUST_STRXFRM -D LOCK_DEBUG -D WAL_DEBUG -D
BTREE_BUILD_STATS -D MULTIXACT_DEBUG -D SELECTIVITY_DEBUG -D HJDEBUG
-D_GNU_SOURCE -I/usr/include/libxml2 -c -o nodeModifyTable.o
/code/postgresql/root/build/../source/src/backend/executor/ nodeModifyTable.c
/code/postgresql/root/build/../source/src/backend/executor/ nodeModifyTable.c:
In function ‘ExecInsert’:
/code/postgresql/root/build/../source/src/backend/executor/ nodeModifyTable.c:412:4:
warning: ‘wco_kind’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
ExecWithCheckOptions(wco_kind, resultRelInfo, slot, estate);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixed.
* The BufferIsValid() checks to decide if we need to ReleaseBuffer()
within ExecMergeMatched() are unnecessary -- a buffer pin must be held
throughout. This looks like it's leftover from before the
ExecMergeNotMatched()/ExecMergeMatched() split was made.
Fixed.
* There should be ResetExprContext() calls for your new
MergeActionState projections. That's what we see for the RETURNING +
ON CONFLICT projections within nodeModifyTable.c, which the new
projections are very similar to, and clearly modeled on.
Right. I thought about what would the right place to call ResetExprContext() and chose to do that in ExecMerge(). I am actually a bit surprised that ExecProcessReturning() and ExecOnConflictUpdate() call ResetExprContext(). Aren't they both using mtstate->ps.ps_ExprContext? If so, why is it safe to reset the expression context before we are completely done with the previous tuple? Clearly, the current code is working fine, so may be there is no bug, but it looks curious.
On Tue, Mar 27, 2018 at 4:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
More comments on v26
* Change errmsg “Ensure that not more than one source rows match any
one target row”
should be
“Ensure that not more than one source row matches any one target row”
Fixed.
* I think we need an example in the docs showing a constant source
query, so people can understand how to use MERGE for OLTP as well as
large ELT
* Long comment in ExecMerge() needs rewording, formatting and spell check
Tried. Please check and suggest improvements, if any.
I suggest not referring to an "order" since that concept doesn't exist
anywhere else
You mean when we say that the actions are evaluated "in an order"?
* Need tests for coverage of these ERROR messages
Named security policy violation
Surprisingly none exists even for regular UPDATE/INSERT/DELETE. I will see what is needed to trigger that error message.
SELECT not allowed in MERGE INSERT...
Multiple VALUES clauses not...
MERGE is not supported for this...
MERGE is not supported for relations with inheritance
MERGE is not supported for relations with rules
Added a test for each of those. v27 attached, though review changes are in the add-on 0005 patch.
Apart from that
- ran the patch through spell-checker and fixed a few typos
- reorganised the code in 0004 a bit.
- rebased on current master
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Tue, Mar 27, 2018 at 5:00 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
In terms of further performance optimization, if there is just one
WHEN AND condition and no unconditional WHEN clauses then we can add
the WHEN AND easily to the join query.
That seems like an easy thing to do for PG11
For example,
postgres=# select * from target ;
a | b
---+----
1 | 10
2 | 20
(2 rows)
postgres=# select * from source ;
a | b
---+-----
2 | 200
3 | 300
(2 rows)
postgres=# BEGIN;
BEGIN
postgres=# EXPLAIN ANALYZE MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED AND t.a < 2 THEN UPDATE SET b = s.b WHEN NOT MATCHED THEN INSERT VALUES (s.a, -1);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Merge on target t (cost=317.01..711.38 rows=25538 width=46) (actual time=0.104..0.104 rows=0 loops=1)
Tuples Inserted: 1
Tuples Updated: 0
Tuples Deleted: 0
Tuples Skipped: 1
-> Merge Left Join (cost=317.01..711.38 rows=25538 width=46) (actual time=0.071..0.074 rows=2 loops=1)
Merge Cond: (s.a = t_1.a)
-> Sort (cost=158.51..164.16 rows=2260 width=40) (actual time=0.042..0.043 rows=2 loops=1)
Sort Key: s.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on source s (cost=0.00..32.60 rows=2260 width=40) (actual time=0.027..0.031 rows=2 loops=1)
-> Sort (cost=158.51..164.16 rows=2260 width=10) (actual time=0.019..0.020 rows=2 loops=1)
Sort Key: t_1.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on target t_1 (cost=0.00..32.60 rows=2260 width=10) (actual time=0.012..0.014 rows=2 loops=1)
Planning Time: 0.207 ms
Execution Time: 0.199 ms
(17 rows)
postgres=# abort;
ROLLBACK
postgres=# BEGIN;
BEGIN
postgres=# EXPLAIN ANALYZE MERGE INTO target t USING source s ON t.a = s.a AND t.a < 2 WHEN MATCHED THEN UPDATE SET b = s.b WHEN NOT MATCHED THEN INSERT VALUES (s.a, -1);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Merge on target t (cost=232.74..364.14 rows=8509 width=46) (actual time=0.128..0.128 rows=0 loops=1)
Tuples Inserted: 2
Tuples Updated: 0
Tuples Deleted: 0
Tuples Skipped: 0
-> Merge Right Join (cost=232.74..364.14 rows=8509 width=46) (actual time=0.070..0.072 rows=2 loops=1)
Merge Cond: (t_1.a = s.a)
-> Sort (cost=74.23..76.11 rows=753 width=10) (actual time=0.038..0.039 rows=1 loops=1)
Sort Key: t_1.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on target t_1 (cost=0.00..38.25 rows=753 width=10) (actual time=0.026..0.028 rows=1 loops=1)
Filter: (a < 2)
Rows Removed by Filter: 1
-> Sort (cost=158.51..164.16 rows=2260 width=40) (actual time=0.024..0.025 rows=2 loops=1)
Sort Key: s.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on source s (cost=0.00..32.60 rows=2260 width=40) (actual time=0.014..0.017 rows=2 loops=1)
Planning Time: 0.218 ms
Execution Time: 0.234 ms
(19 rows)
postgres=# abort;
ROLLBACK
If you look at the first MERGE statement, we filter one matched source row (2,200) using (t.a < 2) and do not take any action for that row. This filtering happens after the RIGHT JOIN has reported it as "matched". But if we push down the qual to the join, then the join will see that the source row has no match and hence send that row for NOT MATCHED processing, thus inserting it into the table again.
I am not saying there is no scope for improvement. But we need to be careful about what can be pushed down to the join and what must be applied after the join.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On 29 March 2018 at 07:37, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > > > On Tue, Mar 27, 2018 at 5:00 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> >> In terms of further performance optimization, if there is just one >> WHEN AND condition and no unconditional WHEN clauses then we can add >> the WHEN AND easily to the join query. >> >> That seems like an easy thing to do for PG11 >> > > I think we need to be careful in terms of what can be pushed down to the > join, in presence of WHEN NOT MATCHED actions. If we push the WHEN AND qual > to the join then I am worried that some rows which should have been reported > "matched" and later filtered out as part of the WHEN quals, will get > reported as "not-matched", thus triggering WHEN NOT MATCHED action. > postgres=# EXPLAIN ANALYZE MERGE INTO target t USING source s ON t.a = s.a > WHEN MATCHED AND t.a < 2 THEN UPDATE SET b = s.b WHEN NOT MATCHED THEN > INSERT VALUES (s.a, -1); That has an unconditional WHEN clause, so would block the push down using my stated rule above. With something like this MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED AND t.a < 2 THEN UPDATE SET b = s.b; or this MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED AND t.a < 2 THEN UPDATE SET b = s.b WHEN NOT MATCHED DO NOTHING; or this MERGE INTO target t USING source s ON t.a = s.a WHEN MATCHED AND t.a < 2 THEN UPDATE SET b = s.b WHEN MATCHED DO NOTHING WHEN NOT MATCHED DO NOTHING; then we can push down "t.a < 2" into the WHERE clause of the join query. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 28 March 2018 at 12:00, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > v27 attached, though review changes are in > the add-on 0005 patch. This all looks good now, thanks for making all of those changes. I propose [v27 patch1+patch3+patch5] as the initial commit candidate for MERGE, with other patches following later before end CF. I propose to commit this tomorrow, 30 March, about 26 hours from now. That will allow some time for buildfarm fixing/reversion before the Easter weekend, then other patches to follow starting 2 April. That then gives reasonable time to follow up on other issues that we will no doubt discover fairly soon after commit, such as additional runs by SQLsmith and more eyeballs. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 March 2018 at 10:50, Simon Riggs <simon@2ndquadrant.com> wrote: > On 28 March 2018 at 12:00, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > >> v27 attached, though review changes are in >> the add-on 0005 patch. > > This all looks good now, thanks for making all of those changes. > > I propose [v27 patch1+patch3+patch5] as the initial commit candidate > for MERGE, with other patches following later before end CF. > > I propose to commit this tomorrow, 30 March, about 26 hours from now. > That will allow some time for buildfarm fixing/reversion before the > Easter weekend, then other patches to follow starting 2 April. That > then gives reasonable time to follow up on other issues that we will > no doubt discover fairly soon after commit, such as additional runs by > SQLsmith and more eyeballs. No problems found, but moving proposed commit to 2 April pm -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2018-03-30 12:10:04 +0100, Simon Riggs wrote: > On 29 March 2018 at 10:50, Simon Riggs <simon@2ndquadrant.com> wrote: > > On 28 March 2018 at 12:00, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > > > >> v27 attached, though review changes are in > >> the add-on 0005 patch. > > > > This all looks good now, thanks for making all of those changes. > > > > I propose [v27 patch1+patch3+patch5] as the initial commit candidate > > for MERGE, with other patches following later before end CF. > > > > I propose to commit this tomorrow, 30 March, about 26 hours from now. > > That will allow some time for buildfarm fixing/reversion before the > > Easter weekend, then other patches to follow starting 2 April. That > > then gives reasonable time to follow up on other issues that we will > > no doubt discover fairly soon after commit, such as additional runs by > > SQLsmith and more eyeballs. > > No problems found, but moving proposed commit to 2 April pm I did a scan through this, as I hadn't been able to keep with the thread previously. Sorry if some of the things mentioned here have been discussed previously. I am just reading through the patch in its own order, so please excuse if there's things I remark on that only later fully make sense. later update: TL;DR: I don't think the parser / executor implementation of MERGE is architecturally sound. I think creating hidden joins during parse-analysis to implement MERGE is a seriously bad idea and it needs to be replaced by a different executor structure. @@ -3828,8 +3846,14 @@ struct AfterTriggersTableData bool before_trig_done; /* did we already queue BS triggers? */ bool after_trig_done; /* did we already queue AS triggers? */ AfterTriggerEventList after_trig_events; /* if so, saved list pointer */ - Tuplestorestate *old_tuplestore; /* "old" transition table, if any */ - Tuplestorestate *new_tuplestore; /* "new" transition table, if any */ + /* "old" transition table for UPDATE, if any */ + Tuplestorestate *old_upd_tuplestore; + /* "new" transition table for UPDATE, if any */ + Tuplestorestate *new_upd_tuplestore; + /* "old" transition table for DELETE, if any */ + Tuplestorestate *old_del_tuplestore; + /* "new" transition table INSERT, if any */ + Tuplestorestate *new_ins_tuplestore; }; A comment somewhere why we have all of these (presumably because they can now happen in the context of a single statement) would be good. @@ -5744,12 +5796,28 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo, newtup == NULL)); if (oldtup != NULL && - ((event == TRIGGER_EVENT_DELETE && delete_old_table) || - (event == TRIGGER_EVENT_UPDATE && update_old_table))) + (event == TRIGGER_EVENT_DELETE && delete_old_table)) { Tuplestorestate *old_tuplestore; - old_tuplestore = transition_capture->tcs_private->old_tuplestore; + old_tuplestore = transition_capture->tcs_private->old_del_tuplestore; + + if (map != NULL) + { + HeapTuple converted = do_convert_tuple(oldtup, map); + + tuplestore_puttuple(old_tuplestore, converted); + pfree(converted); + } + else + tuplestore_puttuple(old_tuplestore, oldtup); + } Very similar code is now repeated four times. Could you abstract this into a separate function? --- a/src/backend/executor/README +++ b/src/backend/executor/README @@ -37,6 +37,16 @@ the plan tree returns the computed tuples to be updated, plus a "junk" one. For DELETE, the plan tree need only deliver a CTID column, and the ModifyTable node visits each of those rows and marks the row deleted. +MERGE runs one generic plan that returns candidate target rows. Each row +consists of a super-row that contains all the columns needed by any of the +individual actions, plus a CTID and a TABLEOID junk columns. The CTID column is "plus a CTID and a TABLEOID junk columns" sounds a bit awkward? +required to know if a matching target row was found or not and the TABLEOID +column is needed to find the underlying target partition, in case when the +target table is a partition table. If the CTID column is set we attempt to +activate WHEN MATCHED actions, or if it is NULL then we will attempt to "if it is NULL then" sounds wrong. +activate WHEN NOT MATCHED actions. Once we know which action is activated we +form the final result row and apply only those changes. + XXX a great deal more documentation needs to be written here... Are we using tableoids in a similar way in other places? +/* + * Given OID of the partition leaf, return the index of the leaf in the + * partition hierarchy. + */ +int +ExecFindPartitionByOid(PartitionTupleRouting *proute, Oid partoid) +{ + int i; + + for (i = 0; i < proute->num_partitions; i++) + { + if (proute->partition_oids[i] == partoid) + break; + } + + Assert(i < proute->num_partitions); + return i; +} Shouldn't we at least warn in a comment that this is O(N)? And document that it does weird stuff if the OID isn't found? Perhaps just introduce a PARTOID syscache? diff --git a/src/backend/executor/nodeMerge.c b/src/backend/executor/nodeMerge.c new file mode 100644 index 00000000000..0e0d0795d4d --- /dev/null +++ b/src/backend/executor/nodeMerge.c @@ -0,0 +1,575 @@ +/*------------------------------------------------------------------------- + * + * nodeMerge.c + * routines to handle Merge nodes relating to the MERGE command Isn't this file misnamed and it should be execMerge.c? The rest of the node*.c files are for nodes that are invoked via execProcnode / ExecProcNode(). This isn't an actual executor node. Might also be worthwhile to move the merge related init stuff here? What's the reasoning behind making this be an anomaluous type of executor node? +/* + * Perform MERGE. + */ +void +ExecMerge(ModifyTableState *mtstate, EState *estate, TupleTableSlot *slot, + JunkFilter *junkfilter, ResultRelInfo *resultRelInfo) +{ + * ExecMergeMatched takes care of following the update chain and + * re-finding the qualifying WHEN MATCHED action, as long as the updated + * target tuple still satisfies the join quals i.e. it still remains a + * WHEN MATCHED case. If the tuple gets deleted or the join quals fail, it + * returns and we try ExecMergeNotMatched. Given that ExecMergeMatched + * always make progress by following the update chain and we never switch + * from ExecMergeNotMatched to ExecMergeMatched, there is no risk of a + * livelock. + */ + if (matched) + matched = ExecMergeMatched(mtstate, estate, slot, junkfilter, tupleid); + + /* + * Either we were dealing with a NOT MATCHED tuple or ExecMergeNotMatched() + * returned "false", indicating the previously MATCHED tuple is no longer a + * matching tuple. + */ + if (!matched) + ExecMergeNotMatched(mtstate, estate, slot); So what happens if there's a concurrent insertion of a potentially matching tuple? FWIW, I'd re-order this file so this routine is above ExecMergeMatched(), ExecMergeNotMatched(), easier to understand. +static bool +ExecMergeMatched(ModifyTableState *mtstate, EState *estate, + TupleTableSlot *slot, JunkFilter *junkfilter, + ItemPointer tupleid) +{ + ExprContext *econtext = mtstate->ps.ps_ExprContext; + bool isNull; + List *mergeMatchedActionStates = NIL; + HeapUpdateFailureData hufd; + bool tuple_updated, + tuple_deleted; + Buffer buffer; + HeapTupleData tuple; + EPQState *epqstate = &mtstate->mt_epqstate; + ResultRelInfo *saved_resultRelInfo; + ResultRelInfo *resultRelInfo = estate->es_result_relation_info; + ListCell *l; + TupleTableSlot *saved_slot = slot; + + if (mtstate->mt_partition_tuple_routing) + { + Datum datum; + Oid tableoid = InvalidOid; + int leaf_part_index; + PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing; + + /* + * In case of partitioned table, we fetch the tableoid while performing + * MATCHED MERGE action. + */ + datum = ExecGetJunkAttribute(slot, junkfilter->jf_otherJunkAttNo, + &isNull); + Assert(!isNull); + tableoid = DatumGetObjectId(datum); + + /* + * If we're dealing with a MATCHED tuple, then tableoid must have been + * set correctly. In case of partitioned table, we must now fetch the + * correct result relation corresponding to the child table emitting + * the matching target row. For normal table, there is just one result + * relation and it must be the one emitting the matching row. + */ + leaf_part_index = ExecFindPartitionByOid(proute, tableoid); + + resultRelInfo = proute->partitions[leaf_part_index]; + if (resultRelInfo == NULL) + { + resultRelInfo = ExecInitPartitionInfo(mtstate, + mtstate->resultRelInfo, + proute, estate, leaf_part_index); + Assert(resultRelInfo != NULL); + } + } + + /* + * Save the current information and work with the correct result relation. + */ + saved_resultRelInfo = resultRelInfo; + estate->es_result_relation_info = resultRelInfo; + + /* + * And get the correct action lists. + */ + mergeMatchedActionStates = + resultRelInfo->ri_mergeState->matchedActionStates; + + /* + * If there are not WHEN MATCHED actions, we are done. + */ + if (mergeMatchedActionStates == NIL) + return true; Maybe I'm confused, but why is mergeMatchedActionStates attached to per-partition info? How can this differ in number between partitions, requiring us to re-check it below fetching the partition info above? + /* + * Check for any concurrent update/delete operation which may have + * prevented our update/delete. We also check for situations where we + * might be trying to update/delete the same tuple twice. + */ + if ((action->commandType == CMD_UPDATE && !tuple_updated) || + (action->commandType == CMD_DELETE && !tuple_deleted)) + + { + switch (hufd.result) + { + case HeapTupleMayBeUpdated: + break; + case HeapTupleInvisible: + + /* + * This state should never be reached since the underlying + * JOIN runs with a MVCC snapshot and should only return + * rows visible to us. + */ Given EPQ, that reasoning isn't correct. I think this should still be unreachable, just not for the reason described here. + case HeapTupleUpdated: + + /* + * The target tuple was concurrently updated/deleted by + * some other transaction. + * + * If the current tuple is that last tuple in the update + * chain, then we know that the tuple was concurrently + * deleted. Just return and let the caller try NOT MATCHED + * actions. + * + * If the current tuple was concurrently updated, then we + * must run the EvalPlanQual() with the new version of the + * tuple. If EvalPlanQual() does not return a tuple then + * we switch to the NOT MATCHED list of actions. + * If it does return a tuple and the join qual is + * still satisfied, then we just need to recheck the + * MATCHED actions, starting from the top, and execute the + * first qualifying action. + */ + if (!ItemPointerEquals(tupleid, &hufd.ctid)) + { + TupleTableSlot *epqslot; + + /* + * Since we generate a JOIN query with a target table + * RTE different than the result relation RTE, we must + * pass in the RTI of the relation used in the join + * query and not the one from result relation. + */ + Assert(resultRelInfo->ri_mergeTargetRTI > 0); + epqslot = EvalPlanQual(estate, + epqstate, + resultRelInfo->ri_RelationDesc, + GetEPQRangeTableIndex(resultRelInfo), + LockTupleExclusive, + &hufd.ctid, + hufd.xmax); + + if (!TupIsNull(epqslot)) + { + (void) ExecGetJunkAttribute(epqslot, + resultRelInfo->ri_junkFilter->jf_junkAttNo, + &isNull); + + /* + * A non-NULL ctid means that we are still dealing + * with MATCHED case. But we must retry from the + * start with the updated tuple to ensure that the + * first qualifying WHEN MATCHED action is + * executed. + * + * We don't use the new slot returned by + * EvalPlanQual because we anyways re-install the + * new target tuple in econtext->ecxt_scantuple + * before re-evaluating WHEN AND conditions and + * re-projecting the update targetlists. The + * source side tuple does not change and hence we + * can safely continue to use the old slot. + */ + if (!isNull) + { + /* + * Must update *tupleid to the TID of the + * newer tuple found in the update chain. + */ + *tupleid = hufd.ctid; + ReleaseBuffer(buffer); + goto lmerge_matched; It seems fairly bad architecturally to me that we now have EvalPlanQual() loops in both this routine *and* ExecUpdate()/ExecDelete(). + +/* + * Execute the first qualifying NOT MATCHED action. + */ +static void +ExecMergeNotMatched(ModifyTableState *mtstate, EState *estate, + TupleTableSlot *slot) +{ + PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing; + ExprContext *econtext = mtstate->ps.ps_ExprContext; + List *mergeNotMatchedActionStates = NIL; + ResultRelInfo *resultRelInfo; + ListCell *l; + TupleTableSlot *myslot; + + /* + * We are dealing with NOT MATCHED tuple. Since for MERGE, partition tree *the partition tree + * is not expanded for the result relation, we continue to work with the + * currently active result relation, which should be of the root of the + * partition tree. + */ + resultRelInfo = mtstate->resultRelInfo; "should be"? "is", I hope? Given that it's referencing mtstate->resultRelInfo which is only set in one place... +/* flags for mt_merge_subcommands */ +#define MERGE_INSERT 0x01 +#define MERGE_UPDATE 0x02 +#define MERGE_DELETE 0x04 Hm, it seems a bit weird to define these file-local, given there's another file implementing a good chunk of merge. * ExecWithCheckOptions() will skip any WCOs which are not of the kind @@ -617,10 +627,19 @@ ExecInsert(ModifyTableState *mtstate, * passed to foreign table triggers; it is NULL when the foreign * table has no relevant triggers. * + * MERGE passes actionState of the action it's currently executing; + * regular DELETE passes NULL. This is used by ExecDelete to know if it's + * being called from MERGE or regular DELETE operation. + * + * If the DELETE fails because the tuple is concurrently updated/deleted + * by this or some other transaction, hufdp is filled with the reason as + * well as other important information. Currently only MERGE needs this + * information. + * * Returns RETURNING result if any, otherwise NULL. * ---------------------------------------------------------------- */ -static TupleTableSlot * +TupleTableSlot * ExecDelete(ModifyTableState *mtstate, ItemPointer tupleid, HeapTuple oldtuple, @@ -629,6 +648,8 @@ ExecDelete(ModifyTableState *mtstate, EState *estate, bool *tupleDeleted, bool processReturning, + HeapUpdateFailureData *hufdp, + MergeActionState *actionState, bool canSetTag) { ResultRelInfo *resultRelInfo; @@ -641,6 +662,14 @@ ExecDelete(ModifyTableState *mtstate, if (tupleDeleted) *tupleDeleted = false; + /* + * Initialize hufdp. Since the caller is only interested in the failure + * status, initialize with the state that is used to indicate successful + * operation. + */ + if (hufdp) + hufdp->result = HeapTupleMayBeUpdated; + This signals for me that the addition of the result field to HUFD wasn't architecturally the right thing. HUFD is supposed to be supposed to be returned by heap_update(), reusing and setting it from other places seems like a layering violation to me. diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index 69dd327f0c9..cd540a0df5b 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -851,6 +851,60 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset) fix_scan_list(root, splan->exclRelTlist, rtoffset); } + /* + * The MERGE produces the target rows by performing a right s/the MERGE/the MERGE statement/? + * join between the target relation and the source relation + * (which could be a plain relation or a subquery). The INSERT + * and UPDATE actions of the MERGE requires access to the same. +opt_and_condition: Renaming this to be a bit more merge specific seems like a good idea. +/*------------------------------------------------------------------------- + * + * parse_merge.c + * handle merge-statement in parser +/* + * Special handling for MERGE statement is required because we assemble + * the query manually. This is similar to setTargetTable() followed + * by transformFromClause() but with a few less steps. + * + * Process the FROM clause and add items to the query's range table, + * joinlist, and namespace. + * + * A special targetlist comprising of the columns from the right-subtree of + * the join is populated and returned. Note that when the JoinExpr is + * setup by transformMergeStmt, the left subtree has the target result + * relation and the right subtree has the source relation. + * + * Returns the rangetable index of the target relation. + */ +static int +transformMergeJoinClause(ParseState *pstate, Node *merge, + List **mergeSourceTargetList) +{ + RangeTblEntry *rte, + *rt_rte; + List *namespace; + int rtindex, + rt_rtindex; + Node *n; + int mergeTarget_relation = list_length(pstate->p_rtable) + 1; + Var *var; + TargetEntry *te; + + n = transformFromClauseItem(pstate, merge, + &rte, + &rtindex, + &rt_rte, + &rt_rtindex, + &namespace); + + pstate->p_joinlist = list_make1(n); + + /* + * We created an internal join between the target and the source relation + * to carry out the MERGE actions. Normally such an unaliased join hides + * the joining relations, unless the column references are qualified. + * Also, any unqualified column references are resolved to the Join RTE, if + * there is a matching entry in the targetlist. But the way MERGE + * execution is later setup, we expect all column references to resolve to + * either the source or the target relation. Hence we must not add the + * Join RTE to the namespace. + * + * The last entry must be for the top-level Join RTE. We don't want to + * resolve any references to the Join RTE. So discard that. + * + * We also do not want to resolve any references from the leftside of the + * Join since that corresponds to the target relation. References to the + * columns of the target relation must be resolved from the result + * relation and not the one that is used in the join. So the + * mergeTarget_relation is marked invisible to both qualified as well as + * unqualified references. + */ *Gulp*. I'm extremely doubtful this is a sane approach. At the very least the amount of hackery required to make existing code cope with the approach / duplicate code seems indicative of that. + Assert(list_length(namespace) > 1); + namespace = list_truncate(namespace, list_length(namespace) - 1); + pstate->p_namespace = list_concat(pstate->p_namespace, namespace); + + setNamespaceVisibilityForRTE(pstate->p_namespace, + rt_fetch(mergeTarget_relation, pstate->p_rtable), false, false); + + /* + * Expand the right relation and add its columns to the + * mergeSourceTargetList. Note that the right relation can either be a + * plain relation or a subquery or anything that can have a + * RangeTableEntry. + */ + *mergeSourceTargetList = expandSourceTL(pstate, rt_rte, rt_rtindex); + + /* + * Add a whole-row-Var entry to support references to "source.*". + */ + var = makeWholeRowVar(rt_rte, rt_rtindex, 0, false); + te = makeTargetEntry((Expr *) var, list_length(*mergeSourceTargetList) + 1, + NULL, true); Why can't this properly dealt with by transformWholeRowRef() etc? +/* + * transformMergeStmt - + * transforms a MERGE statement + */ +Query * +transformMergeStmt(ParseState *pstate, MergeStmt *stmt) +{ + Query *qry = makeNode(Query); + ListCell *l; + AclMode targetPerms = ACL_NO_RIGHTS; + bool is_terminal[2]; + JoinExpr *joinexpr; + RangeTblEntry *resultRelRTE, *mergeRelRTE; + + /* There can't be any outer WITH to worry about */ + Assert(pstate->p_ctenamespace == NIL); + + qry->commandType = CMD_MERGE; + + /* + * Check WHEN clauses for permissions and sanity + */ + is_terminal[0] = false; + is_terminal[1] = false; + foreach(l, stmt->mergeActionList) + { + MergeAction *action = (MergeAction *) lfirst(l); + uint when_type = (action->matched ? 0 : 1); + + /* + * Collect action types so we can check Target permissions + */ + switch (action->commandType) + { + case CMD_INSERT: + { + InsertStmt *istmt = (InsertStmt *) action->stmt; + SelectStmt *selectStmt = (SelectStmt *) istmt->selectStmt; + + /* + * The grammar allows attaching ORDER BY, LIMIT, FOR + * UPDATE, or WITH to a VALUES clause and also multiple + * VALUES clauses. If we have any of those, ERROR. + */ + if (selectStmt && (selectStmt->valuesLists == NIL || + selectStmt->sortClause != NIL || + selectStmt->limitOffset != NULL || + selectStmt->limitCount != NULL || + selectStmt->lockingClause != NIL || + selectStmt->withClause != NULL)) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("SELECT not allowed in MERGE INSERT statement"))); + if (selectStmt && list_length(selectStmt->valuesLists) > 1) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("Multiple VALUES clauses not allowed in MERGE INSERT statement"))); Shouldn't this include an error position? + /* + * Construct a query of the form + * SELECT relation.ctid --junk attribute + * ,relation.tableoid --junk attribute + * ,source_relation.<somecols> + * ,relation.<somecols> + * FROM relation RIGHT JOIN source_relation + * ON join_condition; -- no WHERE clause - all conditions are applied in + * executor + * + * stmt->relation is the target relation, given as a RangeVar + * stmt->source_relation is a RangeVar or subquery + * + * We specify the join as a RIGHT JOIN as a simple way of forcing the + * first (larg) RTE to refer to the target table. + * + * The MERGE query's join can be tuned in some cases, see below for these + * special case tweaks. + * + * We set QSRC_PARSER to show query constructed in parse analysis + * + * Note that we have only one Query for a MERGE statement and the planner + * is called only once. That query is executed once to produce our stream + * of candidate change rows, so the query must contain all of the columns + * required by each of the targetlist or conditions for each action. + * + * As top-level statements INSERT, UPDATE and DELETE have a Query, whereas + * with MERGE the individual actions do not require separate planning, + * only different handling in the executor. See nodeModifyTable handling + * of commandType CMD_MERGE. + * + * A sub-query can include the Target, but otherwise the sub-query cannot + * reference the outermost Target table at all. + */ + qry->querySource = QSRC_PARSER; Why is this, and not building a proper executor node for merge that knows how to get the tuples, the right approach? We did a rough equivalent for matview updates, and I think it turned out to be a pretty bad plan. + /* + * XXX MERGE is unsupported in various cases + */ + if (!(pstate->p_target_relation->rd_rel->relkind == RELKIND_RELATION || + pstate->p_target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("MERGE is not supported for this relation type"))); Shouldn't this report what relation type the error talking about? It's not going to necessarily be obvious to the user. Also, errposition etc would be good. + if (pstate->p_target_relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE && + pstate->p_target_relation->rd_rel->relhassubclass) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("MERGE is not supported for relations with inheritance"))); Hm. + if (pstate->p_target_relation->rd_rel->relhasrules) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("MERGE is not supported for relations with rules"))); I guess we can add that later, but it's a bit sad that this won't work against views with INSTEAD OF triggers. + foreach(l, stmt->mergeActionList) + { + MergeAction *action = (MergeAction *) lfirst(l); + + /* + * Set namespace for the specific action. This must be done before + * analyzing the WHEN quals and the action targetlisst. + */ + setNamespaceForMergeAction(pstate, action); + + /* + * Transform the when condition. + * + * Note that these quals are NOT added to the join quals; instead they + * are evaluated separately during execution to decide which of the + * WHEN MATCHED or WHEN NOT MATCHED actions to execute. + */ + action->qual = transformWhereClause(pstate, action->condition, + EXPR_KIND_MERGE_WHEN_AND, "WHEN"); + + /* + * Transform target lists for each INSERT and UPDATE action stmt + */ + switch (action->commandType) + { + case CMD_INSERT: + { + InsertStmt *istmt = (InsertStmt *) action->stmt; + SelectStmt *selectStmt = (SelectStmt *) istmt->selectStmt; + List *exprList = NIL; + ListCell *lc; + RangeTblEntry *rte; + ListCell *icols; + ListCell *attnos; + List *icolumns; + List *attrnos; + + pstate->p_is_insert = true; + + icolumns = checkInsertTargets(pstate, istmt->cols, &attrnos); + Assert(list_length(icolumns) == list_length(attrnos)); + + /* + * Handle INSERT much like in transformInsertStmt + */ + if (selectStmt == NULL) + { + /* + * We have INSERT ... DEFAULT VALUES. We can handle + * this case by emitting an empty targetlist --- all + * columns will be defaulted when the planner expands + * the targetlist. + */ + exprList = NIL; + } + else + { + /* + * Process INSERT ... VALUES with a single VALUES + * sublist. We treat this case separately for + * efficiency. The sublist is just computed directly + * as the Query's targetlist, with no VALUES RTE. So + * it works just like a SELECT without any FROM. + */ + List *valuesLists = selectStmt->valuesLists; + + Assert(list_length(valuesLists) == 1); + Assert(selectStmt->intoClause == NULL); + + /* + * Do basic expression transformation (same as a ROW() + * expr, but allow SetToDefault at top level) + */ + exprList = transformExpressionList(pstate, + (List *) linitial(valuesLists), + EXPR_KIND_VALUES_SINGLE, + true); + + /* Prepare row for assignment to target table */ + exprList = transformInsertRow(pstate, exprList, + istmt->cols, + icolumns, attrnos, + false); + } Can't we handle this with a littlebit less code duplication? diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index 4c0256b18a4..608f50b0616 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -53,23 +53,34 @@ typedef enum LockTupleMode * When heap_update, heap_delete, or heap_lock_tuple fail because the target * tuple is already outdated, they fill in this struct to provide information * to the caller about what happened. + * + * result is the result of HeapTupleSatisfiesUpdate, leading to the failure. + * It's set to HeapTupleMayBeUpdated when there is no failure. + * * ctid is the target's ctid link: it is the same as the target's TID if the * target was deleted, or the location of the replacement tuple if the target * was updated. + * * xmax is the outdating transaction's XID. If the caller wants to visit the * replacement tuple, it must check that this matches before believing the * replacement is really a match. + * * cmax is the outdating command's CID, but only when the failure code is * HeapTupleSelfUpdated (i.e., something in the current transaction outdated * the tuple); otherwise cmax is zero. (We make this restriction because * HeapTupleHeaderGetCmax doesn't work for tuples outdated in other * transactions.) + * + * lockmode is only relevant for callers of heap_update() and is the mode which + * the caller should use in case it needs to lock the updated tuple. */ typedef struct HeapUpdateFailureData { + HTSU_Result result; ItemPointerData ctid; TransactionId xmax; CommandId cmax; + LockTupleMode lockmode; } HeapUpdateFailureData; These new fields seem not really relateto HUFD, but rather just fields the merge code should maintain? Greetings, Andres Freund
On Mon, Apr 2, 2018 at 7:18 PM, Andres Freund <andres@anarazel.de> wrote: > I did a scan through this, as I hadn't been able to keep with the thread > previously. Sorry if some of the things mentioned here have been > discussed previously. I am just reading through the patch in its own > order, so please excuse if there's things I remark on that only later > fully make sense. > > > later update: TL;DR: I don't think the parser / executor implementation > of MERGE is architecturally sound. I think creating hidden joins during > parse-analysis to implement MERGE is a seriously bad idea and it needs > to be replaced by a different executor structure. +1. I continue to have significant misgivings about this. It has many consequences that we know about, and likely quite a few more that we don't. > So what happens if there's a concurrent insertion of a potentially > matching tuple? It's not a special case. In all likelihood, you get a dup violation. This is a behavior that I argued for from an early stage. > It seems fairly bad architecturally to me that we now have > EvalPlanQual() loops in both this routine *and* > ExecUpdate()/ExecDelete(). I think that that makes sense, because ExecMerge() doesn't use any of the EPQ stuff from ExecUpdate() and ExecDelete(). It did at one point, in fact, and it looked quite a lot worse IMV. > *Gulp*. I'm extremely doubtful this is a sane approach. At the very > least the amount of hackery required to make existing code cope with > the approach / duplicate code seems indicative of that. I made a lot of the fact that there are two RTEs for the target, since that has fairly obvious consequences during every stage of query processing. However, I think you're right that the problem is broader than that. > + if (pstate->p_target_relation->rd_rel->relhasrules) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("MERGE is not supported for relations with rules"))); > > I guess we can add that later, but it's a bit sad that this won't work > against views with INSTEAD OF triggers. It also makes it hard to test deparsing support, which actually made it in in the end. That must be untested. -- Peter Geoghegan
On Mon, Apr 2, 2018 at 10:40 PM, Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Apr 2, 2018 at 7:18 PM, Andres Freund <andres@anarazel.de> wrote: >> I did a scan through this, as I hadn't been able to keep with the thread >> previously. Sorry if some of the things mentioned here have been >> discussed previously. I am just reading through the patch in its own >> order, so please excuse if there's things I remark on that only later >> fully make sense. >> >> >> later update: TL;DR: I don't think the parser / executor implementation >> of MERGE is architecturally sound. I think creating hidden joins during >> parse-analysis to implement MERGE is a seriously bad idea and it needs >> to be replaced by a different executor structure. > > +1. I continue to have significant misgivings about this. It has many > consequences that we know about, and likely quite a few more that we > don't. +1. I didn't understand from Peter's earlier comments that we were doing that, and I agree that it isn't a good design choice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Apr 3, 2018 at 8:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Apr 2, 2018 at 10:40 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, Apr 2, 2018 at 7:18 PM, Andres Freund <andres@anarazel.de> wrote:
>> I did a scan through this, as I hadn't been able to keep with the thread
>> previously. Sorry if some of the things mentioned here have been
>> discussed previously. I am just reading through the patch in its own
>> order, so please excuse if there's things I remark on that only later
>> fully make sense.
>>
>>
>> later update: TL;DR: I don't think the parser / executor implementation
>> of MERGE is architecturally sound. I think creating hidden joins during
>> parse-analysis to implement MERGE is a seriously bad idea and it needs
>> to be replaced by a different executor structure.
>
> +1. I continue to have significant misgivings about this. It has many
> consequences that we know about, and likely quite a few more that we
> don't.
+1. I didn't understand from Peter's earlier comments that we were
doing that, and I agree that it isn't a good design choice.
Honestly I don't think Peter ever raised concerns about the join, though I could be missing early discussions when I wasn't paying attention. It's there from day 1. Peter raised concerns about the two RTE stuff which was necessitated when we added support for partitioned table. We discussed that at some length, with your inputs and agreed that it's not necessarily a bad thing and probably the only way to deal with partitioned tables.
Personally, I don't see why an internal join is bad. That's what MERGE is doing anyways, so it closely matches with the overall procedure.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Apr 2, 2018 at 8:11 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > Honestly I don't think Peter ever raised concerns about the join, though I > could be missing early discussions when I wasn't paying attention. It's > there from day 1. Peter raised concerns about the two RTE stuff which was > necessitated when we added support for partitioned table. We discussed that > at some length, with your inputs and agreed that it's not necessarily a bad > thing and probably the only way to deal with partitioned tables. > > Personally, I don't see why an internal join is bad. That's what MERGE is > doing anyways, so it closely matches with the overall procedure. The issue is not that there is a join as such. It's how it's represented in the parser, and how that affects other things. There is a lot of special case logic to make it work. -- Peter Geoghegan
Hi, On 2018-04-02 19:40:12 -0700, Peter Geoghegan wrote: > > So what happens if there's a concurrent insertion of a potentially > > matching tuple? > > It's not a special case. In all likelihood, you get a dup violation. > This is a behavior that I argued for from an early stage. Right. I think that should be mentioned in the comment... Greetings, Andres Freund
Hi Simon, On 03/30/2018 07:10 AM, Simon Riggs wrote: > No problems found, but moving proposed commit to 2 April pm > There is a warning for this, as attached. Best regards, Jesper
Attachment
On Tue, Apr 3, 2018 at 7:48 AM, Andres Freund <andres@anarazel.de> wrote:
@@ -3828,8 +3846,14 @@ struct AfterTriggersTableData
bool before_trig_done; /* did we already queue BS triggers? */
bool after_trig_done; /* did we already queue AS triggers? */
AfterTriggerEventList after_trig_events; /* if so, saved list pointer */
- Tuplestorestate *old_tuplestore; /* "old" transition table, if any */
- Tuplestorestate *new_tuplestore; /* "new" transition table, if any */
+ /* "old" transition table for UPDATE, if any */
+ Tuplestorestate *old_upd_tuplestore;
+ /* "new" transition table for UPDATE, if any */
+ Tuplestorestate *new_upd_tuplestore;
+ /* "old" transition table for DELETE, if any */
+ Tuplestorestate *old_del_tuplestore;
+ /* "new" transition table INSERT, if any */
+ Tuplestorestate *new_ins_tuplestore;
};
A comment somewhere why we have all of these (presumably because they
can now happen in the context of a single statement) would be good.
Done.
@@ -5744,12 +5796,28 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
newtup == NULL));
if (oldtup != NULL &&
- ((event == TRIGGER_EVENT_DELETE && delete_old_table) ||
- (event == TRIGGER_EVENT_UPDATE && update_old_table)))
+ (event == TRIGGER_EVENT_DELETE && delete_old_table))
{
Tuplestorestate *old_tuplestore;
- old_tuplestore = transition_capture->tcs_private->old_tuplestore;
+ old_tuplestore = transition_capture->tcs_private->old_del_tuplestore;
+
+ if (map != NULL)
+ {
+ HeapTuple converted = do_convert_tuple(oldtup, map);
+
+ tuplestore_puttuple(old_tuplestore, converted);
+ pfree(converted);
+ }
+ else
+ tuplestore_puttuple(old_tuplestore, oldtup);
+ }
Very similar code is now repeated four times. Could you abstract this
into a separate function?
Ok. I gave it a try, please check. It's definitely a lot lesser code.
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -37,6 +37,16 @@ the plan tree returns the computed tuples to be updated, plus a "junk"
one. For DELETE, the plan tree need only deliver a CTID column, and the
ModifyTable node visits each of those rows and marks the row deleted.
+MERGE runs one generic plan that returns candidate target rows. Each row
+consists of a super-row that contains all the columns needed by any of the
+individual actions, plus a CTID and a TABLEOID junk columns. The CTID column is
"plus a CTID and a TABLEOID junk columns" sounds a bit awkward?
Changed.
+required to know if a matching target row was found or not and the TABLEOID
+column is needed to find the underlying target partition, in case when the
+target table is a partition table. If the CTID column is set we attempt to
+activate WHEN MATCHED actions, or if it is NULL then we will attempt to
"if it is NULL then" sounds wrong.
Made some adjustments.
+activate WHEN NOT MATCHED actions. Once we know which action is activated we
+form the final result row and apply only those changes.
+
XXX a great deal more documentation needs to be written here...
Are we using tableoids in a similar way in other places?
AFAIK, no.
+/*
+ * Given OID of the partition leaf, return the index of the leaf in the
+ * partition hierarchy.
+ */
+int
+ExecFindPartitionByOid(PartitionTupleRouting *proute, Oid partoid)
+{
+ int i;
+
+ for (i = 0; i < proute->num_partitions; i++)
+ {
+ if (proute->partition_oids[i] == partoid)
+ break;
+ }
+
+ Assert(i < proute->num_partitions);
+ return i;
+}
Shouldn't we at least warn in a comment that this is O(N)? And document
that it does weird stuff if the OID isn't found?
Yeah, added a comment. Also added a ereport(ERROR) if we do not find the partition. There was already an Assert, but may be ERROR is better.
Perhaps just introduce a PARTOID syscache?
Probably as a separate patch. Anything more than a handful partitions is anyways known to be too slow and I doubt this code will add anything material impact to that.
diff --git a/src/backend/executor/nodeMerge.c b/src/backend/executor/nodeMer ge.c
new file mode 100644
index 00000000000..0e0d0795d4d
--- /dev/null
+++ b/src/backend/executor/nodeMerge.c
@@ -0,0 +1,575 @@
+/*--------------------------------------------------------- ----------------
+ *
+ * nodeMerge.c
+ * routines to handle Merge nodes relating to the MERGE command
Isn't this file misnamed and it should be execMerge.c? The rest of the
node*.c files are for nodes that are invoked via execProcnode /
ExecProcNode(). This isn't an actual executor node.
Makes sense. Done. (Now that the patch is committed, I don't know if there would be a rethink about changing file names. May be not, just raising that concern)
Might also be worthwhile to move the merge related init stuff here?
Done.
What's the reasoning behind making this be an anomaluous type of
executor node?
Didn't quite get that. I think naming of the file was bad (fixed now), but I think it's a good idea to move the new code in a new file, from maintainability as well as coverage perspective. If you've something very different in mind, can you explain in more details?
FWIW, I'd re-order this file so this routine is above
ExecMergeMatched(), ExecMergeNotMatched(), easier to understand.
Done.
+ /*
+ * If there are not WHEN MATCHED actions, we are done.
+ */
+ if (mergeMatchedActionStates == NIL)
+ return true;
Maybe I'm confused, but why is mergeMatchedActionStates attached to
per-partition info? How can this differ in number between partitions,
requiring us to re-check it below fetching the partition info above?
Because each partition may have a columns in different order, dropped attributes etc. So we need to give treatment to the quals/targetlists. See ON CONFLICT DO UPDATE for similar code.
+ /*
+ * Check for any concurrent update/delete operation which may have
+ * prevented our update/delete. We also check for situations where we
+ * might be trying to update/delete the same tuple twice.
+ */
+ if ((action->commandType == CMD_UPDATE && !tuple_updated) ||
+ (action->commandType == CMD_DELETE && !tuple_deleted))
+
+ {
+ switch (hufd.result)
+ {
+ case HeapTupleMayBeUpdated:
+ break;
+ case HeapTupleInvisible:
+
+ /*
+ * This state should never be reached since the underlying
+ * JOIN runs with a MVCC snapshot and should only return
+ * rows visible to us.
+ */
Given EPQ, that reasoning isn't correct. I think this should still be
unreachable, just not for the reason described here.
Agree. Updated the comment, but please check if it's satisfactory or you would like to say something more/different.
It seems fairly bad architecturally to me that we now have
EvalPlanQual() loops in both this routine *and*
ExecUpdate()/ExecDelete().
This was done after review by Peter and I think I like the new way too. Also keeps the regular UPDATE/DELETE code paths least changed and let Merge handle concurrency issues specific to it.
+
+/*
+ * Execute the first qualifying NOT MATCHED action.
+ */
+static void
+ExecMergeNotMatched(ModifyTableState *mtstate, EState *estate,
+ TupleTableSlot *slot)
+{
+ PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing;
+ ExprContext *econtext = mtstate->ps.ps_ExprContext;
+ List *mergeNotMatchedActionStates = NIL;
+ ResultRelInfo *resultRelInfo;
+ ListCell *l;
+ TupleTableSlot *myslot;
+
+ /*
+ * We are dealing with NOT MATCHED tuple. Since for MERGE, partition tree
*the partition tree
Fixed.
+ * is not expanded for the result relation, we continue to work with the
+ * currently active result relation, which should be of the root of the
+ * partition tree.
+ */
+ resultRelInfo = mtstate->resultRelInfo;
"should be"? "is", I hope? Given that it's referencing
mtstate->resultRelInfo which is only set in one place...
Yeah, fixed.
+/* flags for mt_merge_subcommands */
+#define MERGE_INSERT 0x01
+#define MERGE_UPDATE 0x02
+#define MERGE_DELETE 0x04
Hm, it seems a bit weird to define these file-local, given there's
another file implementing a good chunk of merge.
Ok. Moved them execMerge.h, which made sense after I moved the initialisation related code to execMerge.c
+ /*
+ * Initialize hufdp. Since the caller is only interested in the failure
+ * status, initialize with the state that is used to indicate successful
+ * operation.
+ */
+ if (hufdp)
+ hufdp->result = HeapTupleMayBeUpdated;
+
This signals for me that the addition of the result field to HUFD wasn't
architecturally the right thing. HUFD is supposed to be supposed to be
returned by heap_update(), reusing and setting it from other places
seems like a layering violation to me.
I am not sure I agree. Sure we can keep adding more parameters to ExecUpdate/ExecDelete and such routines, but I thought passing back all information via a single struct makes more sense.
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/s etrefs.c
index 69dd327f0c9..cd540a0df5b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -851,6 +851,60 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
fix_scan_list(root, splan->exclRelTlist, rtoffset);
}
+ /*
+ * The MERGE produces the target rows by performing a right
s/the MERGE/the MERGE statement/?
Fixed.
+ * join between the target relation and the source relation
+ * (which could be a plain relation or a subquery). The INSERT
+ * and UPDATE actions of the MERGE requires access to the
same.
Fixed.
+opt_and_condition:
Renaming this to be a bit more merge specific seems like a good idea.
Renamed to opt_merge_when_and_conditio n
+
+ /*
+ * Add a whole-row-Var entry to support references to "source.*".
+ */
+ var = makeWholeRowVar(rt_rte, rt_rtindex, 0, false);
+ te = makeTargetEntry((Expr *) var, list_length(*mergeSourceTargetList) + 1,
+ NULL, true);
Why can't this properly dealt with by transformWholeRowRef() etc?
I just followed ON CONFLICT style. May be there is a better way, but not clear how.
+ if (selectStmt && (selectStmt->valuesLists == NIL ||
+ selectStmt->sortClause != NIL ||
+ selectStmt->limitOffset != NULL ||
+ selectStmt->limitCount != NULL ||
+ selectStmt->lockingClause != NIL ||
+ selectStmt->withClause != NULL))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("SELECT not allowed in MERGE INSERT statement")));
+ if (selectStmt && list_length(selectStmt->valuesLists) > 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("Multiple VALUES clauses not allowed in MERGE INSERT statement")));
Shouldn't this include an error position?
I will work on this as a separate follow-up patch. I tried adding location to MergeAction, but that alone is probably not sufficient. So it was turning out a slightly bigger patch than I anticipated.
Why is this, and not building a proper executor node for merge that
knows how to get the tuples, the right approach? We did a rough
equivalent for matview updates, and I think it turned out to be a pretty
bad plan.
I am still not sure why that would be any better. Can you explain in detail what exactly you've in mind and how's that significantly better than what we have today?
+ /*
+ * XXX MERGE is unsupported in various cases
+ */
+ if (!(pstate->p_target_relation->rd_rel->relkind == RELKIND_RELATION ||
+ pstate->p_target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("MERGE is not supported for this relation type")));
Shouldn't this report what relation type the error talking about? It's
not going to necessarily be obvious to the user. Also, errposition etc
would be good.
Hmm, ok. There is getRelationTypeDescription( ) but otherwise I am surprised that I couldn't find a ready way to get string representation of relation kind. Am I missing something? Of course, we can write for the purpose, but wanted to ensure we are not duplicating something already available.
+ /*
+ * Do basic expression transformation (same as a ROW()
+ * expr, but allow SetToDefault at top level)
+ */
+ exprList = transformExpressionList(pstate,
+ (List *) linitial(valuesLists),
+ EXPR_KIND_VALUES_SINGLE,
+ true);
+
+ /* Prepare row for assignment to target table */
+ exprList = transformInsertRow(pstate, exprList,
+ istmt->cols,
+ icolumns, attrnos,
+ false);
+ }
Can't we handle this with a littlebit less code duplication?
Hmm, yeah, that will be good. But given Tom's suggestions on the other thread, I would like to postpone any refactoring here.
typedef struct HeapUpdateFailureData
{
+ HTSU_Result result;
ItemPointerData ctid;
TransactionId xmax;
CommandId cmax;
+ LockTupleMode lockmode;
} HeapUpdateFailureData;
These new fields seem not really relateto HUFD, but rather just fields
the merge code should maintain?
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 5 April 2018 at 07:01, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >> +/* >> + * Given OID of the partition leaf, return the index of the leaf in the >> + * partition hierarchy. >> + */ >> +int >> +ExecFindPartitionByOid(PartitionTupleRouting *proute, Oid partoid) >> +{ >> + int i; >> + >> + for (i = 0; i < proute->num_partitions; i++) >> + { >> + if (proute->partition_oids[i] == partoid) >> + break; >> + } >> + >> + Assert(i < proute->num_partitions); >> + return i; >> +} >> >> Shouldn't we at least warn in a comment that this is O(N)? And document >> that it does weird stuff if the OID isn't found? > > > Yeah, added a comment. Also added a ereport(ERROR) if we do not find the > partition. There was already an Assert, but may be ERROR is better. > >> >> >> Perhaps just introduce a PARTOID syscache? >> > > Probably as a separate patch. Anything more than a handful partitions is > anyways known to be too slow and I doubt this code will add anything > material impact to that. There's a few others trying to change that now, so I think we should consider working on this now. PARTOID syscache sounds like a good approach. >> diff --git a/src/backend/executor/nodeMerge.c >> b/src/backend/executor/nodeMerge.c >> new file mode 100644 >> index 00000000000..0e0d0795d4d >> --- /dev/null >> +++ b/src/backend/executor/nodeMerge.c >> @@ -0,0 +1,575 @@ >> >> +/*------------------------------------------------------------------------- >> + * >> + * nodeMerge.c >> + * routines to handle Merge nodes relating to the MERGE command >> >> Isn't this file misnamed and it should be execMerge.c? The rest of the >> node*.c files are for nodes that are invoked via execProcnode / >> ExecProcNode(). This isn't an actual executor node. > > > Makes sense. Done. (Now that the patch is committed, I don't know if there > would be a rethink about changing file names. May be not, just raising that > concern) My review notes suggest a file called execMerge.c. I didn't spot the filename change. I think it's important to do that because there is no executor node called Merge. That is especially confusing because there *is* an executor node called MergeAppend and we want some cognitive distance between those things. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Simon and Paven, On 04/04/2018 08:46 AM, Jesper Pedersen wrote: > On 03/30/2018 07:10 AM, Simon Riggs wrote: >> No problems found, but moving proposed commit to 2 April pm >> > > There is a warning for this, as attached. > Updated version due to latest refactoring. Best regards, Jesper
Attachment
On 5 April 2018 at 12:38, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > Hi Simon and Paven, > > On 04/04/2018 08:46 AM, Jesper Pedersen wrote: >> >> On 03/30/2018 07:10 AM, Simon Riggs wrote: >>> >>> No problems found, but moving proposed commit to 2 April pm >>> >> >> There is a warning for this, as attached. >> > > Updated version due to latest refactoring. Thanks for your input. Removing that seems to prevent compilation. Did something change in between? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Apr 5, 2018 at 5:08 PM, Jesper Pedersen <jesper.pedersen@redhat.com> wrote:
Hi Simon and Paven,
On 04/04/2018 08:46 AM, Jesper Pedersen wrote:On 03/30/2018 07:10 AM, Simon Riggs wrote:No problems found, but moving proposed commit to 2 April pm
There is a warning for this, as attached.
Updated version due to latest refactoring.
The variable would become unused in non-assert builds. I see that. But simply removing it is not a solution and I don't think the code will compile that way. We should either rewrite that assertion or put it inside a #ifdef ASSERT_CHECKING block or simple remove that assertion because we already check for relkind in parse_merge.c. Will check.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 04/05/2018 07:48 AM, Simon Riggs wrote: >> Updated version due to latest refactoring. > > Thanks for your input. Removing that seems to prevent compilation. > > Did something change in between? > Updated for non-assert build. Best regards, Jesper
Attachment
On 5 April 2018 at 12:56, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > Hi, > > On 04/05/2018 07:48 AM, Simon Riggs wrote: >>> >>> Updated version due to latest refactoring. >> >> >> Thanks for your input. Removing that seems to prevent compilation. >> >> Did something change in between? >> > > Updated for non-assert build. Thanks, pushed. Sorry to have you wait til v3 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> The variable would become unused in non-assert builds. I see that. But simply > removing it is not a solution and I don't think the code will compile that way. > We should either rewrite that assertion or put it inside a #ifdef > ASSERT_CHECKING block or simple remove that assertion because we already check > for relkind in parse_merge.c. Will check. > That is noted by Kyotaro HORIGUCHI https://www.postgresql.org/message-id/20180405.181730.125855581.horiguchi.kyotaro%40lab.ntt.co.jp and his suggestion to use special macro looks better for me: - char relkind; + char relkind PG_USED_FOR_ASSERTS_ONLY; -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Hi, On 04/05/2018 08:04 AM, Simon Riggs wrote: > On 5 April 2018 at 12:56, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: >> Updated for non-assert build. > > Thanks, pushed. Sorry to have you wait til v3 > That patch was a but rushed, and cut off too much. As attached. Best regards, Jesper
Attachment
On 5 April 2018 at 13:19, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > Hi, > > On 04/05/2018 08:04 AM, Simon Riggs wrote: >> >> On 5 April 2018 at 12:56, Jesper Pedersen <jesper.pedersen@redhat.com> >> wrote: >>> >>> Updated for non-assert build. >> >> >> Thanks, pushed. Sorry to have you wait til v3 >> > > That patch was a but rushed, and cut off too much. Yes, noted, already fixed. Thanks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5 April 2018 at 13:18, Teodor Sigaev <teodor@sigaev.ru> wrote: >> The variable would become unused in non-assert builds. I see that. But >> simply removing it is not a solution and I don't think the code will compile >> that way. We should either rewrite that assertion or put it inside a #ifdef >> ASSERT_CHECKING block or simple remove that assertion because we already >> check for relkind in parse_merge.c. Will check. >> > > That is noted by Kyotaro HORIGUCHI > https://www.postgresql.org/message-id/20180405.181730.125855581.horiguchi.kyotaro%40lab.ntt.co.jp > > and his suggestion to use special macro looks better for me: > - char relkind; > + char relkind PG_USED_FOR_ASSERTS_ONLY; Thanks both, I already fixed that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Quick item: parse_clause.h fails cpluspluscheck because it has a C++ keyword as a function argument name: ./src/include/parser/parse_clause.h:26:14: error: expected ‘,’ or ‘...’ before ‘namespace’ List **namespace); ^~~~~~~~~ -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5 April 2018 at 16:09, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Quick item: parse_clause.h fails cpluspluscheck because it has a C++ > keyword as a function argument name: > > ./src/include/parser/parse_clause.h:26:14: error: expected ‘,’ or ‘...’ before ‘namespace’ > List **namespace); > ^~~~~~~~~ How's this as a fix? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Simon Riggs wrote: > On 5 April 2018 at 16:09, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Quick item: parse_clause.h fails cpluspluscheck because it has a C++ > > keyword as a function argument name: > > > > ./src/include/parser/parse_clause.h:26:14: error: expected ‘,’ or ‘...’ before ‘namespace’ > > List **namespace); > > ^~~~~~~~~ > > How's this as a fix? WFM -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5 April 2018 at 17:07, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Simon Riggs wrote: >> On 5 April 2018 at 16:09, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> > Quick item: parse_clause.h fails cpluspluscheck because it has a C++ >> > keyword as a function argument name: >> > >> > ./src/include/parser/parse_clause.h:26:14: error: expected ‘,’ or ‘...’ before ‘namespace’ >> > List **namespace); >> > ^~~~~~~~~ >> >> How's this as a fix? > > WFM Pushed -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote: > > +/*--------------------------------------------------------- > > ---------------- > > + * > > + * nodeMerge.c > > + * routines to handle Merge nodes relating to the MERGE command > > > > Isn't this file misnamed and it should be execMerge.c? The rest of the > > node*.c files are for nodes that are invoked via execProcnode / > > ExecProcNode(). This isn't an actual executor node. > > > > Makes sense. Done. (Now that the patch is committed, I don't know if there > would be a rethink about changing file names. May be not, just raising that > concern) It absolutely definitely needed to be renamed. But that's been done, so ... > > What's the reasoning behind making this be an anomaluous type of > > executor node? > Didn't quite get that. I think naming of the file was bad (fixed now), but > I think it's a good idea to move the new code in a new file, from > maintainability as well as coverage perspective. If you've something very > different in mind, can you explain in more details? Well, it was kinda modeled as an executor node, including the file name. That's somewhat fixed now. I still am extremely suspicious of the codeflow here. My impression is that this simply shouldn't be going through nodeModifyTuple, but be it's own nodeMerge node. The trigger handling would need to be abstraced into execTrigger.c or such to avoid duplication. We're now going from nodeModifyTable.c:ExecModifyTable() into execMerge.c:ExecMerge(), back to nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing things that aren't appropriate for merge, we then pass an actionState, which neuters part of ExecUpdate/Insert(). This is just bad. I think this should be cleaned up before it can be released, which means this feature should be reverted and cleaned up nicely before being re-committed. Otherwise we'll sit on this bad architecture and it'll make future features harder. And if somebody cleans it up Simon will complain that things are needlessly being destabilized (hello xlog.c with it's 1000+ LOC functions). > > + /* > > + * If there are not WHEN MATCHED actions, we are done. > > + */ > > + if (mergeMatchedActionStates == NIL) > > + return true; > > > > Maybe I'm confused, but why is mergeMatchedActionStates attached to > > per-partition info? How can this differ in number between partitions, > > requiring us to re-check it below fetching the partition info above? > > > > > Because each partition may have a columns in different order, dropped > attributes etc. So we need to give treatment to the quals/targetlists. See > ON CONFLICT DO UPDATE for similar code. But the count wouldn't change, no? So we return before building the partition info if there's no MATCHED action? > > It seems fairly bad architecturally to me that we now have > > EvalPlanQual() loops in both this routine *and* > > ExecUpdate()/ExecDelete(). > > > > > This was done after review by Peter and I think I like the new way too. > Also keeps the regular UPDATE/DELETE code paths least changed and let Merge > handle concurrency issues specific to it. It also makes the whole code barely readable. Minimal amount of changes isn't a bad goal, but if the consequence of that is poor layering and repeated code it's bad as well. > > + /* > > + * Initialize hufdp. Since the caller is only interested in the failure > > + * status, initialize with the state that is used to indicate > > successful > > + * operation. > > + */ > > + if (hufdp) > > + hufdp->result = HeapTupleMayBeUpdated; > > + > > > > This signals for me that the addition of the result field to HUFD wasn't > > architecturally the right thing. HUFD is supposed to be supposed to be > > returned by heap_update(), reusing and setting it from other places > > seems like a layering violation to me. > I am not sure I agree. Sure we can keep adding more parameters to > ExecUpdate/ExecDelete and such routines, but I thought passing back all > information via a single struct makes more sense. You can just wrap HUFD in another struct that has the necessary information. > > + > > + /* > > + * Add a whole-row-Var entry to support references to "source.*". > > + */ > > + var = makeWholeRowVar(rt_rte, rt_rtindex, 0, false); > > + te = makeTargetEntry((Expr *) var, list_length(*mergeSourceTargetList) > > + 1, > > + NULL, true); > > > > Why can't this properly dealt with by transformWholeRowRef() etc? > > > > > I just followed ON CONFLICT style. May be there is a better way, but not > clear how. What code are you referring to? > > Why is this, and not building a proper executor node for merge that > > knows how to get the tuples, the right approach? We did a rough > > equivalent for matview updates, and I think it turned out to be a pretty > > bad plan. > > > > > I am still not sure why that would be any better. Can you explain in detail > what exactly you've in mind and how's that significantly better than what > we have today? Because the code flow would be understandable, we'd have proper parser locations, we'd not have to introduce a new query source type, the deparsing would be less painful, we'd not have to optimizer like hijinks in parse analysis etc. In my opinion you're attempting to do planner / optimizer stuff at the parse-analysis level, and that's just not a good idea. Greetings, Andres Freund
On Fri, Apr 6, 2018 at 1:30 AM, Andres Freund <andres@anarazel.de> wrote:
--
My impression is that this simply shouldn't be going through
nodeModifyTuple, but be it's own nodeMerge node. The trigger handling
would need to be abstraced into execTrigger.c or such to avoid
duplication. We're now going from nodeModifyTable.c:ExecModifyTable()
into execMerge.c:ExecMerge(), back to
nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing
things that aren't appropriate for merge, we then pass an actionState,
which neuters part of ExecUpdate/Insert(). This is just bad.
But wouldn't this lead to lot of code duplication? For example, ExecInsert/ExecUpdate does a bunch of supporting work (firing triggers, inserting into indexes just to name a few) that MERGE's INSERT/UPDATE needs as well. Now we can possibly move these support routines to a new file, say execModify.c and then let both Merge as well as ModifyTable node make use of that. But the fact is that ExecInsert/ExecUpdate knows a lot about ModifyTable already. So to separate ExecInsert/ExecUpdate from ModifyTable will require significant refactoring AFAICS. I am not saying we can't do that, but will have it's own consequences.
If we would not have refactored to move ExecMerge and friends to a new file, then it may not have looked so odd (as you describe above). But I think moving the code to a new file was a net improvement. May be we can move ExecInsert/Update etc to a new file as I suggested, but still use the ModifyTable to run Merge. There are many things common between them. ModifyTable executes all DMLs and MERGE is just another DML which can run all three.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/04/06 5:00, Andres Freund wrote: > On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote: >>> + /* >>> + * If there are not WHEN MATCHED actions, we are done. >>> + */ >>> + if (mergeMatchedActionStates == NIL) >>> + return true; >>> >>> Maybe I'm confused, but why is mergeMatchedActionStates attached to >>> per-partition info? How can this differ in number between partitions, >>> requiring us to re-check it below fetching the partition info above? >>> >>> >> Because each partition may have a columns in different order, dropped >> attributes etc. So we need to give treatment to the quals/targetlists. See >> ON CONFLICT DO UPDATE for similar code. > > But the count wouldn't change, no? So we return before building the > partition info if there's no MATCHED action? Yeah, I think we should return at the top if there are no matched actions. With the current code, even if there aren't any matched actions we're doing ExecFindPartitionByOid() and ExecInitPartitionInfo() to only then see in the partition's resultRelInfo that the action states list is empty. I think there should be an Assert where there currently is the check for empty action states list and the check itself should be at the top of the function, as done in the attached. Thanks, Amit
Attachment
On 5 April 2018 at 21:00, Andres Freund <andres@anarazel.de> wrote: > And if somebody cleans it up Simon will > complain that things are needlessly being destabilized (hello xlog.c > with it's 1000+ LOC functions). Taking this comment as a special point... with other points addressed separately. ...If anybody wants to know what I think they can just ask... There are various parts of the code that don't have full test coverage. Changing things there is more dangerous and if its being done for no reason then that is risk for little reward. That doesn't block it, but it does make me think that people could spend their time better - and issue that concerns me also. But that certainly doesn't apply to parts of the code like this where we have full test coverage. It may not even apply to recovery now we have the ability to check in real-time the results of recovery and replication. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Apr 6, 2018 at 1:30 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote:
> > +/*---------------------------------------------------------
> > ----------------
> > + *
> > + * nodeMerge.c
> > + * routines to handle Merge nodes relating to the MERGE command
> >
> > Isn't this file misnamed and it should be execMerge.c? The rest of the
> > node*.c files are for nodes that are invoked via execProcnode /
> > ExecProcNode(). This isn't an actual executor node.
> >
>
> Makes sense. Done. (Now that the patch is committed, I don't know if there
> would be a rethink about changing file names. May be not, just raising that
> concern)
It absolutely definitely needed to be renamed. But that's been done, so
...
> > What's the reasoning behind making this be an anomaluous type of
> > executor node?
> Didn't quite get that. I think naming of the file was bad (fixed now), but
> I think it's a good idea to move the new code in a new file, from
> maintainability as well as coverage perspective. If you've something very
> different in mind, can you explain in more details?
Well, it was kinda modeled as an executor node, including the file
name. That's somewhat fixed now. I still am extremely suspicious of
the codeflow here.
My impression is that this simply shouldn't be going through
nodeModifyTuple, but be it's own nodeMerge node. The trigger handling
would need to be abstraced into execTrigger.c
There is nothing called execTrigger.c. You probably mean creating something new or trigger.c. That being said, I don't think the current code is bad. There is already a switch to handle different command types. We add one more for CMD_MERGE and then fire individual BEFORE/AFTER statement triggers based on what actions MERGE may take. The ROW trigger handling doesn't require any change.
or such to avoid
duplication. We're now going from nodeModifyTable.c:ExecModifyTable()
into execMerge.c:ExecMerge(), back to
nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing
things that aren't appropriate for merge, we then pass an actionState,
which neuters part of ExecUpdate/Insert(). This is just bad.
I have spent most part of today trying to hack around the executor with the goal to do what you're suggesting. Having done so, I must admit I am actually quite puzzled why you think having a new node is going to be any significant improvement.
I first tried to split nodeModifyTable.c into two parts. One that deals with just ModifyTable node (the exported Init/Exec/End functions to start with) and the other which abstracts out the actual Insert/Update/Delete operations. The idea was that the second part shouldn't know anything about ModifyTable and it being just one of the users. I started increasingly disliking the result as I continued hacking. The knowledge of ModifyTable node is quite wide-spread, including execPartition.c and FDW. The ExecInsert/Update, partitioning, ON CONFLICT handling all make use of ModifyTable extensively. While MERGE does not need ON CONFLICT, it needs everything else, even FDW at some point in future. Do you agree that this is a bad choice or am I getting it completely wrong or do you really expect MERGE to completely refactor nodeModifyTable.c, including the partitioning related code?
I gave up on this approach.
I then started working on other possibility where we keep a ModifyTable node inside a MergeTable (that's the name I am using, for now). The paths/plans/executor state gets built that way. So all common members still belong to the ModifyTable (and friends) and we only add MERGE specific information to MergeTable (and friends).
This, at least looks somewhat better. We can have:
- ExecInitMergeTable(): does some basic initialisation of the embedded ModifyTable node so that it can be passed around
- ExecMergeTable(): executes the (only) subplan, fetches tuples, fires BR triggers, does work for handling transition tables (so mostly duplication of what ExecModifyTable does already) and then executes the MERGE actions. It will mostly use the ModifyTable node created during initialisation when calling ExecUpdate/Insert etc
- ExecEndMergeTable(): ends the executor
This is probably far better than the first approach. But is it really a huge improvement over the committed code? Or even an improvement at all?
If that's not what you've in mind, can you please explain in more detail how to you see the final design and how exactly it's better than what we have today? Once there is clarity, I can work on it in a fairly quick manner.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Hi, sqlsmith triggered an assertion with the following MERGE statement against the regression database. Testing was done with master at 039eb6e92f. Backtrace below. regards, Andreas MERGE INTO public.pagg_tab_ml_p3 as target_0 USING public.hash_i4_heap as ref_0 ON target_0.b = ref_0.seqno WHEN MATCHED AND ((select bitcol from public.brintest limit 1 offset 92) > cast(null as "bit")) and (false) THEN UPDATE set b = target_0.b, a = target_0.b WHEN NOT MATCHED AND cast(null as text) ~ cast(nullif(case when cast(null as float8) <= cast(null as float8) then cast(null as text) elsecast(null as text) end , cast(null as text)) as text) THEN DO NOTHING; #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f25474cf42a in __GI_abort () at abort.c:89 #2 0x0000556c14b75bb3 in ExceptionalCondition ( conditionName=conditionName@entry=0x556c14d09bf8 "!(list != ((List *) ((void *)0)))", errorType=errorType@entry=0x556c14bc4dbd "FailedAssertion", fileName=fileName@entry=0x556c14d3022c "list.c", lineNumber=lineNumber@entry=390) at assert.c:54 #3 0x0000556c1495d580 in list_nth_cell (list=<optimized out>, n=<optimized out>) at list.c:390 #4 0x0000556c1495d5d6 in list_nth (list=list@entry=0x0, n=<optimized out>) at list.c:413 #5 0x0000556c14911fa5 in adjust_partition_tlist (tlist=0x0, map=<optimized out>, map=<optimized out>) at execPartition.c:1266 #6 0x0000556c14913049 in ExecInitPartitionInfo (mtstate=mtstate@entry=0x556c16c163e8, resultRelInfo=<optimized out>, proute=proute@entry=0x556c16c29988, estate=estate@entry=0x556c16c15bf8, partidx=0) at execPartition.c:683 #7 0x0000556c1490ff80 in ExecMergeMatched (junkfilter=0x556c16c15bf8, tupleid=0x7ffe8088a10a, slot=0x556c16c22e20, estate=0x556c16c15bf8, mtstate=0x556c16c163e8) at execMerge.c:205 #8 ExecMerge (mtstate=mtstate@entry=0x556c16c163e8, estate=estate@entry=0x556c16c15bf8, slot=slot@entry=0x556c16c22e20, junkfilter=junkfilter@entry=0x556c16c2b730, resultRelInfo=resultRelInfo@entry=0x556c16c15e48) at execMerge.c:127 #9 0x0000556c14933614 in ExecModifyTable (pstate=0x556c16c163e8) at nodeModifyTable.c:2179 #10 0x0000556c1490c0ca in ExecProcNode (node=0x556c16c163e8) at ../../../src/include/executor/executor.h:239 #11 ExecutePlan (execute_once=<optimized out>, dest=0x556c16c111b8, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_MERGE, use_parallel_mode=<optimized out>, planstate=0x556c16c163e8, estate=0x556c16c15bf8) at execMain.c:1729 #12 standard_ExecutorRun (queryDesc=0x556c16c1bce8, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364 #13 0x0000556c14a6ba52 in ProcessQuery (plan=<optimized out>, sourceText=0x556c16b2ac08 "...", params=0x0, queryEnv=0x0, dest=0x556c16c111b8, completionTag=0x7ffe8088a500 "") at pquery.c:161 #14 0x0000556c14a6bceb in PortalRunMulti (portal=portal@entry=0x556c16b96468, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x556c16c111b8, altdest=altdest@entry=0x556c16c111b8, completionTag=completionTag@entry=0x7ffe8088a500 "") at pquery.c:1291 #15 0x0000556c14a6c979 in PortalRun (portal=portal@entry=0x556c16b96468, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x556c16c111b8, altdest=altdest@entry=0x556c16c111b8, completionTag=0x7ffe8088a500 "") at pquery.c:804 #16 0x0000556c14a6859b in exec_simple_query ( query_string=0x556c16b2ac08 "MERGE INTO public.pagg_tab_ml_p3 as target_0\nUSING public.hash_i4_heap as ref_0\nON target_0.b= ref_0.seqno\nWHEN MATCHED AND ((select bitcol from public.brintest limit 1 offset 92)\n > cast(null as \"bi"...)at postgres.c:1121 #17 0x0000556c14a6a341 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x556c16b56ad8, dbname=<optimized out>, username=<optimized out>) at postgres.c:4149 #18 0x0000556c1474eac4 in BackendRun (port=0x556c16b4c030) at postmaster.c:4409 #19 BackendStartup (port=0x556c16b4c030) at postmaster.c:4081 #20 ServerLoop () at postmaster.c:1754 #21 0x0000556c149ec017 in PostmasterMain (argc=3, argv=0x556c16b257d0) at postmaster.c:1362 #22 0x0000556c1475006d in main (argc=3, argv=0x556c16b257d0) at main.c:228
The status of this is quite unclear to me: - There are two email threads and the most recent email is in the original one (https://www.postgresql.org/message-id/flat/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com/);I thinkthe status should be set to "Waiting for Author" but it was reset to "Needs review" by Pavan on 06/19/2018 (based onthe second email thread?) - The patch was reverted in https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=08ea7a2291db21a618d19d612c8060cda68f1892