Thread: Segmentation fault with PG-12
In our production-environment we get sig11 every now and then after upgrading to PG-12:
2019-10-08 15:45:29.654 CEST [8829-76] LOG: server process (PID 20631) was terminated by signal 11: Segmentation fault
2019-10-08 15:45:29.654 CEST [8829-77] DETAIL: Failed process was running: COMMIT
2019-10-08 15:45:29.654 CEST [8829-78] LOG: terminating any other active server processes
2019-10-08 15:45:29.654 CEST [8829-77] DETAIL: Failed process was running: COMMIT
2019-10-08 15:45:29.654 CEST [8829-78] LOG: terminating any other active server processes
Will running a debug-enabled build slow things noticably down?
Is there a way to make it dump a stack-trace (or back-trace in C-land?) on sig11?
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
Andreas Joseph Krogh <andreas@visena.com> writes: > Will running a debug-enabled build slow things noticably down? gcc promises that the generated code is the same with or without debug. I think clang does too. With other compilers you may pay some penalty. > Is there a way > to make it dump a stack-trace (or back-trace in C-land?) on sig11? You should be able to get a core file from which you can extract a stack trace (and other info) after the fact. https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend regards, tom lane
På tirsdag 08. oktober 2019 kl. 17:24:21, skrev Tom Lane <tgl@sss.pgh.pa.us>:
Andreas Joseph Krogh <andreas@visena.com> writes:
> Will running a debug-enabled build slow things noticably down?
gcc promises that the generated code is the same with or without debug.
I think clang does too. With other compilers you may pay some penalty.
Nice, I'm using the ubuntu-packages, so I'll go ahead and install postgresql-12-dbgsym
> Is there a way
> to make it dump a stack-trace (or back-trace in C-land?) on sig11?
You should be able to get a core file from which you can extract a
stack trace (and other info) after the fact.
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
I'll look into that, thanks.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
På tirsdag 08. oktober 2019 kl. 17:24:21, skrev Tom Lane <tgl@sss.pgh.pa.us>:
Andreas Joseph Krogh <andreas@visena.com> writes:
> Will running a debug-enabled build slow things noticably down?
gcc promises that the generated code is the same with or without debug.
I think clang does too. With other compilers you may pay some penalty.
> Is there a way
> to make it dump a stack-trace (or back-trace in C-land?) on sig11?
You should be able to get a core file from which you can extract a
stack trace (and other info) after the fact.
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
regards, tom lane
Attached is output from "bt full".
Is this helpful?
Anything else I can do to help narrowing down the problem?
Thanks.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
Andreas Joseph Krogh <andreas@visena.com> writes: > Attached is output from "bt full". Is this helpful? Well, it shows that the failure is occurring while trying to evaluate a variable in a trigger's WHEN clause during "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id" And I'd bet that the root cause is something to do with Andres' tuple slot work. But (at least to my eye) it's not apparent exactly what's wrong. Can you show us the table definition and associated trigger definitions for origo_email_delivery? This doesn't seem to correlate with your original report, btw, as that claimed the crash was during COMMIT. regards, tom lane
På onsdag 09. oktober 2019 kl. 16:16:37, skrev Tom Lane <tgl@sss.pgh.pa.us>:
Andreas Joseph Krogh <andreas@visena.com> writes:
> Attached is output from "bt full". Is this helpful?
Well, it shows that the failure is occurring while trying to evaluate
a variable in a trigger's WHEN clause during
"UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id"
And I'd bet that the root cause is something to do with Andres' tuple slot
work. But (at least to my eye) it's not apparent exactly what's wrong.
Can you show us the table definition and associated trigger definitions
for origo_email_delivery?
This doesn't seem to correlate with your original report, btw,
as that claimed the crash was during COMMIT.
regards, tom lane
FWIW: It doesn't always happen when that UPDATE-statement is issued, so it's not reproducable. We'll see what the next core-dump gives us.
Is it OK if I send you the table/trigger-definitions off-list?
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
Andreas Joseph Krogh <andreas@visena.com> writes: > Is it OK if I send you the table/trigger-definitions off-list? Sure, but please share with Andres [cc'ed] as well. regards, tom lane
Hi all, I recently started a Masters in Computer Science (and not at the institution in my email address). One of my courses is "Advanced Databases" - yummy I thought - it's not even compulsory for me but I just *_had_* to take this module. The lecturer is a bit of an Oracle fan-boy (ACE director no less... hmmm...) and I want(ed) - becoming less enthusiasic by the minute - to do my dissertation with him. So, we're having a chat and I make plain my love of good 'ol PostgreSQL as my RDBMS of choice and he tells me that there are problems with random block corruption with PostgreSQL. I said "really" and before that conversation could go any further, another student came over and asked a question. So, I toddled off and did some research - I had heard something about this before (vague fuzzy memories) of a problem with the Linux kernel so I searched for a bit and duly dug up a couple of pages https://lwn.net/Articles/752063/ : PostgreSQL's fsync() surprise - and https://news.ycombinator.com/item?id=19238121 : Linux Fsync Issue for Buffered IO and Its Preliminary Fix for PostgreSQL So, this week I go back to my lecturer and say, yep, there was some issue but it was a Linux kernel problem and not PostgreSQL's fault and has been resolved. He tells me that he knew about that but that there was another issue (he had "spoken to people" at meetings!). I said "well, why isn't it fixed?" and he replied "where's the impetus?" to which I responded (quite shocked at this stage) something like "well, I know that the core team values correctness very highly" to which he came back with "yes, but they have no commercial imperative to fix anything - they have to wait until somebody is capable enough and interested enough to do the work". He then muttered something about this mysterious flaw having been fixed in EnterpriseDB. At this point, I lost interest. Having lurked on lists and going by my general "gut feeling" - if there was a serious issue causing irrecoverable block corruption, I'm pretty sure that it would be "all hands on deck" until this problem had been solved and "nice-to-haves" (GENERATED AS... for example) would have been parked till then. Now, I have four questions: 1) Is my lecturer full of it or does he really have a point? 2) The actual concrete acknowledged problem with fsync that affected PostgreSQL - why didn't it affect Oracle? Or MySQL? Or did it but it was so rare that it never became apparent - it wasn't that obvious with PostgreSQL either - one of those rare and intermittent problems? 3) Were there ever any problems with BSD? 4) What is the OS of choice for *_serious_* PostgreSQL installations? I hope that I have been clear, but should anyone require any clarification, please don't hesitate to ask me. Tia and rgs, Pól...
Greetings, First off- please try to craft a new email in the future rather than respond to an existing one. You may not realize this but there's some headers that get copied when you do a reply that cause the email to show up as being a reply, even if you remove all the "obvious" bits from it. * Pól Ua Laoínecháin (linehanp@tcd.ie) wrote: > 1) Is my lecturer full of it or does he really have a point? He's full of it, as far as I can tell anyway, based on what you've shared with us. Just look at the committers and the commit history to PostgreSQL, and look at who the largest contributors are and who they work for. That alone might be enough to surprise your lecturer with. > 2) The actual concrete acknowledged problem with fsync that affected > PostgreSQL - why didn't it affect Oracle? Or MySQL? Or did it but it > was so rare that it never became apparent - it wasn't that obvious > with PostgreSQL either - one of those rare and intermittent problems? Databases that do direct I/O don't depend on fsync. That said, I do think this could have been an issue for Oracle if you ran it without direct i/o. > 3) Were there ever any problems with BSD? As I understand it, no. > 4) What is the OS of choice for *_serious_* PostgreSQL installations? BSD and Linux are both quite popular platforms for running PG, and people run very serious workloads on both. Thanks, Stephen
Attachment
Assuming you're not a troll ...
On Wed, 2019-10-09 at 20:06 +0100, Pól Ua Laoínecháin wrote:
1) Is my lecturer full of it or does he really have a point?
He's more than full of it. PostgreSQL has had a few bugs over the year that could have resulted in data corruption, but they're pretty rare and fixed as soon as they're found. PostgreSQL is the most reliable software I run, and virtually the only major piece I don't hesitate to upgrade without waiting to see what bugs other people find first.
4) What is the OS of choice for *_serious_* PostgreSQL installations?
That's a religious question, not a technical question. I think even Microsoft makes a decent server OS nowadays. But I expect a large majority of PostgreSQL installations are running on Linux, as are the vast majority of all server apps nowadays. Having said that, I don't run a "serious" PostgreSQL installation; some of the people here run databases that do tens of thousands of TPS and hold many TiB of data. You'd have to ask them I guess.
On 09/10/2019 20:45, Alan Hodgson wrote: > Assuming you're not a troll ... > > On Wed, 2019-10-09 at 20:06 +0100, Pól Ua Laoínecháin wrote: >> 1) Is my lecturer full of it or does he really have a point? > He's more than full of it. PostgreSQL has had a few bugs over the year > that could have resulted in data corruption, but they're pretty rare > and fixed as soon as they're found. PostgreSQL is the most reliable > software I run, and virtually the only major piece I don't hesitate to > upgrade without waiting to see what bugs other people find first. >> 4) What is the OS of choice for *_serious_* PostgreSQL installations? > That's a religious question, not a technical question. I think even > Microsoft makes a decent server OS nowadays. But I expect a large > majority of PostgreSQL installations are running on Linux, as are the > vast majority of all server apps nowadays. Having said that, I don't > run a "serious" PostgreSQL installation; some of the people here run > databases that do tens of thousands of TPS and hold many TiB of data. > You'd have to ask them I guess. Pretty sure if you run through Oracle's back catalogue you'll find a similar data corruption for an entirely unrelated reason. Just guessing but chances are.... Tim Clarke Telephone: Witham: +44(0)1376 503500 | London: +44 (0)20 3009 0853 | Frankfurt: +49 (0)69 7191 6000 | Hong Kong: +852 58031687 | Toronto: +1 647 503 2848 Web: https://www.manifest.co.uk/ Minerva Analytics Ltd - A Solactive Company 9 Freebournes Court | Newland Street | Witham | Essex | CM8 2BL | United Kingdom ________________________________ Copyright: This e-mail may contain confidential or legally privileged information. If you are not the named addressee youmust not use or disclose such information, instead please report it to admin@minerva.info<mailto:admin@minerva.info> Legal: Minerva Analytics is the trading name of: Minerva Analytics Ltd: Registered in England Number 11260966 & The ManifestVoting Agency Ltd: Registered in England Number 2920820 Registered Office at above address. Please Click Here https://www.manifest.co.uk/legal/for further information.
Hi, and thanks for responding, > First off- please try to craft a new email in the future... My apologies to you and the group - I'll do that in future. > > 1) Is my lecturer full of it or does he really have a point? > He's full of it, as far as I can tell anyway, based on what you've > shared with us. Just look at the committers and the commit history to > PostgreSQL, and look at who the largest contributors are and who they > work for. That alone might be enough to surprise your lecturer with. The only non-PostgreSQL company that I could find was Fujitisu - where can I find a (list of) the others? > Databases that do direct I/O don't depend on fsync. That said, I do > think this could have been an issue for Oracle if you ran it without > direct i/o. I think that Oracle are big into asyncio? I know that you have to sudo dnf install some_library with a name like asio/asyncio or something like that? Anyway, why doesn't PostgreSQL use Direct I/O? Thanks again and rgs, Pól... > Stephen
On 2019-10-09 10:16:37 -0400, Tom Lane wrote: > Andreas Joseph Krogh <andreas@visena.com> writes: > > Attached is output from "bt full". Is this helpful? > > Well, it shows that the failure is occurring while trying to evaluate > a variable in a trigger's WHEN clause during > "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id" > And I'd bet that the root cause is something to do with Andres' tuple slot > work. But (at least to my eye) it's not apparent exactly what's wrong. It looks like this could "just" be another report of #16036, which was already fixed in: commit d986d4e87f61c68f52c68ebc274960dc664b7b4e Author: Andres Freund <andres@anarazel.de> Date: 2019-10-04 11:59:34 -0700 Fix crash caused by EPQ happening with a before update trigger present. > This doesn't seem to correlate with your original report, btw, > as that claimed the crash was during COMMIT. That however, would be confusing, unless there's some deferred trigger that causes another update, which then fires a before update trigger causing the problem. Greetings, Andres Freund
På torsdag 10. oktober 2019 kl. 07:25:26, skrev Andres Freund <andres@anarazel.de>:
On 2019-10-09 10:16:37 -0400, Tom Lane wrote:
> Andreas Joseph Krogh <andreas@visena.com> writes:
> > Attached is output from "bt full". Is this helpful?
>
> Well, it shows that the failure is occurring while trying to evaluate
> a variable in a trigger's WHEN clause during
> "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id"
> And I'd bet that the root cause is something to do with Andres' tuple slot
> work. But (at least to my eye) it's not apparent exactly what's wrong.
It looks like this could "just" be another report of #16036, which was
already fixed in:
commit d986d4e87f61c68f52c68ebc274960dc664b7b4e
Author: Andres Freund <andres@anarazel.de>
Date: 2019-10-04 11:59:34 -0700
Fix crash caused by EPQ happening with a before update trigger present.
(Tom: This mail is only viewable as text/html, to if you're reading the text/plain version it will seem "hashed")
Aha, that whould be 60e97d63e5d19098e11fa32431a20eea820e2ae9 in REL_12_STABLE
We'll build and run HEAD of REL_12_STABLE, and report back.
> This doesn't seem to correlate with your original report, btw,
> as that claimed the crash was during COMMIT.
That however, would be confusing, unless there's some deferred trigger
that causes another update, which then fires a before update trigger
causing the problem.
Greetings,
Andres Freund
We have a deferred trigger which updates origo_email_delivery:
CREATE OR REPLACE FUNCTION origo_index_email_props_tf() RETURNS TRIGGER AS $$ declare v_prop origo_email_message_property; BEGIN v_prop := NEW; UPDATE origo_email_delivery SET is_seen = v_prop.is_seen, followup_id = v_prop.followup_id, is_replied = v_prop.is_replied, is_forwarded = v_prop.is_forwarded, is_draft = v_prop.is_draft, is_done = v_prop.is_done, is_flagged = v_prop.is_flagged, modseq = greatest(modseq, v_prop.modseq) WHERE message_id = v_prop.message_id AND owner_id = v_prop.owner_id; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE CONSTRAINT TRIGGER origo_index_email_props_t AFTER INSERT OR UPDATE ON origo_email_message_property DEFERRABLE INITIALLY DEFERRED FOR EACH ROW EXECUTE PROCEDURE origo_index_email_props_tf();
.. and then trigger the following UPDATE-trigger:
CREATE TRIGGER origo_email_delivery_update_t BEFORE UPDATE ON origo_email_delivery FOR EACH ROW WHEN (OLD.folder_id <> NEW.folder_id OR NEW.is_deleted <> OLD.is_deleted) EXECUTE PROCEDURE origo_email_delivery_update_tf();
Maybe that will trigger the bug.
Thanks.
--
Attachment
On 2019-10-10 09:05:06 +0200, Andreas Joseph Krogh wrote: > (Tom: This mail is only viewable as text/html, to if you're reading the > text/plain version it will seem "hashed") I'm totally not ok with that btw.
> > Now, I have four questions: > > 1) Is my lecturer full of it or does he really have a point? Hallo Pol, i don't know, a also teaching a databases, personally i never experienced this sometimes people are really a fan of certain product, sometimes in combination with the thought that all other products are bad; i don't know if this is the case, you could compare it with soccer, a barcalona fan will never become a real madrid fan and vice versa; so "rational" decisions (at first, based on some reasoning) tend to get loaded emotional feelings. in these kind of discussions both parties should have there say, not just one (in this case you, we haven't heard the teacher) there is no such thing as a perfect dbms, nor postgres, nor oracle, nor .., > > 2) The actual concrete acknowledged problem with fsync that affected > PostgreSQL - why didn't it affect Oracle? Or MySQL? Or did it but it > was so rare that it never became apparent - it wasn't that obvious > with PostgreSQL either - one of those rare and intermittent problems? you can set fsync to off (not default), for more performance, but it comes with the cost of D in ACID, you no longer have it > > 3) Were there ever any problems with BSD? as far as i understand BSD and variants are very solid, so good for server use, not for desktop > > 4) What is the OS of choice for *_serious_* PostgreSQL installations? it depends, if transparancy is important to you, choose an opensource os if time is important to you, choose what you already know if you are willing to spent time on it, i would personally choose a linux, bsd or solaris based os if it helps: i my case, i had to make a decision about the dbms for the classes as well, the reasons i have choosen postgres are in a nutshell: * free * open * runs good on servers that are comparable with an average desktop pc or better * close to ISO sql standard (the reason why i didn't choose mysql/now mariadb) * seems to have a future -* within all these, postgres seems to have implemented most features * after using it for a while (18 years now), i should now add: a great community some links: https://www.top500.org/ https://en.wikipedia.org/wiki/DB-Engines_ranking https://db-engines.com/en/ranking > > I hope that I have been clear, but should anyone require any > clarification, please don't hesitate to ask me. > > Tia and rgs, > > Pól... > > -- mvg, Wim Bertels -- Lector UC Leuven-Limburg -- The human race has one really effective weapon, and that is laughter. -- Mark Twain
On Thu, 10 Oct 2019 at 09:31, Wim Bertels <wim.bertels@ucll.be> wrote: > sometimes people are really a fan of certain product, > sometimes in combination with the thought that all other products are > bad; i don't know if this is the case, you could compare it with > soccer, a barcalona fan will never become a real madrid fan and vice > versa; so "rational" decisions (at first, based on some reasoning) tend > to get loaded emotional feelings. Yeah, this. Bear in mind it's possible that having made a sweeping statement that he cannot back up and that he secretly knows was unfounded, your lecturer will be defensive and uncomfortable. Chances are after your conversation he will have gone away and done the same research you did and may well have modified his opinion but will be too embarrassed to admit that to you. Geoff
We had another crash today, and it appears to be the same:
#0 slot_deform_heap_tuple (natts=26, offp=0x5598eba0b968,
tuple=<optimized out>, slot=0x5598eba0b920)
at ./build/../src/backend/executor/execTuples.c:895
tuple=<optimized out>, slot=0x5598eba0b920)
at ./build/../src/backend/executor/execTuples.c:895
--
Andreas Joseph Krogh
On Wed, 9 Oct 2019 at 20:06, Pól Ua Laoínecháin <linehanp@tcd.ie> wrote:
One of my courses is "Advanced Databases" - yummy I thought - it's not
even compulsory for me but I just *_had_* to take this module. The
lecturer is a bit of an Oracle fan-boy (ACE director no less...
hmmm...) and I want(ed) - becoming less enthusiasic by the minute - to
do my dissertation with him.
It's a good thing that you have the opportunity to do that course and to have an lecturer with strong real-world experience.
So, we're having a chat and I make plain
my love of good 'ol PostgreSQL as my RDBMS of choice and he tells me
that there are problems with random block corruption with PostgreSQL.
I said "really" and before that conversation could go any further,
another student came over and asked a question.
I think its true that there have been reported problems with block corruption with both Oracle and PostgreSQL. The main difference is that the PostgreSQL project is open enough for people to see much of that on public record.
Given the efforts made on resilience and recovery, such as PITR, block checksums, those problems are pretty much solved, based upon a statistically sufficient sample of the real world: 2ndQuadrant customers. Some block-level problems do still recur - the recent fsync() problems were reported by us and have been handled (not resolved), but they were and are rare. There are still many issues of other kinds.
Many Oracle customers I have worked with years ago experienced block corruptions and it was very common to hear the reply "talk to your disk vendor". Those happened probably 20-30 years earlier, so in many cases have now been forgotten. There isn't an easy way to go back and check for trends on that.
Having said all of that, its easy to point at some of these things and use them as FUD - fear, uncertainty and doubt. No doubt unintentionally.
I'd go and learn more from your lecturer. Your disagreement has made you think, so he helped you. Learning from your own mistakes takes longer.
Greetings, * Pól Ua Laoínecháin (linehanp@tcd.ie) wrote: > > > 1) Is my lecturer full of it or does he really have a point? > > > He's full of it, as far as I can tell anyway, based on what you've > > shared with us. Just look at the committers and the commit history to > > PostgreSQL, and look at who the largest contributors are and who they > > work for. That alone might be enough to surprise your lecturer with. > > The only non-PostgreSQL company that I could find was Fujitisu - where > can I find a (list of) the others? Not sure where you were looking... The contributors list is here: https://www.postgresql.org/community/contributors/ The committers list is here: https://wiki.postgresql.org/wiki/Committers The git tree is here: https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary Perhaps not the best stat, but you can view the contributions by committer pretty easily, for 2018, here: https://github.com/postgres/postgres/graphs/contributors?from=2018-01-01&to=2018-12-31&type=c Note that this isn't very representative of the actual authors though- we don't track those in the way git would prefer, instead we note who the author of a given patch was in the commit message itself. > > Databases that do direct I/O don't depend on fsync. That said, I do > > think this could have been an issue for Oracle if you ran it without > > direct i/o. > > I think that Oracle are big into asyncio? I know that you have to sudo > dnf install some_library with a name like asio/asyncio or something > like that? Oracle supports both, but running with direct i/o is pretty popular, yes. > Anyway, why doesn't PostgreSQL use Direct I/O? There's an awful lot that the kernel provides when it comes to things like good read-ahead and dealing with disks and SSDs and such that we (currently, at least) prefer to leverage instead of writing lots of new code to deal with that ourselves, which would be required to use Direct I/O (and not have it be completely terrible performance wise, anyway). The whole issue behind fsync was because our expectation (and POSIX's, if you ask me anyway) was different from what the Linux kernel was providing (specifically, you could end up in a situation where an fsync() call "worked" and didn't return an error, even though there remained pages that were dirty and not written out). Now, this is under other error conditions typically and you'll get messages in the kernel log about such failures usually, so if you're properly monitoring and managing your systems there's a good chance you would have realized there was a problem even though the Linux kernel was telling PG that everything was fine (have backups!!). Thanks, Stephen
Attachment
Andres Freund <andres@anarazel.de> writes: > On 2019-10-09 10:16:37 -0400, Tom Lane wrote: >> Well, it shows that the failure is occurring while trying to evaluate >> a variable in a trigger's WHEN clause during >> "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id" >> And I'd bet that the root cause is something to do with Andres' tuple slot >> work. But (at least to my eye) it's not apparent exactly what's wrong. > It looks like this could "just" be another report of #16036, which was > already fixed in: > commit d986d4e87f61c68f52c68ebc274960dc664b7b4e > Author: Andres Freund <andres@anarazel.de> > Date: 2019-10-04 11:59:34 -0700 > Fix crash caused by EPQ happening with a before update trigger present. Bingo. I can reproduce the crash (using concurrent updates of the same table row, in the schema Andreas sent off-list) on the predecessor of that commit, but on that commit it's fine. Andreas, that's a pretty simple patch if you're in a position to build from source ... regards, tom lane
På torsdag 10. oktober 2019 kl. 21:32:38, skrev Tom Lane <tgl@sss.pgh.pa.us>:
Andres Freund <andres@anarazel.de> writes:
> On 2019-10-09 10:16:37 -0400, Tom Lane wrote:
>> Well, it shows that the failure is occurring while trying to evaluate
>> a variable in a trigger's WHEN clause during
>> "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id"
>> And I'd bet that the root cause is something to do with Andres' tuple slot
>> work. But (at least to my eye) it's not apparent exactly what's wrong.
> It looks like this could "just" be another report of #16036, which was
> already fixed in:
> commit d986d4e87f61c68f52c68ebc274960dc664b7b4e
> Author: Andres Freund <andres@anarazel.de>
> Date: 2019-10-04 11:59:34 -0700
> Fix crash caused by EPQ happening with a before update trigger present.
Bingo. I can reproduce the crash (using concurrent updates of the same
table row, in the schema Andreas sent off-list) on the predecessor of
that commit, but on that commit it's fine.
That's great!
Andreas, that's a pretty simple patch if you're in a position to
build from source ...
regards, tom lane
Yes, we've built a new .deb-package from f224c7c11ea7be2751e3342e11317070ffb5622d in REL_12_STABLE which we'll deploy tonight.
Thanks!
--
Andreas Joseph Krogh
On 2019-10-10 15:32:38 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-10-09 10:16:37 -0400, Tom Lane wrote: > >> Well, it shows that the failure is occurring while trying to evaluate > >> a variable in a trigger's WHEN clause during > >> "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id" > >> And I'd bet that the root cause is something to do with Andres' tuple slot > >> work. But (at least to my eye) it's not apparent exactly what's wrong. > > > It looks like this could "just" be another report of #16036, which was > > already fixed in: > > commit d986d4e87f61c68f52c68ebc274960dc664b7b4e > > Author: Andres Freund <andres@anarazel.de> > > Date: 2019-10-04 11:59:34 -0700 > > Fix crash caused by EPQ happening with a before update trigger present. > > Bingo. I can reproduce the crash (using concurrent updates of the same > table row, in the schema Andreas sent off-list) on the predecessor of > that commit, but on that commit it's fine. Cool, and thanks for checking.
On 10/10/19 11:31 π.μ., Wim Bertels wrote: > >> 3) Were there ever any problems with BSD? > as far as i understand BSD and variants are very solid, > so good for server use, not for desktop Desktop software might be a little old, but that does not mean that this is not usable. I run FreeBSD desktop for ages. > >>
På torsdag 10. oktober 2019 kl. 22:21:13, skrev Andres Freund <andres@anarazel.de>:
On 2019-10-10 15:32:38 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-10-09 10:16:37 -0400, Tom Lane wrote:
> >> Well, it shows that the failure is occurring while trying to evaluate
> >> a variable in a trigger's WHEN clause during
> >> "UPDATE origo_email_delivery SET folder_id=$1, created=$2\nWHERE entity_id IN ($3)\nRETURNING entity_id"
> >> And I'd bet that the root cause is something to do with Andres' tuple slot
> >> work. But (at least to my eye) it's not apparent exactly what's wrong.
>
> > It looks like this could "just" be another report of #16036, which was
> > already fixed in:
> > commit d986d4e87f61c68f52c68ebc274960dc664b7b4e
> > Author: Andres Freund <andres@anarazel.de>
> > Date: 2019-10-04 11:59:34 -0700
> > Fix crash caused by EPQ happening with a before update trigger present.
>
> Bingo. I can reproduce the crash (using concurrent updates of the same
> table row, in the schema Andreas sent off-list) on the predecessor of
> that commit, but on that commit it's fine.
Cool, and thanks for checking.
No crashes in production after deploying the fix.
--
Andreas Joseph Krogh