Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection - Mailing list pgsql-hackers
From | Catalin Iacob |
---|---|
Subject | Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection |
Date | |
Msg-id | CAHg_5gqJACKP9O3xHWJpzdWiJeB0vo8MMdRQ5mG+uddSPCmoiQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection
|
List | pgsql-hackers |
On Mon, Sep 4, 2017 at 4:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Also, the main thing that we need xact.c's involvement for in the first > place is the fact that implicit transaction blocks, unlike regular ones, > auto-cancel on an error, leaving you outside a block not inside a failed > one. So I don't exactly see how savepoints would fit into that. I think this hits the nail on the head and should have a place in the official docs as I now realize I didn't grasp this distinction before I read this. My mental model was always "sending a bunch of semicolon separated queries without BEGIN/COMMIT/ROLLBACK; in one PQexec is like sending them one by one preceeded by a BEGIN; and followed by a COMMIT; except you only get the response from the last one". Also, explain what happens when there are BEGIN/ROLLBACK/COMMIT inside that multiquery string, that's still not completely clear to me and I don't want to reverse engineer it from your patch. > Now admittedly, the same set of issues pops up if one uses an > explicit transaction block in a multi-query string: > > begin\; insert ...\; savepoint\; insert ...\; release savepoint\; insert ...\; commit; According to my mental model described above, this would be exactly the same as without the begin; and commit; which is not the case so I think the distinction is worth explaining. I think the lack of a more detailed explanation about the stuff above confuses *a lot* of people, especially newcomers, and the confusion is only increased by what client drivers do on top (like issuing implicit BEGIN if configured in various modes specified by language-specific-DB-independent specs like Python's DBAPI or Java's JDBC) and one's background from other DBs that do it differently. Speaking of the above, psql also doesn't explicitly document how it groups lines of the file it's executing into PQexec calls. See below for a personal example of the confusions all this generates. I also encountered this FATAL a month ago in the context of "we have some (migration schema) queries in some files and want to orchestrate running them for testing". Initially we started with calling psql but then we needed some client side logic for some other stuff and switched to Python and Psycopg2. We did "read the whole file in a Python string" and then call Psycopg2's execute() on that string. Note that Psycopg2 only uses PQexec to issue queries. We had some SAVEPOINT statements in the file which lead to the backend stopping and the next Psycopg2 execute() on that connection saying Connection closed. It was already confusing why Psycopg2 behaves differently than psql (because we were issuing the whole file in one PQexec vs. psql splitting on ; and issuing multiple PQexecs and SAVEPOINTs working there) and the backend stopping only added to that confusion. Add on top of that "Should we put BEGIN; and COMMIT; in the file itself? Or is a single Psycopg2 execute() enough to have this schema migration be applied transactionally? Is there a difference between the two?". I searched the docs for existing explanations of multiquery strings and found these references but all of them are a bit hand wavy: - psql's reference explaining -c - libpq's PQexec explanation - the message flow document in the FE/BE protocol description
pgsql-hackers by date: