Thread: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

Architecture: Intel Core 2 Duo
OS: linux-2.6.20-gentoo-r8
Filesystem: ext3
Postgres v8.2.3 compiled with gcc 4.1.1-r3
RAM - 2GB
Shared buffers - 24MB
[All other Postgres configuration parameters are default values]

Problem description:
COPY into temp table fails using a specific combination of
create/insert on temp tables, prepare/commit in subsequent
transactions. The "could not open relation" error occurs reliably.

Steps to reproduce:

Existing schema (scripts to create and populate these tables are
uploaded to http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html
):
In the scenario, the following 4 tables exist already in the database:

postgres=# \d order_detail            Table "public.order_detail"  Column      |            Type
| Modifiers
 
--------------+-----------------------------+-----------order_id        | integer                                  |
notnullitem_id          | integer                                  | not nullorder_date    | timestamp without time
zone |emp_id         | integer                                  |promotion_id | integer
|qty_sold       | integer                                  |unit_price      | bigint
|unit_cost      | bigint                                    |discount       | integer
|customer_id | integer                                  |
 
Indexes:  "order_detail_pkey" PRIMARY KEY, btree (order_id, item_id)

postgres=# select count(*) from order_detail;count
-------34352
(1 row)

postgres=# \d lu_call_ctr     Table "public.lu_call_ctr" Column      |     Type          | Modifiers
-------------+---------------+-----------call_ctr_id    | integer             | not nullcenter_name | character(50)
|region_id    | integer             |manager_id | integer             |country_id   | integer             |dist_ctr_id
| bigint               |
 
Indexes:  "lu_call_ctr_pkey" PRIMARY KEY, btree (call_ctr_id)

postgres=# select count(*) from lu_call_ctr;count
-------   1
(1 row)

postgres=# \d lu_employee              Table "public.lu_employee"   Column        |            Type
   | Modifiers
 
----------------+-----------------------------+-----------emp_id             | integer
|not nullemp_last_name | character(50)                       |emp_first_name | character(50)
|emp_ssn         | character(50)                        |birth_date         | timestamp without time zone  |hire_date
      | timestamp without time zone  |salary              | integer                                  |country_id
|integer                                  |dist_ctr_id        | integer                                  |manager_id
| integer                                  |call_ctr_id        | integer                                  |fte_flag
      | character(50)                        |
 
Indexes:  "lu_employee_pkey" PRIMARY KEY, btree (emp_id)

postgres=# select count(*) from lu_employee;count
-------   2
(1 row)

postgres=# \d city_ctr_sls    Table "public.city_ctr_sls"    Column          |  Type     | Modifiers
------------------+---------+-----------cust_city_id         | integer    | not nullcall_ctr_id           | integer
|not nulltot_dollar_sales  | integer     |tot_unit_sales      | integer    |tot_cost              | integer     |
 
Indexes:  "city_ctr_sls_pkey" PRIMARY KEY, btree (cust_city_id, call_ctr_id)

postgres=# select count(*) from city_ctr_sls;count
------- 548
(1 row)


Given the the data in these base tables, the following set of SQLs
always generates the "Could not open relation" error.
The insert*.log files that we try to COPY into Postgres in the SQLs
below are uploaded as:
http://upload2.net/page/download/gADZqQvOIntLRpI/insert.tgz.html
---------------------------------------------------------------------
-- Note: If the amount of data being inserted is decreased significantly,
-- the error disappears.

-- First transaction
begin transaction;

-- Temp table 1 and insert 1582 records
create temp table temp1
as
select customer_id, emp_id
from order_detail
limit 0;

copy temp1 from '/tmp/relationError/insert_1.log';


-- Create temp table 2 and populate with select.
-- Note: Even changing the order of these columns causes the error to
-- disappear.
create temp table temp2
as
select temp1.customer_id, temp1.emp_id as temp1__emp_id, le.emp_id as le__emp_id, le.emp_last_name, le.emp_first_name,
le.emp_ssn,le.birth_date, le.hire_date, le.salary, le.country_id, le.dist_ctr_id, le.manager_id, le.call_ctr_id,
le.fte_flag
from temp1, lu_employee le
where temp1.emp_id = le.emp_id;


-- Create temp table 3 and insert 13832 records.
create temp table temp3
as
select temp2.call_ctr_id, temp2.temp1__emp_id, temp2.customer_id, temp2.le__emp_id
from temp2
where temp2.temp1__emp_id = temp2.le__emp_id
limit 0 ;

copy temp3 from '/tmp/relationError/insert_2.log';

-- Create temp table 4 and insert 6160 records.
create temp table temp4
as
select lcc.region_id, temp3.customer_id
from lu_call_ctr lcc, temp3
where temp3.call_ctr_id = lcc.call_ctr_id
group by lcc.region_id, temp3.customer_id
limit 0;

copy temp4 from '/tmp/relationError/insert_3.log';

-- Drop the temp tables
drop table temp1 ;
drop table temp2 ;
drop table temp3 ;
drop table temp4 ;


-- 2PC
-- Note: Replace the prepare/commit pair with just a simple commit; and
-- the error goes away.
prepare transaction '260';
commit prepared '260' ;


--Next transaction
begin transaction;

-- Create temp table 5 and try to insert 1582 rows into it.
create temp table temp5
as
select customer_id, emp_id
from order_detail
limit 0;

copy temp5 from '/tmp/relationError/insert_4_crash.log';

-- Should get an error of type:
-- psql:crash.sql:87: ERROR:  could not open relation
1663/16384/16774: No such file or directory
-- CONTEXT:  COPY temp5, line 926: "2929   33"
-- This context is always the same.

rollback;


Observations:
1. The size of the data seems to matters. If the amount of data being
inserted is dropped to just one or two records per table, the error
doesn't happen.
2. The order of columns for the select into temp2 matters. Changing
the order can cause the error to go away.
3. If the prepare/commit is replaced with a "commit;" the error goes away.
4. Removing "temp3" or "temp4" from the transaction causes one run of
the above statements to succeed, but if the sequence is issued in the
same PSQL session, the second one will fail.
5. Given the current dataset, the error always occurs on line 926 of
the COPY (even if the values at line 926 are changed).
6. <tablespace>/<database>/<oid> typically always corresponds to that
of temp2 on my system.

Thanks.
- John
[Resending since didn't see this posted on pgsql-hackers]


John Smith wrote:
> Architecture: Intel Core 2 Duo
> OS: linux-2.6.20-gentoo-r8
> Filesystem: ext3
> Postgres v8.2.3 compiled with gcc 4.1.1-r3
> RAM - 2GB
> Shared buffers - 24MB
> [All other Postgres configuration parameters are default values]
>
> Problem description:
> COPY into temp table fails using a specific combination of
> create/insert on temp tables, prepare/commit in subsequent
> transactions. The "could not open relation" error occurs reliably.
>
> Steps to reproduce:
>
> Existing schema (scripts to create and populate these tables are
> uploaded to http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html
> ):

I can't get that link to work. Can you please email me the files
offlist? Or upload somewhere else if they're too big for email.

> Observations:
> 1. The size of the data seems to matters. If the amount of data being
> inserted is dropped to just one or two records per table, the error
> doesn't happen.
> 2. The order of columns for the select into temp2 matters. Changing
> the order can cause the error to go away.
> 3. If the prepare/commit is replaced with a "commit;" the error goes away.
> 4. Removing "temp3" or "temp4" from the transaction causes one run of
> the above statements to succeed, but if the sequence is issued in the
> same PSQL session, the second one will fail.
> 5. Given the current dataset, the error always occurs on line 926 of
> the COPY (even if the values at line 926 are changed).
> 6. <tablespace>/<database>/<oid> typically always corresponds to that
> of temp2 on my system.

I think I see what's happening here. We have restricted two-phase commit
so that you're not supposed to be able to PREPARE TRANSACTION if the
transaction has touched any temporary tables. That's because the 2nd
phase commit can be performed from another backend, and another backend
can't mess with another backend's temporary tables.

However in this case, where you CREATE and DROP the temporary table in
the same transaction, we don't detect that, and let the PREPARE
TRANSACTION to finish. The detection relies on the lock manager, but
we're not holding any locks on the dropped relation.

I think we could in fact allow CREATE+DROP in same transaction, and
remove the table immediately at PREPARE TRANSACTION, but what happens
right now is that we store the relfilenode of the temp table to the
two-phase state file in pg_twophase, for deletion at COMMIT/ROLLBACK
PREPARED. But we don't store the fact that it's a temporary table, and
therefore we try to unlink it like a normal table, and fail to purge the
temp buffers of that table which causes problems later.

Attached is a simple patch to fix that by disallowing
CREATE+DROP+PREPARE TRANSACTION more reliably. It'd still be nice to
debug the full test case of yours to verify that that's what's
happening, though.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.39
diff -c -r1.39 twophase.c
*** src/backend/access/transam/twophase.c    1 Jan 2008 19:45:48 -0000    1.39
--- src/backend/access/transam/twophase.c    29 Feb 2008 09:58:19 -0000
***************
*** 793,798 ****
--- 793,799 ----
      TransactionId *children;
      RelFileNode *commitrels;
      RelFileNode *abortrels;
+     bool haveTempCommit, haveTempAbort;

      /* Initialize linked list */
      records.head = palloc0(sizeof(XLogRecData));
***************
*** 815,824 ****
      hdr.prepared_at = gxact->prepared_at;
      hdr.owner = gxact->owner;
      hdr.nsubxacts = xactGetCommittedChildren(&children);
!     hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL);
!     hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL);
      StrNCpy(hdr.gid, gxact->gid, GIDSIZE);

      save_state_data(&hdr, sizeof(TwoPhaseFileHeader));

      /* Add the additional info about subxacts and deletable files */
--- 816,830 ----
      hdr.prepared_at = gxact->prepared_at;
      hdr.owner = gxact->owner;
      hdr.nsubxacts = xactGetCommittedChildren(&children);
!     hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL, &haveTempCommit);
!     hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL, &haveTempAbort);
      StrNCpy(hdr.gid, gxact->gid, GIDSIZE);

+     if (haveTempCommit || haveTempAbort)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
+
      save_state_data(&hdr, sizeof(TwoPhaseFileHeader));

      /* Add the additional info about subxacts and deletable files */
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.257
diff -c -r1.257 xact.c
*** src/backend/access/transam/xact.c    15 Jan 2008 18:56:59 -0000    1.257
--- src/backend/access/transam/xact.c    29 Feb 2008 09:48:52 -0000
***************
*** 802,808 ****
      TransactionId *children;

      /* Get data needed for commit record */
!     nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp);
      nchildren = xactGetCommittedChildren(&children);

      /*
--- 802,808 ----
      TransactionId *children;

      /* Get data needed for commit record */
!     nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /*
***************
*** 1174,1180 ****
               xid);

      /* Fetch the data we need for the abort record */
!     nrels = smgrGetPendingDeletes(false, &rels, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /* XXX do we really need a critical section here? */
--- 1174,1180 ----
               xid);

      /* Fetch the data we need for the abort record */
!     nrels = smgrGetPendingDeletes(false, &rels, NULL, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /* XXX do we really need a critical section here? */
Index: src/backend/storage/smgr/smgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/smgr.c,v
retrieving revision 1.109
diff -c -r1.109 smgr.c
*** src/backend/storage/smgr/smgr.c    1 Jan 2008 19:45:52 -0000    1.109
--- src/backend/storage/smgr/smgr.c    29 Feb 2008 09:56:37 -0000
***************
*** 678,689 ****
   *
   * If haveNonTemp isn't NULL, the bool it points to gets set to true if
   * there is any non-temp table pending to be deleted; false if not.
   *
   * Note that the list does not include anything scheduled for termination
   * by upper-level transactions.
   */
  int
! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr, bool *haveNonTemp)
  {
      int            nestLevel = GetCurrentTransactionNestLevel();
      int            nrels;
--- 678,692 ----
   *
   * If haveNonTemp isn't NULL, the bool it points to gets set to true if
   * there is any non-temp table pending to be deleted; false if not.
+  * haveTemp is similar, but gets set if there is any temp table deletions
+  * pending.
   *
   * Note that the list does not include anything scheduled for termination
   * by upper-level transactions.
   */
  int
! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr,
!                       bool *haveNonTemp, bool *haveTemp)
  {
      int            nestLevel = GetCurrentTransactionNestLevel();
      int            nrels;
***************
*** 693,698 ****
--- 696,703 ----
      nrels = 0;
      if (haveNonTemp)
          *haveNonTemp = false;
+     if (haveTemp)
+         *haveTemp = false;
      for (pending = pendingDeletes; pending != NULL; pending = pending->next)
      {
          if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit)
***************
*** 711,716 ****
--- 716,723 ----
              *rptr++ = pending->relnode;
          if (haveNonTemp && !pending->isTemp)
              *haveNonTemp = true;
+         if (haveTemp && pending->isTemp)
+             *haveTemp = true;
      }
      return nrels;
  }

Heikki Linnakangas wrote:
> Attached is a simple patch to fix that by disallowing
> CREATE+DROP+PREPARE TRANSACTION more reliably.

That patch was missing changes to header files. New patch attached.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.39
diff -c -r1.39 twophase.c
*** src/backend/access/transam/twophase.c    1 Jan 2008 19:45:48 -0000    1.39
--- src/backend/access/transam/twophase.c    29 Feb 2008 13:05:24 -0000
***************
*** 793,798 ****
--- 793,799 ----
      TransactionId *children;
      RelFileNode *commitrels;
      RelFileNode *abortrels;
+     bool haveTempCommit, haveTempAbort;

      /* Initialize linked list */
      records.head = palloc0(sizeof(XLogRecData));
***************
*** 815,824 ****
      hdr.prepared_at = gxact->prepared_at;
      hdr.owner = gxact->owner;
      hdr.nsubxacts = xactGetCommittedChildren(&children);
!     hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL);
!     hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL);
      StrNCpy(hdr.gid, gxact->gid, GIDSIZE);

      save_state_data(&hdr, sizeof(TwoPhaseFileHeader));

      /* Add the additional info about subxacts and deletable files */
--- 816,830 ----
      hdr.prepared_at = gxact->prepared_at;
      hdr.owner = gxact->owner;
      hdr.nsubxacts = xactGetCommittedChildren(&children);
!     hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL, &haveTempCommit);
!     hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL, &haveTempAbort);
      StrNCpy(hdr.gid, gxact->gid, GIDSIZE);

+     if (haveTempCommit || haveTempAbort)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
+
      save_state_data(&hdr, sizeof(TwoPhaseFileHeader));

      /* Add the additional info about subxacts and deletable files */
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.257
diff -c -r1.257 xact.c
*** src/backend/access/transam/xact.c    15 Jan 2008 18:56:59 -0000    1.257
--- src/backend/access/transam/xact.c    29 Feb 2008 13:05:24 -0000
***************
*** 802,808 ****
      TransactionId *children;

      /* Get data needed for commit record */
!     nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp);
      nchildren = xactGetCommittedChildren(&children);

      /*
--- 802,808 ----
      TransactionId *children;

      /* Get data needed for commit record */
!     nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /*
***************
*** 1174,1180 ****
               xid);

      /* Fetch the data we need for the abort record */
!     nrels = smgrGetPendingDeletes(false, &rels, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /* XXX do we really need a critical section here? */
--- 1174,1180 ----
               xid);

      /* Fetch the data we need for the abort record */
!     nrels = smgrGetPendingDeletes(false, &rels, NULL, NULL);
      nchildren = xactGetCommittedChildren(&children);

      /* XXX do we really need a critical section here? */
Index: src/backend/storage/smgr/smgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/smgr.c,v
retrieving revision 1.109
diff -c -r1.109 smgr.c
*** src/backend/storage/smgr/smgr.c    1 Jan 2008 19:45:52 -0000    1.109
--- src/backend/storage/smgr/smgr.c    29 Feb 2008 13:05:24 -0000
***************
*** 678,689 ****
   *
   * If haveNonTemp isn't NULL, the bool it points to gets set to true if
   * there is any non-temp table pending to be deleted; false if not.
   *
   * Note that the list does not include anything scheduled for termination
   * by upper-level transactions.
   */
  int
! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr, bool *haveNonTemp)
  {
      int            nestLevel = GetCurrentTransactionNestLevel();
      int            nrels;
--- 678,692 ----
   *
   * If haveNonTemp isn't NULL, the bool it points to gets set to true if
   * there is any non-temp table pending to be deleted; false if not.
+  * haveTemp is similar, but gets set if there is any temp table deletions
+  * pending.
   *
   * Note that the list does not include anything scheduled for termination
   * by upper-level transactions.
   */
  int
! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr,
!                       bool *haveNonTemp, bool *haveTemp)
  {
      int            nestLevel = GetCurrentTransactionNestLevel();
      int            nrels;
***************
*** 693,698 ****
--- 696,703 ----
      nrels = 0;
      if (haveNonTemp)
          *haveNonTemp = false;
+     if (haveTemp)
+         *haveTemp = false;
      for (pending = pendingDeletes; pending != NULL; pending = pending->next)
      {
          if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit)
***************
*** 711,716 ****
--- 716,723 ----
              *rptr++ = pending->relnode;
          if (haveNonTemp && !pending->isTemp)
              *haveNonTemp = true;
+         if (haveTemp && pending->isTemp)
+             *haveTemp = true;
      }
      return nrels;
  }
Index: src/include/storage/smgr.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/storage/smgr.h,v
retrieving revision 1.62
diff -c -r1.62 smgr.h
*** src/include/storage/smgr.h    1 Jan 2008 19:45:59 -0000    1.62
--- src/include/storage/smgr.h    29 Feb 2008 09:47:24 -0000
***************
*** 77,83 ****
  extern void smgrimmedsync(SMgrRelation reln);
  extern void smgrDoPendingDeletes(bool isCommit);
  extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr,
!                       bool *haveNonTemp);
  extern void AtSubCommit_smgr(void);
  extern void AtSubAbort_smgr(void);
  extern void PostPrepare_smgr(void);
--- 77,83 ----
  extern void smgrimmedsync(SMgrRelation reln);
  extern void smgrDoPendingDeletes(bool isCommit);
  extern int smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr,
!                       bool *haveNonTemp, bool *haveTemp);
  extern void AtSubCommit_smgr(void);
  extern void AtSubAbort_smgr(void);
  extern void PostPrepare_smgr(void);

Plausible theory, and nice explanation....<br /><br />Try the following link (I had to wait for 50 sec for the link to
appear,but I guess the trade-off of getting knowledge in return is worth it :) )<br /><br /><a
href="http://www5.upload2.net/download/77fa86e16a02e52fd5439c76e148d231/47c7fdce/rfsLfnuVlYjEcCJ/basetables.tgz">http://www5.upload2.net/download/77fa86e16a02e52fd5439c76e148d231/47c7fdce/rfsLfnuVlYjEcCJ/basetables.tgz</a><br
/><br/>Not sending attachment in this mail; that may cause the mail to be not accepted by the list. I will try to send
theattachment in the next mail, to retain it in the mailing list for historica purposes.<br /><br />Thanks and best
regards,<br/> -- <br />gurjeet[.singh]@EnterpriseDB.com<br />singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo
}.com<br/><br />EnterpriseDB      <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />17°
29'34.37"N,  78° 30' 59.76"E - Hyderabad<br /> 18° 32' 57.25"N,  73° 56' 25.42"E - Pune *<br />37° 47' 19.72"N, 122°
24'1.69" W - San Francisco<br /><br /><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br
/>Mailsent from my BlackLaptop device <br /><br /><div class="gmail_quote">On Fri, Feb 29, 2008 at 3:32 PM, Heikki
Linnakangas<<a href="mailto:heikki@enterprisedb.com">heikki@enterprisedb.com</a>> wrote:<br /><blockquote
class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
1ex;"><divclass="Ih2E3d">John Smith wrote:<br /> > Architecture: Intel Core 2 Duo<br /> > OS:
linux-2.6.20-gentoo-r8<br/> > Filesystem: ext3<br /> > Postgres v8.2.3 compiled with gcc 4.1.1-r3<br /> > RAM
-2GB<br /> > Shared buffers - 24MB<br /> > [All other Postgres configuration parameters are default values]<br />
><br/> > Problem description:<br /> > COPY into temp table fails using a specific combination of<br /> >
create/inserton temp tables, prepare/commit in subsequent<br /> > transactions. The "could not open relation" error
occursreliably.<br /> ><br /> > Steps to reproduce:<br /> ><br /> > Existing schema (scripts to create and
populatethese tables are<br /> > uploaded to <a
href="http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html"
target="_blank">http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html</a><br/> > ):<br /><br
/></div>Ican't get that link to work. Can you please email me the files<br /> offlist? Or upload somewhere else if
they'retoo big for email.<br /><div class="Ih2E3d"><br /> > Observations:<br /> > 1. The size of the data seems
tomatters. If the amount of data being<br /> > inserted is dropped to just one or two records per table, the
error<br/> > doesn't happen.<br /> > 2. The order of columns for the select into temp2 matters. Changing<br />
>the order can cause the error to go away.<br /> > 3. If the prepare/commit is replaced with a "commit;" the
errorgoes away.<br /> > 4. Removing "temp3" or "temp4" from the transaction causes one run of<br /> > the above
statementsto succeed, but if the sequence is issued in the<br /> > same PSQL session, the second one will fail.<br
/>> 5. Given the current dataset, the error always occurs on line 926 of<br /> > the COPY (even if the values at
line926 are changed).<br /> > 6. <tablespace>/<database>/<oid> typically always corresponds to
that<br/> > of temp2 on my system.<br /><br /></div>I think I see what's happening here. We have restricted
two-phasecommit<br /> so that you're not supposed to be able to PREPARE TRANSACTION if the<br /> transaction has
touchedany temporary tables. That's because the 2nd<br /> phase commit can be performed from another backend, and
anotherbackend<br /> can't mess with another backend's temporary tables.<br /><br /> However in this case, where you
CREATEand DROP the temporary table in<br /> the same transaction, we don't detect that, and let the PREPARE<br />
TRANSACTIONto finish. The detection relies on the lock manager, but<br /> we're not holding any locks on the dropped
relation.<br/><br /> I think we could in fact allow CREATE+DROP in same transaction, and<br /> remove the table
immediatelyat PREPARE TRANSACTION, but what happens<br /> right now is that we store the relfilenode of the temp table
tothe<br /> two-phase state file in pg_twophase, for deletion at COMMIT/ROLLBACK<br /> PREPARED. But we don't store the
factthat it's a temporary table, and<br /> therefore we try to unlink it like a normal table, and fail to purge the<br
/>temp buffers of that table which causes problems later.<br /><br /> Attached is a simple patch to fix that by
disallowing<br/> CREATE+DROP+PREPARE TRANSACTION more reliably. It'd still be nice to<br /> debug the full test case of
yoursto verify that that's what's<br /> happening, though.<br /><font color="#888888"><br /> --<br />   Heikki
Linnakangas<br/>   EnterpriseDB   <a href="http://www.enterprisedb.com"
target="_blank">http://www.enterprisedb.com</a><br/></font><br /><br /> ---------------------------(end of
broadcast)---------------------------<br/> TIP 3: Have you checked our extensive FAQ?<br /><br />               <a
href="http://www.postgresql.org/docs/faq"target="_blank">http://www.postgresql.org/docs/faq</a><br /><br
/></blockquote></div><br/><br /> 
Gurjeet Singh wrote:
> Plausible theory, and nice explanation....
> 
> Try the following link (I had to wait for 50 sec for the link to appear, but
> I guess the trade-off of getting knowledge in return is worth it :) )
> 
> http://www5.upload2.net/download/77fa86e16a02e52fd5439c76e148d231/47c7fdce/rfsLfnuVlYjEcCJ/basetables.tgz
> 
> Not sending attachment in this mail; that may cause the mail to be not
> accepted by the list. I will try to send the attachment in the next mail, to
> retain it in the mailing list for historica purposes.

Thanks!

As I suspected, what's happening is that buffers for the dropped temp 
table are not cleaned up from the temporary buffer cache as they should 
be. The next time the buffers are needed, on the next COPY, we try to 
write out the buffers make room for new pages, but that fails because 
the file the buffers are related to doesn't exist anymore.

The patch I sent earlier fixes that, by tightening the check that you 
can't operate on temporary tables on 2pc transactions.

If you had a real-world use case for that, sorry :-(. Perhaps we could 
enhance that for 8.4 if there's demand, so that you could CREATE+DROP or 
use ON COMMIT TRUNCATE temp tables in a transaction, though I haven't 
personally planned to work on it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> I think I see what's happening here. We have restricted two-phase commit 
> so that you're not supposed to be able to PREPARE TRANSACTION if the 
> transaction has touched any temporary tables. That's because the 2nd 
> phase commit can be performed from another backend, and another backend 
> can't mess with another backend's temporary tables.

> However in this case, where you CREATE and DROP the temporary table in 
> the same transaction, we don't detect that, and let the PREPARE 
> TRANSACTION to finish. The detection relies on the lock manager, but 
> we're not holding any locks on the dropped relation.

This explanation is nonsense; we certainly *are* holding a lock on any
relation that's about to be dropped.  Experimentation shows that
AccessExclusiveLock is indeed held (you can see it in pg_locks), but
nonetheless the PREPARE doesn't complain.  Did you trace through
exactly why?

I'm dissatisfied with the proposed patch because I'm afraid it's
patching a symptom rather than whatever the real problem is.
        regards, tom lane


I wrote:
> This explanation is nonsense; we certainly *are* holding a lock on any
> relation that's about to be dropped.  Experimentation shows that
> AccessExclusiveLock is indeed held (you can see it in pg_locks), but
> nonetheless the PREPARE doesn't complain.  Did you trace through
> exactly why?

I looked into this, and the problem is in LockTagIsTemp: it checks
whether a lock is on a temp table this way:
           if (isTempOrToastNamespace(get_rel_namespace((Oid) tag->locktag_field2)))               return true;

What is happening is that get_rel_namespace fails to find the syscache
entry for the table's pg_class row (which is expected since it's already
been deleted as far as this transaction is concerned).  It silently
returns InvalidOid, and then isTempOrToastNamespace returns false,
and so we mistakenly conclude the lock is not on a temp object.

This coding had bothered me all along because I didn't care for doing a
lot of syscache lookups during transaction commit, but I hadn't realized
that it was outright wrong.

I think we need some better means of recording whether a lock is on a
temp object.  We could certainly add a flag to the LOCALLOCK struct,
but it's not clear where a clean place to set it would be.  As a rule
we don't yet know when locking a relation whether it's temp or not.
        regards, tom lane


For the list this time.

---------- Forwarded message ----------
From: Gurjeet Singh <singh.gurjeet@gmail.com>
Date: Fri, Feb 29, 2008 at 8:30 PM
Subject: Re: [HACKERS] "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables
To: Heikki Linnakangas <heikki@enterprisedb.com>



On Fri, Feb 29, 2008 at 6:54 PM, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
Gurjeet Singh wrote:
> Try the following link (I had to wait for 50 sec for the link to appear, but
> I guess the trade-off of getting knowledge in return is worth it :) )
>
> http://www5.upload2.net/download/77fa86e16a02e52fd5439c76e148d231/47c7fdce/rfsLfnuVlYjEcCJ/basetables.tgz

Thanks, that works. Can you get a working link / send over the other
file as well, please?

Here you go...

http://www5.upload2.net/download/afc87cfc978f2d68542e0229307829a2/47c818bd/gADZqQvOIntLRpI/insert.tgz

The actual attachment follows in the next mail.


Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB      http://www.enterprisedb.com

17° 29' 34.37"N,  78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N,  73° 56' 25.42"E - Pune *
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco

http://gurjeet.frihost.net

Mail sent from my BlackLaptop device
 



--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB      http://www.enterprisedb.com

17° 29' 34.37"N,  78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N,  73° 56' 25.42"E - Pune *
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco

http://gurjeet.frihost.net

Mail sent from my BlackLaptop device
I wrote:
> I think we need some better means of recording whether a lock is on a
> temp object.  We could certainly add a flag to the LOCALLOCK struct,
> but it's not clear where a clean place to set it would be.  As a rule
> we don't yet know when locking a relation whether it's temp or not.

Actually ... why are we using the lock manager to drive this at all?
Temp-ness of relations is not really something that it has any interest
in.  What if we get rid of LockTagIsTemp altogether, and instead protect
2PC transactions by having a global flag "transactionUsedTempTable"?
We could clear that at transaction start, and conditionally set it in
relation_open, for very little cost.

I think the idea behind the lock-manager approach was to avoid expending
any cycles at all on this consideration if you weren't using 2PC.  But
if we have to take special action to mark locks as temp when they are
taken, we certainly aren't going to beat a simple flag for performance.
        regards, tom lane


Tom Lane wrote:
> I wrote:
>> I think we need some better means of recording whether a lock is on a
>> temp object.  We could certainly add a flag to the LOCALLOCK struct,
>> but it's not clear where a clean place to set it would be.  As a rule
>> we don't yet know when locking a relation whether it's temp or not.
> 
> Actually ... why are we using the lock manager to drive this at all?

Good question. It has always seemed a bit strange to me. The assumption 
that we always hold the lock on temp table until end of transaction, 
while true today, seems weak to me.

> Temp-ness of relations is not really something that it has any interest
> in.  What if we get rid of LockTagIsTemp altogether, and instead protect
> 2PC transactions by having a global flag "transactionUsedTempTable"?
> We could clear that at transaction start, and conditionally set it in
> relation_open, for very little cost.

That certainly seems like the simplest and most robust solution.

There's this corner case where that would behave differently than the 
lock manager approach:

BEGIN;
SAVEPOINT sp;
CREATE TEMP TABLE foo(bar int4);
ROLLBACK TO sp;
PREPARE TRANSACTION 'foo';

The flag would have to be per-subxact to avoid that, though I doubt 
anyone is relying on that behavior.

In the future, it would be nice to relax the restriction on using temp 
rels, though. A flag doesn't lend itself to that easily, but I'm sure 
we'll figure out something if we ever get around to implement that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Heikki Linnakangas escribió:

> In the future, it would be nice to relax the restriction on using temp  
> rels, though. A flag doesn't lend itself to that easily, but I'm sure  
> we'll figure out something if we ever get around to implement that.

I can't recall the rationale for this limitation.  Do we need anything
beyond flushing the table's buffers to disk?  That sounds an easy thing
to implement.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Alvaro Herrera wrote:
> Heikki Linnakangas escribió:
> 
>> In the future, it would be nice to relax the restriction on using temp  
>> rels, though. A flag doesn't lend itself to that easily, but I'm sure  
>> we'll figure out something if we ever get around to implement that.
> 
> I can't recall the rationale for this limitation.  Do we need anything
> beyond flushing the table's buffers to disk?  That sounds an easy thing
> to implement.

See http://archives.postgresql.org/pgsql-hackers/2005-05/msg01223.php

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Tom Lane wrote:
>> Actually ... why are we using the lock manager to drive this at all?

> Good question. It has always seemed a bit strange to me. The assumption 
> that we always hold the lock on temp table until end of transaction, 
> while true today, seems weak to me.

Looking back, I think it was driven by the desire to tie the behavior
directly to things that are going to get persisted, such as locks.
From that standpoint your initial patch to attach a temp-check to
relation-drop 2PC entries would be the right kind of design.  However,
given what we now know about the lock situation, I'd feel uncomfortable
with applying that without also fixing LockTagIsTemp, and right now
that's looking like much more complexity and possible performance
penalty than it's worth.

> In the future, it would be nice to relax the restriction on using temp 
> rels, though. A flag doesn't lend itself to that easily, but I'm sure 
> we'll figure out something if we ever get around to implement that.

Yeah.  As you already noted, there are several other problems that would
have to be dealt with to support that, so we can just leave this as
another one.

Do you want to write up a flag-based patch, or shall I?
        regards, tom lane


Tom Lane wrote:
> Do you want to write up a flag-based patch, or shall I?

I can do that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Mon, Mar 3, 2008 at 8:46 AM, Heikki Linnakangas
<heikki@enterprisedb.com> wrote:
> Tom Lane wrote:
>  > Do you want to write up a flag-based patch, or shall I?
>
>  I can do that.

BTW, I found a easier way of reproducing this (see attached 2pc.sql).
It might help with debugging or verifying a fix/regression.

Thanks for the attention so far, but I have some questions about the issue:

[1] The data file is reported missing in the second transaction only
if the first transaction was ended using PREPARE TRANSACTION. The
error does not show up if a direct COMMIT is performed (commit.sql)
instead of PREPARE TRANSACTION + COMMIT PREPARED. Why is that so?

[2] From all of the discussion here since my first post, I understand
that there are complications for session-level TEMP tables. But is it
easier to support PREPARE TRANSACTION for transactions that create and
drop their TEMP tables, i.e., so that the tables are not session-level
but just transaction-level?

[3] I am not certain how widespread they might be, but I think there
may be some backward compatibility concerns with the patch you are
proposing. On the one hand, the documentation says, "It is not
currently allowed to PREPARE a transaction that has executed any
operations involving temporary tables or created any cursors WITH
HOLD." But temporary tables that are created ON COMMIT DROP are more
like cursors that do not have WITH HOLD specified. So it does not seem
clear from the documentation that PREPARE TRANSACTION is not
supported, and indeed due to the lack of a check in Postgres today, it
seems as though it is supported. Do you think there is a risk in
breaking applications?

Thanks.
- John

Attachment
John Smith wrote:
> BTW, I found a easier way of reproducing this (see attached 2pc.sql).
> It might help with debugging or verifying a fix/regression.

Thanks.

> [1] The data file is reported missing in the second transaction only
> if the first transaction was ended using PREPARE TRANSACTION. The
> error does not show up if a direct COMMIT is performed (commit.sql)
> instead of PREPARE TRANSACTION + COMMIT PREPARED. Why is that so?

On normal COMMIT, all buffers related to the table are flushed from the 
buffer cache, and the file is deleted. On PREPARE TRANSACTION, the 
buffers and the file are not immediately dropped, but the relfilenode (= 
filename) of the file is stored in the two-phase state file. On COMMIT 
PREPARED, the state file is read, the buffers are dropped and the file 
is deleted.

Temporary tables don't use the shared buffer cache, but a backend-local 
buffer cache. In PREPARE TRANSACTION, we don't make any note of which 
tables are temporary, because there shouldn't be any, because we 
should've aborted if you have operated on temporary tables. But as we 
found out, that check in the lock manager isn't working. Therefore when 
COMMIT PREPARED is run, we delete the file, but don't flush the buffers 
from the backend-local temporary buffer cache. The leftover buffers 
cause the "relation not found" error later on, when we try to flush them 
to disk to make room in the cache for other pages.

> [2] From all of the discussion here since my first post, I understand
> that there are complications for session-level TEMP tables. But is it
> easier to support PREPARE TRANSACTION for transactions that create and
> drop their TEMP tables, i.e., so that the tables are not session-level
> but just transaction-level?

Yes, if the table is created and dropped in the same transaction, that 
avoids many of the problems. I think we could get away with dropping the 
buffers, deleting the file, and releasing locks on it immediately at 
PREPARE TRANSACTION in that case. It wouldn't behave exactly the same as 
a normal transaction, though. The lock would be released early, which 
would allow another transaction to create a table with the same name 
before the transaction has been committed/rolled back.

ON COMMIT DELETE ROWS could be handled like that as well.

> [3] I am not certain how widespread they might be, but I think there
> may be some backward compatibility concerns with the patch you are
> proposing. On the one hand, the documentation says, "It is not
> currently allowed to PREPARE a transaction that has executed any
> operations involving temporary tables or created any cursors WITH
> HOLD." But temporary tables that are created ON COMMIT DROP are more
> like cursors that do not have WITH HOLD specified. So it does not seem
> clear from the documentation that PREPARE TRANSACTION is not
> supported, and indeed due to the lack of a check in Postgres today, it
> seems as though it is supported. Do you think there is a risk in
> breaking applications?

Well, the current behavior is certainly broken, so an application 
relying on it is in trouble anyway :-(. Even if we came up with a patch 
for 8.4 to relax the limitation, I doubt it would be safe enough to 
backport to stable branches.

Is your application relying on this? As a workaround, you could use 
non-temporary tables instead.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> Tom Lane wrote:
>>> Actually ... why are we using the lock manager to drive this at all?
>
>> Good question. It has always seemed a bit strange to me. The assumption
>> that we always hold the lock on temp table until end of transaction,
>> while true today, seems weak to me.
>
> Looking back, I think it was driven by the desire to tie the behavior
> directly to things that are going to get persisted, such as locks.
> From that standpoint your initial patch to attach a temp-check to
> relation-drop 2PC entries would be the right kind of design.  However,
> given what we now know about the lock situation, I'd feel uncomfortable
> with applying that without also fixing LockTagIsTemp, and right now
> that's looking like much more complexity and possible performance
> penalty than it's worth.

Looking closer, this actually worked in 8.1, and was broken in 8.2 by
this change:

> date: 2006-07-31 21:09:05 +0100;  author: tgl;  state: Exp;  lines: +167 -48;
> Change the relation_open protocol so that we obtain lock on a relation
> (table or index) before trying to open its relcache entry.  This fixes
> race conditions in which someone else commits a change to the relation's
> catalog entries while we are in process of doing relcache load.  Problems
> of that ilk have been reported sporadically for years, but it was not
> really practical to fix until recently --- for instance, the recent
> addition of WAL-log support for in-place updates helped.
>
> Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
> concurrent update.

Before that, we had an isTempObject flag in LOCALLOCK, which worked even
when the relation was dropped later on, unlike LockTagIsTemp.

Anyway, patches attached, using the global flag approach, for 8.2 and
8.3. As discussed earlier, since the flag is global, we won't allow
PREPARE TRANSACTION if you have operated on a temp table in an aborted
subxact, but I think that's acceptable.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/heap/heapam.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.249
diff -c -r1.249 heapam.c
*** src/backend/access/heap/heapam.c    30 Jan 2008 18:35:55 -0000    1.249
--- src/backend/access/heap/heapam.c    4 Mar 2008 12:43:25 -0000
***************
*** 868,873 ****
--- 868,877 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      pgstat_initstats(r);

      return r;
***************
*** 912,917 ****
--- 916,925 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      pgstat_initstats(r);

      return r;
***************
*** 958,963 ****
--- 966,975 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      pgstat_initstats(r);

      return r;
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.257
diff -c -r1.257 xact.c
*** src/backend/access/transam/xact.c    15 Jan 2008 18:56:59 -0000    1.257
--- src/backend/access/transam/xact.c    4 Mar 2008 12:29:21 -0000
***************
*** 189,194 ****
--- 189,200 ----
  static bool forceSyncCommit = false;

  /*
+  * MyXactAccessedTempRel is set when a temporary relation is accessed. We
+  * don't allow PREPARE TRANSACTION in that case.
+  */
+ bool MyXactAccessedTempRel = false;
+
+ /*
   * Private context for transaction-abort work --- we reserve space for this
   * at startup to ensure that AbortTransaction and AbortSubTransaction can work
   * when we've run out of memory.
***************
*** 1445,1450 ****
--- 1451,1457 ----
      XactIsoLevel = DefaultXactIsoLevel;
      XactReadOnly = DefaultXactReadOnly;
      forceSyncCommit = false;
+     MyXactAccessedTempRel = false;

      /*
       * reinitialize within-transaction counters
***************
*** 1770,1775 ****
--- 1777,1808 ----

      /* NOTIFY and flatfiles will be handled below */

+     /*
+      * Don't allow PREPARE TRANSACTION if we've accessed a temporary table
+      * in this transaction. It's not clear what should happen if a prepared
+      * transaction holds a lock on a temp table, and the original backend
+      * exits and deletes the file, for example. Also, if a temp table is
+      * dropped, we have no way of flushing temp buffers from the backend-
+      * private temp buffer cache of the original backend at COMMIT PREPARED,
+      * since it can be executed in a different backend.
+      *
+      * We need to check this after executing any ON COMMIT actions, because
+      * they might still access a temp relation.
+      *
+      * It would be nice to relax this restriction so that you could operate on
+      * ON COMMIT DELETE ROWS temp tables, or on one created and dropped in the
+      * same transaction, by releasing the locks on it at PREPARE TRANSACTION,
+      * instead of COMMIT PREPARED as usual. Another case we could allow is if
+      * the access to the temp relation happened in a subtransaction that later
+      * rolled back. MyXactAccessedTempRel would needto be per-subxact to track
+      * that, but it doesn't seem worth the effort given how narrow the use
+      * case for that is.
+      */
+     if (MyXactAccessedTempRel)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
+
      /* Prevent cancel/die interrupt while cleaning up */
      HOLD_INTERRUPTS();

Index: src/backend/storage/lmgr/lmgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/lmgr/lmgr.c,v
retrieving revision 1.96
diff -c -r1.96 lmgr.c
*** src/backend/storage/lmgr/lmgr.c    8 Jan 2008 23:18:50 -0000    1.96
--- src/backend/storage/lmgr/lmgr.c    4 Mar 2008 12:13:28 -0000
***************
*** 662,706 ****
      LockRelease(&tag, lockmode, false);
  }

-
- /*
-  * LockTagIsTemp
-  *        Determine whether a locktag is for a lock on a temporary object
-  *
-  * We need this because 2PC cannot deal with temp objects
-  */
- bool
- LockTagIsTemp(const LOCKTAG *tag)
- {
-     switch ((LockTagType) tag->locktag_type)
-     {
-         case LOCKTAG_RELATION:
-         case LOCKTAG_RELATION_EXTEND:
-         case LOCKTAG_PAGE:
-         case LOCKTAG_TUPLE:
-             /* check for lock on a temp relation */
-             /* field1 is dboid, field2 is reloid for all of these */
-             if ((Oid) tag->locktag_field1 == InvalidOid)
-                 return false;    /* shared, so not temp */
-             if (isTempOrToastNamespace(get_rel_namespace((Oid) tag->locktag_field2)))
-                 return true;
-             break;
-         case LOCKTAG_TRANSACTION:
-         case LOCKTAG_VIRTUALTRANSACTION:
-             /* there are no temp transactions */
-             break;
-         case LOCKTAG_OBJECT:
-             /* there are currently no non-table temp objects */
-             break;
-         case LOCKTAG_USERLOCK:
-         case LOCKTAG_ADVISORY:
-             /* assume these aren't temp */
-             break;
-     }
-     return false;                /* default case */
- }
-
-
  /*
   * Append a description of a lockable object to buf.
   *
--- 662,667 ----
Index: src/backend/storage/lmgr/lock.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/lmgr/lock.c,v
retrieving revision 1.181
diff -c -r1.181 lock.c
*** src/backend/storage/lmgr/lock.c    2 Feb 2008 22:26:17 -0000    1.181
--- src/backend/storage/lmgr/lock.c    4 Mar 2008 12:13:09 -0000
***************
*** 1871,1882 ****
                  elog(ERROR, "cannot PREPARE when session locks exist");
          }

-         /* Can't handle it if the lock is on a temporary object */
-         if (LockTagIsTemp(&locallock->tag.lock))
-             ereport(ERROR,
-                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-                      errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
-
          /*
           * Create a 2PC record.
           */
--- 1871,1876 ----
Index: src/include/access/xact.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/xact.h,v
retrieving revision 1.93
diff -c -r1.93 xact.h
*** src/include/access/xact.h    1 Jan 2008 19:45:56 -0000    1.93
--- src/include/access/xact.h    4 Mar 2008 12:27:12 -0000
***************
*** 44,49 ****
--- 44,51 ----
  /* Asynchronous commits */
  extern bool XactSyncCommit;

+ extern bool MyXactAccessedTempRel;
+
  /*
   *    start- and end-of-transaction callbacks for dynamically loaded modules
   */
Index: src/backend/access/heap/heapam.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.222.2.1
diff -c -r1.222.2.1 heapam.c
*** src/backend/access/heap/heapam.c    4 Feb 2007 20:00:49 -0000    1.222.2.1
--- src/backend/access/heap/heapam.c    4 Mar 2008 12:42:05 -0000
***************
*** 699,704 ****
--- 699,708 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      return r;
  }

***************
*** 741,746 ****
--- 745,755 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+
+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      return r;
  }

***************
*** 785,790 ****
--- 794,803 ----
      if (!RelationIsValid(r))
          elog(ERROR, "could not open relation with OID %u", relationId);

+     /* Make note that we've accessed a temporary relation. */
+     if (r->rd_istemp)
+         MyXactAccessedTempRel = true;
+
      return r;
  }

Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.229.2.3
diff -c -r1.229.2.3 xact.c
*** src/backend/access/transam/xact.c    3 Jan 2008 21:23:45 -0000    1.229.2.3
--- src/backend/access/transam/xact.c    4 Mar 2008 12:34:43 -0000
***************
*** 174,179 ****
--- 174,185 ----
  static char *prepareGID;

  /*
+  * MyXactAccessedTempRel is set when a temporary relation is accessed. We
+  * don't allow PREPARE TRANSACTION in that case.
+  */
+ bool MyXactAccessedTempRel = false;
+
+ /*
   * Private context for transaction-abort work --- we reserve space for this
   * at startup to ensure that AbortTransaction and AbortSubTransaction can work
   * when we've run out of memory.
***************
*** 1389,1394 ****
--- 1395,1401 ----
      FreeXactSnapshot();
      XactIsoLevel = DefaultXactIsoLevel;
      XactReadOnly = DefaultXactReadOnly;
+     MyXactAccessedTempRel = false;

      /*
       * reinitialize within-transaction counters
***************
*** 1715,1720 ****
--- 1722,1753 ----

      /* NOTIFY and flatfiles will be handled below */

+     /*
+      * Don't allow PREPARE TRANSACTION if we've accessed a temporary table
+      * in this transaction. It's not clear what should happen if a prepared
+      * transaction holds a lock on a temp table, and the original backend
+      * exits and deletes the file, for example. Also, if a temp table is
+      * dropped, we have no way of flushing temp buffers from the backend-
+      * private temp buffer cache of the original backend at COMMIT PREPARED,
+      * since it can be executed in a different backend.
+      *
+      * We need to check this after executing any ON COMMIT actions, because
+      * they might still access a temp relation.
+      *
+      * It would be nice to relax this restriction so that you could operate on
+      * ON COMMIT DELETE ROWS temp tables, or on one created and dropped in the
+      * same transaction, by releasing the locks on it at PREPARE TRANSACTION,
+      * instead of COMMIT PREPARED as usual. Another case we could allow is if
+      * the access to the temp relation happened in a subtransaction that later
+      * rolled back. MyXactAccessedTempRel would needto be per-subxact to track
+      * that, but it doesn't seem worth the effort given how narrow the use
+      * case for that is.
+      */
+     if (MyXactAccessedTempRel)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
+
      /* Prevent cancel/die interrupt while cleaning up */
      HOLD_INTERRUPTS();

Index: src/backend/storage/lmgr/lmgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/lmgr/lmgr.c,v
retrieving revision 1.89
diff -c -r1.89 lmgr.c
*** src/backend/storage/lmgr/lmgr.c    4 Oct 2006 00:29:57 -0000    1.89
--- src/backend/storage/lmgr/lmgr.c    4 Mar 2008 12:35:00 -0000
***************
*** 598,637 ****

      LockRelease(&tag, lockmode, false);
  }
-
-
- /*
-  * LockTagIsTemp
-  *        Determine whether a locktag is for a lock on a temporary object
-  *
-  * We need this because 2PC cannot deal with temp objects
-  */
- bool
- LockTagIsTemp(const LOCKTAG *tag)
- {
-     switch (tag->locktag_type)
-     {
-         case LOCKTAG_RELATION:
-         case LOCKTAG_RELATION_EXTEND:
-         case LOCKTAG_PAGE:
-         case LOCKTAG_TUPLE:
-             /* check for lock on a temp relation */
-             /* field1 is dboid, field2 is reloid for all of these */
-             if ((Oid) tag->locktag_field1 == InvalidOid)
-                 return false;    /* shared, so not temp */
-             if (isTempNamespace(get_rel_namespace((Oid) tag->locktag_field2)))
-                 return true;
-             break;
-         case LOCKTAG_TRANSACTION:
-             /* there are no temp transactions */
-             break;
-         case LOCKTAG_OBJECT:
-             /* there are currently no non-table temp objects */
-             break;
-         case LOCKTAG_USERLOCK:
-         case LOCKTAG_ADVISORY:
-             /* assume these aren't temp */
-             break;
-     }
-     return false;                /* default case */
- }
--- 598,600 ----
Index: src/backend/storage/lmgr/lock.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/lmgr/lock.c,v
retrieving revision 1.174.2.1
diff -c -r1.174.2.1 lock.c
*** src/backend/storage/lmgr/lock.c    2 Feb 2008 22:26:23 -0000    1.174.2.1
--- src/backend/storage/lmgr/lock.c    4 Mar 2008 12:32:30 -0000
***************
*** 1849,1860 ****
                  elog(ERROR, "cannot PREPARE when session locks exist");
          }

-         /* Can't handle it if the lock is on a temporary object */
-         if (LockTagIsTemp(&locallock->tag.lock))
-             ereport(ERROR,
-                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-                      errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
-
          /*
           * Create a 2PC record.
           */
--- 1849,1854 ----
Index: src/include/access/xact.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/xact.h,v
retrieving revision 1.83
diff -c -r1.83 xact.h
*** src/include/access/xact.h    11 Jul 2006 18:26:11 -0000    1.83
--- src/include/access/xact.h    4 Mar 2008 12:32:30 -0000
***************
*** 41,46 ****
--- 41,48 ----
  extern bool DefaultXactReadOnly;
  extern bool XactReadOnly;

+ extern bool MyXactAccessedTempRel;
+
  /*
   *    start- and end-of-transaction callbacks for dynamically loaded modules
   */

"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Tom Lane wrote:
>> Looking back, I think it was driven by the desire to tie the behavior
>> directly to things that are going to get persisted, such as locks.
>> From that standpoint your initial patch to attach a temp-check to
>> relation-drop 2PC entries would be the right kind of design.  However,
>> given what we now know about the lock situation, I'd feel uncomfortable
>> with applying that without also fixing LockTagIsTemp, and right now
>> that's looking like much more complexity and possible performance
>> penalty than it's worth.

> Looking closer, this actually worked in 8.1, and was broken in 8.2 by 
> this change:

Argh, so it's actually my bug :-(

> Before that, we had an isTempObject flag in LOCALLOCK, which worked even 
> when the relation was dropped later on, unlike LockTagIsTemp.

Yeah.  I guess my hindbrain was remembering that arrangement, because
adding such a field to LOCALLOCK was what I was first thinking about.
The problem though is that we need to take a lock on a table before
reading its pg_class row, in order to avoid race conditions when the
row is being deleted or updated.  So we can't easily know at the time
the lock is taken whether it's a temp table or not.  (In some contexts
such as the parser we might know which schema the table is in, but
most places are expected to be able to open and lock the table knowing
only its OID.)  I think the only feasible solution like that would
involve calling the lock manager a second time to mark the lock temp
after we'd looked at the pg_class row.  Which is not impossible, but
it would cost an extra hashtable search for each open of a temp table,
and it isn't really buying us anything compared to setting a global
flag in the same places.

> Anyway, patches attached, using the global flag approach, for 8.2 and 
> 8.3. As discussed earlier, since the flag is global, we won't allow 
> PREPARE TRANSACTION if you have operated on a temp table in an aborted 
> subxact, but I think that's acceptable.

Patch looks good in a fast once-over, I'll check it more carefully
and apply today.
        regards, tom lane


"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> John Smith wrote:
>> [3] I am not certain how widespread they might be, but I think there
>> may be some backward compatibility concerns with the patch you are
>> proposing.

> Well, the current behavior is certainly broken, so an application 
> relying on it is in trouble anyway :-(. Even if we came up with a patch 
> for 8.4 to relax the limitation, I doubt it would be safe enough to 
> backport to stable branches.

As Heikki pointed out later, PG 8.1 correctly enforces the restriction
against preparing a transaction that has dropped a temp table.  It's
only 8.2.x and 8.3.0 that (appear to) allow this.  So I'm not persuaded
by backwards-compatibility arguments.

I've applied Heikki's new patch, and I think that's as much as we can do
for 8.2 and 8.3.  Any improvement in the functionality would be new
development (and not trivial development, either) for 8.4 or later.
        regards, tom lane


Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> > John Smith wrote:
> >> [3] I am not certain how widespread they might be, but I think there
> >> may be some backward compatibility concerns with the patch you are
> >> proposing.
> 
> > Well, the current behavior is certainly broken, so an application 
> > relying on it is in trouble anyway :-(. Even if we came up with a patch 
> > for 8.4 to relax the limitation, I doubt it would be safe enough to 
> > backport to stable branches.
> 
> As Heikki pointed out later, PG 8.1 correctly enforces the restriction
> against preparing a transaction that has dropped a temp table.  It's
> only 8.2.x and 8.3.0 that (appear to) allow this.  So I'm not persuaded
> by backwards-compatibility arguments.
> 
> I've applied Heikki's new patch, and I think that's as much as we can do
> for 8.2 and 8.3.  Any improvement in the functionality would be new
> development (and not trivial development, either) for 8.4 or later.

Is there a TODO here?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Bruce Momjian wrote:
> Tom Lane wrote:
>> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>>> John Smith wrote:
>>>> [3] I am not certain how widespread they might be, but I think there
>>>> may be some backward compatibility concerns with the patch you are
>>>> proposing.
>>> Well, the current behavior is certainly broken, so an application 
>>> relying on it is in trouble anyway :-(. Even if we came up with a patch 
>>> for 8.4 to relax the limitation, I doubt it would be safe enough to 
>>> backport to stable branches.
>> As Heikki pointed out later, PG 8.1 correctly enforces the restriction
>> against preparing a transaction that has dropped a temp table.  It's
>> only 8.2.x and 8.3.0 that (appear to) allow this.  So I'm not persuaded
>> by backwards-compatibility arguments.
>>
>> I've applied Heikki's new patch, and I think that's as much as we can do
>> for 8.2 and 8.3.  Any improvement in the functionality would be new
>> development (and not trivial development, either) for 8.4 or later.
> 
> Is there a TODO here?

Yes, please:

"Allow two-phase commit when a temporary table is created and dropped in 
the same transaction, or when an ON COMMIT DELETE ROWS temporary table 
is accessed"

Hmm. If we can do that, I guess we could allow read-only queries on temp 
tables as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Added to TODO:

* Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT
DELETEROWS temporary table is accessed
 
 http://archives.postgresql.org/pgsql-hackers/2008-03/msg00047.php


---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Bruce Momjian wrote:
> > Tom Lane wrote:
> >> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> >>> John Smith wrote:
> >>>> [3] I am not certain how widespread they might be, but I think there
> >>>> may be some backward compatibility concerns with the patch you are
> >>>> proposing.
> >>> Well, the current behavior is certainly broken, so an application 
> >>> relying on it is in trouble anyway :-(. Even if we came up with a patch 
> >>> for 8.4 to relax the limitation, I doubt it would be safe enough to 
> >>> backport to stable branches.
> >> As Heikki pointed out later, PG 8.1 correctly enforces the restriction
> >> against preparing a transaction that has dropped a temp table.  It's
> >> only 8.2.x and 8.3.0 that (appear to) allow this.  So I'm not persuaded
> >> by backwards-compatibility arguments.
> >>
> >> I've applied Heikki's new patch, and I think that's as much as we can do
> >> for 8.2 and 8.3.  Any improvement in the functionality would be new
> >> development (and not trivial development, either) for 8.4 or later.
> > 
> > Is there a TODO here?
> 
> Yes, please:
> 
> "Allow two-phase commit when a temporary table is created and dropped in 
> the same transaction, or when an ON COMMIT DELETE ROWS temporary table 
> is accessed"
> 
> Hmm. If we can do that, I guess we could allow read-only queries on temp 
> tables as well.
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +