Thread: hot update doesn't work?

hot update doesn't work?

From
Pavel Stehule
Date:
Hello

I would to repeatably update non indexed column of temp table. I
expected cheap operation, but it isn't true.

p ostgres=# create temp table x(a int primary key, b int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"x_pkey" for table "x"
CREATE TABLE
Time: 363,339 ms
postgres=# insert into x values(1,0);
INSERT 0 1
Time: 11,676 ms
postgres=# create or replace function s1() returns void as $$ begin
for i in 1..100000 loop update x set b = i where a = 1; end loop;
return; end; $$ language plpgsql;
CREATE FUNCTION
Time: 113,052 ms
postgres=# select s1();s1
----

(1 row)

Time: 255223,249 ms
postgres=# select pg_total_relation_size('x');pg_total_relation_size
------------------------               3686400
(1 row)

the size of table was significantly increased :(.

what I doing wrong?

Regards
Pavel Stehule


Re: hot update doesn't work?

From
"Kevin Grittner"
Date:
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I would to repeatably update non indexed column of temp table. I
> expected cheap operation, but it isn't true.
You're updating the row 100000 times within a single transaction.  I
don't *think* HOT will reclaim a version of a row until the
transaction which completed it is done and no other transactions can
see that version any longer.  It does raise the question, though --
couldn't a HOT update of a tuple *which was written by the same
transaction* do an "update in place"?  I mean, the updating
transaction doesn't need to see the old row after this, and other
transactions shouldn't see it either.
I suspect that somewhere in the subtransaction or referential
integrity areas there may be some issues with that, but it would be
a clever optimization if it could be pulled off.
-Kevin


Re: hot update doesn't work?

From
Merlin Moncure
Date:
On Wed, May 12, 2010 at 11:34 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>> I would to repeatably update non indexed column of temp table. I
>> expected cheap operation, but it isn't true.
>
> You're updating the row 100000 times within a single transaction.  I
> don't *think* HOT will reclaim a version of a row until the
> transaction which completed it is done and no other transactions can
> see that version any longer.  It does raise the question, though --
> couldn't a HOT update of a tuple *which was written by the same
> transaction* do an "update in place"?  I mean, the updating
> transaction doesn't need to see the old row after this, and other
> transactions shouldn't see it either.
>
> I suspect that somewhere in the subtransaction or referential
> integrity areas there may be some issues with that, but it would be
> a clever optimization if it could be pulled off.

scripting this outside of transaction does not exhibit the behavior --even with autovac off relation size tops out
arond57k.  vacuuming as 
it goes seems to top out around 200 row versions before hot catches
them.  I guess a good way to think of hot is a page level vacuum.

merlin


Re: hot update doesn't work?

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> You're updating the row 100000 times within a single transaction.  I
> don't *think* HOT will reclaim a version of a row until the
> transaction which completed it is done and no other transactions can
> see that version any longer.  It does raise the question, though --
> couldn't a HOT update of a tuple *which was written by the same
> transaction* do an "update in place"?

Well ... in the first place there is not, ever, any such thing as
"update in place".  The correct question to ask is whether we could
vacuum away the older elements of the HOT chain on the grounds that they
are no longer of interest.  What we would see is tuples with xmin equal
to xmax and cmin different from cmax.  The problem then is to determine
whether there are any live snapshots with curcid between cmin and cmax.
There is 0 hope of doing that from outside the originating backend.
Now if heap_page_prune() is being run by the same backend that generated
the in-doubt tuples, which I will agree is likely in a case like this,
in principle we could do it.  Not sure if it's really worth the trouble
and nonorthogonal behavior.
        regards, tom lane


Re: hot update doesn't work?

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> The correct question to ask is whether we could
> vacuum away the older elements of the HOT chain on the grounds that they
> are no longer of interest.  What we would see is tuples with xmin equal
> to xmax and cmin different from cmax.  The problem then is to determine
> whether there are any live snapshots with curcid between cmin and cmax.
> There is 0 hope of doing that from outside the originating backend.
> Now if heap_page_prune() is being run by the same backend that generated
> the in-doubt tuples, which I will agree is likely in a case like this,
> in principle we could do it.

There's an extra hurdle in the way: If you remove tuples in the middle
of an update chain (not necessarily a HOT update chain), the latest
tuple becomes inaccessible to other transactions running in read
committed mode that might need to find the latest version of the row by
following the ctid pointers.

That's not an issue if the row was created in the same transaction too,
but we don't know that in HeapTupleSatisfiesVacuum.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: hot update doesn't work?

From
Merlin Moncure
Date:
On Wed, May 12, 2010 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> You're updating the row 100000 times within a single transaction.  I
>> don't *think* HOT will reclaim a version of a row until the
>> transaction which completed it is done and no other transactions can
>> see that version any longer.  It does raise the question, though --
>> couldn't a HOT update of a tuple *which was written by the same
>> transaction* do an "update in place"?
>
> Well ... in the first place there is not, ever, any such thing as
> "update in place".  The correct question to ask is whether we could
> vacuum away the older elements of the HOT chain on the grounds that they
> are no longer of interest.  What we would see is tuples with xmin equal
> to xmax and cmin different from cmax.  The problem then is to determine
> whether there are any live snapshots with curcid between cmin and cmax.
> There is 0 hope of doing that from outside the originating backend.
> Now if heap_page_prune() is being run by the same backend that generated
> the in-doubt tuples, which I will agree is likely in a case like this,
> in principle we could do it.  Not sure if it's really worth the trouble
> and nonorthogonal behavior.

update of same row in a single transaction is not going to come up
that much and there are a number of simple work arounds to get better
performance.

isn't it possible to skip the snapshot check for temp tables though?

merlin


Re: hot update doesn't work?

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom Lane wrote:
>> The correct question to ask is whether we could
>> vacuum away the older elements of the HOT chain on the grounds that they
>> are no longer of interest.  What we would see is tuples with xmin equal
>> to xmax and cmin different from cmax.  The problem then is to determine
>> whether there are any live snapshots with curcid between cmin and cmax.
>> There is 0 hope of doing that from outside the originating backend.
>> Now if heap_page_prune() is being run by the same backend that generated
>> the in-doubt tuples, which I will agree is likely in a case like this,
>> in principle we could do it.

> There's an extra hurdle in the way: If you remove tuples in the middle
> of an update chain (not necessarily a HOT update chain), the latest
> tuple becomes inaccessible to other transactions running in read
> committed mode that might need to find the latest version of the row by
> following the ctid pointers.

Hm, right.  In the normal vacuum case this isn't an issue because all
older versions of the row must be dead too, and so there can't be anyone
trying to chase up to the chain end from those versions.  But a rule
such as I suggested above could try to remove tuples whose ancestor is
still live (to somebody).
        regards, tom lane


Re: hot update doesn't work?

From
Tom Lane
Date:
Merlin Moncure <mmoncure@gmail.com> writes:
> isn't it possible to skip the snapshot check for temp tables though?

No, it's no different from the regular-table case.  There could be
snapshots that could see the older tuple versions --- consider
functions inside queries, etc.
        regards, tom lane