Thread: Update using primary key slow

Update using primary key slow

From
Denis
Date:
The following update was captured in the database log and the elapsed time
was 1058.956 ms.  A later explain analyze shows total run time of 730 ms.
Although isn't the actual time to update the row 183 ms.  Where is the
other 547 ms coming from?  Updating the two secondary indexes??


Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-1] LOG:  duration:
1058.956 ms  statement: UPDATE CONTRACT SET CUSTOMER_KEY = 143567
,SOURCE_CODE_KEY = null ,PRIOR_CONTRACT =
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-2]
'265985' ,PRIOR_CONTRACT_ELITE = null ,CONTRACT_NEW = 'N' ,RENEWAL_CONTRACT
= '1373990'  ,RENEWAL_CONTRACT_ELITE = null
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-3]  ,CONTRACT_DROPPED =
'N' ,TOTAL_SALE_DOLLARS = 3492.00 ,TARGET_DOLLARS = 3576
,ASSIGN_DOLLARS_OVERRIDE = null ,BOOK_KEY = 160
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-4]  ,PUB_SEQUENCE = 25
,DROP_DATE = null ,OUT_OF_BUSINESS = 'N' ,RENEWAL_SALESREP_KEY = 3639
,SALESREP_KEY = 3639 ,NEW_CATEGORY =
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-5]
'NEW_INSERT' ,PENDING_DELETE_DATE = null ,CLIENT_NAME = null ,DATE_SOLD =
'2004-08-30'  ,DATE_RECEIVED = '2004-09-03'
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-6]  ,DATE_ENTERED =
'2004-09-07'  ,DATE_SHELLED = null  ,DATE_APPROVED = '2004-09-09'  WHERE
REGION_KEY = 14  AND CONTRACT_KEY =
Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-7]  1070135


The explain for this update is as follows.


dw=# begin;
BEGIN
dw=#
dw=# explain analyze UPDATE CONTRACT SET CUSTOMER_KEY = 143567,
SOURCE_CODE_KEY = null ,
dw-# PRIOR_CONTRACT = '265985' ,PRIOR_CONTRACT_ELITE = null ,CONTRACT_NEW =
'N' ,RENEWAL_CONTRACT = '1373990'  ,RENEWAL_CONTRACT_ELITE = null
dw-#  ,CONTRACT_DROPPED = 'N' ,TOTAL_SALE_DOLLARS = 3492.00 ,TARGET_DOLLARS
= 3576 ,ASSIGN_DOLLARS_OVERRIDE = null ,BOOK_KEY = 160
dw-#  ,PUB_SEQUENCE = 25 ,DROP_DATE = null ,OUT_OF_BUSINESS =
'N' ,RENEWAL_SALESREP_KEY = 3639 ,SALESREP_KEY = 3639
dw-#  ,NEW_CATEGORY = 'NEW_INSERT' ,PENDING_DELETE_DATE = null ,CLIENT_NAME
= null ,DATE_SOLD = '2004-08-30'  ,DATE_RECEIVED = '2004-09-03'
dw-#   ,DATE_ENTERED = '2004-09-07'  ,DATE_SHELLED = null  ,DATE_APPROVED =
'2004-09-09'
dw-# WHERE REGION_KEY = 14  AND CONTRACT_KEY = 1070135;
                                                        QUERY PLAN
---------------------------------------------------------------------------
------------------------------------------------
 Index Scan using contract_pkey on contract  (cost=0.00..10.61 rows=3
width=115) (actual time=0.181..0.183 rows=1 loops=1)
   Index Cond: ((contract_key = 1070135) AND (region_key = 14))
 Total runtime: 0.730 ms
(3 rows)

dw=# rollback;
ROLLBACK



Here is the table and index definitions

dw=# \d contract
                         Table "ods.contract"
         Column          |            Type             |   Modifiers
-------------------------+-----------------------------+---------------
 contract_key            | integer                     | not null
 customer_key            | integer                     | not null
 source_code_key         | integer                     |
 prior_contract          | character varying(20)       |
 prior_contract_elite    | character varying(20)       |
 renewal_contract        | character varying(20)       |
 contract_dropped        | character varying(1)        | not null
 renewal_contract_elite  | character varying(20)       |
 total_sale_dollars      | numeric(9,2)                | not null
 assign_dollars_override | numeric(9,2)                |
 target_dollars          | numeric(9,2)                |
 book_key                | integer                     | not null
 pub_sequence            | integer                     |
 drop_date               | timestamp without time zone |
 out_of_business         | character varying(1)        | not null
 salesrep_key            | integer                     |
 renewal_salesrep_key    | integer                     |
 new_category            | character varying(20)       |
 region_key              | integer                     | not null
 contract_new            | character varying(1)        | not null
 pending_delete_date     | timestamp without time zone |
 client_name             | character varying(150)      |
 fuzzy_client_name       | character varying(150)      |
 last_update_date        | timestamp without time zone | default now()
 date_sold               | date                        |
 date_received           | date                        |
 date_entered            | date                        |
 date_shelled            | date                        |
 date_approved           | date                        |
Indexes:
    "contract_pkey" primary key, btree (contract_key, region_key)
    "XIE1_Contract" btree (region_key, book_key, pub_sequence)
    "XIE2_Contract" btree (customer_key, region_key)


The table contains approximately 5 million rows

Postgres version is PostgreSQL 7.4.3 on i686-pc-linux-gnu, compiled by GCC
gcc (GCC) 3.2.2


The machine has 4 Intel Xeon 3.0GHz processors and 3GB of memory

shared_buffers = 16384
sort_mem = 8192
vacuum_mem = 8192
effective_cache_size = 262144



Re: Update using primary key slow

From
Tom Lane
Date:
Denis <denis.sailer@yellowbook.com> writes:
> The following update was captured in the database log and the elapsed time
> was 1058.956 ms.  A later explain analyze shows total run time of 730 ms.
> Although isn't the actual time to update the row 183 ms.  Where is the
> other 547 ms coming from?  Updating the two secondary indexes??

The 183 msec is the time needed to *fetch* the row, not the time to
update it.  So it could well be that the other time is just the time
needed to update the table and indexes.  If this seems slower than
your hardware ought to be able to handle, I'd wonder about how recently
the table has been vacuumed.

            regards, tom lane

Re: Update using primary key slow

From
Denis
Date:
tgl@sss.pgh.pa.us (Tom Lane) wrote in
news:19722.1130429883@sss.pgh.pa.us:

> Denis <denis.sailer@yellowbook.com> writes:
>> The following update was captured in the database log and the elapsed
>> time was 1058.956 ms.  A later explain analyze shows total run time
>> of 730 ms.  Although isn't the actual time to update the row 183 ms.
>> Where is the other 547 ms coming from?  Updating the two secondary
>> indexes??
>
> The 183 msec is the time needed to *fetch* the row, not the time to
> update it.  So it could well be that the other time is just the time
> needed to update the table and indexes.  If this seems slower than
> your hardware ought to be able to handle, I'd wonder about how
> recently the table has been vacuumed.
>
>                regards, tom lane
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 1: if posting/reading
> through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that
>        your message can get through to the mailing list cleanly
>
>

There is a vacuumdb done at 6 AM and 5PM

In addition this table is vacuumed at 2AM, 8AM, 10AM, 12PM, 2PM, and 4PM

This is the vacuum from last night at 5PM


INFO:  vacuuming "ods.contract"
INFO:  index "XIE1_Contract" now contains 5105322 row versions in 27710
pages
DETAIL:  2174 index row versions were removed.
893 index pages have been deleted, 893 are currently reusable.
CPU 1.91s/1.58u sec elapsed 34.14 sec.
INFO:  index "XIE2_Contract" now contains 5105331 row versions in 21701
pages
DETAIL:  2174 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.40s/1.42u sec elapsed 22.73 sec.
INFO:  index "contract_pkey" now contains 5105337 row versions in 21480
pages
DETAIL:  2174 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.80s/1.52u sec elapsed 18.59 sec.
INFO:  "contract": removed 2174 row versions in 893 pages
DETAIL:  CPU 0.42s/0.08u sec elapsed 1.22 sec.
INFO:  "contract": found 2174 removable, 5105321 nonremovable row
versions in 129154 pages
DETAIL:  1357 dead row versions cannot be removed yet.
There were 1967941 unused item pointers.
0 pages are entirely empty.
CPU 11.38s/5.09u sec elapsed 85.48 sec.
INFO:  analyzing "ods.contract"
INFO:  "contract": 129154 pages, 3000 rows sampled, 5277622 estimated
total rows


Here is the latest vacuum today.

INFO:  vacuuming "ods.contract"
INFO:  index "XIE1_Contract" now contains 5106346 row versions in 28233
pages
DETAIL:  64146 index row versions were removed.
706 index pages have been deleted, 669 are currently reusable.
CPU 2.03s/2.33u sec elapsed 20.08 sec.
INFO:  index "XIE2_Contract" now contains 5106347 row versions in 21951
pages
DETAIL:  64146 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 2.16s/3.39u sec elapsed 12.23 sec.
INFO:  index "contract_pkey" now contains 5106347 row versions in 21516
pages
DETAIL:  64146 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.76s/2.47u sec elapsed 11.80 sec.
INFO:  "contract": removed 64146 row versions in 26115 pages
DETAIL:  CPU 1.94s/2.55u sec elapsed 7.78 sec.
INFO:  "contract": found 64146 removable, 5106307 nonremovable row
versions in 129154 pages
DETAIL:  890 dead row versions cannot be removed yet.
There were 1905028 unused item pointers.
0 pages are entirely empty.
CPU 14.83s/11.48u sec elapsed 60.96 sec.
INFO:  analyzing "ods.contract"
INFO:  "contract": 129154 pages, 3000 rows sampled, 5236929 estimated
total rows


I would think this should be very fast.  I already described the CPU and
memory.  THe disk is backed by an EMC DMX2000.  This particular server
has 1 physical volume group of 500GB which is split over two logical
volumes.  One for $PGDATA and the other ofr PG_XLOG.  THis split was not
really done for performance since it comes from the same physical volume
group, but more for space manageability.  The physical volume group
consists of 11GB stripes from across the EMC san.  So that would be
about 50 stripes which is really coming from dozens of backend disk
drives.  Typical I/O response times for these is 3-5 ms.

Re: Update using primary key slow

From
Tom Lane
Date:
Denis <denis.sailer@yellowbook.com> writes:
> tgl@sss.pgh.pa.us (Tom Lane) wrote in
> news:19722.1130429883@sss.pgh.pa.us:
>> The 183 msec is the time needed to *fetch* the row, not the time to
>> update it.  So it could well be that the other time is just the time
>> needed to update the table and indexes.  If this seems slower than
>> your hardware ought to be able to handle, I'd wonder about how
>> recently the table has been vacuumed.

> Here is the latest vacuum today.

> INFO:  "contract": removed 64146 row versions in 26115 pages
> DETAIL:  CPU 1.94s/2.55u sec elapsed 7.78 sec.
> INFO:  "contract": found 64146 removable, 5106307 nonremovable row
> versions in 129154 pages
> DETAIL:  890 dead row versions cannot be removed yet.
> There were 1905028 unused item pointers.

The "unused item pointers" number seems a bit high, but otherwise that
looks pretty reasonable.

Is it possible that the particular row you were updating has been
updated quite a lot of times since the last vacuum?  Or even quite
a few times within a single transaction?  The only thing I can think
of that would explain such a slow fetch is if the code has to reject
a bunch of recently-dead versions of the row.

            regards, tom lane

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Denis <denis.sailer@yellowbook.com> writes:
>> There were 1905028 unused item pointers.
> The "unused item pointers" number seems a bit high, but otherwise that
> looks pretty reasonable.
>
> Is it possible that the particular row you were updating has been
> updated quite a lot of times since the last vacuum?  Or even quite
> a few times within a single transaction?

What causes this "unused item pointers" and which impact do they have
regarding performance?

If I understood your last posting correctly more than one update on a
single row between two vacuum's would i.e. result in one ore more
"unused item pointer". Does this slow down the vacuum process and/or
other processes?  Until now I could not find an answer what this number
implies.

Regards

Martin

Martin Lesser <ml-pgsql@bettercom.de> writes:
> What causes this "unused item pointers" and which impact do they have
> regarding performance?

Those are item pointer slots that were once used but aren't used at the
moment.  VACUUM leaves an empty slot behind when it removes a dead
tuple, and the slot is then available for re-use next time a tuple is
created on that page.  See
http://developer.postgresql.org/docs/postgres/storage-page-layout.html

The direct performance impact is really pretty minimal (and none at all
on indexscans, AFAIR).  The reason Denis' number drew my attention was
that it implied that the table had gone un-vacuumed for awhile at some
time in the past.  His stats were showing about 64000 tuples deleted
per vacuum pass, which would have created 64000 unused item pointers
--- but in a steady-state situation those would be eaten up again by
the time of the next vacuum.  To have 1905028 unused pointers in a
table with only 5106307 live entries suggests that at some point there
were 1.9 million (or so) dead but not-yet-vacuumed tuples, which
suggests insufficient vacuuming.

            regards, tom lane

Re: Reasons and drawbacks for unused item pointers

From
Martin Lesser
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Martin Lesser <ml-pgsql@bettercom.de> writes:
>> What causes this "unused item pointers" and which impact do they have
>> regarding performance?
> The direct performance impact is really pretty minimal (and none at
> all on indexscans, AFAIR).  The reason Denis' number drew my attention
> was that it implied that the table had gone un-vacuumed for awhile at
> some time in the past. [...]  To have 1905028 unused pointers in a
> table with only 5106307 live entries suggests that at some point there
> were 1.9 million (or so) dead but not-yet-vacuumed tuples, which
> suggests insufficient vacuuming.

Does each update of a single row result in an "unused item pointer"?
I.e. if I update one row 10 times between VACUUMing the table the result
are 10 unused pointers?

Some rows in some of my tables are updated much more frequently than
others so I'm not sure whether the number of unused pointers implie that
I should VACUUM more often than every 24 hours.

Martin

UNSUBSCRIBE

From
Nick Howden
Date:
UNSUBSCRIBE