insert/update performance - Mailing list pgsql-hackers

From Jinhua Luo
Subject insert/update performance
Date
Msg-id CAAc9rOxUSnWhDo1GV+i+kktxo6J0kKnZGxCEigCuSku0Mr+xqg@mail.gmail.com
Whole thread Raw
Responses Re: insert/update performance  (Amit Kapila <amit.kapila16@gmail.com>)
Re: insert/update performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi All,

Here is my test environment:

postgresql 9.4.5, Ubuntu 14.04, 8 CPU core, 8GB ram, SCSI hard disk

I have a table with 70 columns, and 6 indexes. The data flow is a
special OLTP model: frequent inserts (2000 tps), and each inserted row
would be updated very soon (i.e. the number of inserts is equal to the
number of updates).

From the start, the table (initially empty) would be extended bigger
and bigger (via mdextend(), which allocates a new page and zero out
the page to the disk). During this period, the query latency is ok
(Because most fsync of writes are queued to the next checkpoint, so
the writes only copy zero-filled data to the page cache of linux
kernel).

After a long time, the mdextend() disappears, instead, the mdread()
occupy almost all I/O requests, because the fsm indicates enough free
space within the old table segments. At this stage, the performance is
bad. Because most page cache of linux kernel is occupied by the latest
table segments and the index files (I check it via mincore() syscall).
The read from disk (in sync way, of course) slow down all subsequent
queries.

Why fsm is updated much slower than the query speed? If the fsm is
updated in time, then the free space would still possibly cached by
the linux kernel. I thought it's due to the autovacuum is not so
aggressive. So I try to reconfigure it, e.g. set cost_delay to 0,
lower down the vacuum threshold. But it doesn't help.

At last, I found it's not the problem of autovacuum.
I do a simple test: I truncate the table, disable the autovacuum, and
run the application for a few minutes, then I invokes vacuum manually,
it gives a strange output:
found 598 removable, 25662 nonremovable row versions in 3476 pages
DETAIL:  0 dead row versions cannot be removed yet

As said before, the number of inserts is equal to the number of
updates. So the bloat of the table should be 100%, and the number of
removable rows should be equal to the number of nonremovable rows,
which is the real number of inserts issued by the application.

But the result shows that the number of nonremovable rows is just a
small fraction. If it's true, then no wonder that the table would keep
extending for a long time, because the free space is almost small.

Why the free space after vacuum is not 50% of the original size in my case?

Please help. Thanks!


Regards,
Jinhua Luo



pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: Patch: Implement failover on libpq connect level.
Next
From: Amit Kapila
Date:
Subject: Re: Relation extension scalability