Home > mailing lists

Re: DELETE vs TRUNCATE explanation - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: DELETE vs TRUNCATE explanation
Date	July 12, 2012 22:01:07
Msg-id	CAMkU=1zJK6WiS9-6eO8rNmRpKSSB4=uw2mTsD=k64mNh__=Dcg@mail.gmail.com Whole thread
In response to	Re: DELETE vs TRUNCATE explanation ("Harold A. Giménez" <harold.gimenez@gmail.com>)
List	pgsql-performance

Tree view

On Thu, Jul 12, 2012 at 4:21 PM, Harold A. Giménez
<harold.gimenez@gmail.com> wrote:
>
> > What is shared_buffers?
>
>
> 1600kB

That is really small, so the buffer flushing should not be a problem.
Unless you mean 1600MB.

> > > This is a rather small schema -- probably a half a dozen tables, and
> > > probably about a dozen indexes. This application is entirely
> > > unremarkable in its test-database workload: it wants to load a few
> > > records, do a few things, and then clear those handful of records.
> >
> > How many rounds of truncation does one rake do? I.e. how many
> > truncations are occurring over the course of that 1 minute or 15
> > minutes?
>
> All tables are cleared out after every test. On this particular project, I'm
> running 200+ tests in 1.5 minutes (or 30 seconds with DELETE instead of
> TRUNCATE). For another, bigger project it's running 1700+ tests in about a
> minute. You can do the math from there.

so 1700 rounds * 18 relations = truncates 30,600 per minute.

That is actually faster than I get truncates to go when I am purely
limited by CPU.

I think the problem is in the Fsync Absorption queue.  Every truncate
adds a FORGET_RELATION_FSYNC to the queue, and processing each one of
those leads to sequential scanning the checkpointer's pending ops hash
table, which is quite large.  It is almost entirely full of other
requests which have already been canceled, but it still has to dig
through them all.   So this is essentially an N^2 operation.

I'm not sure why we don't just delete the entry instead of marking it
as cancelled.  It looks like the only problem is that you can't delete
an entry other than the one just returned by hash_seq_search.  Which
would be fine, as that is the entry that we would want to delete;
except that mdsync might have a different hash_seq_search open, and so
it wouldn't be safe to delete.

If the segno was taken out of the hash key and handled some other way,
then the forgetting could be done with a simple hash look up rather
than a full scan.

Maybe we could just turn off the pending ops table altogether when
fsync=off, but since fsync is PGC_SIGHUP it is not clear how you could
safely turn it back on.

Cheers,

Jeff

pgsql-performance by date:

From: "Harold A. Giménez"
Date: 12 July 2012, 20:21:32
Subject: Re: DELETE vs TRUNCATE explanation

From: Yan Chunlu
Date: 13 July 2012, 01:02:42
Subject: Re: how could select id=xx so slow?

Re: DELETE vs TRUNCATE explanation - Mailing list pgsql-performance

Previous

Next