Re: I/O on select count(*) - Mailing list pgsql-performance

From Gregory Stark
Subject Re: I/O on select count(*)
Date
Msg-id 87hcczqjxq.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: I/O on select count(*)  (Luke Lonergan <llonergan@greenplum.com>)
List pgsql-performance
"Luke Lonergan" <llonergan@greenplum.com> writes:

> BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the
> visibility caching which was enough to provide performance at the same level
> as with the HINT bit optimization, but avoids this whole ³write the data,
> write it to the log also, then write it again just for good measure²
> behavior.
>
> For people doing data warehousing work like the poster, this Postgres
> behavior is miserable.  It should be fixed for 8.4 for sure (volunteers?)

For people doing data warehousing I would think the trick would be to do
something like what we do to avoid WAL logging for tables created in the same
transaction.

That is, if you're loading a lot of data at the same time then all of that
data is going to be aborted or committed and that will happen at the same
time. Ideally we would find a way to insert the data with the hint bits
already set to committed and mark the section of the table as being only
provisionally extended so other transactions wouldn't even look at those pages
until the transaction commits.

This is similar to the abortive attempt to have the abovementioned WAL logging
trick insert the records pre-frozen. I recall there were problems with that
idea though but I don't recall if they were insurmountable or just required
more work.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

pgsql-performance by date:

Previous
From: James Mansion
Date:
Subject: Re: I/O on select count(*)
Next
From: "Kevin Grittner"
Date:
Subject: Re: I/O on select count(*)