Re: Slow count(*) again... - Mailing list pgsql-performance

From Craig Ringer
Subject Re: Slow count(*) again...
Date
Msg-id 4CB2410C.6010603@postnewspapers.com.au
Whole thread Raw
In response to Re: Slow count(*) again...  (Mladen Gogala <mladen.gogala@vmsinfo.com>)
Responses Re: Slow count(*) again...  (Joshua Tolley <eggyknap@gmail.com>)
Re: Slow count(*) again...  (Neil Whelchel <neil.whelchel@gmail.com>)
List pgsql-performance
On 10/11/2010 01:14 AM, Mladen Gogala wrote:

> I can provide measurements, but from Oracle RDBMS. Postgres doesn't
> allow tuning of that aspect, so no measurement can be done. Would the
> numbers from Oracle RDBMS be acceptable?

Well, they'd tell me a lot about Oracle's performance as I/O chunk size
scales, but almost nothing about the cost of small I/O operations vs
larger ones in general.

Typically dedicated test programs that simulate the database read
patterns would be used for this sort of thing. I'd be surprised if
nobody on -hackers has already done suitable testing; I was mostly
asking because I was interested in how you were backing your assertions.

PostgreSQL isn't Oracle; their design is in many ways very different.
Most importantly, Oracle uses a redo log, where PostgreSQL stores old
rows with visibility information directly in the tables. It is possible
that a larger proportion of Oracle's I/O costs are fixed per-block
overheads rather than per-byte costs, so it seeks to batch requests into
larger chunks. Of course, it's also possible that 8k chunk I/O is just
universally expensive and is something Pg should avoid, too, but we
can't know that without
dedicated testing, which I at least haven't done. I don't follow
-hackers closely, and wouldn't have seen discussion about testing done
there. The archives are likely to contain useful discussions.

Then again, IIRC Pg's page size is also it's I/O size, so you could
actually get larger I/O chunking by rebuilding Pg with larger pages.
Having never had the need, I haven't examined the performance of page
size changes on I/O performance.

>> The Linux kernel, at least, does request merging (and splitting, and
>> merging, and more splitting) along the request path, and I'd
>> personally expect that most of the cost of 8k requests would be in the
>> increased number of system calls, buffer copies, etc required.
>> Measurements demonstrating or contradicting this would be good to see.
>
> Even the cost of hundreds of thousands of context switches is far from
> negligible. What kind of measurements do you expect me to do with the
> database which doesn't support tweaking of that aspect of its operation?

Test programs, or references to testing done by others that demonstrates
these costs in isolation. Of course, they still wouldn't show what gain
Pg might obtain (nothing except hacking on Pg's sources really will) but
they'd help measure the costs of doing I/O that way.

I suspect you're right that large I/O chunks would be desirable and a
potential performance improvement. What I'd like to know is *how*
*much*, or at least how much the current approach costs in pure
overheads like context switches and scheduler delays.

> Does that provide enough of an evidence and, if not, why not?

It shows that it helps Oracle a lot ;-)

Without isolating how much of that is raw costs of the block I/O and how
much is costs internal to Oracle, it's still hard to have much idea how
much it'd benefit Pg to take a similar approach.

I'm sure the folks on -hackers have been over this and know a whole lot
more about it than I do, though.

> Oracle counts 400 million records in 2 minutes and Postgres 9.01 takes
> 12.8 seconds to count 1.2 million records? Do you see the disparity?

Sure. What I don't know is how much of that is due to block sizes. There
are all sorts of areas where Oracle could be gaining.

> It maybe so, but slow sequential scan is still the largest single
> performance problem of PostgreSQL. The frequency with which that topic
> appears on the mailing lists should serve as a good evidence for that.

I'm certainly not arguing that it could use improvement; it's clearly
hurting some users. I just don't know if I/O chunking is the answer - I
suspect that if it were, then it would've become a priority for one or
more people working on Pg much sooner.

It's quite likely that it's one of those things where it makes a huge
difference for Oracle because Oracle has managed to optimize out most of
the other bigger costs. If Pg still has other areas that make I/O more
expensive per-byte (say, visibility checks) and low fixed per-block
costs, then there'd be little point in chunking I/O. My understanding is
that that's pretty much how things stand at the moment, but I'd love
verification from someone who's done the testing.

>If you still claim that it wouldn't make the difference,
> the onus to prove it is on you.

I didn't mean to claim that it would make no difference. If I sounded
like it, sorry.

I just want to know how _much_ , or more accurately how great the
overheads of the current approach in Pg are vs doing much larger reads.

--
Craig Ringer


pgsql-performance by date:

Previous
From: Neil Whelchel
Date:
Subject: Re: Slow count(*) again...
Next
From: Jon Nelson
Date:
Subject: Re: Slow count(*) again...