Re: vacuum locking - Mailing list pgsql-performance

From Greg Stark
Subject Re: vacuum locking
Date
Msg-id 878ynaupgu.fsf@stark.dyndns.tv
Whole thread Raw
In response to Re: vacuum locking  (Rob Nagler <nagler@bivio.biz>)
Responses Re: vacuum locking
List pgsql-performance
Rob Nagler <nagler@bivio.biz> writes:

> Greg Stark writes:
> > Note that pctfree/pctused are a big performance drain on the usual case. Try
> > setting them to 0/100 on a table that doesn't get updates (like a many-many
> > relation table) and see how much faster it is to insert and scan.
>
> Right.  You can optimize each table independently.  The "usual" case
> doesn't exist in most databases, I've found, which is why Oracle does
> better.

Sorry I was unclear. By "usual case" I meant reading, as opposed to updates.
The size of the on-disk representation turns out to be a major determinant in
a lot of database applications, since the dominant resource is i/o bandwidth.
Try doing a fresh import of a large table with pctfree 0 pctuse 100 and
compare how long a select takes on it compared to the original table.



> In practice, I'm still stuck.  As a simple example, this query
>         select avg(f1) from t1 group by f2
>
> Takes 33 seconds (see explain analyze in another note in this thread)
> to run on idle hardware with about 1GB available in the cache.  It's
> clearly hitting disk to do the sort.  Being a dumb programmer, I
> changed the query to:

I didn't see the rest of the thread so forgive me if you've already seen these
suggestions.

FIrstly, that type of query will be faster in 7.4 due to implementing a new
method for doing groups called hash aggregates.

Secondly you could try raising sort_mem. Postgres can't know how much memory
it really has before it swaps, so there's a parameter to tell it. And swapping
would be much worse than doing disk sorts.

You can raise sort_mem to tell it how much memory it's allowed to use before
it goes to disk sorts. You can even use ALTER SESSION to raise it in a few DSS
sessions but leave it low the many OLTP sessions. If it's high in OLTP
sessions then you could quickly hit swap when they all happen to decide to use
the maximum amount at the same time. But then you don't want to be doing big
sorts in OLTP sessions anyways.

Unfortunately there's no way to tell how much memory it thinks it's going to
use. I used to use a script to monitor the pgsql_tmp directory in the database
to watch for usage.

>     select f1 from t1;
>
> And wrote the rest in Perl. It takes 4 seconds to run. Why? The Perl doesn't
> sort to disk, it aggregates in memory. There are 18 rows returned. What I
> didn't mention is that I originally had:

Oof. I expect if you convinced 7.3 to do the sort in memory by a suitable
value of sort_mem it would be close, but still slower than perl. 7.4 should be
very close since hash aggregates would be more or less equivalent to the perl
method.


>     select avg(f1), t2.name from t1, t2 where t2.f2 = t1.f2 group by t2.name;
>
> Which is much worse:
>
>  Aggregate  (cost=161046.30..162130.42 rows=8673 width=222) (actual time=72069.10..87455.69 rows=18 loops=1)
>    ->  Group  (cost=161046.30..161479.95 rows=86729 width=222) (actual time=71066.38..78108.17 rows=963660 loops=1)
>          ->  Sort  (cost=161046.30..161263.13 rows=86729 width=222) (actual time=71066.36..72445.74 rows=963660
loops=1)
>                Sort Key: t2.name
>                ->  Merge Join  (cost=148030.15..153932.66 rows=86729 width=222) (actual time=19850.52..27266.40
rows=963660loops=1) 
>                      Merge Cond: ("outer".f2 = "inner".f2)
>                      ->  Sort  (cost=148028.59..150437.74 rows=963660 width=58) (actual time=19850.18..21750.12
rows=963660loops=1) 
>                            Sort Key: t1.f2
>                            ->  Seq Scan on t1  (cost=0.00..32479.60 rows=963660 width=58) (actual time=0.06..3333.39
rows=963660loops=1) 
>                      ->  Sort  (cost=1.56..1.60 rows=18 width=164) (actual time=0.30..737.59 rows=931007 loops=1)
>                            Sort Key: t2.f2
>                            ->  Seq Scan on t2  (cost=0.00..1.18 rows=18 width=164) (actual time=0.05..0.08 rows=18
loops=1)
>  Total runtime: 87550.31 msec
>
> Again, there are about 18 values of f2.  The optimizer even knows this
> (it's a foreign key to t2.f2), but instead it does the query plan in
> exactly the wrong order.  It hits disk probably 3 times as much as the
> simpler query judging by the amount of time this query takes (33 vs 88
> secs).  BTW, adding an index to t1.f2 has seriously negative effects
> on many other DSS queries.

Well, first of all it doesn't really because you said to group by t2.name not
f1. You might expect it to at least optimize something like this:

select avg(f1),t2.name from t1 join t2 using (f2) group by f2

but even then I don't think it actually is capable of using foreign keys as a
hint like that. I don't think Oracle does either actually, but I'm not sure.

To convince it to do the right thing you would have to do either:

SELECT a, t2.name
  FROM (SELECT avg(f1),f2 FROM t1 GROUP BY f2) AS t1
  JOIN t2 USING (f2)

Or use a subquery:

SELECT a, (SELECT name FROM t2 WHERE t2.f2 = t1.f2)
  FROM t1
 GROUP BY f2


Oh, incidentally, my use of the "JOIN" syntax is a personal preference.
Ideally it would produce identical plans but unfortunately that's not always
true yet, though 7.4 is closer. I think in the suggestion above it actually
would.

--
greg

pgsql-performance by date:

Previous
From: "John Pagakis"
Date:
Subject: Re: Performance Concern
Next
From: "John Pagakis"
Date:
Subject: Re: Performance Concern