Thread: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

From
Eoghan Murray
Date:
I'm upgrading from 8.4 to 9.1 and experiencing a performance degradation on a key query with 2 views and 2 tables.

Old server "PostgreSQL 8.4.10 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 32-bit"
New server "PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit"

The query is as follows:
SELECT *
FROM edge_geom
WHERE (edge_geom.start_id, edge_geom.end_id) IN (('congo', 'donal'), ('golow', 'tundo'), ('golow', 'arthur'), ('golow', 'porto'), ('tundo', 'donal'), ('golow', 'newbo'), ('porto', 'donal'), ('decal', 'donal'), ('arthur', 'donal'), ('leandro', 'donal'), ('golow', 'decal'), ('golow', 'salad'), ('newbo', 'donal'), ('golow', 'congo'), ('salad', 'donal'), ('golow', 'leandro'));

Schema definitions:
I've tried to simplify the table and view definitions wherever possible.

And the query plans:
9.1 with setting `enable_material = off`: 1,635ms  http://explain.depesz.com/s/gIu

So it looks like the Materialize in the query plan is causing the 30x slowdown.
With the materialize strategy switched off , it's still 5x slower on 9.1 vs. 8.4.

Any help appreciated, I acknowledge that the tables and views aren't the simplest.

Thanks!

Eoghan





Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

From
Claudio Freire
Date:
On Fri, Jul 13, 2012 at 11:11 AM, Eoghan Murray <eoghan@qatano.com> wrote:
> 8.4: 314ms:  http://explain.depesz.com/s/GkX
> 9.1: 10,059ms :http://explain.depesz.com/s/txn
> 9.1 with setting `enable_material = off`: 1,635ms
> http://explain.depesz.com/s/gIu

I think the problem is it's using a merge join, with a sort inside
that's producing 600x more rows than expected, while 8.4 does a hash
join with no intermediate big tables instead.

What's your configuration like in both servers? (that could explain
planning differences)

Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

From
Claudio Freire
Date:
On Fri, Jul 13, 2012 at 1:28 PM, Eoghan Murray <eoghan@qatano.com> wrote:
> Thank you Claudio,
>
> I haven't touched the 9.1 configuration (with the exception of toggling the
> enable_material setting). http://pastebin.com/nDjcYrUd
> As far as I can remember I haven't changed the 8.4 configuration:
> http://pastebin.com/w4XhDRX4

Maybe that's your problem. Postgres default configuration is not only
suboptimal, but also a poor reflection of your hardware (what's your
hardware, btw?). Which means postgres' expected costs won't hold. I'm
thinking especially about your effective_cache_size, which may
influence postgres' decision to use one join method vs another, but
many other settings would probable influence.

Spend a bit of time to configure both servers such that the
configuration reflects the hardware, and try your queries again.

Eoghan Murray <eoghan@qatano.com> writes:
> I'm upgrading from 8.4 to 9.1 and experiencing a performance degradation on
> a key query with 2 views and 2 tables.

I think the core of the problem is the lousy rowcount estimate for the
result of the edited_stop_2 view: when you've got 1 row estimated and
almost 10000 rows actual, it's almost guaranteed that the rest of the
plan is going to be bad.  It's pure luck that 8.4 chooses a plan that
fails to suck, because it's optimizing for the wrong case.  9.1 isn't
so lucky, but that doesn't make 9.1 broken, just less lucky.

I'm not terribly disappointed that that rowcount estimate is bad,
because this seems like a rather weird and inefficient way to do "get
the rows with the maximal "updated" values".  I'd suggest experimenting
with some other definitions for edited_stop_2, such as using a subquery:

 SELECT ...
   FROM stop o
   WHERE updated = (select max(updated) from stop i
                    where o.node_id = i.node_id and ...);

This might be reasonably efficient given your pkey index for "stop".

Or if you don't mind using a Postgres-ism, you could try DISTINCT ON:

 SELECT DISTINCT ON (node_id, org_id, edge_id, stop_pos) ...
   FROM stop
   ORDER BY node_id DESC, org_id DESC, edge_id DESC, stop_pos DESC, updated DESC;

See the "weather reports" example in our SELECT reference page for
some explanation of how that works.  Again, the ORDER BY is chosen
to match your pkey index; I'm not sure that the planner will think
a full-index scan beats a seqscan-and-sort, but you may as well
give it the option.

Of these, I would bet that the first will work better if your typical
usage is such that only a few rows need to be fetched from the view.
I believe the DISTINCT ON is likely to act as an optimization fence
forcing the whole view to be evaluated when using the second definition.

            regards, tom lane

Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

From
Claudio Freire
Date:
On Fri, Jul 13, 2012 at 3:22 PM, Eoghan Murray <eoghan@qatano.com> wrote:
> This is with `enable_material=off`, with `enable_material=on` it also
> doesn't go for the Merge Join, but the Materialize step pushes it up to over
> 7,000ms.

I think this one could stem from what Tom observed, that the rowcount
estimate is way off.

It usually helps a great deal if you can get PG to estimate correctly,
be it by doing analyze, incrementing statistics or simply formulating
your query in a way that's friendlier for PG's estimator.