Thread: Sorting a query on a view ignores an index

Sorting a query on a view ignores an index

From
Mathieu De Zutter
Date:
Hi,

I'm having a performance problem in postgresql.

I have a rather complex view (attached) which, on itself, executes very
fast, as it should. Normally this view is unordered. When I order the
view itself (see comments in attachment), the view executes with about
the same speed since the field i'm sorting on has an index.
However, i dont want the view to be presorted, but sort it in the
queries that use the view. When I do that, the index I have on that
field seems to be ignored. It stretches so far that, when I sort the
view on A and sort the query on A too, the query will try to sort
_again_ _without_ index and thus lose all performance.

I've attached the query-plans for the different cases.
preordered:           means the view itself is sorted
with/without sorting: tells whether the query is sorted

Note: the "NOT NULL" in the queries doesn't affect performance

With kind regards,

    Mathieu

Re: Sorting a query on a view ignores an index

From
Tom Lane
Date:
Mathieu De Zutter <mathieu@dezutter.org> writes:
> However, i dont want the view to be presorted, but sort it in the
> queries that use the view. When I do that, the index I have on that
> field seems to be ignored. It stretches so far that, when I sort the
> view on A and sort the query on A too, the query will try to sort
> _again_ _without_ index and thus lose all performance.

This is a limitation of the 7.3 query planner.  7.4 should do better.

            regards, tom lane

Re: Sorting a query on a view ignores an index

From
Mathieu De Zutter
Date:
On Thu, Aug 21, 2003 at 10:59:11AM -0400, Tom Lane wrote:
> Mathieu De Zutter <mathieu@dezutter.org> writes:
> > However, i dont want the view to be presorted, but sort it in the
> > queries that use the view. When I do that, the index I have on that
> > field seems to be ignored. It stretches so far that, when I sort the
> > view on A and sort the query on A too, the query will try to sort
> > _again_ _without_ index and thus lose all performance.
>
> This is a limitation of the 7.3 query planner.  7.4 should do better.

Ok I'll have to live with that I guess.
Apart from avoiding views or subselects when sorting afterwards and
putting the whole bunch in a huge SQL statement (which i'll have to
produce on-the-fly), do you have an other alternative?
The 2 seconds is way to much, as the database will eventually run on a
machine that is 10 times slower.

With kind regards,

    Mathieu

Re: Sorting a query on a view ignores an index

From
Tom Lane
Date:
Mathieu De Zutter <mathieu@dezutter.org> writes:
> Apart from avoiding views or subselects when sorting afterwards and
> putting the whole bunch in a huge SQL statement (which i'll have to
> produce on-the-fly), do you have an other alternative?

See if you can avoid the subselects in the view's SELECT list.  That's
what's preventing 7.3 from doing a good job.

            regards, tom lane

Re: Sorting a query on a view ignores an index

From
"Your Name"
Date:
> Apart from avoiding views or subselects when sorting afterwards and
> putting the whole bunch in a huge SQL statement (which i'll have to
> produce on-the-fly), do you have an other alternative?
> The 2 seconds is way to much, as the database will eventually run on a
> machine that is 10 times slower.

Something that isn't _totally_ clear is whether or not it is going to
be possible to make use of indices in the selection.  If the postmaster
must assemble, out of disparate sources, a large collection of data,
the best trade-off may very well be to build the collection the best
way the system knows how (perhaps NOT ordering this using the index you
expect), and sort it afterwards.

Sorting doesn't tend to be grieviously expensive except when finding
the query results is also grieviously expensive.

I think you are assuming that the query would be quicker if it used the
sorted index; that is an assumption that should be checked at the door,
or at least checked somewhere.
--
"The main difference between an amateur crypto designer and a used car
salesman is that  the used car salesman can  probably drive and  knows
when he's lying." -- An Metet <anmetet@freedom.gmsociety.org>

Re: Sorting a query on a view ignores an index

From
Mathieu De Zutter
Date:
On Thu, Aug 21, 2003 at 01:02:08PM -0400, Tom Lane wrote:
> Mathieu De Zutter <mathieu@dezutter.org> writes:
> > Apart from avoiding views or subselects when sorting afterwards and
> > putting the whole bunch in a huge SQL statement (which i'll have to
> > produce on-the-fly), do you have an other alternative?
>
> See if you can avoid the subselects in the view's SELECT list.  That's
> what's preventing 7.3 from doing a good job.

First of all, I've installed 7.4beta1 and the performance is as should be
expected. Good, I'll try to convince the admin to upgrade to 7.4 when it
gets released (though that machine I will eventually have to use, runs
debian/stable atm, with pgsql 7.2).

Second, I have tried to eliminate the subselects by (left) joining some
tables (I only kept a part of the original view to do this(1)). That made
me also add a GROUP BY clause to remove duplicates. Comparing to the
result-equivalent subselect-version, it has an enormous performance hit

(1) I only kept one subselect and rearranged it to left join's

Limit  (cost=2169.31..2169.36 rows=20 width=140) (actual
     time=526.12..526.14 rows=20 loops=1)
->  Sort  (cost=2169.31..2185.79 rows=6591 width=140) (actual
        time=526.11..526.12 rows=20 loops=1)
     Sort Key: search_title
 ->  Subquery Scan search_song
        (cost=1254.25..1484.93 rows=6591 width=140)
       (actual time=149.29..423.16 rows=6380 loops=1)
  ->  GroupAggregate  (cost=1254.25..1419.02 rows=6591
       width=49) (actual time=149.28..407.04 rows=6380 loops=1)
   ->  Sort (cost=1254.25..1270.73 rows=6591 width=49) (actual
                 time=147.92..150.75 rows=6380 loops=1)
        Sort Key: s.id, s.title, s."year"
etc...

Running half a second. It looks like the grouping is messing it up.
No index seems to be used either?
(s.id is primary key, s.title = search_title, with index
 search_title_idx)
While the subselect-version shows no sort in its plan and runs in 2msec
and shows "Index Scan using song_title_idx on song s"
(both queries are sorted on search_title)

I'm fairly sure the two queries were written as they should.
If you want more info on how I got this result, just tell me. I think my
mail is yet long enough.

Oh, important detail, these results are obtained with 7.4beta1, I
haven't checked it in 7.3

With kind regards,

    Mathieu

Re: Sorting a query on a view ignores an index

From
Date:
Hello,

Here is my exp. regarding VIEW (create VIEW view_name AS SELECT ....)
optimization.
I have found strange (or at least not very intelligent) behavior of query
planner (v.7.3.4).
Given task is to take data from 14 tables, join, group, sort, etc. I
currently use about ten views referencing each other.

I have done a lot of job to adjust 'postgresql.conf' configuration.
Except common increase of shared_mem&buffer stuff I felt I was unable to do
anything else.
So, initialy, time to generate the Monthly report was about 10 sec.
I started to play with GEQO....
Screaming success -- I downplayed query execution time about 3x JUST by
setting geqo_threshold=3.
Now my Monthly report takes about 3 sec.
(Users are running that sequence of select queries (I call it invocation of
the final view ;-) quite often  for any given period of time --
 usually from one week to half a year (rarely) so it is essential part of my
dbvs.

Maybe my exp. will encourage somebody to play with scary genetic stuff. ;-)

Good luck & Best Regards,
Alvis