Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions - Mailing list pgsql-hackers

From Sameer Kumar
Subject Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Date
Msg-id CADp-Sm5EzGp5qryZMf=eKMtF3tG4rmN1SGf3uYib9WjgqduOSA@mail.gmail.com
Whole thread Raw
In response to Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
List pgsql-hackers

Hello,

> > With this index, you will get a different plan like this,
> >
> Exactly my point, can we look at making windowing functions
> smart and make use of available indexes?

I might have guessed..


> > Does this satisfies your needs?
> >
> Not exactly. If I have missed to mention, this is not a
> production issue for me. I am trying to see if PostgreSQL
> planner produces best plans for Data Warehouse and mining
> oriented queries.

I agree to the point.

> I think Hashes can be efficiently used for sorting (and I
> believe they are used for joins too when a pre-sorted data set
> is not available via indexes). This again could my
> misinterpretation.

It is true if 'Sorting' means 'key classification without
orderings'. Hashes should always appear at inner side of a join,
I'm convinced. The "ordered' nature is not required for the case
if outer side is already ordered. If not, separate sorting will
needed.

> I lost you somewhere here. My be this is above my pay-grade :-)

Sorry for my crumsy english :-<

 
No, it was not your English. :-)
When I read it again and try to relate, I get your point. Actually true, hashes must always be performed as last option (if that is what you too meant) and if there are few other operations they must be the last one to be performed especially after sorting/grouping. Hashes must somehow make use of already sorted data (I think this something even you indicated)


> Well, at least with Oracle and DB2 planners I have seen that
> the plan produced with dense_rank performs better than a series
> of nested SELECT MAX().

I see your point. Although I don't know what plans they
generates, and I don't see how to ordering and ranking without
sorting.  Could you let me see what they look like?

# Nevertheless, I don't have the confidence that I can be of some
# help..

I will do that if I get a DB2 system or Oracle system running. I will try to replicate the same 2 test cases and share the plan. One thing which I am sure is, the below part of the plan 

QUERY PLAN | Subquery Scan on __unnamed_subquery_0  (cost=12971.39..16964.99 rows=614 width=43) (actual time=2606.075..2953.937 rows=558 loops=1)

would be generated as RID scan in DB2 (which I have seen to perform better than normal subquery scans in DB2).



Regards
Sameer
Ashnik Pte Ltd

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Fast insertion indexes: why no developments
Next
From: Leonardo Francalanci
Date:
Subject: Re: Fast insertion indexes: why no developments