Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions - Mailing list pgsql-hackers

From Sameer Kumar
Subject Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Date
Msg-id CADp-Sm4704TL3jv3+hq5k4+xza544y3NsdEtrdTMt9NSFCb7rw@mail.gmail.com
Whole thread Raw
In response to Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers


On Thu, Nov 14, 2013 at 6:51 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,

> When I read it again and try to relate, I get your point. Actually true,
> hashes must always be performed as last option (if that is what you too
> meant) and if there are few other operations they must be the last one to
> be performed especially after sorting/grouping. Hashes must somehow make
> use of already sorted data (I think this something even you indicated)

Yes, some 'hash'es could preserve order selecting such a function
for hash function. But at least PostgreSQL's 'HashAggregation'
uses not-order-preserving function as hash function. So output
cannot preserve input ordering.

> I will do that if I get a DB2 system or Oracle system running. I will try
> to replicate the same 2 test cases and share the plan. One thing which I am
> sure is, the below part of the plan
>
> QUERY PLAN | Subquery Scan on __unnamed_subquery_0
>  (cost=12971.39..16964.99 rows=614 width=43) (actual
> time=2606.075..2953.937 rows=558 loops=1)
>
> would be generated as RID scan in DB2 (which I have seen to perform better
> than normal subquery scans in DB2).

DB2's document says it is used for 'index ORing' corresponds OR
or IN ops, which seems to be a relative to BitmapOr of
PostgreSQL, perhaps not to HashAggregates/SemiJoin.

I tried to imagin the plan for the group_by case with repeated
index scan and merging..

> select student_name
> from student_score
> where (course,score) in (select course,max(score)
>                          from student_score group by course);

Taking the advantage that the cardinarity of course is 8, this
query could be transformed into 8 times of index scan and
bitmaping.

With hypothetical plan node LOOP, and BitmapScanAdd the plan
could be,

| BitmapHeapScan (rows = 155, loops = 1)
|  -> LOOP
|     ON Subquery (select distinct course from student_course) as c0
|     -> BitmapScanAdd (loops = 8)
|        BitmapCond: (student_score.score = x)
|       -> Limit (rows = 1, loops = 8) AS x
|         -> Unique (rows = 1, loops = 8)
|           -> IndexScan using idx_score on student_course (rows = 1, loops = 8)
|              Filter (student_course.course = c0)

I suppose this is one possibility of what DB2 is doing. If DB2
does the same optimization for ranking > 1 with the dense_rank()
case, this plan might be like this,

I can not be sure but this one seems logically correct from cost and cardinality perspective(am not sure the operations that the DB2 planner would generate ). Need to test it.

 
| BitmapHeapScan (rows = 133, loops = 1)
|  -> LOOP
|     ON Subquery (select distinct course from student_course) as c0
|     -> BitmapScanAdd (loops = 8)
|        BitmapCond: (student_score.score = x)
|       -> Limit (rows = 1, loops = 8) AS x
|         -> Unique (rows = 2, loops = 8)
|           -> IndexScan using idx_score on student_course (rows = 18,loops= 8)
|              Filter (student_course.course = c0)

Both plans surely seem to be done shortly for relatively small
n's and number of courses.

I guess they would do well even when the cardinality of courses is fairly high (unless we hit a scenario where courses offered are more than/in the same decimal range as students opting for them).

On the other hand, using semijoin as PostgreSQL does, creating
HashAggregate storing nth place score for every course requires
some memory to work on for each course.

| Hash Semi Join
|   Hash Cond: (a.course = b.course and a.score = b.score)
|  -> SeqScan on student_score as a
|  -> Hash
|   -> HashAggregatesFunc (rows = 8)
|      Key = course, func = rankn(dense_rank(), n, key, val)
|    -> SeqScan on student_score (rows = 122880)

Where, rankn() must keep socres down to nth rank and emits nth
score as final value. I don't get more generic form for this
mechanism right now, and the value to do in this specific manner
seems not so much..


I feel the advantage could be more when dealing with a DW environment which has more complex aggregate and windowing queries. Extending this to other windowing function, it could be a great gain for DW and OLAP queries.


Regards
Sameer

pgsql-hackers by date:

Previous
From: David Johnston
Date:
Subject: Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Next
From: Daniel Wood
Date:
Subject: lock on object is already held