Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Date
Msg-id 20131114.195120.28746056.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions  (Sameer Kumar <sameer.kumar@ashnik.com>)
Responses Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
List pgsql-hackers
Hello,

> When I read it again and try to relate, I get your point. Actually true,
> hashes must always be performed as last option (if that is what you too
> meant) and if there are few other operations they must be the last one to
> be performed especially after sorting/grouping. Hashes must somehow make
> use of already sorted data (I think this something even you indicated)

Yes, some 'hash'es could preserve order selecting such a function
for hash function. But at least PostgreSQL's 'HashAggregation'
uses not-order-preserving function as hash function. So output
cannot preserve input ordering.

> I will do that if I get a DB2 system or Oracle system running. I will try
> to replicate the same 2 test cases and share the plan. One thing which I am
> sure is, the below part of the plan
> 
> QUERY PLAN | Subquery Scan on __unnamed_subquery_0
>  (cost=12971.39..16964.99 rows=614 width=43) (actual
> time=2606.075..2953.937 rows=558 loops=1)
> 
> would be generated as RID scan in DB2 (which I have seen to perform better
> than normal subquery scans in DB2).

DB2's document says it is used for 'index ORing' corresponds OR
or IN ops, which seems to be a relative to BitmapOr of
PostgreSQL, perhaps not to HashAggregates/SemiJoin.

I tried to imagin the plan for the group_by case with repeated
index scan and merging..

> select student_name
> from student_score
> where (course,score) in (select course,max(score)
>                          from student_score group by course);

Taking the advantage that the cardinarity of course is 8, this
query could be transformed into 8 times of index scan and
bitmaping.

With hypothetical plan node LOOP, and BitmapScanAdd the plan
could be,

| BitmapHeapScan (rows = 155, loops = 1)
|  -> LOOP 
|     ON Subquery (select distinct course from student_course) as c0
|     -> BitmapScanAdd (loops = 8)
|        BitmapCond: (student_score.score = x)
|       -> Limit (rows = 1, loops = 8) AS x
|         -> Unique (rows = 1, loops = 8)
|           -> IndexScan using idx_score on student_course (rows = 1, loops = 8)
|              Filter (student_course.course = c0)

I suppose this is one possibility of what DB2 is doing. If DB2
does the same optimization for ranking > 1 with the dense_rank()
case, this plan might be like this,

| BitmapHeapScan (rows = 133, loops = 1)
|  -> LOOP 
|     ON Subquery (select distinct course from student_course) as c0
|     -> BitmapScanAdd (loops = 8)
|        BitmapCond: (student_score.score = x)
|       -> Limit (rows = 1, loops = 8) AS x
|         -> Unique (rows = 2, loops = 8)
|           -> IndexScan using idx_score on student_course (rows = 18,loops= 8)
|              Filter (student_course.course = c0)

Both plans surely seem to be done shortly for relatively small
n's and number of courses.

On the other hand, using semijoin as PostgreSQL does, creating
HashAggregate storing nth place score for every course requires
some memory to work on for each course.

| Hash Semi Join
|   Hash Cond: (a.course = b.course and a.score = b.score)
|  -> SeqScan on student_score as a
|  -> Hash
|   -> HashAggregatesFunc (rows = 8)
|      Key = course, func = rankn(dense_rank(), n, key, val)
|    -> SeqScan on student_score (rows = 122880)

Where, rankn() must keep socres down to nth rank and emits nth
score as final value. I don't get more generic form for this
mechanism right now, and the value to do in this specific manner
seems not so much..


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Florian Weimer
Date:
Subject: Re: Logging WAL when updating hintbit
Next
From: Sameer Thakur
Date:
Subject: Re: Extra functionality to createuser