Re: Parallel Sequence Scan doubts - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Parallel Sequence Scan doubts
Date
Msg-id 53FE17B2.4070700@nasby.net
Whole thread Raw
In response to Re: Parallel Sequence Scan doubts  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On 8/24/14, 6:22 AM, Haribabu Kommi wrote:
> Yes, we are mainly targeting CPU-limited sequential scans, Because of
> this reason
> only I want the worker to handle the predicates also not just reading
> the tuples from
> disk.

In that case, I would suggest focusing on parallel execution of conditions regardless of where they show up in the
queryplan. In my experience, they often have nothing to do with a seqscan.
 

Here's a real-world example. We have a view that pivots our applications accounting journal into a ledger. The
expensivepart of the view is this:
 

sum(        CASE            WHEN b.tag::text = 'installment_principal'::text THEN b.type_cd -- type_cd is either 1, 0,
or-1            ELSE 0::numeric        END    ) * transaction_amount AS installment_principal
 

The view with this pivot has about 100 of these case statements. Frequently we only reference a few of them, but
anytimewe need to refer to 20+ the evaluation of that expression gets VERY cpu-expensive compared to the rest of the
query.

The other thing I would look at before seqscan filters is join processing and bitmap index index combining (ie: ANDing
togetherthe results of several bitmap index scans). Those are things that can be very CPU intensive even when doing
simpleequality comparisons.
 

BTW, it's also possible that these cases would be good fits for GPU parallel execution.
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: delta relations in AFTER triggers
Next
From: Jim Nasby
Date:
Subject: Re: Function to know last log write timestamp