Re: PG-Strom - A GPU optimized asynchronous executor module - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: PG-Strom - A GPU optimized asynchronous executor module
Date
Msg-id CADyhKSWA3nSbokM2TFGzNB1rKYubfws1ZJzFpNjP5Kue1EdU-g@mail.gmail.com
Whole thread Raw
In response to Re: PG-Strom - A GPU optimized asynchronous executor module  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: PG-Strom - A GPU optimized asynchronous executor module  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
2012/1/23 Robert Haas <robertmhaas@gmail.com>:
> On Sun, Jan 22, 2012 at 10:48 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> I tried to implement a fdw module that is designed to utilize GPU
>> devices to execute
>> qualifiers of sequential-scan on foreign tables managed by this module.
>>
>> It was named PG-Strom, and the following wikipage gives a brief
>> overview of this module.
>>    http://wiki.postgresql.org/wiki/PGStrom
>>
>> In our measurement, it achieves about x10 times faster on
>> sequential-scan with complex-
>> qualifiers, of course, it quite depends on type of workloads.
>
> That's pretty neat.  In terms of tuning the non-GPU based
> implementation, have you done any profiling?  Sometimes that leads to
> an "oh, woops" moment.
>
Not yet, except for \timing.

What options are available to see rate of workloads of components
within a particular query?
I tried to google some keywords, but does not hit to me.


As an aside, I also tries to modify is_device_executable_qual() always
return false to disable qualifiers pushed-down.
In this case, 2100ms of 7679ms was consumed within this module, thus,
I guess rest of 5500ms was mostly consumed by ExecQual(), although
it is just an estimation...

postgres=# SET pg_strom.exec_profile = on;
SET
Time: 1.075 ms
postgres=# SELECT count(*) FROM ftbl WHERE sqrt((x-25.6)^2 + (y-12.8)^2) < 10;
INFO:  PG-Strom Exec Profile on "ftbl"
INFO:  Total PG-Strom consumed time: 2100.898 ms
INFO:  Time to JIT Compile GPU code: 0.000 ms
INFO:  Time to initialize devices:   0.000 ms
INFO:  Time to Load column-stores:   7.013 ms
INFO:  Time to Scan column-stores:   1219.746 ms
INFO:  Time to Fetch virtual tuples: 874.095 ms
INFO:  Time of GPU Synchronization:  0.000 ms
INFO:  Time of Async memcpy:         0.000 ms
INFO:  Time of Async kernel exec:    0.000 mscount
------- 3159
(1 row)

Time: 7679.342 ms


Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Inline Extension
Next
From: Simon Riggs
Date:
Subject: Re: New replication mode: write