Thread: GPU Accelerated Sorting

GPU Accelerated Sorting

From
Eliot Gable
Date:
Not sure if anyone else saw this, but it struck me as an interesting
idea if it could be added to PostgreSQL. GPU accelerated database
operations could be very... interesting. Of course, this could be
difficult to do in a way that usefully increases performance of
PostgreSQL, but I'll leave that up to you guys to figure out.

http://code.google.com/p/back40computing/wiki/RadixSorting

--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing
from our children, we're stealing from them--and it's not even
considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to
live; not live to eat.) ~Marcus Tullius Cicero

Re: GPU Accelerated Sorting

From
Greg Smith
Date:
Eliot Gable wrote:
> Not sure if anyone else saw this, but it struck me as an interesting
> idea if it could be added to PostgreSQL. GPU accelerated database
> operations could be very... interesting. Of course, this could be
> difficult to do in a way that usefully increases performance of
> PostgreSQL, but I'll leave that up to you guys to figure out.
>

This comes up every year or so.  The ability of GPU offloading to help
with sorting has to overcome the additional latency that comes from
copying everything over to it and then getting all the results back.  If
you look at the typical types of sorting people see in PostgreSQL, it's
hard to find ones that are a) big enough to benefit from being offloaded
to the GPU like that, while also being b) not so bottlenecked on disk
I/O that speeding up the CPU part matters.  And if you need to sort
something in that category, you probably just put an index on it instead
and call it a day.

If you made me make a list of things I'd think would be worthwhile to
spend effort improving in PostgreSQL, this would be on the research
list, but unlikely to even make my personal top 100 things that are work
fiddling with.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: GPU Accelerated Sorting

From
gael@pilotsystems.net (Gaël Le Mignot)
Date:
Hello,

In my humble opinion, while  it can sound interesting from a theorical
point of view to outloads some  operations to the GPU, there is a huge
pratical problem in current world  : databases which are big enough to
require such heavy optimization are usually runned on server hardware,
which very rarely have powerful GPU.

There may  be a small  target of computers  having both GPU  and heavy
database, but that sounds very  exceptional to me, so investing effort
into it sounds a bit unjustified to me.

Except  as a  matter of  scientific research,  which is  it  of course
important for the future.

Regards,

--
Gaël Le Mignot - gael@pilotsystems.net
Pilot Systems - 9, rue Desargues - 75011 Paris
Tel : +33 1 44 53 05 55 - www.pilotsystems.net
Gérez vos contacts et vos newsletters : www.cockpit-mailing.com

Re: GPU Accelerated Sorting

From
Eliot Gable
Date:
Well, from that perspective, it becomes a "chicken and egg" problem.
Without the software support to use a GPU in a server for
acceleration, nobody's going to build a server with a GPU.

However, as previously stated, I can understand the challenges with
determining whether the offloading would even be worthwhile (given
disk performance constraints and GPU memory loading constraints vs
payoff for acceleration), much less finding actual real-world queries
that would benefit from it. I am sure there are probably hundreds of
other performance improvements that could be made that would have much
more wide-spread appeal, but it's an interesting thing to consider.
This is not the first thing that's come up that a database could use a
GPU for in order to improve performance, and it won't be the last.
It's something to keep in mind, and if enough additional items come up
where GPUs can do better than CPUs, it might be worthwhile to start
implementing some of those things. Maybe if enough of them get
implemented, there will be enough overall performance increase to make
it worthwhile to put a GPU in a database server. Besides, if you can
sort data on a GPU faster than a CPU, you can probably also search for
data faster on the GPU than on the CPU under similar conditions. With
memory increasing like crazy in servers, it might be worthwhile to
keep indexes entirely in memory (but keep them sync'd to the disk,
obviously) and for extremely large tables, dump them to the GPU for a
massively parallel search. In fact, if you can get enough GPU memory
that you could keep them entirely in the GPU memory and keep them
updated there any time they change, you could see some real
performance pay-offs.

I'm not saying someone should just go out and do this right now, but
it might be worthwhile to keep it in mind as code is re-written or
updated in the future how it might be structured to more easily
implement something like this in the future.

On Mon, Aug 30, 2010 at 10:56 AM, Gaël Le Mignot <gael@pilotsystems.net> wrote:
> Hello,
>
> In my humble opinion, while  it can sound interesting from a theorical
> point of view to outloads some  operations to the GPU, there is a huge
> pratical problem in current world  : databases which are big enough to
> require such heavy optimization are usually runned on server hardware,
> which very rarely have powerful GPU.
>
> There may  be a small  target of computers  having both GPU  and heavy
> database, but that sounds very  exceptional to me, so investing effort
> into it sounds a bit unjustified to me.
>
> Except  as a  matter of  scientific research,  which is  it  of course
> important for the future.
>
> Regards,
>
> --
> Gaël Le Mignot - gael@pilotsystems.net
> Pilot Systems - 9, rue Desargues - 75011 Paris
> Tel : +33 1 44 53 05 55 - www.pilotsystems.net
> Gérez vos contacts et vos newsletters : www.cockpit-mailing.com
>



--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing
from our children, we're stealing from them--and it's not even
considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to
live; not live to eat.) ~Marcus Tullius Cicero

Re: GPU Accelerated Sorting

From
Yeb Havinga
Date:
Greg Smith wrote:
> This comes up every year or so.  The ability of GPU offloading to help
> with sorting has to overcome the additional latency that comes from
> copying everything over to it and then getting all the results back.
> If you look at the typical types of sorting people see in PostgreSQL,
> it's hard to find ones that are a) big enough to benefit from being
> offloaded to the GPU like that, while also being b) not so
> bottlenecked on disk I/O that speeding up the CPU part matters.  And
> if you need to sort something in that category, you probably just put
> an index on it instead and call it a day.
>
> If you made me make a list of things I'd think would be worthwhile to
> spend effort improving in PostgreSQL, this would be on the research
> list, but unlikely to even make my personal top 100 things that are
> work fiddling with.
Related is 'Parallelizing query optimization'
(http://www.vldb.org/pvldb/1/1453882.pdf) in which they actually
experiment with PostgreSQL. Note that their target platform is general
purpose CPU, not a SIMD GPU processor.

-- Yeb


Re: GPU Accelerated Sorting

From
Scott Marlowe
Date:
On Mon, Aug 30, 2010 at 8:56 AM, Gaël Le Mignot <gael@pilotsystems.net> wrote:
> Hello,
>
> In my humble opinion, while  it can sound interesting from a theorical
> point of view to outloads some  operations to the GPU, there is a huge
> pratical problem in current world  : databases which are big enough to
> require such heavy optimization are usually runned on server hardware,
> which very rarely have powerful GPU.

That's changed recently:
http://www.aberdeeninc.com/abcatg/GPUservers.htm

> There may  be a small  target of computers  having both GPU  and heavy
> database, but that sounds very  exceptional to me, so investing effort
> into it sounds a bit unjustified to me.

I tend to agree.  OTOH, imagine using a 400 core GPU for offloading
stuff that isn't just a sort, like travelling salesman type problems.
The beauty of if is, that with pgsql support dozens of scripting
languages, you wouldn't have to build anything into pg's backends, you
could just write it in a pl langauge.

--
To understand recursion, one must first understand recursion.

Re: GPU Accelerated Sorting

From
David Boreham
Date:
  Feels like I fell through a worm hole in space/time, back to inmos in
1987, and a guy from marketing has just
walked in the office going on about there's a customer who wants to use
our massively parallel hardware to speed up databases...



Re: GPU Accelerated Sorting

From
Jeff Davis
Date:
On Mon, 2010-08-30 at 09:51 -0400, Eliot Gable wrote:
> Not sure if anyone else saw this, but it struck me as an interesting
> idea if it could be added to PostgreSQL. GPU accelerated database
> operations could be very... interesting. Of course, this could be
> difficult to do in a way that usefully increases performance of
> PostgreSQL, but I'll leave that up to you guys to figure out.
>
> http://code.google.com/p/back40computing/wiki/RadixSorting
>

Radix sort is not a comparison sort. Comparison sorts work for any data
type for which you define a total order; and any total order is allowed.
Radix sort only works for some data types and some total orders.

However, it would be very nice to use radix sorting where it does work.
That would require some extensions to the type system, but it could be
done.

The GPU issue is orthogonal.

Regards,
    Jeff Davis


Re: GPU Accelerated Sorting

From
Chris Browne
Date:
david_list@boreham.org (David Boreham) writes:
>  Feels like I fell through a worm hole in space/time, back to inmos in
> 1987, and a guy from marketing has just
> walked in the office going on about there's a customer who wants to
> use our massively parallel hardware to speed up databases...

... As long as you're willing to rewrite PostgreSQL in Occam 2...
--
http://projects.cs.kent.ac.uk/projects/tock/trac/
The statistics on  sanity are that one out of  every four Americans is
suffering from some  form of mental illness. Think  of your three best
friends. If they're okay, then it's you. -- Rita Mae Brown

Re: GPU Accelerated Sorting

From
David Boreham
Date:
  On 8/30/2010 3:18 PM, Chris Browne wrote:
> ... As long as you're willing to rewrite PostgreSQL in Occam 2...

Just re-write it in Google's new language 'Go' : it's close enough to
Occam and they'd probably fund the project..

;)



Re: GPU Accelerated Sorting

From
Scott Carey
Date:
On a similar note, is Postgres' Quicksort a dual-pivot quicksort?  This can be up to 2x as fast as a normal quicksort
(25%fewer swap operations, and swap operations are more expensive than compares for most sorts). 

Just google 'dual pivot quicksort' for more info.


And before anyone asks -- two pivots (3 partitions) is optimal.  See
http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-September/002676.html


On Aug 30, 2010, at 12:25 PM, Yeb Havinga wrote:

> Greg Smith wrote:
>> This comes up every year or so.  The ability of GPU offloading to help
>> with sorting has to overcome the additional latency that comes from
>> copying everything over to it and then getting all the results back.
>> If you look at the typical types of sorting people see in PostgreSQL,
>> it's hard to find ones that are a) big enough to benefit from being
>> offloaded to the GPU like that, while also being b) not so
>> bottlenecked on disk I/O that speeding up the CPU part matters.  And
>> if you need to sort something in that category, you probably just put
>> an index on it instead and call it a day.
>>
>> If you made me make a list of things I'd think would be worthwhile to
>> spend effort improving in PostgreSQL, this would be on the research
>> list, but unlikely to even make my personal top 100 things that are
>> work fiddling with.
> Related is 'Parallelizing query optimization'
> (http://www.vldb.org/pvldb/1/1453882.pdf) in which they actually
> experiment with PostgreSQL. Note that their target platform is general
> purpose CPU, not a SIMD GPU processor.
>
> -- Yeb
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: GPU Accelerated Sorting

From
Tom Lane
Date:
Scott Carey <scott@richrelevance.com> writes:
> On a similar note, is Postgres' Quicksort a dual-pivot quicksort?  This can be up to 2x as fast as a normal quicksort
(25%fewer swap operations, and swap operations are more expensive than compares for most sorts). 

In Postgres, the swaps are pretty much free compared to the
comparisons.  Sorry, but the above doesn't especially tempt me...

            regards, tom lane