Thread: Using the GPU

Using the GPU

From
"Billings, John"
Date:
Does anyone think that PostgreSQL could benefit from using the video card as a parallel computing device?  I'm working on a project using Nvidia's CUDA with an 8800 series video card to handle non-graphical algorithms.  I'm curious if anyone thinks that this technology could be used to speed up a database?  If so which part of the database, and what kind of parallel algorithms would be used?
Thanks,
-- John Billings
 
 
John L. Billings
Principal Applications Developer
585.413.2219  Office
585.339.8580  Mobile
John.Billings@PAETEC.com
Visit PAETEC.COM
 
Attachment

Re: Using the GPU

From
"Alexander Staubo"
Date:
On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:
> Does anyone think  that PostgreSQL could benefit from using the video card
> as a parallel computing  device?  I'm working on a project using Nvidia's
> CUDA with an 8800 series  video card to handle non-graphical algorithms.
> I'm curious if anyone  thinks that this technology could be used to speed up
> a database?

Absolutely.

> If so  which part of the database, and what kind of parallel algorithms would be  used?

GPUs are parallel vector processing pipelines, which as far as I can
tell do not lend themselves right away to the data structures that
PostgreSQL uses; they're optimized for processing high volumes of
homogenously typed values in sequence.

From what I know about its internals, like most relational databases
PostgreSQL stores each tuple as a sequence of values (v1, v2, ...,
vN). Each tuple has a table of offsets into the tuple so that you can
quickly find a value based on an attribute; in other words, data is
not fixed-length or in fixed positions, table scans need to process
one tuple at a time.

GPUs would be a lot easier to integrate with databases such as Monet,
KDB and C-Store, which partition tables vertically -- each column in a
table is stored separately a vector of values.

Alexander.

Re: Using the GPU

From
"Dawid Kuroczko"
Date:
On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:
>
>
>
> Does anyone think  that PostgreSQL could benefit from using the video card as a parallel computing  device?  I'm
workingon a project using Nvidia's CUDA with an 8800 series  video card to handle non-graphical algorithms.  I'm
curiousif anyone  thinks that this technology could be used to speed up a database?  If so  which part of the database,
andwhat kind of parallel algorithms would be  used? 

You might want to look at:

http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf

...haven't used it though...

   Regards,
       Dawid

Re: Using the GPU

From
Alban Hertroys
Date:
Alexander Staubo wrote:
> On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:
>> If so  which part of the database, and what kind of parallel
>> algorithms would be  used?
>
> GPUs are parallel vector processing pipelines, which as far as I can
> tell do not lend themselves right away to the data structures that
> PostgreSQL uses; they're optimized for processing high volumes of
> homogenously typed values in sequence.

But wouldn't vector calculations on database data be sped up? I'm
thinking of GIS data, joins across ranges like matching one (start, end)
range with another, etc.
I realize these are rather specific calculations, but if they're
important to your application...

OTOH modern PC GPU's are optimized for pushing textures; basically
transferring a lot of data in as short a time as possible. Maybe it'd be
possible to move result sets around that way? Do joins even maybe?

And then there are the vertex and pixel shaders...

It'd be kind of odd though, to order a big time database server with a
high-end gaming card in it :P

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

Re: Using the GPU

From
Alejandro Torras
Date:
Billings, John wrote:
> Does anyone think that PostgreSQL could benefit from using the video
> card as a parallel computing device?  I'm working on a project using
> Nvidia's CUDA with an 8800 series video card to handle non-graphical
> algorithms.  I'm curious if anyone thinks that this technology could
> be used to speed up a database?  If so which part of the database, and
> what kind of parallel algorithms would be used?
>

Looking at nvidia's cuda homepage
(http://developer.nvidia.com/object/cuda.html), I see that the parallel
bitonic sorting could be used instead of qsort/heapsort/mergesort (I
don't know which is used)

--
Alejandro Torras


Re: Using the GPU

From
Alejandro Torras
Date:
Alejandro Torras wrote:
> Billings, John wrote:
>> Does anyone think that PostgreSQL could benefit from using the video
>> card as a parallel computing device?  I'm working on a project using
>> Nvidia's CUDA with an 8800 series video card to handle non-graphical
>> algorithms.  I'm curious if anyone thinks that this technology could
>> be used to speed up a database?  If so which part of the database,
>> and what kind of parallel algorithms would be used?
>>
>
> Looking at nvidia's cuda homepage
> (http://developer.nvidia.com/object/cuda.html), I see that the
> parallel bitonic sorting could be used instead of
> qsort/heapsort/mergesort (I don't know which is used)
>
I think that the function cublasIsamax() explained at
http://developer.download.nvidia.com/compute/cuda/0_8/NVIDIA_CUBLAS_Library_0.8.pdf
can be used to find the maximum of a single precision vector, but
according with a previous post of Alexander Staubo, this function is
best suited for fixed-length tuple values.

But could the data be separated into two zones, one for varying-length
data and other for fixed-length data?
With this approach fixed-length data may be susceptible for more and
deeper optimizations like parallelization processing.

--
Alejandro Torras


Re: Using the GPU

From
Tom Allison
Date:
On Jun 11, 2007, at 4:31 AM, Alban Hertroys wrote:

>
> Alexander Staubo wrote:
>> On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:
>>> If so  which part of the database, and what kind of parallel
>>> algorithms would be  used?
>>
>> GPUs are parallel vector processing pipelines, which as far as I can
>> tell do not lend themselves right away to the data structures that
>> PostgreSQL uses; they're optimized for processing high volumes of
>> homogenously typed values in sequence.
>
> But wouldn't vector calculations on database data be sped up? I'm
> thinking of GIS data, joins across ranges like matching one (start,
> end)
> range with another, etc.
> I realize these are rather specific calculations, but if they're
> important to your application...
>
> OTOH modern PC GPU's are optimized for pushing textures; basically
> transferring a lot of data in as short a time as possible. Maybe
> it'd be
> possible to move result sets around that way? Do joins even maybe?

OTOH databases might not be running on modern desktop PC's with the
GPU investment.
Rather they might be running on a "headless" machine that has little
consideration for the GPU.

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance!  <g>


Re: Using the GPU

From
"Alexander Staubo"
Date:
On 6/16/07, Tom Allison <tom@tacocat.net> wrote:
> It might make an interesting project, but I would be really depressed
> if I had to go buy an NVidia card instead of investing in more RAM to
> optimize my performance!  <g>

Why does it matter what kind of hardware you can (not "have to") buy
to give your database a performance boost? With a GPU, you would have
one more component that you could upgrade to improve performance;
that's more possibilities, not less. I only see a problem with a
database that would *require* a GPU to achieve adequate performance,
or to function at all, but that's not what this thread is about.

Alexander.

Re: Using the GPU

From
Tom Lane
Date:
"Alexander Staubo" <alex@purefiction.net> writes:
> On 6/16/07, Tom Allison <tom@tacocat.net> wrote:
>> It might make an interesting project, but I would be really depressed
>> if I had to go buy an NVidia card instead of investing in more RAM to
>> optimize my performance!  <g>

> Why does it matter what kind of hardware you can (not "have to") buy
> to give your database a performance boost? With a GPU, you would have
> one more component that you could upgrade to improve performance;
> that's more possibilities, not less. I only see a problem with a
> database that would *require* a GPU to achieve adequate performance,
> or to function at all, but that's not what this thread is about.

Too often, arguments of this sort disregard the opportunity costs of
development going in one direction vs another.  If we make any
significant effort to make Postgres use a GPU, that's development effort
spent on that rather than some other optimization; and more effort,
ongoing indefinitely, to maintain that code; and perhaps the code
will preclude other possible optimizations or features because of
assumptions wired into it.  So you can't just claim that using a GPU
might be interesting; you have to persuade people that it's more
interesting than other places where we could spend our
performance-improvement efforts.

            regards, tom lane

Re: Using the GPU

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> So you can't just claim that using a GPU might be interesting; you have to
> persuade people that it's more interesting than other places where we could
> spend our performance-improvement efforts.

I have a feeling something as sexy as that could attract new developers
though.

I think the hard part here is coming up with an abstract enough interface that
it doesn't tie Postgres to a particular implementation. I would want to see a
library that provided primitives that Postgres could use. Then that library
could have drivers for GPUs, or perhaps also for various other kinds of
coprocessors available in high end hardware.

I wonder if it exists already though.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: Using the GPU

From
Tom Allison
Date:
Tom Lane wrote:
> "Alexander Staubo" <alex@purefiction.net> writes:
>> On 6/16/07, Tom Allison <tom@tacocat.net> wrote:
>>> It might make an interesting project, but I would be really depressed
>>> if I had to go buy an NVidia card instead of investing in more RAM to
>>> optimize my performance!  <g>
>
>> Why does it matter what kind of hardware you can (not "have to") buy
>> to give your database a performance boost? With a GPU, you would have
>> one more component that you could upgrade to improve performance;
>> that's more possibilities, not less. I only see a problem with a
>> database that would *require* a GPU to achieve adequate performance,
>> or to function at all, but that's not what this thread is about.
>
> Too often, arguments of this sort disregard the opportunity costs of
> development going in one direction vs another.  If we make any
> significant effort to make Postgres use a GPU, that's development effort
> spent on that rather than some other optimization; and more effort,
> ongoing indefinitely, to maintain that code; and perhaps the code
> will preclude other possible optimizations or features because of
> assumptions wired into it.  So you can't just claim that using a GPU
> might be interesting; you have to persuade people that it's more
> interesting than other places where we could spend our
> performance-improvement efforts.

You have a good point.

I don't know enough about how/what people use databases for in general to know
what would be a good thing to work on.  I'm still trying to find out the
particulars of postgresql, which are always sexy.

I'm also trying to fill in the gaps between what I already know in Oracle and
how to implement something similar in postgresq.  But I probably don't know
enough about Oracle to do much there either.

I'm a believer in strong fundamentals over glamour.

Re: Using the GPU

From
"Alexander Staubo"
Date:
On 6/16/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Alexander Staubo" <alex@purefiction.net> writes:
> > On 6/16/07, Tom Allison <tom@tacocat.net> wrote:
> >> It might make an interesting project, but I would be really depressed
> >> if I had to go buy an NVidia card instead of investing in more RAM to
> >> optimize my performance!  <g>
>
> > Why does it matter what kind of hardware you can (not "have to") buy
> > to give your database a performance boost? With a GPU, you would have
> > one more component that you could upgrade to improve performance;
> > that's more possibilities, not less. I only see a problem with a
> > database that would *require* a GPU to achieve adequate performance,
> > or to function at all, but that's not what this thread is about.
>
> Too often, arguments of this sort disregard the opportunity costs of
> development going in one direction vs another.  If we make any
> significant effort to make Postgres use a GPU, that's development effort
> spent on that rather than some other optimization [...]

I don't see how this goes against what I wrote. I was merely
addressing Tom Allison's comment, which seems to be an unnecessary
fear. By analogy, not everyone uses hardware RAID, for example, but
PostgreSQL can benefit greatly from it, so it does not make sense to
worry about "having to buy" it. Then again, Tom's comment may have
been in jest.

Alexanderr.