Thread: would hw acceleration help postgres (databases in general) ?

would hw acceleration help postgres (databases in general) ?

From

Hamza Bin Sohail

Date:

10 December 2010, 19:09:48

Hello hackers,

I think i'm at the right place to ask this question.

Based on your experience and the fact that you have written the Postgres code, 
can you tell what a rough break-down - in your opinion - is for the time the 
database spends time just "fetching and writing " stuff to memory and the 
actual computation. The reason i ask this is because off-late there has been a 
push to put reconfigurable hardware on processor cores. What this means is that 
database writers can possibly identify the compute-intensive portions of the 
code and write hardware accelerators and/or custom instructions and offload 
computation to these hardware accelerators which they would have programmed 
onto the FPGA. 

There is not much utility  in doing this if there aren't considerable compute-
intensive operations in the database (which i would be surprise if true ). I 
would suspect joins, complex queries etc may be very compute-intensive. Please 
correct me if i'm wrong. Moreover, if you were told that you have a 
reconfigurable hardware which can perform pretty complex computations 10x 
faster than the base, would you think about synthesizing it directly on an fpga 
and use it ?  

I'd be more than glad to hear your guesstimates.

Thanks alot !


Hamza

Re: would hw acceleration help postgres (databases in general) ?

From

Dann Corbit

Date:

10 December 2010, 19:15:04

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Hamza Bin Sohail
> Sent: Friday, December 10, 2010 3:10 PM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] would hw acceleration help postgres (databases in
> general) ?
>
>
> Hello hackers,
>
> I think i'm at the right place to ask this question.
>
> Based on your experience and the fact that you have written the
> Postgres code,
> can you tell what a rough break-down - in your opinion - is for the
> time the
> database spends time just "fetching and writing " stuff to memory and
> the
> actual computation. The reason i ask this is because off-late there has
> been a
> push to put reconfigurable hardware on processor cores. What this means
> is that
> database writers can possibly identify the compute-intensive portions
> of the
> code and write hardware accelerators and/or custom instructions and
> offload
> computation to these hardware accelerators which they would have
> programmed
> onto the FPGA.
>
> There is not much utility  in doing this if there aren't considerable
> compute-
> intensive operations in the database (which i would be surprise if true
> ). I
> would suspect joins, complex queries etc may be very compute-intensive.
> Please
> correct me if i'm wrong. Moreover, if you were told that you have a
> reconfigurable hardware which can perform pretty complex computations
> 10x
> faster than the base, would you think about synthesizing it directly on
> an fpga
> and use it ?
>
> I'd be more than glad to hear your guesstimates.

Here is a sample project:
http://www.cs.virginia.edu/~skadron/Papers/bakkum_sqlite_gpgpu10.pdf
And another:
http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/ngm/15-823/project/Final.pdf

Re: would hw acceleration help postgres (databases in general) ?

From

Josh Berkus

Date:

10 December 2010, 19:48:56

On 12/10/10 3:09 PM, Hamza Bin Sohail wrote:
> There is not much utility  in doing this if there aren't considerable compute-
> intensive operations in the database (which i would be surprise if true ). I 
> would suspect joins, complex queries etc may be very compute-intensive. Please 
> correct me if i'm wrong. Moreover, if you were told that you have a 
> reconfigurable hardware which can perform pretty complex computations 10x 
> faster than the base, would you think about synthesizing it directly on an fpga 
> and use it ?  

Databases are, in general, CPU-bound.  Most activities are
compute-intensive.  Even things you might think would be I/O-bound ...
like COPY ... end up being dominated by parsing and building data
structures.

So, take your pick.  COPY might be a good place to start, actually,
since the code is pretty isolated and it would be easy to do tests.

Or am I using a different definition of "compute-intensive" than you are?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com

Re: would hw acceleration help postgres (databases in general) ?

From

Jeff Janes

Date:

10 December 2010, 20:18:35

On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail <hsohail@purdue.edu> wrote:
>
> Hello hackers,
>
> I think i'm at the right place to ask this question.
>
> Based on your experience and the fact that you have written the Postgres code,
> can you tell what a rough break-down - in your opinion - is for the time the
> database spends time just "fetching and writing " stuff to memory and the
> actual computation.

The database is a general purpose tool.  Pick a bottleneck you wish to have,
and probably someone uses it in a way that causes that bottleneck to occur.

> The reason i ask this is because off-late there has been a
> push to put reconfigurable hardware on processor cores. What this means is that
> database writers can possibly identify the compute-intensive portions of the
> code and write hardware accelerators and/or custom instructions and offload
> computation to these hardware accelerators which they would have programmed
> onto the FPGA.

When people don't use prepared statements, parsing can become a bottleneck.

If Bison's yyparse could be put on a FPGA in a transparent way, than
anyone using
Bison, including PG, might benefit.

That's just one example, of course.

Cheers,

Jeff

Re: would hw acceleration help postgres (databases in general) ?

From

"Hamza Bin Sohail"

Date:

10 December 2010, 20:39:25

Thanks alot for all the replies. Very helpful, really appreciate it.

----- Original Message ----- 
From: "Jeff Janes" <jeff.janes@gmail.com>
To: "Hamza Bin Sohail" <hsohail@purdue.edu>
Cc: <pgsql-hackers@postgresql.org>
Sent: Friday, December 10, 2010 7:18 PM
Subject: Re: [HACKERS] would hw acceleration help postgres (databases in 
general) ?


> On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail <hsohail@purdue.edu> 
> wrote:
>>
>> Hello hackers,
>>
>> I think i'm at the right place to ask this question.
>>
>> Based on your experience and the fact that you have written the Postgres 
>> code,
>> can you tell what a rough break-down - in your opinion - is for the time 
>> the
>> database spends time just "fetching and writing " stuff to memory and the
>> actual computation.
>
> The database is a general purpose tool.  Pick a bottleneck you wish to 
> have,
> and probably someone uses it in a way that causes that bottleneck to 
> occur.
>
>> The reason i ask this is because off-late there has been a
>> push to put reconfigurable hardware on processor cores. What this means 
>> is that
>> database writers can possibly identify the compute-intensive portions of 
>> the
>> code and write hardware accelerators and/or custom instructions and 
>> offload
>> computation to these hardware accelerators which they would have 
>> programmed
>> onto the FPGA.
>
> When people don't use prepared statements, parsing can become a 
> bottleneck.
>
> If Bison's yyparse could be put on a FPGA in a transparent way, than
> anyone using
> Bison, including PG, might benefit.
>
> That's just one example, of course.
>
> Cheers,
>
> Jeff
>

Re: would hw acceleration help postgres (databases in general) ?

From

Jim Nasby

Date:

11 December 2010, 17:37:12

On Dec 10, 2010, at 6:18 PM, Jeff Janes wrote:
> On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail <hsohail@purdue.edu> wrote:
>>
>> Hello hackers,
>>
>> I think i'm at the right place to ask this question.
>>
>> Based on your experience and the fact that you have written the Postgres code,
>> can you tell what a rough break-down - in your opinion - is for the time the
>> database spends time just "fetching and writing " stuff to memory and the
>> actual computation.
>
> The database is a general purpose tool.  Pick a bottleneck you wish to have,
> and probably someone uses it in a way that causes that bottleneck to occur.

A common bottleneck we run into is sorting of text data. Unfortunately, I doubt that a GPU would be able to help with
that.
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: would hw acceleration help postgres (databases in general) ?

From

Chris Browne

Date:

13 December 2010, 18:24:18

jim@nasby.net (Jim Nasby) writes:
> On Dec 10, 2010, at 6:18 PM, Jeff Janes wrote:
>> On Fri, Dec 10, 2010 at 3:09 PM, Hamza Bin Sohail <hsohail@purdue.edu> wrote:
>>> 
>>> Hello hackers,
>>> 
>>> I think i'm at the right place to ask this question.
>>> 
>>> Based on your experience and the fact that you have written the Postgres code,
>>> can you tell what a rough break-down - in your opinion - is for the time the
>>> database spends time just "fetching and writing " stuff to memory and the
>>> actual computation.
>> 
>> The database is a general purpose tool.  Pick a bottleneck you wish
>> to have, and probably someone uses it in a way that causes that
>> bottleneck to occur.
>
> A common bottleneck we run into is sorting of text
> data. Unfortunately, I doubt that a GPU would be able to help with
> that.

Actually, that is a case where some successful experimentation has been
done.

http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/ngm/15-823/project/Final.pdf

Making it reliable to the point of being generally usable when someone
installs Postgres via a generic packaging tool in default fashion may be
somewhat more challenging!

But it appears that sorting is a plausible application for GPUs.
-- 
output = ("cbbrowne" "@" "linuxdatabases.info")
"The right honorable  gentleman is reminiscent of  a  poker.  The only
difference is that a poker gives  off the occasional signs of warmth."
-- Benjamin Disraeli on Robert Peel

Re: would hw acceleration help postgres (databases in general) ?

From

3dmashup

Date:

27 March 2011, 03:08:37

Yes!

Probably very much so. There is good evidence that using multiple CPU's and
GPU's will speed sorting and many other database operations too.

See
http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/ngm/15-823/project/Final.pdf

The question become how practical is it? There are numerous issues; Sorts
often use multiplecolumns and character data as keys. Little research has been done on
sorting multi-column,
variable length character data types on GPUs. Most research papers have
used a single numeric (int or real) key. Fixed length character encodings such as UCS-2 or
UCS-4 and not UTF-8 or UTF-16 will work faster, at the expense of storage, for character data
sorting,

For PostgreSQL you also need to support many platforms. Most database GPGPU
research
has been done using Nvidia's CUDA programming environment.
Several papers have been published on using CUDA with PostgreSQL.But do we want to be tied into a single vendors hw?

No, Never!

OpenCL, is C Language for programming GPU's and many-core CPUs.It addresses the platform problem; it runs on
Windows,MAC, and Linux.It supports Nvidia, and AMD GPU's. It supports Intel, AMD, and ARM

CPU's and IBM/Toshiba Cell processors too.

Most GPU programming models are data array centric. They work great if
your data type is anaturally array centric like vectors, matrices, or images. They also work
better if the data is fixed length too.
SQL is relational; row and column centric. SQL data types are not usually
fixed length vectors, arrays or images.
But most SQL data types can be handled with OpenCL on GPUs with some work.

To leverage GPUs well, you need good parallel algorithms that effectively
utilize the GPU'smemory model. These parallel algorithms will give good speed-ups, based
on Amdahl's Law, only if they are very parallel.
E.g. 99% or better. So there lots of work to do in algorithms and their
practical implementation.

There is great potential in hw accelerated databases. This is why I've
started PgOpenCL language bindings project
See
http://www.scribd.com/doc/51484335/PostgreSQL-OpenCL-Procedural-Language-pgEast-March-2011

There a still more issues like multi-tasking, fixed memory size, data
transfer speed to resolve.
Using GPUs won't solve data I/O parallelism problems. But once the data is
in RAM they can kick some serious ...
We need to research and conquer the issues that hold us back from using hw
acceleration like GPU's.
Running OpenCL inside PostgreSQL is just a first step.

And I can't believe I've made a post on this subject without bringing up the
topic of multi-threading...
It's price you have to pay to utilize the 1400+ Arithmetic Logic Units on
high end GPU.
Each ALU can evaluate a predicate, or sum 2 numbers, and do lots more in
parallel.

With OpenCL, MT is built into the language, you don't have to mess with
P-threads, and such evils.
OpenCL has some simple built-in synchronization techniques, that's a
fundamental for parallel programming too.

-Tim

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/would-hw-acceleration-help-postgres-databases-in-general-tp3301110p4266108.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.