Using the GPU

Started by Billings, Johnalmost 19 years ago12 messagesgeneral
Jump to latest
#1Billings, John
John.Billings@PAETEC.com

Does anyone think that PostgreSQL could benefit from using the video
card as a parallel computing device? I'm working on a project using
Nvidia's CUDA with an 8800 series video card to handle non-graphical
algorithms. I'm curious if anyone thinks that this technology could be
used to speed up a database? If so which part of the database, and what
kind of parallel algorithms would be used?
Thanks,
-- John Billings

John L. Billings
Principal Applications Developer
585.413.2219 Office
585.339.8580 Mobile
John.Billings@PAETEC.com
<http://www.paetec.com/&gt;

Attachments:

shim.gifimage/gif; name=shim.gifDownload
My_PAETEC_signature_r1_c1.gifimage/gif; name=My_PAETEC_signature_r1_c1.gifDownload
My_PAETEC_signature_r1_c2.gifimage/gif; name=My_PAETEC_signature_r1_c2.gifDownload
My_PAETEC_signature_r2_c1.gifimage/gif; name=My_PAETEC_signature_r2_c1.gifDownload
My_PAETEC_signature_r3_c1.gifimage/gif; name=My_PAETEC_signature_r3_c1.gifDownload
#2Alexander Staubo
alex@purefiction.net
In reply to: Billings, John (#1)
Re: Using the GPU

On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:

Does anyone think that PostgreSQL could benefit from using the video card
as a parallel computing device? I'm working on a project using Nvidia's
CUDA with an 8800 series video card to handle non-graphical algorithms.
I'm curious if anyone thinks that this technology could be used to speed up
a database?

Absolutely.

If so which part of the database, and what kind of parallel algorithms would be used?

GPUs are parallel vector processing pipelines, which as far as I can
tell do not lend themselves right away to the data structures that
PostgreSQL uses; they're optimized for processing high volumes of
homogenously typed values in sequence.

From what I know about its internals, like most relational databases
PostgreSQL stores each tuple as a sequence of values (v1, v2, ...,
vN). Each tuple has a table of offsets into the tuple so that you can
quickly find a value based on an attribute; in other words, data is
not fixed-length or in fixed positions, table scans need to process
one tuple at a time.

GPUs would be a lot easier to integrate with databases such as Monet,
KDB and C-Store, which partition tables vertically -- each column in a
table is stored separately a vector of values.

Alexander.

#3Dawid Kuroczko
qnex42@gmail.com
In reply to: Billings, John (#1)
Re: Using the GPU

On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:

Does anyone think that PostgreSQL could benefit from using the video card as a parallel computing device? I'm working on a project using Nvidia's CUDA with an 8800 series video card to handle non-graphical algorithms. I'm curious if anyone thinks that this technology could be used to speed up a database? If so which part of the database, and what kind of parallel algorithms would be used?

You might want to look at:

http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf

...haven't used it though...

Regards,
Dawid

#4Alban Hertroys
alban@magproductions.nl
In reply to: Alexander Staubo (#2)
Re: Using the GPU

Alexander Staubo wrote:

On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:

If so which part of the database, and what kind of parallel
algorithms would be used?

GPUs are parallel vector processing pipelines, which as far as I can
tell do not lend themselves right away to the data structures that
PostgreSQL uses; they're optimized for processing high volumes of
homogenously typed values in sequence.

But wouldn't vector calculations on database data be sped up? I'm
thinking of GIS data, joins across ranges like matching one (start, end)
range with another, etc.
I realize these are rather specific calculations, but if they're
important to your application...

OTOH modern PC GPU's are optimized for pushing textures; basically
transferring a lot of data in as short a time as possible. Maybe it'd be
possible to move result sets around that way? Do joins even maybe?

And then there are the vertex and pixel shaders...

It'd be kind of odd though, to order a big time database server with a
high-end gaming card in it :P

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

#5Alejandro Torras
atec_post@yahoo.es
In reply to: Billings, John (#1)
Re: Using the GPU

Billings, John wrote:

Does anyone think that PostgreSQL could benefit from using the video
card as a parallel computing device? I'm working on a project using
Nvidia's CUDA with an 8800 series video card to handle non-graphical
algorithms. I'm curious if anyone thinks that this technology could
be used to speed up a database? If so which part of the database, and
what kind of parallel algorithms would be used?

Looking at nvidia's cuda homepage
(http://developer.nvidia.com/object/cuda.html), I see that the parallel
bitonic sorting could be used instead of qsort/heapsort/mergesort (I
don't know which is used)

--
Alejandro Torras

#6Alejandro Torras
atec_post@yahoo.es
In reply to: Alejandro Torras (#5)
Re: Using the GPU

Alejandro Torras wrote:

Billings, John wrote:

Does anyone think that PostgreSQL could benefit from using the video
card as a parallel computing device? I'm working on a project using
Nvidia's CUDA with an 8800 series video card to handle non-graphical
algorithms. I'm curious if anyone thinks that this technology could
be used to speed up a database? If so which part of the database,
and what kind of parallel algorithms would be used?

Looking at nvidia's cuda homepage
(http://developer.nvidia.com/object/cuda.html), I see that the
parallel bitonic sorting could be used instead of
qsort/heapsort/mergesort (I don't know which is used)

I think that the function cublasIsamax() explained at
http://developer.download.nvidia.com/compute/cuda/0_8/NVIDIA_CUBLAS_Library_0.8.pdf
can be used to find the maximum of a single precision vector, but
according with a previous post of Alexander Staubo, this function is
best suited for fixed-length tuple values.

But could the data be separated into two zones, one for varying-length
data and other for fixed-length data?
With this approach fixed-length data may be susceptible for more and
deeper optimizations like parallelization processing.

--
Alejandro Torras

#7Tom Allison
tom@tacocat.net
In reply to: Alban Hertroys (#4)
Re: Using the GPU

On Jun 11, 2007, at 4:31 AM, Alban Hertroys wrote:

Alexander Staubo wrote:

On 6/8/07, Billings, John <John.Billings@paetec.com> wrote:

If so which part of the database, and what kind of parallel
algorithms would be used?

GPUs are parallel vector processing pipelines, which as far as I can
tell do not lend themselves right away to the data structures that
PostgreSQL uses; they're optimized for processing high volumes of
homogenously typed values in sequence.

But wouldn't vector calculations on database data be sped up? I'm
thinking of GIS data, joins across ranges like matching one (start,
end)
range with another, etc.
I realize these are rather specific calculations, but if they're
important to your application...

OTOH modern PC GPU's are optimized for pushing textures; basically
transferring a lot of data in as short a time as possible. Maybe
it'd be
possible to move result sets around that way? Do joins even maybe?

OTOH databases might not be running on modern desktop PC's with the
GPU investment.
Rather they might be running on a "headless" machine that has little
consideration for the GPU.

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance! <g>

#8Alexander Staubo
alex@purefiction.net
In reply to: Tom Allison (#7)
Re: Using the GPU

On 6/16/07, Tom Allison <tom@tacocat.net> wrote:

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance! <g>

Why does it matter what kind of hardware you can (not "have to") buy
to give your database a performance boost? With a GPU, you would have
one more component that you could upgrade to improve performance;
that's more possibilities, not less. I only see a problem with a
database that would *require* a GPU to achieve adequate performance,
or to function at all, but that's not what this thread is about.

Alexander.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Staubo (#8)
Re: Using the GPU

"Alexander Staubo" <alex@purefiction.net> writes:

On 6/16/07, Tom Allison <tom@tacocat.net> wrote:

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance! <g>

Why does it matter what kind of hardware you can (not "have to") buy
to give your database a performance boost? With a GPU, you would have
one more component that you could upgrade to improve performance;
that's more possibilities, not less. I only see a problem with a
database that would *require* a GPU to achieve adequate performance,
or to function at all, but that's not what this thread is about.

Too often, arguments of this sort disregard the opportunity costs of
development going in one direction vs another. If we make any
significant effort to make Postgres use a GPU, that's development effort
spent on that rather than some other optimization; and more effort,
ongoing indefinitely, to maintain that code; and perhaps the code
will preclude other possible optimizations or features because of
assumptions wired into it. So you can't just claim that using a GPU
might be interesting; you have to persuade people that it's more
interesting than other places where we could spend our
performance-improvement efforts.

regards, tom lane

#10Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#9)
Re: Using the GPU

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

So you can't just claim that using a GPU might be interesting; you have to
persuade people that it's more interesting than other places where we could
spend our performance-improvement efforts.

I have a feeling something as sexy as that could attract new developers
though.

I think the hard part here is coming up with an abstract enough interface that
it doesn't tie Postgres to a particular implementation. I would want to see a
library that provided primitives that Postgres could use. Then that library
could have drivers for GPUs, or perhaps also for various other kinds of
coprocessors available in high end hardware.

I wonder if it exists already though.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#11Tom Allison
tom@tacocat.net
In reply to: Tom Lane (#9)
Re: Using the GPU

Tom Lane wrote:

"Alexander Staubo" <alex@purefiction.net> writes:

On 6/16/07, Tom Allison <tom@tacocat.net> wrote:

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance! <g>

Why does it matter what kind of hardware you can (not "have to") buy
to give your database a performance boost? With a GPU, you would have
one more component that you could upgrade to improve performance;
that's more possibilities, not less. I only see a problem with a
database that would *require* a GPU to achieve adequate performance,
or to function at all, but that's not what this thread is about.

Too often, arguments of this sort disregard the opportunity costs of
development going in one direction vs another. If we make any
significant effort to make Postgres use a GPU, that's development effort
spent on that rather than some other optimization; and more effort,
ongoing indefinitely, to maintain that code; and perhaps the code
will preclude other possible optimizations or features because of
assumptions wired into it. So you can't just claim that using a GPU
might be interesting; you have to persuade people that it's more
interesting than other places where we could spend our
performance-improvement efforts.

You have a good point.

I don't know enough about how/what people use databases for in general to know
what would be a good thing to work on. I'm still trying to find out the
particulars of postgresql, which are always sexy.

I'm also trying to fill in the gaps between what I already know in Oracle and
how to implement something similar in postgresq. But I probably don't know
enough about Oracle to do much there either.

I'm a believer in strong fundamentals over glamour.

#12Alexander Staubo
alex@purefiction.net
In reply to: Tom Lane (#9)
Re: Using the GPU

On 6/16/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Alexander Staubo" <alex@purefiction.net> writes:

On 6/16/07, Tom Allison <tom@tacocat.net> wrote:

It might make an interesting project, but I would be really depressed
if I had to go buy an NVidia card instead of investing in more RAM to
optimize my performance! <g>

Why does it matter what kind of hardware you can (not "have to") buy
to give your database a performance boost? With a GPU, you would have
one more component that you could upgrade to improve performance;
that's more possibilities, not less. I only see a problem with a
database that would *require* a GPU to achieve adequate performance,
or to function at all, but that's not what this thread is about.

Too often, arguments of this sort disregard the opportunity costs of
development going in one direction vs another. If we make any
significant effort to make Postgres use a GPU, that's development effort
spent on that rather than some other optimization [...]

I don't see how this goes against what I wrote. I was merely
addressing Tom Allison's comment, which seems to be an unnecessary
fear. By analogy, not everyone uses hardware RAID, for example, but
PostgreSQL can benefit greatly from it, so it does not make sense to
worry about "having to buy" it. Then again, Tom's comment may have
been in jest.

Alexanderr.