Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables? - Mailing list pgsql-hackers

From dg@illustra.com (David Gould)
Subject Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?
Date
Msg-id 9803130153.AA28898@hawk.illustra.com
Whole thread Raw
In response to Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?  (Michal Mosiewicz <mimo@interdata.com.pl>)
Responses Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?t  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
Michal Mosiewicz writes:
> As I was always biased to threading I would note that in many cases it
> is a big win. First of all, today it's the IO which is usually the
> slowest part of the database. ...
...
> However, if you do some IO, then some processing, then some IO.... you
> loose the capability of optimising your requests. ...
> ... But if your system is not loaded too heavily, it's
> good to parallelize IO tasks. And the easiest way to accomplish it is to
> use threads for parallel execution of tasks.

Agreed, but what you are talking about here is decomposing a query into
it's parallel components and executing them in parallel. This is a win
of course, but the optimizer and executor have to support it. Also, you
start to want things like table fragmentation across devices to make this
work. A big job. As a shortcut, you can just do some lookahead on index scans
and do prefetch. Doesn't buy as much, but could probably be done very
quickly.

> But I notice that many people still think of threads as a replacement
> for fork. Of course, in such case it's pretty useless since fork is fast
> enough. But the key to the success is to parallelize single queries not
> only to leverage processor usage, but also to push IO to it's maximum.

This is indeed what I was thinking about. The process per connection scheme
of Postgres is often criticised vs a thread per connection scheme as in
Sybase for example. I was responding to that criticism.

> > That is a very easy win for us.  I hadn't considered the synchonization
> > problems with palloc/pfree, and that could be a real problem.
>
> Few months ago I was thinking about it. Actually I don't see much
> problems with things like palloc/pfree. I don't see any problems with

If you have multiple threads each allocing memory at the same time, the
allocator data structures have to be protected.

> any heap data that is used locally. But it is a big problem when you
> take a look at global variables and global data that is accessed and
> modified in many places. This is potential source of troubles.

Too right.

-dg

David Gould            dg@illustra.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
 - I realize now that irony has no place in business communications.


pgsql-hackers by date:

Previous
From: Michal Mosiewicz
Date:
Subject: Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?t