Thread: Physical sites handling large data

Physical sites handling large data

From
"scott.marlowe"
Date:
I moved this over to general, where it's more on topic...

On Fri, 13 Sep 2002, Shridhar Daithankar wrote:

> Hi all,
>
> One of my friends is evaluating postgres for large databases. This is a select
> intensive application which is something similar to data-warehousing as far as
> I can see.
>
> The data is 150GB in flat files so would swell to 200GB+ with indexes.
>
> Is anybody running that kind of site? Any url? Any performance numbers/tuning
> tips for random selects?
>
> I would hate to put mysql there but we are evaluating that too. I would hate if
> postgres loses this to mysql because I didn't know few things about postgres.
>
> Secondly would it make a difference if I host that database on say, an HP-UX
> box? From some tests I have done for my job, single CPU HP-UX box trounces 4
> way xeon box. Any suggestions in this directions?

Often times the real limiter for database performance is IO bandwidth and
subsystem, not the CPUs.  After that memory access speed and bandwidth are
very important too, so I can see a big HP UX box beating the pants off of
a Xeon.

Honestly, I'd put a dual 1G PIII 1G ram up against a quad xeon with 2
Gig ram if I got to spend the difference in cost on a very fast RAID
array for the PIII.  Since a quad Xeon with 2 Gigs ram and a pair of 18
gig SCSI drives goes for ~ $27,500 on Dell, and a Dual PIII 1Ghz with 5
15KRPM 18 gig drives goes for ~ $6,700, that leaves me with about $20,000
to spend on an external RAID array on top of the 5 15kRPM drives I've
already got configured.  An external RAID array with 144GB of 15krpm 18gig
drives runs ~$7700, so you could get three if you got the dual PIII
without all those drives built into it.  That makes for 24 15kRPM drives
and about 430 Gigs of storage, all in a four unit Rack mounted setup.

My point being, spend more money on the drive subsystem than anything else
and you'll probably be fine, but postgresql may or may not be your best
answer.  It may be better to use something like berkeley db to handle this
job than a SQL database.


Re: Physical sites handling large data

From
Justin Clift
Date:
Hi Scott,

Good move.  :)

Shridhar, any idea of the kind of demands they'll be placing on the
database?

For example, does your friend have an idea of:

a) how many clients will be simultaneously connecting to the database(s)
normally, and at peak times

b) how sensitive to performance lag and downtime is the application?

c) what are the data integrity requirements?  Large array's of data that
are mission critical need to be treated differently than small arrays,
especially when taking b) into consideration.  Higher end non-intel
servers generally have better features in their OS and hardware for
dealing with large amounts of important data.

d) what kind of stuff is your friend familar with?  For example, is he
ok with unix in general, etc?

The more info you can get to us, the better we can help yourselves out.

:-)

Regards and best wishes,

Justin Clift


"scott.marlowe" wrote:
>
> I moved this over to general, where it's more on topic...
>
> On Fri, 13 Sep 2002, Shridhar Daithankar wrote:
>
> > Hi all,
> >
> > One of my friends is evaluating postgres for large databases. This is a select
> > intensive application which is something similar to data-warehousing as far as
> > I can see.
> >
> > The data is 150GB in flat files so would swell to 200GB+ with indexes.
> >
> > Is anybody running that kind of site? Any url? Any performance numbers/tuning
> > tips for random selects?
> >
> > I would hate to put mysql there but we are evaluating that too. I would hate if
> > postgres loses this to mysql because I didn't know few things about postgres.
> >
> > Secondly would it make a difference if I host that database on say, an HP-UX
> > box? From some tests I have done for my job, single CPU HP-UX box trounces 4
> > way xeon box. Any suggestions in this directions?
>
> Often times the real limiter for database performance is IO bandwidth and
> subsystem, not the CPUs.  After that memory access speed and bandwidth are
> very important too, so I can see a big HP UX box beating the pants off of
> a Xeon.
>
> Honestly, I'd put a dual 1G PIII 1G ram up against a quad xeon with 2
> Gig ram if I got to spend the difference in cost on a very fast RAID
> array for the PIII.  Since a quad Xeon with 2 Gigs ram and a pair of 18
> gig SCSI drives goes for ~ $27,500 on Dell, and a Dual PIII 1Ghz with 5
> 15KRPM 18 gig drives goes for ~ $6,700, that leaves me with about $20,000
> to spend on an external RAID array on top of the 5 15kRPM drives I've
> already got configured.  An external RAID array with 144GB of 15krpm 18gig
> drives runs ~$7700, so you could get three if you got the dual PIII
> without all those drives built into it.  That makes for 24 15kRPM drives
> and about 430 Gigs of storage, all in a four unit Rack mounted setup.
>
> My point being, spend more money on the drive subsystem than anything else
> and you'll probably be fine, but postgresql may or may not be your best
> answer.  It may be better to use something like berkeley db to handle this
> job than a SQL database.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: Physical sites handling large data

From
"Shridhar Daithankar"
Date:
On 14 Sep 2002 at 9:39, Justin Clift wrote:

> Hi Scott,
>
> Good move.  :)
>
> Shridhar, any idea of the kind of demands they'll be placing on the
> database?

First of all, thanks you all guys for your quick and helpful responses. Robert
E. Bruccoleri send me his sites description which gave me an idea what
postgresql can do at that scale.

I spent more than 3 hours looking for urls of such a large installation. Didn't
get a single one via google(May be I am a bad search person..) Now from what
Robert E. Bruccoleri tells me, there is a bigger installation of postgres than
mentioned anywhere else.

I would request people to put up some information if they hav such an
installation. I would do the same subject to permission from client. Postgres
deserves this sort of publicity. Such information is very crucial when it comes
to convince government bodies to consider open source alternatives to
commercial ones, thr. LUGs etc..

I understand that designing database for such a site would require detailed
knowledge of application. I am persuing my friend/colleague to get as much
information to get out of it..

Now to answer your queries..

> For example, does your friend have an idea of:
>
> a) how many clients will be simultaneously connecting to the database(s)
> normally, and at peak times

AFAIK, this job is like analysis of log data(telecom domain). So number of
clients would not be big if there is no/small parallalism to be extracted at
application level. Most importantly number of clients will not be fluctuating
much.. So it gives some deterministic levels of prediction, to say so..

> b) how sensitive to performance lag and downtime is the application?

Hardly.. This is a replica of production system. Data loss is not going to be
an issue. Of course if the database pulls wrong data and hence wrong results
that is obviously unacceptable.. But I guess things won't come down to that
level..

There are not set performance goals as of now. Obviously faster is better..

> c) what are the data integrity requirements?  Large array's of data that
> are mission critical need to be treated differently than small arrays,
> especially when taking b) into consideration.  Higher end non-intel
> servers generally have better features in their OS and hardware for
> dealing with large amounts of important data.

IMO if I have to write a evaluation and proposal for this task, intel hardware
won't feature in just because it does not have a proven 64 bit CPU. Personally
I would recommend HP hardware knowing the client's profile..

> d) what kind of stuff is your friend familar with?  For example, is he
> ok with unix in general, etc?

That's OK. These guys are into HP-UX heavily.. That's not a problem. Postgres
has figured on their radar to save licensing costs towards oracle installation.

> "scott.marlowe" wrote:
> > Often times the real limiter for database performance is IO bandwidth and
> > subsystem, not the CPUs.  After that memory access speed and bandwidth are
> > very important too, so I can see a big HP UX box beating the pants off of
> > a Xeon.

Perfectly agreed. Slightly OT here. The test I referred to in my OP, xeon
machine had mirrored SCSI raid and 533MHz FSB. HP-UX box was single PA-
8700/750MHz CPU  with single SCSI disk.

IMO even bandwidth consideration were in xeon's favour. Only things in favour
of HP box were the 3MB on chip cache and 64 bit RISC CPU. If that makes so much
of a difference, intel machines won't stand a chance for a long time IMO..

> > My point being, spend more money on the drive subsystem than anything else
> > and you'll probably be fine, but postgresql may or may not be your best
> > answer.  It may be better to use something like berkeley db to handle this
> > job than a SQL database.

Agreed but with few differences...

I am going to divide my pitch as follows.

1) Can postgresql can do the job?

Yes. Refer to FAQ for limits of postgresql and real world installations like
posted by  Robert E. Bruccoleri in person, as example.

2) Why postgresql?

a) It can do the job as illustrated by 1.

b) Architecture of postgres

Some strong points,

  i) It's a most complete and/or useful open source SQL implementation. If they
want, they can customise it as they want later. Using berkeley DB might do good
at this level but depending upon complexity of application(Which I don't know
much), I would not rather put it.

 ii) Does not offer idiocies/features that would limit implementations. e.g.
since it relies on OS to take care of storage, it will run with same peace on
IDE disk to fiber array(or what ever highest end tech. available). One need not
wait till you get storage driver from database vendors.

Less unneeded features==Cleaner implementation

iii) Since table is split in multiples of 1GB, no upper limit on table size
and/or splitting the table across storages etc.. (Hand tweaking basically)

Some weak points(These guys are considering distributed databases but won't
mind spending on hardware if the proposal is that worth.)

  i) A database can not span mulptiple machines. So clustering is out. If data
is split in multiple databases on multiple machines, application will have to
do merging etc.

Any pointers on this? Partitioning etc?

 ii) No *out of box* replication. (OK I can take down this point but when mysql
pops up, I got to include this for fair comparison.)

iii) Being a process driven architecture, it can not process data in parallel
even if possible. e..g say a table is 100GB in size. So split across 100
segments on file system. But it can not return data from all 100 segments
simaltaneously because there is one process per connection. Besides it won't be
able to spread any computational load across multiple CPUs.

3) How to do it?

a) Get a 64 bit architecture. Depending upon computational requirements and
projected number of connections, add the CPUs. Start with one CPU. I guess that
would be enough given how good PA-Risc CPUs are. (OK I haven't used anything
else but PA-Risc CPUs look good to me for data manipulation)

b) Get loads of RAM. 4-8GB sounds good to me. May be even better.

On this topic, say I decide to devote 8GB to postgres that means 1048576
buffers for 8K page size. Need I push available number of shared memory
segments beyond this value? Can I tweak size of page? If I can would it help
for such an installation?

c) Get a *fast* storage. I would rather not elaborate on this point because the
client is suppose to know better being in telecom business. But would offer my
inputs based upon information from Scott etc. But I don't think it would come
to that.

Any suggestions? Modifications? This is a rough plot. Will finalise today
evening/tom morning.. ( I am from India BTW, just to let you know my time
zone..;-)). I hope I haven't missed any point I have thought of in last two
days..

Once again, thanks a lot for the help offered. I can not put in words how on
point all this thread has been..

Bye
 Shridhar

--
DeVries' Dilemma:    If you hit two keys on the typewriter, the one you don't want
hits the paper.


Re: Physical sites handling large data

From
Tom Lane
Date:
"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:
> On this topic, say I decide to devote 8GB to postgres that means 1048576
> buffers for 8K page size. Need I push available number of shared memory
> segments beyond this value? Can I tweak size of page? If I can would it help
> for such an installation?

I do not believe you can push the shared memory size past 2GB, or about
250K buffers, because the size calculations for it are done in "int"
arithmetic.  Of course, this could be fixed if anyone cared to do the
legwork.  I doubt there's much point in worrying about it though.
A larger memory is still usable, it's just that the rest of it will be
used in the form of kernel-level disk cache not Postgres buffers.

            regards, tom lane

Re: Physical sites handling large data

From
Ericson Smith
Date:
Hmmm...

Using the bigmem kernel and RH7.3, we were able to set Postgresql shared
memory to 3.2Gigs (out of 6GB Ram). Does this mean that Postgresql will
only use the first 2Gigs?

Our settings are:

shmmax = 3192000000
shared_buffers = 38500

ipcs output:
0x0052e2c1 98304      postgres  600        324018176  51

- Ericson Smith
eric@did-it.com

On Sun, 2002-09-15 at 11:01, Tom Lane wrote:
> "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:
> > On this topic, say I decide to devote 8GB to postgres that means 1048576
> > buffers for 8K page size. Need I push available number of shared memory
> > segments beyond this value? Can I tweak size of page? If I can would it help
> > for such an installation?
>
> I do not believe you can push the shared memory size past 2GB, or about
> 250K buffers, because the size calculations for it are done in "int"
> arithmetic.  Of course, this could be fixed if anyone cared to do the
> legwork.  I doubt there's much point in worrying about it though.
> A larger memory is still usable, it's just that the rest of it will be
> used in the form of kernel-level disk cache not Postgres buffers.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org



Re: Physical sites handling large data

From
Tom Lane
Date:
Ericson Smith <eric@did-it.com> writes:
> Using the bigmem kernel and RH7.3, we were able to set Postgresql shared
> memory to 3.2Gigs (out of 6GB Ram). Does this mean that Postgresql will
> only use the first 2Gigs?

I think you are skating on thin ice there --- there must have been some
integer overflows in the shmem size calculations.  It evidently worked
as an unsigned result, but...

IIRC we have an open bug report from someone who tried to set
shared_buffers so large that the shmem size would have been ~5GB;
the overflowed size request was ~1GB and then it promptly dumped
core from trying to access memory beyond that.  We need to put in
some code to detect overflows in those size calculations.

In any case, pushing PG's shared memory to 50% of physical RAM is
completely counterproductive.  See past discussions (mostly on
-hackers and -admin if memory serves) about appropriate sizing of
shared buffers.  There are different schools of thought about this,
but I think everyone agrees that a shared-buffer pool that's roughly
equal to the size of the kernel's disk buffer cache is a waste of
memory.  One should be much bigger than the other.  I personally think
it's appropriate to let the kernel cache do most of the work, and so
I favor a shared_buffers setting of just a few thousand.

            regards, tom lane

Re: Physical sites handling large data

From
"Shridhar Daithankar"
Date:
On 15 Sep 2002 at 11:47, Tom Lane wrote:

> Ericson Smith <eric@did-it.com> writes:
> > Using the bigmem kernel and RH7.3, we were able to set Postgresql shared
> > memory to 3.2Gigs (out of 6GB Ram). Does this mean that Postgresql will
> > only use the first 2Gigs?
> In any case, pushing PG's shared memory to 50% of physical RAM is
> completely counterproductive.  See past discussions (mostly on
> -hackers and -admin if memory serves) about appropriate sizing of
> shared buffers.  There are different schools of thought about this,
> but I think everyone agrees that a shared-buffer pool that's roughly
> equal to the size of the kernel's disk buffer cache is a waste of
> memory.  One should be much bigger than the other.  I personally think
> it's appropriate to let the kernel cache do most of the work, and so
> I favor a shared_buffers setting of just a few thousand.

So you mean, at large sites tuning kernel disks buffer becomes a part of tuning
postgres?

IIRC kernel disk caching behaviour varies across unices and I really don't know
if kernel will honour the caching request with cache size more than say 2Gigs..
Linux kernel is a different story. It eats everything it can but still..

That comes back to my other question.. Is it possible to change page size and
would that be beneficial under some conditions?

I am not asking because I don't want to listen to you but I would better back
my claims when I am making an evaluation proposals. Do you have any numbers
handy? (I will search them as well..)


Bye
 Shridhar

--
Painting, n.:    The art of protecting flat surfaces from the weather, and
exposing them to the critic.        -- Ambrose Bierce


Re: Physical sites handling large data

From
Manfred Koizar
Date:
On 15 Sep 2002 11:33:59 -0400, Ericson Smith <eric@did-it.com> wrote:
> shared memory to 3.2Gigs (out of 6GB Ram). [...]
>shared_buffers = 38500
>
>ipcs output:
>0x0052e2c1 98304      postgres  600        324018176  51

Ericson, this looks more like 300MB to me; which might be a good
choice anyway ;-)

Servus
 Manfred

Re: Physical sites handling large data

From
Ericson Smith
Date:
... that sound you hear is the sound of me knocking my head against the
brick wall in here...

Well it looks like Tom Lane was right (as always) on this one. On our
previous server, we had 4 Gigs of RAM and 1.6 Gigs of shared memory.
Does this mean now that the OS is efficiently caching disk, and they our
320MB of shared memory is good enough?

Our database is about 4 Gigs at this point with some tables having
hundreds of thousands or millions of records.

Running free looks like this.
[root@pg root]# free
             total       used       free     shared    buffers
cached
Mem:       5939524    5868720      70804          0      90732
5451808
-/+ buffers/cache:     326180    5613344
Swap:      2096440          0    2096440

There are 58 client processes running, with at times up to 220. The load
on this machine never runs more than 1 with Dual CPU's.

Top looks like this:
97 processes: 96 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:  1.2% user,  3.2% system,  0.0% nice, 94.5% idle
CPU1 states:  0.1% user,  0.0% system,  0.0% nice, 99.4% idle
CPU2 states:  0.3% user,  0.2% system,  0.0% nice, 99.0% idle
CPU3 states:  0.3% user,  0.2% system,  0.0% nice, 99.0% idle
Mem:  5939524K av, 5874740K used,   64784K free,       0K shrd,   91344K
buff
Swap: 2096440K av,       0K used, 2096440K free                 5451892K
cached

Any definitive insight here as to why I'm running so well at this point?
- Ericson




On Mon, 2002-09-16 at 15:33, Manfred Koizar wrote:
> On 15 Sep 2002 11:33:59 -0400, Ericson Smith <eric@did-it.com> wrote:
> > shared memory to 3.2Gigs (out of 6GB Ram). [...]
> >shared_buffers = 38500
> >
> >ipcs output:
> >0x0052e2c1 98304      postgres  600        324018176  51
>
> Ericson, this looks more like 300MB to me; which might be a good
> choice anyway ;-)
>
> Servus
>  Manfred



Re: Physical sites handling large data

From
"Shridhar Daithankar"
Date:
On 16 Sep 2002 at 17:01, Ericson Smith wrote:

> ... that sound you hear is the sound of me knocking my head against the
> brick wall in here...
>
> Well it looks like Tom Lane was right (as always) on this one. On our
> previous server, we had 4 Gigs of RAM and 1.6 Gigs of shared memory.
> Does this mean now that the OS is efficiently caching disk, and they our
> 320MB of shared memory is good enough?

Looks like you are asking but if you ask me you just proved that it's enough..

> Our database is about 4 Gigs at this point with some tables having
> hundreds of thousands or millions of records.
> Any definitive insight here as to why I'm running so well at this point?

I would suggest looking at pg metadata regarding memory usage as well as ipcs
stats. Besides what are the kernle disk buffer setting. I believe you are using
linux and these buffer settings can be controlled via/for bdflush.

Your typical ipcs usage would be a much valuable figure along with free..

And BTW, what's your vacuum frequency? Just to count that in..






Bye
 Shridhar

--
Worst Vegetable of the Year:    The brussels sprout.  This is also the worst
vegetable of next year.        -- Steve Rubenstein


Re: Physical sites handling large data

From
Ericson Smith
Date:
Out Vacuum frequency is once daily, with EXPLAINS happening every 2
hours.

We use the default kernel buffer settings.

- Ericson

On Tue, 2002-09-17 at 02:37, Shridhar Daithankar wrote:
> On 16 Sep 2002 at 17:01, Ericson Smith wrote:
>
> > ... that sound you hear is the sound of me knocking my head against the
> > brick wall in here...
> >
> > Well it looks like Tom Lane was right (as always) on this one. On our
> > previous server, we had 4 Gigs of RAM and 1.6 Gigs of shared memory.
> > Does this mean now that the OS is efficiently caching disk, and they our
> > 320MB of shared memory is good enough?
>
> Looks like you are asking but if you ask me you just proved that it's enough..
>
> > Our database is about 4 Gigs at this point with some tables having
> > hundreds of thousands or millions of records.
> > Any definitive insight here as to why I'm running so well at this point?
>
> I would suggest looking at pg metadata regarding memory usage as well as ipcs
> stats. Besides what are the kernle disk buffer setting. I believe you are using
> linux and these buffer settings can be controlled via/for bdflush.
>
> Your typical ipcs usage would be a much valuable figure along with free..
>
> And BTW, what's your vacuum frequency? Just to count that in..
>
>
>
>
>
>
> Bye
>  Shridhar
>
> --
> Worst Vegetable of the Year:    The brussels sprout.  This is also the worst
> vegetable of next year.        -- Steve Rubenstein
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org



Re: Physical sites handling large data

From
"Shridhar Daithankar"
Date:
On 17 Sep 2002 at 8:59, Ericson Smith wrote:

> Out Vacuum frequency is once daily, with EXPLAINS happening every 2
> hours.
>
> We use the default kernel buffer settings.

I must say you have a good database installation. As many people have found and
documented before, an in time vacuum can almost double the performance.

And I am really amazed that things worked with default buffer settings. AFAIK,
it takes upto half the physical RAM for buffers. But many distro. (for linux at
least) limit it to 200MB or half the RAM. So that on huge RAM, it would not eat
up all..

Bye
 Shridhar

--
MIT:    The Georgia Tech of the North