Thread:

From
"Gregory S. Williamson"
Date:
One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of linux
anda database. 

<http://story.news.yahoo.com/news?tmpl=story&cid=1738&ncid=738&e=9&u=/zd/20030825/tc_zd/55311>

Greg Williamson
DBA GlobeXplorer LLC

Linux ready for high-volume databases?

From
Ron Johnson
Date:
On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
> One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of linux
anda database. 
>
> <http://story.news.yahoo.com/news?tmpl=story&cid=1738&ncid=738&e=9&u=/zd/20030825/tc_zd/55311>

"Weak points include lack of available tools, ease of use and ease
of installation"

Sounds like he needs point-and-drool tools...

On the other hand, could even a beefy Linux 2.4 *today* system handle
a 24x7 500GB db that must process 6-8M OLTP-style transactions per
day, while also getting hit by report queries?

Don't think of this as a troll, because I really don't know, even
though I do know that MVS, OpenVMS & Solaris can.  (I won't even
ask about toys like Windows and FreeBSD.)

--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA

"Knowledge should be free for all."
Harcourt Fenton Mudd, Star Trek:TOS, "I, Mudd"


Re: Linux ready for high-volume databases?

From
"Shridhar Daithankar"
Date:
On 26 Aug 2003 at 2:55, Ron Johnson wrote:

> On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
> > One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of
linuxand a database. 
> >
> > <http://story.news.yahoo.com/news?tmpl=story&cid=1738&ncid=738&e=9&u=/zd/20030825/tc_zd/55311>
>
> "Weak points include lack of available tools, ease of use and ease
> of installation"
>
> Sounds like he needs point-and-drool tools...
>
> On the other hand, could even a beefy Linux 2.4 *today* system handle
> a 24x7 500GB db that must process 6-8M OLTP-style transactions per
> day, while also getting hit by report queries?

If linux isn't limited to intel, probably yes. Of course, that does not carry
any weightage beyond an opinion. Probably on mainframe/postgresql it could
handle that..:-)

You have a monster database running. That is a proof. Well, I don't know of
any.

BTW, Tom mentioned couple of huge databases under postgresql, the 2u survey and
I forgot the other one. What they were running on?

> Don't think of this as a troll, because I really don't know, even
> though I do know that MVS, OpenVMS & Solaris can.  (I won't even
> ask about toys like Windows and FreeBSD.)

Well, given that Windows has more TPC-C spots at top than any single
combination, that remains a possibility if you have money...:-)

Bye
 Shridhar

--
Peers's Law:    The solution to a problem changes the nature of the problem.


Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "RJ" == Ron Johnson <ron.l.johnson@cox.net> writes:

RJ> Don't think of this as a troll, because I really don't know, even
RJ> though I do know that MVS, OpenVMS & Solaris can.  (I won't even
RJ> ask about toys like Windows and FreeBSD.)

Well, you must be smoking something funny if you think FreeBSD is a
'toy' to be lumped in with windows....

I run a 24x7x365 db on FreeBSD which has *never* crashed in the 3
years it has been in production.  Only downtime was the upgrade from
PG 7.1 to 7.2 and once for a switchover from RAID5 to RAID10.  I *may*
have a few minutes of down time shortly when updating from PG 7.2 to
7.4 on a new box since I'm saturating the disk I/O bandwidth on the
old box.  The eRServer software will be doing the data migration on
the live db so I don't have significant down time.


The DB is currently about 27Mb on disk (including indexes) and
processes several million inserts and updates daily, and a few million
deletes once every two weeks.


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Linux ready for high-volume databases?

From
Ron Johnson
Date:
On Tue, 2003-08-26 at 10:00, Dennis Gearon wrote:
> Free BSD may be a toy, maybe not, but it runs more of the webhosting
> domains than any other OS.

That was supposed to be a joke.  Putting FreeBSD in the same class
with Winblows is a prima facia absurdity, no matter how you cut it...

> Ron Johnson wrote:
>
> >On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
> >
[snip]
> >though I do know that MVS, OpenVMS & Solaris can.  (I won't even
> >ask about toys like Windows and FreeBSD.)

--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA

"Millions of Chinese speak Chinese, and it's not hereditary..."
Dr. Dean Edell


Re: Linux ready for high-volume databases?

From
Dennis Gearon
Date:
Free BSD may be a toy, maybe not, but it runs more of the webhosting
domains than any other OS.

Ron Johnson wrote:

>On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
>
>
>>One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of linux
anda database. 
>>
>><http://story.news.yahoo.com/news?tmpl=story&cid=1738&ncid=738&e=9&u=/zd/20030825/tc_zd/55311>
>>
>>
>
>"Weak points include lack of available tools, ease of use and ease
>of installation"
>
>Sounds like he needs point-and-drool tools...
>
>On the other hand, could even a beefy Linux 2.4 *today* system handle
>a 24x7 500GB db that must process 6-8M OLTP-style transactions per
>day, while also getting hit by report queries?
>
>Don't think of this as a troll, because I really don't know, even
>though I do know that MVS, OpenVMS & Solaris can.  (I won't even
>ask about toys like Windows and FreeBSD.)
>
>
>


Re: Linux ready for high-volume databases?

From
Al Hulaton
Date:
After seeing this article yesterday, I did a bit of research. One _big_ reason
why Sourceforge/VA/OSDN is moving over to IBM/Webshere/DB2 from PostgreSQL is
the resulting product will be jointly marketed by Sourceforge and IBM's
zillions of sales people. So not only will they get a shiny, new db, but
backend revenue.

"The companies will jointly market and sell the software as part of the
commercial agreement. "-- 4th paragraph, last sentence.
http://www.eweek.com/print_article/0,3668,a=30025,00.asp

"In a separate announcement today, VA Software announced a significant
commercial agreement with IBM focused on the joint marketing and sales of the
next generation of SourceForge™ Enterprise Edition." -- 7th paragram from
their press release at
http://www.vasoftware.com/news/press.php/2002/1070.html

Perhaps the PostgreSQL team bidding for the job, if any were even consulted,
didn't frame the project as IBM did -- a product joint venture. It's a good
tactic and I don't blame Sourceforge one bit for the opportunity.

The decision wasn't entirely technical so I don't see this as a loss for
PostgreSQL. DB2 isn't a slouch db by any means but not many companies will be
able to bargain with IBM as Sourceforge did. If you're a retailer in Topeka
with 3 locations, I doubt IBM would give you the same attention or joint
marketing deal they gave Sourceforge. DB2 ain't cheap.

--
Best,
Al Hulaton    |  Sr. Account Engineer  |  Command Prompt, Inc.
503.222.2783  |  ahulaton@commandprompt.com
Home of Mammoth PostgreSQL and 'Practical PostgreSQL'
Managed PostgreSQL, Linux services and consulting
Read and Search O'Reilly's 'Practical PostgreSQL' at
http://www.commandprompt.com


Re: Linux ready for high-volume databases?

From
Ron Johnson
Date:
On Tue, 2003-08-26 at 03:06, Shridhar Daithankar wrote:
> On 26 Aug 2003 at 2:55, Ron Johnson wrote:
>
> > On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
> > > One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of
linuxand a database. 
> > >
> > > <http://story.news.yahoo.com/news?tmpl=story&cid=1738&ncid=738&e=9&u=/zd/20030825/tc_zd/55311>
> >
> > "Weak points include lack of available tools, ease of use and ease
> > of installation"
> >
> > Sounds like he needs point-and-drool tools...
> >
> > On the other hand, could even a beefy Linux 2.4 *today* system handle
> > a 24x7 500GB db that must process 6-8M OLTP-style transactions per
> > day, while also getting hit by report queries?
>
> If linux isn't limited to intel, probably yes. Of course, that does not carry

I think the worry is more about the kernel than anything else.
After all, an 8-way 2.8GHz Xeon w/ 16GB RAM and a bunch of 64-bit
66MHz PCI Ultra320 SCSI controllers or SAN HBAs can handle that much
database, I think.

--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA

"Experience hath shewn, that even under the best forms [of
government] those entrusted with power have, in time, and by
slow operations, perverted it into tyranny."
Thomas Jefferson


Re: Linux ready for high-volume databases?

From
Andrew Sullivan
Date:
On Tue, Aug 26, 2003 at 09:15:14AM -0700, Al Hulaton wrote:
> Perhaps the PostgreSQL team bidding for the job, if any were even consulted,
> didn't frame the project as IBM did -- a product joint venture. It's a good
> tactic and I don't blame Sourceforge one bit for the opportunity.

Well, since the main point was to get some $$ into the company, bucks
which IBM has and PostgreSQL doesn't, it's not too surprising that
the PostgreSQL team didn't win.  The move to DB2 was apparently a
quid pro quo for the cash.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Linux ready for high-volume databases?

From
"scott.marlowe"
Date:
On Tue, 26 Aug 2003, Al Hulaton wrote:

> After seeing this article yesterday, I did a bit of research. One _big_ reason
> why Sourceforge/VA/OSDN is moving over to IBM/Webshere/DB2 from PostgreSQL is
> the resulting product will be jointly marketed by Sourceforge and IBM's
> zillions of sales people. So not only will they get a shiny, new db, but
> backend revenue.
>
> "The companies will jointly market and sell the software as part of the
> commercial agreement. "-- 4th paragraph, last sentence.
> http://www.eweek.com/print_article/0,3668,a=30025,00.asp
>
> "In a separate announcement today, VA Software announced a significant
> commercial agreement with IBM focused on the joint marketing and sales of the
> next generation of SourceForge™ Enterprise Edition." -- 7th paragram from
> their press release at
> http://www.vasoftware.com/news/press.php/2002/1070.html
>
> Perhaps the PostgreSQL team bidding for the job, if any were even consulted,
> didn't frame the project as IBM did -- a product joint venture. It's a good
> tactic and I don't blame Sourceforge one bit for the opportunity.
>
> The decision wasn't entirely technical so I don't see this as a loss for
> PostgreSQL. DB2 isn't a slouch db by any means but not many companies will be
> able to bargain with IBM as Sourceforge did. If you're a retailer in Topeka
> with 3 locations, I doubt IBM would give you the same attention or joint
> marketing deal they gave Sourceforge. DB2 ain't cheap.

Actually, I remember quite clearly the incredibly bad performance of
sourceforge's search engine for the better part of a year after switching
out postgresql for db2.  It had been quite snappy, and I could enter
database or some other keyword and have a page display in ~2 seconds or
less.  For the first three months or so after the switch, most searchs
simply timed out to PHP's default 30 seconds.  Even when they got it
working better, it only had maybe 1/10th or less of the keywords indexed
that they had had in postgresql (i.e. words like index or email weren't
being indexed. :-)

It was probably at least 9 months later that the search engine was finally
back to being usable, and another 3 or 4 before it was about as good as
postgresql.  And we're talking an older version (I believe it was 7.1) of
postgresql as well.

The switch to db2 was driven by partnering business needs, not by poor
performance of postgresql.


Re: Linux ready for high-volume databases?

From
Greg Stark
Date:
Vivek Khera <khera@kcilink.com> writes:

> I run a 24x7x365 db on FreeBSD which has *never* crashed in the 3
> years it has been in production.  Only downtime was the upgrade from
> PG 7.1 to 7.2 and once for a switchover from RAID5 to RAID10.

I would be interested to know what backup strategy you use for this. Without
online backups this means that if you had crashed you would have lost data up
to the last pg_dump you took? Had you done tests to see how long it would have
taken to restore from the pg_dump?

Online backups with archived transaction logs are the next big killer feature
(the last one remaining?) for 24x7 operation I think.

> The DB is currently about 27Mb on disk (including indexes) and
> processes several million inserts and updates daily, and a few million
> deletes once every two weeks.

Oh, it's a really small database. That helps a lot with the backup problems of
24x7 operation. Still I would be interested.


--
greg

Re: Linux ready for high-volume databases?

From
Jonathan Bartlett
Date:
> Online backups with archived transaction logs are the next big killer feature
> (the last one remaining?) for 24x7 operation I think.

I believe this is at least theoretically possible using Linux device layer
tricks.  Using network block devices, you can have a network RAID1, with
the transaction logs living over NFS.  Never tried this, but it seems all
the tools are there.



Re: Linux ready for high-volume databases?

From
Oleg Bartunov
Date:
On Tue, 26 Aug 2003, scott.marlowe wrote:

> On Tue, 26 Aug 2003, Al Hulaton wrote:
>
> > After seeing this article yesterday, I did a bit of research. One _big_ reason
> > why Sourceforge/VA/OSDN is moving over to IBM/Webshere/DB2 from PostgreSQL is
> > the resulting product will be jointly marketed by Sourceforge and IBM's
> > zillions of sales people. So not only will they get a shiny, new db, but
> > backend revenue.
> >
> > "The companies will jointly market and sell the software as part of the
> > commercial agreement. "-- 4th paragraph, last sentence.
> > http://www.eweek.com/print_article/0,3668,a=30025,00.asp
> >
> > "In a separate announcement today, VA Software announced a significant
> > commercial agreement with IBM focused on the joint marketing and sales of the
> > next generation of SourceForgeтДв Enterprise Edition." -- 7th paragram from
> > their press release at
> > http://www.vasoftware.com/news/press.php/2002/1070.html
> >
> > Perhaps the PostgreSQL team bidding for the job, if any were even consulted,
> > didn't frame the project as IBM did -- a product joint venture. It's a good
> > tactic and I don't blame Sourceforge one bit for the opportunity.
> >
> > The decision wasn't entirely technical so I don't see this as a loss for
> > PostgreSQL. DB2 isn't a slouch db by any means but not many companies will be
> > able to bargain with IBM as Sourceforge did. If you're a retailer in Topeka
> > with 3 locations, I doubt IBM would give you the same attention or joint
> > marketing deal they gave Sourceforge. DB2 ain't cheap.
>
> Actually, I remember quite clearly the incredibly bad performance of
> sourceforge's search engine for the better part of a year after switching
> out postgresql for db2.  It had been quite snappy, and I could enter
> database or some other keyword and have a page display in ~2 seconds or
> less.  For the first three months or so after the switch, most searchs
> simply timed out to PHP's default 30 seconds.  Even when they got it
> working better, it only had maybe 1/10th or less of the keywords indexed
> that they had had in postgresql (i.e. words like index or email weren't
> being indexed. :-)
>
> It was probably at least 9 months later that the search engine was finally
> back to being usable, and another 3 or 4 before it was about as good as
> postgresql.  And we're talking an older version (I believe it was 7.1) of
> postgresql as well.
>
> The switch to db2 was driven by partnering business needs, not by poor
> performance of postgresql.
>

It's interesting to get projects metadata so I could setup simple
tsearch2 based full text search and see how it could be fast now.


>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: Linux ready for high-volume databases?

From
Christopher Browne
Date:
After a long battle with technology,khera@kcilink.com (Vivek Khera), an earthling, wrote:
>>>>>> "RJ" == Ron Johnson <ron.l.johnson@cox.net> writes:
>
> RJ> Don't think of this as a troll, because I really don't know, even
> RJ> though I do know that MVS, OpenVMS & Solaris can.  (I won't even
> RJ> ask about toys like Windows and FreeBSD.)
>
> Well, you must be smoking something funny if you think FreeBSD is a
> 'toy' to be lumped in with windows....

I suspect your irony-meter didn't get activated when it was supposed
to.

Please keep in mind that to the sorts of people that read and believe
and act on the source article, any system that doesn't have a vendor
to "certify" its fitness for database use is manifestly a "toy" that
only fools and Englishmen would use for any purpose that was the
slightest bit important.

Your injection of technical fact into the matter just confuses the
matter for people that prefer to get their "technical expertise" from
some white-paper-writer at the Gartner Group.

And there is a very slight bit of genuine technical reality to this,
too.  People _assert_ that there are technical reasons to prefer
FreeBSD over other systems, but it is difficult to get forcibly past
the anecdotal evidence.  The fact that you have had a system running,
apparently quite successfully, for a while, is not a _proof_ that
FreeBSD is more or less satisfactory than other OSes for the purpose.
It is merely an anecdote.

Unfortunately, we seldom see _anything_ better than anecdotes.  People
report anecdotes that they heard that someone lost data to ext2.
Others report anecdotes that they have had good results with one
filesystem or another or one OS or another.

When there are problems, there isn't a good "certifiable" (or
'statistically significant') way of evaluating whether the faults
resulted from:
 a) A PG bug
 b) An OS filesystem bug
 c) An OS device driver bug
 d) Bad disk controller
 e) Bad disk drive

It's quite easy for these to feed into one another so that a severe
problem combines together a tragedy of errors.  (Been there :-(.)  Is
there a way to "certify" that the composition of your particular
hardware with FreeBSD with PostgreSQL can't lead to tragedy?  I'd
think not.

There's some pathos in with that irony...
--
http://cbbrowne.com/info/wp.html
Rules  of the  Evil Overlord  #41. "Once  my power  is secure,  I will
destroy all those pesky time-travel devices."
<http://www.eviloverlord.com/>

Re: Linux ready for high-volume databases?

From
Dennis Gearon
Date:
With the low cost of disks, it might be a good idea to just copy to
disks, that one can put back in.

Greg Stark wrote:

>Vivek Khera <khera@kcilink.com> writes:
>
>
>
>>I run a 24x7x365 db on FreeBSD which has *never* crashed in the 3
>>years it has been in production.  Only downtime was the upgrade from
>>PG 7.1 to 7.2 and once for a switchover from RAID5 to RAID10.
>>
>>
>
>I would be interested to know what backup strategy you use for this. Without
>online backups this means that if you had crashed you would have lost data up
>to the last pg_dump you took? Had you done tests to see how long it would have
>taken to restore from the pg_dump?
>
>Online backups with archived transaction logs are the next big killer feature
>(the last one remaining?) for 24x7 operation I think.
>
>
>
>>The DB is currently about 27Mb on disk (including indexes) and
>>processes several million inserts and updates daily, and a few million
>>deletes once every two weeks.
>>
>>
>
>Oh, it's a really small database. That helps a lot with the backup problems of
>24x7 operation. Still I would be interested.
>
>
>
>


Re: Linux ready for high-volume databases?

From
Greg Stark
Date:
Dennis Gearon <gearond@fireserve.net> writes:

> With the low cost of disks, it might be a good idea to just copy to disks, that
> one can put back in.

Uh, sure, using hardware raid 1 and breaking one set of drives out of the
mirror to perform the backup is an old trick. And for small databases backups
are easy that way. Just store a few dozen copies of the pg_dump output on your
live disks for local backups and burn CD-Rs for offsite backups.

But when you have hundreds of gigabytes of data and you want to be able to
keep multiple snapshots of your database both on-site and off-site... No, you
can't just buy another hard drive and call it a business continuity plan.

As it turns out my current project will be quite small. I may well be adopting
the first approach. I'm thinking taking a pg_dump regularly (nightly if I can
get away with doing it that infrequently) keeping the past n dumps, and
burning a CD with those dumps.

This doesn't provide what online backups do, of recovery to the minute of the
crash. And I get nervous having only logical pg_dump output, no backups of the
actual blocks on disk. But is that what everybody does?

--
greg

Re: Linux ready for high-volume databases?

From
"Shridhar Daithankar"
Date:
On 26 Aug 2003 at 9:15, Al Hulaton wrote:

> After seeing this article yesterday, I did a bit of research. One _big_ reason
> why Sourceforge/VA/OSDN is moving over to IBM/Webshere/DB2 from PostgreSQL is
> the resulting product will be jointly marketed by Sourceforge and IBM's
> zillions of sales people. So not only will they get a shiny, new db, but
> backend revenue.
>
> "The companies will jointly market and sell the software as part of the
> commercial agreement. "-- 4th paragraph, last sentence.
> http://www.eweek.com/print_article/0,3668,a=30025,00.asp

<From vague memory somewhere from some article>

One of the technical reasons sourceforge went to DB2 was that DB2 had
clustering. Postgresql could not scale beyond single machine and on single
machine it had limitations on scaling as well.

Note that this was done quite a while back. Today postgresql might be as
scalable as required by sourceforge but they needed it then and had to move.

</From vague memory somewhere from some article>

<rant>
However if DB clustering was the problem, personally I would have split the
data on two machines on two different databases and had an app consolidate that
data. The efforts in rewriting app. could have been well compensated for
performance hit sourceforge took immediately after the move.

But that's me..
<rant>

Bye
 Shridhar

--
There are always alternatives.        -- Spock, "The Galileo Seven", stardate 2822.3


Re: Linux ready for high-volume databases?

From
"Shridhar Daithankar"
Date:
On 26 Aug 2003 at 8:12, Ron Johnson wrote:

> On Tue, 2003-08-26 at 03:06, Shridhar Daithankar wrote:
> > If linux isn't limited to intel, probably yes. Of course, that does not carry
>
> I think the worry is more about the kernel than anything else.
> After all, an 8-way 2.8GHz Xeon w/ 16GB RAM and a bunch of 64-bit
> 66MHz PCI Ultra320 SCSI controllers or SAN HBAs can handle that much
> database, I think.

Hmm.. I would say a 64 bit CPU will make a difference as well.

I am impressed by 2.6. Yesterday I installed 2.6.0-test4. Full postgresql CVS
head compile took 9 min 18 sec on 2.4.20 and 7 min 14 sec on 2.6.0-test4.
Impressive.

I am going to run pgbench on it. That will add another dimention to 7.3.4 v/s
7.4beta1 performance bouts.

Bye
 Shridhar

--
Bizoos, n.:    The millions of tiny individual bumps that make up a basketball.        --
 Rich Hall, "Sniglets"


Re: Linux ready for high-volume databases?

From
Ron Johnson
Date:
On Tue, 2003-08-26 at 23:35, Greg Stark wrote:
> Dennis Gearon <gearond@fireserve.net> writes:
[snip]
> This doesn't provide what online backups do, of recovery to the minute of the
> crash. And I get nervous having only logical pg_dump output, no backups of the
> actual blocks on disk. But is that what everybody does?

Gak!!  It can never be guaranteed that the "actual blocks on disk"
are transactionally consistent.  Thus, the pg_dump output is suff-
icient.

However, there is still the large problem of PITR.  Unless you
double your h/w and run Postgresql-R, you can not guarantee recov-
ery to an exact point in time if there is a hardware failure that
destroys the database.

Therefore, you can only restore databases to the time that the
last pg_dump was taken.

--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA

The difference between Rock&Roll and Country Music?
Old Rockers still on tour are pathetic, but old Country singers
are still great.


Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:

GS> Vivek Khera <khera@kcilink.com> writes:

>> I run a 24x7x365 db on FreeBSD which has *never* crashed in the 3
>> years it has been in production.  Only downtime was the upgrade from
>> PG 7.1 to 7.2 and once for a switchover from RAID5 to RAID10.

GS> I would be interested to know what backup strategy you use for
GS> this. Without online backups this means that if you had crashed
GS> you would have lost data up to the last pg_dump you took? Had you
GS> done tests to see how long it would have taken to restore from the
GS> pg_dump?

Currently it is pg_dump.  Once the new server is online this week,
we'll be using eRServer to keep a 'hot spare' slave ready for quick
switchover.

Both systems use RAID10 hardware arrays for the database.

Restore from dump takes about an hour for the data, and then the rest
of eternity (something like 18 hours last time I did it) for index
generation.

The pg_dump process takes about 52 minutes across the network.

GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.

Well, perhaps, but it is big enough and pounded on enough
(read/insert/update very often) that it saturates the disk.  I have
memory to spare according to the system stats.

I personally *really* wonder how people run DBs that are much larger
and have high rate of read/insert/update across large tables with RI
checks and all that normal good stuff.  The tuning recommendations I
have been through are insufficent to really help for my load.  Perhaps
my current server hardware just isn't up to it.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:

GS> the first approach. I'm thinking taking a pg_dump regularly
GS> (nightly if I can get away with doing it that infrequently)
GS> keeping the past n dumps, and burning a CD with those dumps.

Basically what I do.  I burn a set of CDs from one of my dumps once a
week, and keep the rest online for a few days.   I'm really getting
close to splurging for a DVD writer since my dumps are way too big for
a single CD.

GS> This doesn't provide what online backups do, of recovery to the
GS> minute of the crash. And I get nervous having only logical pg_dump
GS> output, no backups of the actual blocks on disk. But is that what
GS> everybody does?

Well, if you want backups of the blocks on disk, then you need to shut
down the postmaster so that it is a consistent copy.  You can't copy
the table files "live" this way.

So, yes, having the pg_dump is pretty much your safest bet to have a
consistent dump.  And using a replicated slave with, eg, eRServer, is
also another way, but that requires more hardware.



--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "AH" == Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

AH> On Wed, Aug 27, 2003 at 12:21:53PM -0400, Vivek Khera wrote:
>> >>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:
>>
GS> Vivek Khera <khera@kcilink.com> writes:

GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.
>>
>> Well, perhaps, but it is big enough and pounded on enough
>> (read/insert/update very often) that it saturates the disk.  I have
>> memory to spare according to the system stats.

AH> Well, was it really a 27 MB database, or it was a typo and you meant 27
AH> GB?  The latter doesn't fit in my "really small database" category...

Yes, typo.  It is  27GB *not* just 27MB.  Heck, I could do 27MB on
my solid state drive at an incredible speed ;-)

Re: Linux ready for high-volume databases?

From
Alvaro Herrera
Date:
On Wed, Aug 27, 2003 at 12:21:53PM -0400, Vivek Khera wrote:
> >>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:
>
> GS> Vivek Khera <khera@kcilink.com> writes:

> GS> Oh, it's a really small database. That helps a lot with the backup
> GS> problems of 24x7 operation. Still I would be interested.
>
> Well, perhaps, but it is big enough and pounded on enough
> (read/insert/update very often) that it saturates the disk.  I have
> memory to spare according to the system stats.

Well, was it really a 27 MB database, or it was a typo and you meant 27
GB?  The latter doesn't fit in my "really small database" category...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)

Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "GS" == Greg Stark <gsstark@mit.edu> writes:

>> The DB is currently about 27Mb on disk (including indexes) and
>> processes several million inserts and updates daily, and a few million
>> deletes once every two weeks.

GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.

Ok... so I re-read my post.  I mean 27Gb on disk.  Duh.  Sorry for the
confusion!


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Linux ready for high-volume databases?

From
Vivek Khera
Date:
>>>>> "DG" == Dennis Gearon <gearond@fireserve.net> writes:

DG> With the low cost of disks, it might be a good idea to just copy to
DG> disks, that one can put back in.

Your file system copies will be broken, since they only work if
PG is shut down.  For this very reason I don't even bother doing a
file system dump of the PG data partition.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Linux ready for high-volume databases?

From
Dennis Gearon
Date:
I meant,use disks as the medium. Put  DUMP's on it.<br /><br /> Vivek Khera wrote:<br /><blockquote
cite="midx7r837oyqo.fsf@yertle.int.kciLink.com"type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote
type="cite"><blockquotetype="cite"><blockquote type="cite"><pre wrap="">"DG" == Dennis Gearon <a
class="moz-txt-link-rfc2396E"href="mailto:gearond@fireserve.net"><gearond@fireserve.net></a> writes:
</pre></blockquote></blockquote></blockquote></blockquote></blockquote><prewrap="">
 
DG> With the low cost of disks, it might be a good idea to just copy to
DG> disks, that one can put back in.

Your file system copies will be broken, since they only work if
PG is shut down.  For this very reason I don't even bother doing a
file system dump of the PG data partition.
 </pre></blockquote>

Re: Linux ready for high-volume databases?

From
Andrew Sullivan
Date:
On Wed, Aug 27, 2003 at 12:35:29AM -0400, Greg Stark wrote:
> This doesn't provide what online backups do, of recovery to the minute of the
> crash. And I get nervous having only logical pg_dump output, no backups of the
> actual blocks on disk. But is that what everybody does?

Well, you can replicate and thereby get a copy of what was on the
master.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Optimizing pgsql for read-only access

From
Steve Lane
Date:
Hello all:

I'm building a web-based app that is purely a query tool: no data can be
added or edited. Postgres is the back end.

What steps do I need to take to make the setup as fast as possible for
read-only access? Are there any default settings I can disable because I
don't need them, and gain some speed that way? AFAIK there's no way to turn
transactions off, but what about something like f_sync? Will I get a
performance boost by turning that off?

I'm aware of the "standard" pgsql optimizations and I'll do my best to put
those in place. I'm wondering whether there's anything extra I can do, that
might not normally be "safe", but might become so in a read-only
environment.

All the data will be scrubbed out every night and refreshed from the
original source. Should I be running a VACUUM ANALYZE after each refresh?
Any other optimizations or hints I can pass along to the query processor
that reflect the fact that the data will NEVER change between VACUUM passes?

Thanks for any thoughts or advice.

-- sgl


=======================================================
Steve Lane

Vice President
The Moyer Group
14 North Peoria St Suite 2H
Chicago, IL 60607

Voice: (312) 433-2421       Email: slane@moyergroup.com
Fax:   (312) 850-3930       Web:   http://www.moyergroup.com
=======================================================


Re: Optimizing pgsql for read-only access

From
Greg Stark
Date:
Steve Lane <slane@moyergroup.com> writes:

> Ron Johnson <ron.l.johnson@cox.net> writes:
>
> > > Dennis Gearon <gearond@fireserve.net> writes:
> > >
> > > This doesn't provide what online backups do, of recovery to the minute of the
> > > crash. And I get nervous having only logical pg_dump output, no backups of the
> > > actual blocks on disk. But is that what everybody does?
> >
> > Gak!!  It can never be guaranteed that the "actual blocks on disk"
> > are transactionally consistent.  Thus, the pg_dump output is suff-
> > icient.
>
> > Hello all:
> >
> > I'm building a web-based app that is purely a query tool: no data can be
> > added or edited. Postgres is the back end.

What does this have to do with online backups vs pg_dump ?
Please don't follow up to threads with unrelated questions.

In any case you're far more likely to see answers if you post a message
properly as your message won't show up buried inside old threads in people's
mail user agents.

--
greg