Re: What popular, large commercial websites run - Mailing list pgsql-general

From Nigel J. Andrews
Subject Re: What popular, large commercial websites run
Date
Msg-id Pine.LNX.4.21.0205012342060.9945-100000@ponder.fairway2k.co.uk
Whole thread Raw
In response to Re: What popular, large commercial websites run  (postgres@vrane.com)
Responses Re: What popular, large commercial websites run
Re: What popular, large commercial websites run
List pgsql-general
On Wed, 1 May 2002 postgres@vrane.com wrote:

> On Wed, May 01, 2002 at 02:52:21PM -0500, Shaun Thomas wrote:
> > On Mon, 29 Apr 2002, pgsql-gen Newsgroup wrote:
> >
> > > The way I see it, some managers will buy Oracle. They will have low
> > > profit margines. Some programers will use PostgreSQL. They will have
> > > high margins.
> >
> >
> > Why are our databases bloating, even after hourly full vacuums?  Because
> > we have a database with a 50-100% data turnover rate at about 100,000
> > rows, and postgres just can't handle it.  I've watched our 100mb
> > database grow to 500mb, then 2gigs.  Full dump and restore?  70mb
> > again.  Oh, and the spiking load, and table locks that occur during
> > full vacuums?  Just take the hit, web-surfers be damned.
> >
>
> I'm very curious to know why you have problem with growing
> database.  Does the performance suffer significantly
> if you don't do the FULL vacuum?  Surely if you can
> afford the oracle you can afford relatively much
> cheaper storage.  You must have other reasons
> than just not liking large database


What you want to do to see an issue is:

Create a table with a boolean column
Then fill it with 50000 rows with the boolean field set to true and 50000 rows
with the boolean field set to false;
Analyse the table

Then run
   EXPLAIN ANALYZE SELECT * FROM TABLE <mytable> WHERE <booleanfield> = true;

Then run
   UPDATE <mytable> SET <booleanfield> = false WHERE <booleanfield> = false;

(several times)

Then retry the EXPLAIN ANALYZE SELECT ... from above again, after analysing
the table again obviously.

What you should see is a seqscan in both plans with the final number of rows
the same but the second will have taken longer.

Now, I haven't tried that exact scenario myself so don't quote me :) and I'm
also assuming there's no caching of the results from the first explain through
to the second.


--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


pgsql-general by date:

Previous
From: Paul M Foster
Date:
Subject: Re: Fwd: Postfix Relay Hub SMTP server: errors from postgresql.org[64.49.215.8]
Next
From: postgres@vrane.com
Date:
Subject: Re: What popular, large commercial websites run