On Tue, 2004-06-15 at 21:09, C. Bensend wrote:
> > Right. The only thing that will do for you is waste space with padding
> > blanks. The performance improvement you saw was due to something else,
> > which I theorize is not having to waste time reading dead space.
> >
> > Since you obviously doubt this assertion, try the same experiment except
> > load the data into a new table with the *same* structure as the original
> > (ie, all varchar). You'll see the same or probably better performance.
>
> Hi Tom,
>
> I don't doubt your assertion, I just don't understand it all yet. :)
> Let me assure you, you're the expert here, not I.
>
> I did as you suggested - I created a third copy of the table, using
> the exact same structure. And you're absolutely right - it was lightning
> fast (around 100ms).
>
> >> Actually, all databases on this server are vacuumed nightly, right
> >> before backups.
> >
> > Not often enough evidently...
>
> This statement worries me a bit. The data in the original table was
> bulk-loaded the other night (less than three days I'm sure), and new
> entries have been added at the approximate rate of 300 per day. Is this
> going to continue to happen?
>
> Or do I just need to vacuum more often? I _did_ try a vacuum before
> asking the list for help, but it didn't give any improvement (just a
> vacuum analyze).
>
> > Note the difference in "pages". emails is nearly double the physical
> > size, even though the live data in it is doubtless smaller. (You could
> > use the contrib/pgstattuple functions to see exactly how much live data
> > there is.)
>
> OK. I see (and understand) the pages value now.
>
> > I was actually expecting to see more than a 2:1 difference in file size,
> > seeing that you reported more than a 2:1 difference in read time. It
> > could be that there is also some question of physical layout of the file
> > on disk. The original table probably accreted over time and was given
> > space that's not very consecutive on disk. The copied table was
> > written in one swoop, more or less, and is very likely stored in a more
> > nearly consecutive set of physical disk blocks. I'm not sure how you'd
> > investigate this theory though --- AFAIK there isn't any real easy way
> > to find out how badly fragmented a file is in most Unix filesystems.
>
> Ugh. So, what would you recommend as a fix? I see the problem, and I
> see the fix that just worked, but I certainly can't be the only person
> around that is using a "wide" table with a lot of character data being
> added at a rather slow rate...
You might want to look into the autovacuum daemon, and / or increasing
fsm settings to be large enough to hold all the spare tuples released by
vacuuming.