Re: VARCHAR -vs- CHAR: huge performance difference? - Mailing list pgsql-admin

From Scott Marlowe
Subject Re: VARCHAR -vs- CHAR: huge performance difference?
Date
Msg-id 1087363871.1582.20.camel@localhost.localdomain
Whole thread Raw
In response to Re: VARCHAR -vs- CHAR: huge performance difference?  ("C. Bensend" <benny@bennyvision.com>)
Responses Re: VARCHAR -vs- CHAR: huge performance difference?
List pgsql-admin
On Tue, 2004-06-15 at 21:09, C. Bensend wrote:
> > Right.  The only thing that will do for you is waste space with padding
> > blanks.  The performance improvement you saw was due to something else,
> > which I theorize is not having to waste time reading dead space.
> >
> > Since you obviously doubt this assertion, try the same experiment except
> > load the data into a new table with the *same* structure as the original
> > (ie, all varchar).  You'll see the same or probably better performance.
>
> Hi Tom,
>
>    I don't doubt your assertion, I just don't understand it all yet.  :)
> Let me assure you, you're the expert here, not I.
>
>    I did as you suggested - I created a third copy of the table, using
> the exact same structure.  And you're absolutely right - it was lightning
> fast (around 100ms).
>
> >> Actually, all databases on this server are vacuumed nightly, right
> >> before backups.
> >
> > Not often enough evidently...
>
> This statement worries me a bit.  The data in the original table was
> bulk-loaded the other night (less than three days I'm sure), and new
> entries have been added at the approximate rate of 300 per day.  Is this
> going to continue to happen?
>
> Or do I just need to vacuum more often?  I _did_ try a vacuum before
> asking the list for help, but it didn't give any improvement (just a
> vacuum analyze).
>
> > Note the difference in "pages".  emails is nearly double the physical
> > size, even though the live data in it is doubtless smaller.  (You could
> > use the contrib/pgstattuple functions to see exactly how much live data
> > there is.)
>
> OK.  I see (and understand) the pages value now.
>
> > I was actually expecting to see more than a 2:1 difference in file size,
> > seeing that you reported more than a 2:1 difference in read time.  It
> > could be that there is also some question of physical layout of the file
> > on disk.  The original table probably accreted over time and was given
> > space that's not very consecutive on disk.  The copied table was
> > written in one swoop, more or less, and is very likely stored in a more
> > nearly consecutive set of physical disk blocks.  I'm not sure how you'd
> > investigate this theory though --- AFAIK there isn't any real easy way
> > to find out how badly fragmented a file is in most Unix filesystems.
>
> Ugh.  So, what would you recommend as a fix?  I see the problem, and I
> see the fix that just worked, but I certainly can't be the only person
> around that is using a "wide" table with a lot of character data being
> added at a rather slow rate...

You might want to look into the autovacuum daemon, and / or increasing
fsm settings to be large enough to hold all the spare tuples released by
vacuuming.


pgsql-admin by date:

Previous
From: "Iain"
Date:
Subject: Re: VARCHAR -vs- CHAR: huge performance difference?
Next
From: Rajesh Kumar Mallah
Date:
Subject: weired behavior... after pg_resetxlog-> dump->initdb-->reload.