Home > mailing lists

Re: ShmemAlloc errors - Mailing list pgsql-general

From	Nick Burrett
Subject	Re: ShmemAlloc errors
Date	October 20, 2003 10:41:57
Msg-id	3F93BBD4.8090804@dsvr.net Whole thread Raw
In response to	Re: ShmemAlloc errors (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: ShmemAlloc errors (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-general

Tree view

Tom Lane wrote:
> Nick Burrett <nick@dsvr.net> writes:
>
>>The original table definition was exactly this:
>
>
>>CREATE TABLE fiveminute ( server CHAR(32),
>>                           stamp TIMESTAMP,
>>                           bytesin BIGINT CHECK (bytesin >= 0),
>>                           bytesout BIGINT CHECK (bytesout >= 0));
>
>
>>CREATE UNIQUE INDEX fiveminute_idx ON fiveminute(server,stamp);
>
>
>>The database never survived operational use.  The original import of
>>data took so long (2+ days) that the re-design was done almost immediately.
>
>
> It'd probably be better to load the initial data first and then build
> the index; boosting sort_mem while building the index should be a help
> too.  But there's still something very fishy here.  You said you had
> 1500 * 3000 = 4.5million rows in the tables.  Even with the dumbest
> approach to loading the data, I can't conceive of it taking 2 days.
> Even on my oldest and slowest development machine, I can load a table
> that size in about ten minutes, and create an index on it in another ten.

What I'll do is try the setup again based on the recommendations that
have appeared here and see what results I get.

The original data was inserted using INSERT statements rather than COPY.
  I can't remember whether indexes were dropped before-hand or not.

> One thing that comes to mind is that what you eliminated appears to have
> been an indexed char(32) column.  We have seen reports of very slow
> behavior on certain platforms if a non-C locale is used --- apparently
> strcoll() is just horrendously slow in some implementations.  It'd be
> worth ensuring that your database is initdb'd in C locale.  Also, is
> there a good reason to pad every server name to 32 characters?  Is the
> 32-character limit even real, or did you pluck that number from the air?
> Usually I recommend type text (or equivalently, varchar with no specific
> length limit) unless you have a clear application-driven need for a
> specific length limit --- and even then it should be varchar(n) not
> char(n).

The 32-character limit is a very real limit.  However the all the data
is machine generated, so removing the limits is certainly possible.
Though the limit was put in place to retain some data integrity.

The original database used to be in MySQL.  The CREATE INDEX
documentation for MySQL states that you must specify a prefix length for
TEXT columns.

I expected that specifying a specific length for a column would allow
for more efficient indexing and searching because:

a) you already know the exact length of the column
b) potentially the column-data could be stored starting on a word-boundary
c) apply string comparison functions that are optimised for data
starting on word-boundaries (i.e. by comparing words rather than bytes).
  Certainly for the C-locale.

Regards,

Nick.

--
Nick Burrett
Network Engineer, Designer Servers Ltd.   http://www.dsvr.co.uk

pgsql-general by date:

From: Peter Childs
Date: 20 October 2003, 10:27:45
Subject: Re: Recomended FS

From: "Ben-Nes Michael"
Date: 20 October 2003, 11:04:02
Subject: Re: Recomended FS

Re: ShmemAlloc errors - Mailing list pgsql-general

Previous

Next