Re: Performance while loading data and indexing - Mailing list pgsql-general

From Shridhar Daithankar
Subject Re: Performance while loading data and indexing
Date
Msg-id 3D936C6D.12380.14936AEC@localhost
Whole thread Raw
In response to Re: Performance while loading data and indexing  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Performance while loading data and indexing
Re: [HACKERS] Performance while loading data and indexing
Re: Performance while loading data and indexing
List pgsql-general
On 26 Sep 2002 at 10:33, Tom Lane wrote:

> "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:
> > RedHat7.2/PostgreSQL7.1.3
>
> I'd suggest a newer release of Postgres ... 7.1.3 is pretty old ...

I agree.. downloadind 7.2.2 right away..

> > Create unique composite index on 2 char and a timestamp field:  25226 sec.
>
> What do you mean by "char" exactly?  If it's really char(N), how much
> are you paying in padding space?  There are very very few cases where
> I'd not say to use varchar(N), or text, instead.  Also, does it have to
> be character data?  If you could use an integer or float datatype
> instead the index operations should be faster (though I can't say by
> how much).  Have you thought carefully about the order in which the
> composite index columns are listed?

I have forwarded the idea of putting things into number. If it causes speedup
in index lookup/creation, it would do. Looks like bigint is the order of the
day..

>
> > sort_mem = 12000
>
> To create an index of this size, you want to push sort_mem as high as it
> can go without swapping.  12000 sounds fine for the global setting, but
> in the process that will create the index, try setting sort_mem to some
> hundreds of megs or even 1Gb.  (But be careful: the calculation of space
> actually used by CREATE INDEX is off quite a bit in pre-7.3 releases
> :-(.  You should probably expect the actual process size to grow to two
> or three times what you set sort_mem to.  Don't let it get so big as to
> swap.)

Great. I was skeptical to push it beyond 100MB. Now I can push it to corners..

> > wal_buffers = 65536
>
> The above is a complete waste of memory space, which would be better
> spent on letting the kernel expand its disk cache.  There's no reason
> for wal_buffers to be more than a few dozen.

That was a rather desparate move. Nothing was improving performance and then we
started pushing numbers.. WIll get it back..  Same goes for 64 WAL files.. A GB
looks like waste to me..

I might have found the bottleneck, although by accident. Mysql was running out
of space while creating index. So my friend shut down mysql and tried to move
things by hand to create links. He noticed that even things like cp were
terribly slow and it hit us.. May be the culprit is the file system. Ext3 in
this case.

My friend argues for ext2 to eliminate journalling overhead but I favour
reiserfs personally having used it in pgbench with 10M rows on paltry 20GB IDE
disk for 25 tps..

We will be attempting raiserfs and/or XFS if required. I know how much speed
difference exists between resiserfs and ext2. Would not be surprised if
everythng just starts screaming in one go..

Bye
 Shridhar

--
Cropp's Law:    The amount of work done varies inversly with the time spent in the
office.


pgsql-general by date:

Previous
From: Ericson Smith
Date:
Subject: Unixtime (epoch) into timestamp?
Next
From: "Shridhar Daithankar"
Date:
Subject: Re: [HACKERS] Performance while loading data and indexing