Thread: Insert are going slower ...

Insert are going slower ...

From
Hervé Piedvache
Date:
Hi,

I have a database with 10 tables having about 50 000 000 records ...
Every day I have to delete about 20 000 records, inserting about the same in
one of this table.

Then I make some agregations inside the other tables to get some week results,
and globals result by users. That mean about 180 000 to 300 000 insert by
table each days.

The time for the calculation of the request is about 2 / 4 minutes ... I do
the request inside a temporary table ... then I do an
insert into my_table select * from temp_table.

And that's the point, the INSERT take about (depending of the tables) 41
minutes up to 2 hours ... only for 180 to 300 000 INSERTs ...

The table have index really usefull ... so please do not tell me to delete
some of them ... and I can't drop them before inserting data ... it's really
too long to regenerate them ...

I'm configured with no flush, I have 8 Gb of RAM, and RAID 5 with SCSI 7200
harddrive ... I'm using Linux Debian, with a PostgreSQL version compiled by
myself in 7.4.3.

What can I do to get better results ?? (configuration option, and/or hardware
update ?)
What can I give you to get more important informations to help me ?

Regards,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

Re: Insert are going slower ...

From
Josh Berkus
Date:
Herve,

> What can I do to get better results ?? (configuration option, and/or
> hardware update ?)
> What can I give you to get more important informations to help me ?

1) What PostgreSQL version are you using?

2) What's your VACUUM, ANALYZE, VACUUM FULL, REINDEX schedule?

3) Can you list the non-default settings in your PostgreSQL.conf?
Particularly, shared_buffers, sort_mem, checkpoint_segments, estimated_cache,
and max_fsm_pages?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Insert are going slower ...

From
Hervé Piedvache
Date:
Josh,

Le mardi 13 Juillet 2004 19:10, Josh Berkus a écrit :
>
> > What can I do to get better results ?? (configuration option, and/or
> > hardware update ?)
> > What can I give you to get more important informations to help me ?
>
> 1) What PostgreSQL version are you using?

v7.4.3

> 2) What's your VACUUM, ANALYZE, VACUUM FULL, REINDEX schedule?

VACUUM FULL VERBOSE ANALYZE;

Every day after the calculation I was talking about ...

> 3) Can you list the non-default settings in your PostgreSQL.conf?
> Particularly, shared_buffers, sort_mem, checkpoint_segments,
> estimated_cache, and max_fsm_pages?

shared_buffers = 48828
sort_mem =   512000
vacuum_mem = 409600
max_fsm_pages = 50000000
max_fsm_relations = 2000
max_files_per_process = 2000
wal_buffers = 1000
checkpoint_segments = 3
effective_cache_size = 5000000
random_page_cost = 3
default_statistics_target = 20
join_collapse_limit = 10

Regards,
--
Hervé Piedvache


Re: Insert are going slower ...

From
"gnari"
Date:
From: "Hervé Piedvache" <footcow@noos.fr>
Sent: Tuesday, July 13, 2004 11:42 PM


> effective_cache_size = 5000000

looks like madness to me.
my (modest) understanding of this, is that
you are telling postgres to assume a 40Gb disk
cache (correct me if I am wrong).

btw, how much effect does this setting have on the planner?

is there a recommended procedure to estimate
the best value for effective_cache_size on a
dedicated DB server ?

gnari






Re: Insert are going slower ...

From
Shridhar Daithankar
Date:
gnari wrote:
> is there a recommended procedure to estimate
> the best value for effective_cache_size on a
> dedicated DB server ?

Rule of thumb(On linux): on a typically loaded machine, observe cache memory of
the machine and allocate good chunk of it as effective cache.

To define good chunck of it, you need to consider how many other things are
running on that machine. If it is file server + web server + database server,
you have to allocate the resources depending upon requirement.

But remember It does not guarantee that it will be a good value. It is just a
starting point..:-) You have to tune it further if required.

HTH

  Shridhar

Re: Insert are going slower ...

From
Hervé Piedvache
Date:
Le mercredi 14 Juillet 2004 12:13, Shridhar Daithankar a écrit :
> gnari wrote:
> > is there a recommended procedure to estimate
> > the best value for effective_cache_size on a
> > dedicated DB server ?
>
> Rule of thumb(On linux): on a typically loaded machine, observe cache
> memory of the machine and allocate good chunk of it as effective cache.
>
> To define good chunck of it, you need to consider how many other things are
> running on that machine. If it is file server + web server + database
> server, you have to allocate the resources depending upon requirement.
>
> But remember It does not guarantee that it will be a good value. It is just
> a starting point..:-) You have to tune it further if required.

In my case it's a PostgreSQL dedicated server ...

effective_cache_size = 5000000

For me I give to the planner the information that the kernel is able to cache
5000000 disk page in RAM

>free
             total       used       free     shared    buffers     cached
Mem:       7959120    7712164     246956          0      17372    7165704
-/+ buffers/cache:     529088    7430032
Swap:      2097136       9880    2087256

What should I put ?

Regards,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

Re: Insert are going slower ...

From
Shridhar Daithankar
Date:
Hervé Piedvache wrote:
> In my case it's a PostgreSQL dedicated server ...
>
> effective_cache_size = 5000000
>
> For me I give to the planner the information that the kernel is able to cache
> 5000000 disk page in RAM

That is what? 38GB of RAM?
>
>
>>free
>
>              total       used       free     shared    buffers     cached
> Mem:       7959120    7712164     246956          0      17372    7165704
> -/+ buffers/cache:     529088    7430032
> Swap:      2097136       9880    2087256
>
> What should I put ?

7165704 / 8 = 895713

So counting variations, I would say 875000. That is a 8GB box, right? So 875000
is about 7000MB. Which should be rather practical. Of course you can give it
everything you can but that's upto you.

Can you get explain analze for inserts? I think some foreign key check etc. are
taking long and hence it accumulates. But that is just a wild guess.

Off the top of my head, you have allocated roughly 48K shard buffers which seems
bit on higher side. Can you check with something like 10K-15K?

HTH

  Shridhar

Re: Insert are going slower ...

From
Josh Berkus
Date:
Herve'

I forgot to ask about your hardware.   How much RAM, and what's your disk
setup?  CPU?

> sort_mem =   512000

Huh?   Sort_mem is in K.   The above says that you've allocated 512MB sort
mem.  Is this process the *only* thing going on on the machine?

> vacuum_mem = 409600

Again, 409.6MB vacuum mem?   That's an odd number, and quite high.

> max_fsm_pages = 50000000

50million?   That's quite high.   Certianly enough to have an effect on your
memory usage.   How did you calculate this number?

> checkpoint_segments = 3

You should probably increase this if you have the disk space.  For massive
insert operations, I've found it useful to have as much as 128 segments
(although this means about 1.5GB disk space)

> effective_cache_size = 5000000

If you actually have that much RAM, I'd love to play on your box.  Please?

> Off the top of my head, you have allocated roughly 48K shard buffers which
> seems bit on higher side. Can you check with something like 10K-15K?

Shridhar, that depends on how much RAM he has.   On 4GB dedicated machines,
I've set Shared_Buffers as high as 750MB.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Insert are going slower ...

From
Hervé Piedvache
Date:
Josh,

Le mercredi 14 Juillet 2004 18:28, Josh Berkus a écrit :
>
> I forgot to ask about your hardware.   How much RAM, and what's your disk
> setup?  CPU?

8 Gb of RAM
Bi - Intel Xeon 2.00GHz
Hard drive in SCSI RAID 5
/dev/sdb6             101G   87G  8.7G  91% /usr/local/pgsql/data
/dev/sda7             1.8G  129M  1.6G   8% /usr/local/pgsql/data/pg_xlog

Server dedicated to PostgreSQL with only one database.

> > sort_mem =   512000
>
> Huh?   Sort_mem is in K.   The above says that you've allocated 512MB sort
> mem.  Is this process the *only* thing going on on the machine?

PostgreSQL dedicated server yes ... so it's too much ?
How you decide the good value ?

> > vacuum_mem = 409600
>
> Again, 409.6MB vacuum mem?   That's an odd number, and quite high.

Yep but I have 8 Gb of memory ... ;o) So why not ?
Just explain me why it's not a good choice ... I have done this because of
this text from you found somewhere :
"As this setting only uses RAM when VACUUM is running, you may wish to
increase it on high-RAM machines to make VACUUM run faster (but never more
than 20% of available RAM!)"
So that's less than 20% of my memory ...

> > max_fsm_pages = 50000000
>
> 50million?   That's quite high.   Certianly enough to have an effect on
> your memory usage.   How did you calculate this number?

Not done by me ... and the guy is out ... but in same time with 8 Gb of
RAM ... that's not a crazy number ?

> > checkpoint_segments = 3
>
> You should probably increase this if you have the disk space.  For massive
> insert operations, I've found it useful to have as much as 128 segments
> (although this means about 1.5GB disk space)
>
> > effective_cache_size = 5000000
>
> If you actually have that much RAM, I'd love to play on your box.  Please?

Hum ... yes as Shridhar told me the number is a crazy one and now down to
875000 ...

> > Off the top of my head, you have allocated roughly 48K shard buffers
> > which seems bit on higher side. Can you check with something like
> > 10K-15K?
>
> Shridhar, that depends on how much RAM he has.   On 4GB dedicated machines,
> I've set Shared_Buffers as high as 750MB.

Could you explain me the interest to reduce this size ??
I really miss understand this point ...

regards,
--
Bill Footcow


Re: Insert are going slower ...

From
Hervé Piedvache
Date:
Josh,

Le mercredi 14 Juillet 2004 18:28, Josh Berkus a écrit :
>
> > checkpoint_segments = 3
>
> You should probably increase this if you have the disk space.  For massive
> insert operations, I've found it useful to have as much as 128 segments
> (although this means about 1.5GB disk space)

Other point I have also read this :
"NOTE: Since 7.2, turning fsync off does NOT stop WAL. It does stop
checkpointing."

So ... still true for 7.4.3 ??? So I'm with fsync = off so the value of
checkpoint_segments have no interest ??

Thanks for your help...
--
Bill Footcow


Re: Insert are going slower ...

From
Shridhar Daithankar
Date:
Hervé Piedvache wrote:

> Josh,
>
> Le mercredi 14 Juillet 2004 18:28, Josh Berkus a écrit :
>
>>>checkpoint_segments = 3
>>
>>You should probably increase this if you have the disk space.  For massive
>>insert operations, I've found it useful to have as much as 128 segments
>>(although this means about 1.5GB disk space)
>
>
> Other point I have also read this :
> "NOTE: Since 7.2, turning fsync off does NOT stop WAL. It does stop
> checkpointing."
>
> So ... still true for 7.4.3 ??? So I'm with fsync = off so the value of
> checkpoint_segments have no interest ??
>
> Thanks for your help...

I suggest you check this first. Check the performance tuning guide..

http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

That is a starters. As Josh suggested, increase checkpoint segments if you have
disk space. Correspondingly WAL disk space requirements go up as well.

HTH

  Shridhar


Re: Insert are going slower ...

From
Josh Berkus
Date:
Shridhar,

> I suggest you check this first. Check the performance tuning guide..
>
> http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
>
> That is a starters. As Josh suggested, increase checkpoint segments if you
have
> disk space. Correspondingly WAL disk space requirements go up as well.

Well, not if he has fsync=off.   But having fsync off is a very bad idea.  You
do realize, Herve', that if you lose power on that machine you'll most likely
have to restore from backup?

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Insert are going slower ...

From
Hervé Piedvache
Date:
Josh,

Le jeudi 15 Juillet 2004 20:09, Josh Berkus a écrit :
> > I suggest you check this first. Check the performance tuning guide..
> >
> > http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
> >
> > That is a starters. As Josh suggested, increase checkpoint segments if
> > you
>
> have
>
> > disk space. Correspondingly WAL disk space requirements go up as well.
>
> Well, not if he has fsync=off.   But having fsync off is a very bad idea.
> You do realize, Herve', that if you lose power on that machine you'll most
> likely have to restore from backup?

Hum ... it's only for speed aspect ... I was using postgresql with this option
since 7.01 ... and for me fsync=on was so slow ...
Is it really no time consuming for the system to bring it ON now with
v7.4.3 ??

Tell me ...
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

Re: Insert are going slower ...

From
Josh Berkus
Date:
Herve'

> Hum ... it's only for speed aspect ... I was using postgresql with this
> option since 7.01 ... and for me fsync=on was so slow ...
> Is it really no time consuming for the system to bring it ON now with
> v7.4.3 ??

Well, I wouldn't do it until you've figured out the current performance
problem.

The issue with having fsync=off is that, if someone yanks the power cord on
your server, there is a significant chance that you will have to restore the
database from backup becuase it will be corrupted.   But clearly you've been
living with that risk for some time.

It *is* true that there is significantly less performance difference between
7.4 with fsync off and on than there was between 7.1 with fsync off and on.
But there is still a difference.   In 7.0 and 7.1 (I think), when you turned
fsync off it turned WAL off completely, resulting in a substantial difference
in disk activity.   Now, it just stops checkpointing WAL but WAL is still
recording -- meaning that disk activity decreases some but not a lot.   The
difference is more noticable the more vulnerable to contention your disk
system is.

The other reason not to think of fsync=off as a permanent performance tweak is
that we're likely to remove the option sometime in the next 2 versions, since
an increasing number of features depend on WAL behavior, and the option is
largely a legacy of the 7.0 days, when WAL was sometimes buggy and needed to
be turned off to get the database to start.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Insert are going slower ...

From
Gaetano Mendola
Date:
Josh Berkus wrote:

> Herve'
>
> I forgot to ask about your hardware.   How much RAM, and what's your disk
> setup?  CPU?
>
>
>>sort_mem =   512000
>
>
> Huh?   Sort_mem is in K.   The above says that you've allocated 512MB sort
> mem.  Is this process the *only* thing going on on the machine?

And also is not system wide but let me say "for backend"...



Regards
Gaetano Mendola

Re: Insert are going slower ...

From
Gaetano Mendola
Date:
Hervé Piedvache wrote:

> Josh,
>
> Le mardi 13 Juillet 2004 19:10, Josh Berkus a écrit :
>
>>>What can I do to get better results ?? (configuration option, and/or
>>>hardware update ?)
>>>What can I give you to get more important informations to help me ?
>>
>>1) What PostgreSQL version are you using?
>
>
> v7.4.3
>
>
>>2) What's your VACUUM, ANALYZE, VACUUM FULL, REINDEX schedule?
>
>
> VACUUM FULL VERBOSE ANALYZE;
>
> Every day after the calculation I was talking about ...
>
>
>>3) Can you list the non-default settings in your PostgreSQL.conf?
>>Particularly, shared_buffers, sort_mem, checkpoint_segments,
>>estimated_cache, and max_fsm_pages?
>

> sort_mem =   512000

This is too much, you are instructing Postgres to use 512MB
for each backend ( some time each backend can use this quantity
more then one )

> vacuum_mem = 409600
> max_fsm_pages = 50000000
 > max_fsm_relations = 2000

50 milions ? HUG.
what tell you postgres in the log after performing
a vacuum full ?

> max_files_per_process = 2000
> wal_buffers = 1000
> checkpoint_segments = 3

For massive insert you have to increase this number,
pump it up to 16


> effective_cache_size = 5000000

5GB for 8 GB system is too much

> random_page_cost = 3

on your HW you can decrease it to 2
and also decrease the other cpu costs

Regards
Gaetano Mendola


BTW, I live in Paris too, if you need a hand...






Re: Insert are going slower ...

From
"Scott Marlowe"
Date:
On Mon, 2004-07-26 at 08:20, Gaetano Mendola wrote:
> Hervé Piedvache wrote:

SNIP

> > sort_mem =   512000
>
> This is too much, you are instructing Postgres to use 512MB
> for each backend ( some time each backend can use this quantity
> more then one )

agreed.  If any one process needs this much sort mem, you can set it in
that sessions with set sort_mem anyway, so to let every sort consume up
to 512 meg is asking for trouble.

> > effective_cache_size = 5000000
>
> 5GB for 8 GB system is too much

No, it's not.  Assuming that postgresql with all it's shared buffers is
using <2 gig, it's quite likely that the kernel is caching at least 5
gigs of disk data.  Effective cache size doesn't set any cache size, it
tells the planner about how much the kernel is caching.

> > random_page_cost = 3
>
> on your HW you can decrease it to 2
> and also decrease the other cpu costs

On fast machines it often winds up needing to be set somewhere around
1.2 to 2.0