Re: Memory usage - indexes - Mailing list pgsql-performance

From Tobias Brox
Subject Re: Memory usage - indexes
Date
Msg-id AANLkTikgHeENV_OKofSBR8Xm0PMvRheYq=ejioveryKP@mail.gmail.com
Whole thread Raw
In response to Re: Memory usage - indexes  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Memory usage - indexes
List pgsql-performance
Thanks for spending your time on this ... amidst all the useful
feedback I've received, I'd rate your post as the most useful post.

>> 1) Are there any good ways to verify my hypothesis?
>
> You can confim easily whether the contents of the PostgreSQL buffer cache
> contain when you think they do by installing pg_buffercache.  My paper and
> sample samples at http://www.pgcon.org/2010/schedule/events/218.en.html go
> over that.

I've asked the sysadmin to consider installing it.  From what I
understood from other posts, the pg_statio_user_indexes and
pg_statio_user_tables would also indicate to what extent data is found
in shared buffers and not.  Monitoring it over time could possibly
help us predicting the "tipping points" before they happen.  Though
still, if most of the cacheing takes place on the OS level, one
wouldn't learn that much from studying the shared buffers usage ...

> You can also integrate that with a look at the OS level information by using
> pgfincore:  http://www.pgcon.org/2010/schedule/events/261.en.html

... ah, right ... :-)

> I've found that if shared_buffers is set to a largish size, you can find out
> enough information from look at it to have a decent idea what's going on
> without going to that depth.  But it's available if you want it.

Haven't studied it in details yet, but the information value in itself
may be a "selling point" for increasing the buffer size.

> I have easily measurable improvements on client systems increasing
> shared_buffers into the 4GB - 8GB range.  Popular indexes move into there,
> stay there, and only get written out at checkpoint time rather than all the
> time.

Ours is at 12 GB, out of 70 GB total RAM.

>  However, if you write heavily enough that much of this space gets
> dirty fast, you may not be be able to go that high before checkpoint issues
> start to make such sizes impractical.

I think we did have some issues at some point ... we do have some
applications that are very sensitive towards latency.  Though, I think
the problem was eventually solved.  I think I somehow managed to
deliver the message that it was not a good idea to store
keep-alive-messages sent every second from multiple clients into the
main production database, and that it was an equally bad idea to
disconnect the clients after a three seconds timeout :-)  Anyway,
today we have mostly issues with read access, not write access.

> Using minimal values works for some people, particularly on
> Windows,

Huh ... does it mean Windows have better OS cache handling than Linux?
 To me it sounds insane to run a database under a buggy GUI ... but I
suppose I should keep that to myself :-)

> Yes, it would run slower, because now it has to sort through blocks in a
> larger index in order to find anything.  How significant that is depends on
> the relative size of the indexes.  To give a simple example, if (a) is 1GB,
> while (a,b) is 2GB, you can expect dropping (a) to halve the speed of index
> lookups.  Fatter indexes just take longer to navigate through.

Linear relationship between the time it takes to do index lookups vs
the fatness of the index?  That's not what I guessed in the first
place ... but I suppose you're right.

> P.S. You seem busy re-inventing pgstatspack this week:
>  http://pgfoundry.org/projects/pgstatspack/ does all of this "take a
> snapshot of the stats and store it in the database for future analysis" work
> for you.  Working on that intead of continuing to hack individual
> storage/retrieve scripts for each statistics counter set would be a better
> contribution to the PostgreSQL community.

Sometimes it takes more work to implement work already done by others
than to reimplement the logics ... but anyway, I will have a look
before I make more snapshot tables ;-)

pgsql-performance by date:

Previous
From: Tobias Brox
Date:
Subject: Re: Memory usage - indexes
Next
From: Greg Smith
Date:
Subject: Re: Memory usage - indexes