Re: extremly low memory usage - Mailing list pgsql-performance
From | Jeremiah Jahn |
---|---|
Subject | Re: extremly low memory usage |
Date | |
Msg-id | 1124482288.27881.182.camel@bluejay.goodinassociates.com Whole thread Raw |
In response to | Re: extremly low memory usage (John A Meinel <john@arbash-meinel.com>) |
Responses |
Re: extremly low memory usage
|
List | pgsql-performance |
On Fri, 2005-08-19 at 14:23 -0500, John A Meinel wrote: > Ron wrote: > > At 01:18 PM 8/19/2005, John A Meinel wrote: > > > >> Jeremiah Jahn wrote: > >> > Sorry about the formatting. > >> > > >> > On Thu, 2005-08-18 at 12:55 -0500, John Arbash Meinel wrote: > >> > > >> >>Jeremiah Jahn wrote: > >> >> > >> >> > >> > >> ... > >> > >> >>The expensive parts are the 4915 lookups into the litigant_details > >> (each > >> >>one takes approx 4ms for a total of ~20s). > >> >>And then you do it again on case_data (average 3ms each * 4906 loops = > >> >>~15s). > >> > > >> > Is there some way to avoid this? > >> > > >> > >> Well, in general, 3ms for a single lookup seems really long. Maybe your > >> index is bloated by not vacuuming often enough. Do you tend to get a lot > >> of updates to litigant_details? > > > > > > Given that the average access time for a 15Krpm HD is in the 5.5-6ms > > range (7.5-8ms for a 10Krpm HD), having an average of 3ms for a single > > lookup implies that ~1/2 (the 15Krpm case) or ~1/3 (the 10Krpm case) > > table accesses is requiring a seek. > > I think LVM may be a problem, since it also seems to break things up on the file system. My access time on the seek should be around 1/7th the 15Krpm I believe since it's a 14 disk raid 10 array. And no other traffic at the moment. > > > Well, from what he has said, the total indexes are < 1GB and he has 6GB > of ram. So everything should fit. Not to mention he is only accessing > 5000/several million rows. I table spaced some of the indexes and they are around 211066880 bytes for the name_speed index and 149825330 for the lit_actor_speed index tables seem to be about a gig. > > > > This implies a poor match between physical layout and access pattern. > > This seems to be the case. But since this is not the only query, it may > be that other access patterns are more important to optimize for. > > > > > If I understand correctly, the table should not be very fragmented given > > that this is a reasonably freshly loaded DB? That implies that the > > fields being looked up are not well sorted in the table compared to the > > query pattern. > > > > If the entire table could fit in RAM, this would be far less of a > > consideration. Failing that, the physical HD layout has to be improved > > or the query pattern has to be changed to reduce seeks. > > > > > > ... > > >> After CLUSTER, the current data will stay clustered, but new data will > >> not, so you have to continually CLUSTER, the same way that you might > >> VACUUM. *However*, IIRC CLUSTER grabs an Exclusive lock, so it is as > >> expensive as a VACUUM FULL. Be aware of this, but it might vastly > >> improve your performance, so it would be worth it. > > > > > > CLUSTER can be a very large maintenance overhead/problem if the table(s) > > in question actually need to be "continually" re CLUSTER ed. > > > > If there is no better solution available, then you do what you have to, > > but it feels like there should be a better answer here. > > > > Perhaps the DB schema needs examining to see if it matches up well with > > its real usage? > > > > Ron Peacetree > > > > I certainly agree that CLUSTER is expensive, and is an on-going > maintenance issue. If it is the normal access pattern, though, it may be > worth it. The query I've sent you is one of the most common I get just change the name. I handle about 180K of them a day mostly between 8 and 5. The clustering has never really been a problem. Like I said before I do it about once a week. I handle about 3000 update an hour consisting of about 1000-3000 statement per update. ie about 2.5 million updates per hour. In the last few months or so I've filtered these down to about 400K update/delete/insert statements per hour. > > I also wonder, though, if his table is properly normalized. Which, as > you mentioned, might lead to improved access patterns. The system is about as normalized as I can get it. In general the layout is the following: courts have cases, cases have litigant_details. Actors have identities and litigant_details. > > John > =:-> -- Speak softly and carry a +6 two-handed sword.
pgsql-performance by date: