Thread: Help request: how to tune performance?
Hi, We just upgraded Best Practical's RT from 3.6 to 3.81 and gave the database completely own machine. And the users still complain that it is dog slow. :-( I installed pg_top and it seems that at the beginning of the ticket display RT-issues a query that eats everything the database has. Query is as follows: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_3 ON ( CachedGroupMembers_3.MemberId = Principals_1.id ) WHERE (Principals_1.Disabled = '0') AND (ACL_2.PrincipalId = CachedGroupMembers_3.GroupId) AND (Principals_1.id != '1') AND (ACL_2.PrincipalType = 'Group') AND (Principals_1.PrincipalType = 'User') AND (ACL_2.RightName = 'OwnTicket') AND ((ACL_2.ObjectType = 'RT::Queue' AND ACL_2.ObjectId = 18) OR (ACL_2.ObjectType = 'RT::System')) ORDER BY main.Name ASC and explain tells me that: -> Nested Loop (cost=16.17..9953.51 rows=1 width=4318) Join Filter: ("outer".principalid = "inner".groupid) -> Index Scan using acl1 on acl acl_2 (cost=0.00..49.38 rows=1 width=4) Index Cond: ((principaltype)::text = 'Group'::text) Filter: ((((rightname)::text = 'OwnTicket'::text) OR ((rightname)::text = 'SuperUser'::text)) AND (((objecttype)::text = 'RT::Queue'::text) OR ((objecttype)::text = 'RT::System'::text))) -> Nested Loop (cost=16.17..9903.98 rows=12 width=4322) -> Nested Loop (cost=0.00..6343.98 rows=1 width=4322) -> Seq Scan on principals principals_1 (cost=0.00..6308.00 rows=6 width=4) Filter: ((disabled = 0) AND (id <> 1) AND ((principaltype)::text = 'User'::text)) -> Index Scan using users_pkey on users main (cost=0.00..5.98 rows=1 width=4318) Index Cond: ("outer".id = main.id) -> Bitmap Heap Scan on cachedgroupmembers cachedgroupmembers_3 (cost=16.17..3527.23 rows=2621 width=8) Recheck Cond: (cachedgroupmembers_3.memberid = "outer".id) -> Bitmap Index Scan on cachedgroupmembers2 (cost=0.00..16.17 rows=2621 width=0) Index Cond: (cachedgroupmembers_3.memberid = "outer".id) Is there something I can do to improve performance with tuning something on postgresql.conf? Or adding/dropping indexes? What I read from that query plan is that the single most expensive thing is sequential scan on Principals. Principals already has indexes for both id and object.id! Database version: Name : postgresql-server Relocations: (not relocatable) Version : 8.1.11 Vendor: CentOS Release : 1.el5_1.1 Build Date: Sat 12 Jan 2008 04:45:09 PM EET pg_top: last pid: 7201; load avg: 0.62, 0.90, 0.62; up 0+19:17:00 11:28:10 13 processes: 1 running, 12 sleeping CPU states: 8.2% user, 0.0% nice, 42.4% system, 49.5% idle, 0.0% iowait Memory: 1083M used, 2722M free, 234M buffers, 759M cached Swap: 1024M free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 6015 postgres 17 0 22M 12M run 0:32 13.28% 99.82% postgres: rt rt 6018 postgres 15 0 22M 12M sleep 3:25 0.01% 0.00% postgres: rt rt 6035 postgres 15 0 22M 12M sleep 2:11 1.31% 0.00% postgres: rt rt 6037 postgres 15 0 22M 12M sleep 1:33 0.01% 0.00% postgres: rt rt
On Tue, Sep 16, 2008 at 2:31 AM, Mauri Sahlberg <Mauri.Sahlberg@claymountain.com> wrote: > Hi, > > We just upgraded Best Practical's RT from 3.6 to 3.81 and gave the > database completely own machine. And the users still complain that it is > dog slow. Moved up from below: > Version : 8.1.11 Vendor: CentOS So, you built it its own machine, but you didn't upgrade to at least 8.2? Last place I worked we ran rt 3.6.1 and got a noticeable performance boost from switching to 8.2 but the only thing that was ever really slow was viewing the rather large approval queue. > :-( I installed pg_top and it seems that at the beginning of > the ticket display RT-issues a query that eats everything the database > has. Query is as follows: > > SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN > Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN > CachedGroupMembers CachedGroupMembers_3 ON > ( CachedGroupMembers_3.MemberId = Principals_1.id ) WHERE > (Principals_1.Disabled = '0') AND (ACL_2.PrincipalId = > CachedGroupMembers_3.GroupId) AND (Principals_1.id != '1') AND > (ACL_2.PrincipalType = 'Group') AND (Principals_1.PrincipalType = > 'User') AND (ACL_2.RightName = 'OwnTicket') AND ((ACL_2.ObjectType = > 'RT::Queue' AND ACL_2.ObjectId = 18) OR (ACL_2.ObjectType = > 'RT::System')) ORDER BY main.Name ASC Please post the output of explain analyze as an attachment. explain is only half the answer. > Is there something I can do to improve performance with tuning something > on postgresql.conf? Or adding/dropping indexes? What I read from that > query plan is that the single most expensive thing is sequential scan on > Principals. Principals already has indexes for both id and object.id! Possibly. explain analyze will help you identify where stats are wrong. sometimes just cranking the stats target on a few columns and re-analyzing gets you a noticeable performance boost. It's cheap and easy. When the estimated and actual number of rows are fairly close, then look for the slowest thing and see if an index can help. What have to already done to tune the install? shared_buffers, work_mem, random_page_cost, effective_cache_size. Is your db bloating during the day? Why no try 8.3 on this? Are you running on a single SATA hard drive? How big's the database directory? I'm guessing from your top output that the db is about 500 meg or so. it should all fit in memory.
On Tue, Sep 16, 2008 at 11:10:01AM -0600, Scott Marlowe wrote: > On Tue, Sep 16, 2008 at 2:31 AM, Mauri Sahlberg > <Mauri.Sahlberg@claymountain.com> wrote: > > Hi, > > > > We just upgraded Best Practical's RT from 3.6 to 3.81 and gave the > > database completely own machine. And the users still complain that it is > > dog slow. > > Moved up from below: > > > Version : 8.1.11 Vendor: CentOS > > So, you built it its own machine, but you didn't upgrade to at least 8.2? > > Last place I worked we ran rt 3.6.1 and got a noticeable performance > boost from switching to 8.2 but the only thing that was ever really > slow was viewing the rather large approval queue. > > > :-( I installed pg_top and it seems that at the beginning of > > the ticket display RT-issues a query that eats everything the database > > has. Query is as follows: > > > > SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN > > Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN > > CachedGroupMembers CachedGroupMembers_3 ON > > ( CachedGroupMembers_3.MemberId = Principals_1.id ) WHERE > > (Principals_1.Disabled = '0') AND (ACL_2.PrincipalId = > > CachedGroupMembers_3.GroupId) AND (Principals_1.id != '1') AND > > (ACL_2.PrincipalType = 'Group') AND (Principals_1.PrincipalType = > > 'User') AND (ACL_2.RightName = 'OwnTicket') AND ((ACL_2.ObjectType = > > 'RT::Queue' AND ACL_2.ObjectId = 18) OR (ACL_2.ObjectType = > > 'RT::System')) ORDER BY main.Name ASC > What is your version of DBIx-SearchBuilder? In particular, the DISTINCT definition was changed from: $$statementref = "SELECT DISTINCT main.* FROM $$statementref"; to: $$statementref = "SELECT main.* FROM ( SELECT DISTINCT main.id FROM $$statementref ) distinctquery, $table main WHERE (main.id= distinctquery.id) "; You can change the line in your version of SearchBuilder and restart RT with a cache flush to get approximately the same benefit. Ken
Hi, Thanks for the reply and advice. Scott Marlowe kirjoitti: >> Version : 8.1.11 Vendor: CentOS >> > > So, you built it its own machine, but you didn't upgrade to at least 8.2? > > Now it is: 8.4devel_15092008 The machine was installed by the production team from the standard CentOS template. I tried to adhere to the standard and installed the standard CentOS binary for Postgresql. I am not part of production team so I try to be extra careful with the "rule book". > > Please post the output of explain analyze as an attachment. explain > is only half the answer. > > I did what Kenneth Marshall suggested and edited DBIx::Searchbuilder's Handle/Pg.pm. I will post the explain analyze for the new query it now generates if it becomes necessary. > Possibly. explain analyze will help you identify where stats are > wrong. sometimes just cranking the stats target on a few columns and > re-analyzing gets you a noticeable performance boost. It's cheap and > easy. > > When the estimated and actual number of rows are fairly close, then > look for the slowest thing and see if an index can help. > > What have to already done to tune the install? shared_buffers, > work_mem, random_page_cost, effective_cache_size. Is your db bloating > during the day? > > When I upgraded to 8.4 I also checked newer Postgresql manual for the memory consumption and found comment by Steven Citron-Pousty and increased accordingly: - shared_buffers to 320MB - wal_buffers to 8MB - effective_cache_size to 2048MB - maintenance_work_mem to 384MB Sorry, I do not understand what you mean by bloating. The db size is: rt=# select pg_size_pretty(pg_database_size('rt')); pg_size_pretty ---------------- 350 MB (1 row) > Are you running on a single SATA hard drive? How big's the database > directory? I'm guessing from your top output that the db is about 500 > meg or so. it should all fit in memory. > > -bash-3.2$ du --si -s data 524M data I don't know what kind of drives there actually are. The machine is vmware virtual with two virtual CPU's clocking 2,33GHz, 4 GB ram, 1 GB swap. The disk is probably given from either MSA or from EVA. The disk shows up as one virtual drive and everything is on it. Filesystem is ext3 on lvm. Database data is on /var which is it's own volume. I have also added 5 more mason processes to the web frontend machine. For me the results look promising. Opening search builder went from 42 seconds to 4 seconds and opening one particular long chain takes now only 27 seconds. But again I am not from the support team either so I do not get to define what is fast enough. The verdict is now in for the jury to decide. Thank you.
Hi, The only other thing to check is what indexes are defined for your schema. You can look at a previous post about PostgreSQL indexing for RT to see what we are using here at Rice. Let me know if you have any questions. Cheers, Ken On Thu, Sep 18, 2008 at 09:00:14PM +0300, Mauri Sahlberg wrote: > Hi, > > Thanks for the reply and advice. > > Scott Marlowe kirjoitti: >>> Version : 8.1.11 Vendor: CentOS >>> >> >> So, you built it its own machine, but you didn't upgrade to at least 8.2? >> >> > Now it is: 8.4devel_15092008 > > The machine was installed by the production team from the standard CentOS > template. I tried to adhere to the standard and installed the standard > CentOS binary for Postgresql. I am not part of production team so I try to > be extra careful with the "rule book". >> >> Please post the output of explain analyze as an attachment. explain >> is only half the answer. >> >> > I did what Kenneth Marshall suggested and edited DBIx::Searchbuilder's > Handle/Pg.pm. I will post the explain analyze for the new query it now > generates if it becomes necessary. >> Possibly. explain analyze will help you identify where stats are >> wrong. sometimes just cranking the stats target on a few columns and >> re-analyzing gets you a noticeable performance boost. It's cheap and >> easy. >> >> When the estimated and actual number of rows are fairly close, then >> look for the slowest thing and see if an index can help. >> >> What have to already done to tune the install? shared_buffers, >> work_mem, random_page_cost, effective_cache_size. Is your db bloating >> during the day? >> >> > When I upgraded to 8.4 I also checked newer Postgresql manual for the > memory consumption and found comment by Steven Citron-Pousty and increased > accordingly: > - shared_buffers to 320MB > - wal_buffers to 8MB > - effective_cache_size to 2048MB > - maintenance_work_mem to 384MB > > Sorry, I do not understand what you mean by bloating. The db size is: > rt=# select pg_size_pretty(pg_database_size('rt')); > pg_size_pretty > ---------------- > 350 MB > (1 row) > >> Are you running on a single SATA hard drive? How big's the database >> directory? I'm guessing from your top output that the db is about 500 >> meg or so. it should all fit in memory. >> >> > -bash-3.2$ du --si -s data > 524M data > > I don't know what kind of drives there actually are. The machine is vmware > virtual with two virtual CPU's clocking 2,33GHz, 4 GB ram, 1 GB swap. The > disk is probably given from either MSA or from EVA. The disk shows up as > one virtual drive and everything is on it. Filesystem is ext3 on lvm. > Database data is on /var which is it's own volume. > > I have also added 5 more mason processes to the web frontend machine. > > For me the results look promising. Opening search builder went from 42 > seconds to 4 seconds and opening one particular long chain takes now only > 27 seconds. But again I am not from the support team either so I do not get > to define what is fast enough. The verdict is now in for the jury to > decide. > > Thank you. > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin >
On Thu, Sep 18, 2008 at 12:00 PM, Mauri Sahlberg <mauri.sahlberg@claymountain.com> wrote: >> So, you built it its own machine, but you didn't upgrade to at least 8.2? >> >> > > Now it is: 8.4devel_15092008 I don't think I'd be running production data on a dev version of the db. Not that it's likely to crash and eat all your data, which is a distinct possibility, but that you might have to dump and reload a couple of times before you get to 8.4 production. Plus if there's a weird performance corner case you might get to be the lucky one to report it. 8.3.3 is quite stable and quite a bit faster than 8.1. I haven't had a chance to even test 8.4 yet, but I'm sure it's got its own performance enhancements as well. > The machine was installed by the production team from the standard CentOS > template. I tried to adhere to the standard and installed the standard > CentOS binary for Postgresql. I am not part of production team so I try to > be extra careful with the "rule book". Understood... I prefer to install the PGDG rpms on centos / redhat, as it lets me choose the version I want instead of using the old version that rh/centos supports for that version. They're easy to install and uninstall. So, how's the performance of 8.4 now compared to 8.1? > When I upgraded to 8.4 I also checked newer Postgresql manual for the memory > consumption and found comment by Steven Citron-Pousty and increased > accordingly: > - shared_buffers to 320MB > - wal_buffers to 8MB > - effective_cache_size to 2048MB > - maintenance_work_mem to 384MB Seems reasonable. What's work_mem set to? I'd suggest something in the 4 to 8 meg range for starters, unless you're trying to handle hundreds and hundreds of connections. > Sorry, I do not understand what you mean by bloating. Every time pgsql updates or deletes a row it leaves a dead row in its place. Enough of these without vacuuming up the dead tuples and you wind up with a table with 90% dead space etc... Bad for performance. > The db size is: > rt=# select pg_size_pretty(pg_database_size('rt')); > pg_size_pretty > ---------------- > 350 MB > (1 row) Cool, between OS kernel cache and pgsql's shared_buffers it should all be in memory after a bit. >> Are you running on a single SATA hard drive? How big's the database >> directory? I'm guessing from your top output that the db is about 500 >> meg or so. it should all fit in memory. >> >> > > -bash-3.2$ du --si -s data > 524M data > > I don't know what kind of drives there actually are. The machine is vmware > virtual with two virtual CPU's clocking 2,33GHz, 4 GB ram, 1 GB swap. The > disk is probably given from either MSA or from EVA. The disk shows up as one > virtual drive and everything is on it. Filesystem is ext3 on lvm. Database > data is on /var which is it's own volume. The one thing that should NEVER run on a VM or on an LVM vol is a database. This is because most VMs and LVM for sure, do NOT provide proper write barriers, which means a crash could cost you your database being corrupted beyond repair. also, VMs tend to slow down heavily switched apps like databases and LVM has a maximum throughput in the 300Meg/sec range. Not a big deal for a couple of mirrored disks, but a big deal if you're running a 32 disk RAID-10 array under it. > For me the results look promising. Opening search builder went from 42 > seconds to 4 seconds and opening one particular long chain takes now only 27 > seconds. But again I am not from the support team either so I do not get to > define what is fast enough. The verdict is now in for the jury to decide. hehe. I know how that works. best of luck. I'd push or a dedicated db server. They can't give you a pick up truck and be upset it's not a dragster later on.