Thread: Poor Query
Hi Everybody, I have a performance problem with this query , it takes lot of time on the production database. is there a way to improve it ? i do vacuumdb on this database and do anlyze on the users table separately daily select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678') AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678') order by UserID desc limit 500 QUERY PLAN ----------------------------------------------------------------------------------------------------- Limit (cost=0.00..12752.61 rows=500 width=120) -> Index Scan Backward using users_pkey on users u (cost=0.00..2460462.79 rows=96469 width=120) Filter: ((subplan) AND (subplan)) SubPlan -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.07 rows=1 width=0) Index Cond: (bankaccountid = $1) Filter: (routingnumber = '12345678'::text) -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.07 rows=1 width=0) Index Cond: (bankaccountid = $1) Filter: (accountnumber = '12345678'::text) I tried changing it but it still takes lot of time select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' and ba.routingNumber = '12345678') order by UserID desc limit 500 QUERY PLAN ------------------------------------------------------------------------------------------------------- Limit (cost=0.00..3309.62 rows=500 width=120) -> Index Scan Backward using users_pkey on users u (cost=0.00..1277101.86 rows=192938 width=120) Filter: (subplan) SubPlan -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.07 rows=1 width=0) Index Cond: (bankaccountid = $1) Filter: ((accountnumber = '12345678'::text) AND (routingnumber = '12345678'::text)) the users_pkey index on the primary key userid is on users table. it seems to be using index but it still takes lot of time. here is the output from the pg_class for the users and bankaccount table . Table doesnt have lot of records but this query take anywhere from 3 to 5 min to run which is really bad for us. Can we improve the performance on this query ? relname | relpages | reltuples ---------+----------+----------- users | 39967 | 385875 bankaccount | 242 | 16453 Thanks! Pallav
How many rows do the following queries return : select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678' Can you post EXPLAIN ANALYZE for these two queries ? Regards.
Pierre-Frédéric Caillaud wrote: > > How many rows do the following queries return : > > select userID > from bankaccount ba > where ba.bankaccountID = u.bankaccountID > and ba.accountnumber = '12345678' > > select userID > from bankaccount ba > where ba.bankaccountID = u.bankaccountID > and ba.routingNumber = '12345678' > > Can you post EXPLAIN ANALYZE for these two queries ? > Regards. > Thanks! for the quick reply. It should usually return just one account for that user so its only one record. Actually userid column doesnt exist on bankaccount table it exists only on the user table and it is joined with bankaccountid column, if i run this query separately i wouldnt able to run it .
Just wanted to know the selectivity of the accountnumber and routingNumber columns. I shoulda written : >> How many rows do the following queries return : One or few at most, or a lot ? >> >> select userID >> from bankaccount >> WHERE accountnumber = '12345678' >> >> select userID >> from bankaccount >> WHERE routingNumber = '12345678' >> >> Can you post EXPLAIN ANALYZE for these two queries ? >> Regards. >> > Thanks! for the quick reply. It should usually return just one account > for that user so its only one record. Actually userid column doesnt > exist on bankaccount table it exists only on the user table and it is > joined with bankaccountid column, if i run this query separately i > wouldnt able to run it . > >
Pierre-Frédéric Caillaud wrote: > > > Just wanted to know the selectivity of the accountnumber and > routingNumber columns. > I shoulda written : > >>> How many rows do the following queries return : >> > One or few at most, or a lot ? Just One, user can i have only one bankaccount. > >>> >>> select userID >>> from bankaccount >>> WHERE accountnumber = '12345678' >>> >>> select userID >>> from bankaccount >>> WHERE routingNumber = '12345678' >>> >>> Can you post EXPLAIN ANALYZE for these two queries ? >>> Regards. >>> >> Thanks! for the quick reply. It should usually return just one >> account for that user so its only one record. Actually userid column >> doesnt exist on bankaccount table it exists only on the user table >> and it is joined with bankaccountid column, if i run this query >> separately i wouldnt able to run it . >> >> > > >
> Just One, user can i have only one bankaccount. Ah well, in that case : This is your query : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678') AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678') order by UserID desc limit 500 What it does is scan all users, and for each user, test if it has the accountnumber or the routingNumber you seek. You're reversing the problem : you should first look for accountnumber and routingNumber, THEN look for the user : SELECT * FROM Users WHERE bankaccountID IN (SELECT bankaccountID FROM bankaccount WHERE accountnumber = '12345678' OR/AND routingNumber = '12345678') or : SELECT * FROM Users WHERE userID IN (SELECT userID FROM bankaccount WHERE accountnumber = '12345678' OR/AND routingNumber = '12345678') There is something very strange in your query, it seems that bankaccount and Users both have a UserID column and a bankaccountID column. Is this normal ? It looks denormalized to me...
Pierre-Frédéric Caillaud wrote: > > >> Just One, user can i have only one bankaccount. > > > Ah well, in that case : > This is your query : > > select userID, fname, lname, email, phone, dateEntered, dateCanceled, > dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as > searches > from Users u > where 1=1 AND exists (select userID > from bankaccount ba > where ba.bankaccountID = u.bankaccountID > and ba.accountnumber = '12345678') > AND exists (select userID > from bankaccount ba > where ba.bankaccountID = u.bankaccountID > and ba.routingNumber = '12345678') > order by UserID desc > limit 500 > > What it does is scan all users, and for each user, test if it has > the accountnumber or the routingNumber you seek. You're reversing the > problem : you should first look for accountnumber and routingNumber, > THEN look for the user : > > > SELECT * FROM Users WHERE bankaccountID IN > (SELECT bankaccountID FROM bankaccount WHERE accountnumber = > '12345678' OR/AND routingNumber = '12345678') > > or : > > SELECT * FROM Users WHERE userID IN > (SELECT userID FROM bankaccount WHERE accountnumber = '12345678' > OR/AND routingNumber = '12345678') > > There is something very strange in your query, it seems that > bankaccount and Users both have a UserID column and a bankaccountID > column. Is this normal ? It looks denormalized to me... > Userid column is only in users table not in bankaccounts table , based on your suggestion i made changes to the query and here are the explain plans : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where bankaccountid in (select bankaccountid from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678') AND bankaccountid in (select bankaccountid from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678') order by UserID desc limit 500 QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..6642.59 rows=500 width=121) (actual time=40180.116..93650.837 rows=1 loops=1) -> Index Scan Backward using users_pkey on users u (cost=0.00..1087936.69 rows=81891 width=121) (actual time=40180.112..93650.829 rows=1 loops=1) Filter: ((subplan) AND (subplan)) SubPlan -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.08 rows=1 width=4) (actual time=0.019..0.019 rows=0 loops=3) Index Cond: (bankaccountid = $0) Filter: (routingnumber = '12345678'::text) -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.08 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=385914) Index Cond: (bankaccountid = $0) Filter: (accountnumber = '12345678'::text) Total runtime: 93684.307 ms select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where bankaccountid in (select bankaccountid from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' and ba.routingNumber = '12345678') order by UserID desc limit 500 QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1777.53 rows=500 width=121) (actual time=18479.669..63584.437 rows=1 loops=1) -> Index Scan Backward using users_pkey on users u (cost=0.00..582250.93 rows=163781 width=121) (actual time=18479.663..63584.428 rows=1 loops=1) Filter: (subplan) SubPlan -> Index Scan using bankaccount_pkey on bankaccount ba (cost=0.00..3.09 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=385914) Index Cond: (bankaccountid = $0) Filter: ((accountnumber = '12345678'::text) AND (routingnumber = '12345678'::text)) Total runtime: 63596.222 ms What's wierd is even though there is a index on bankaccountid table it doesnt use that index, it uses the index on the userid table and the execution time is little better but it still takes over a minute to execute .
Your suffering comes from the "where ba.bankaccountID = u.bankaccountID" in the subselect. It means postgres has to run the subselect once for each row in Users. You want the subselect to run only once, and return one (or more?) bankaccountid's, then fetch the users from Users. Just remove the "where ba.bankaccountID = u.bankaccountID" ! > select userID, fname, lname, email, phone, dateEntered, dateCanceled, > dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as > searches > from Users u > where bankaccountid in (select bankaccountid > from bankaccount ba > where ba.bankaccountID = u.bankaccountID > and ba.accountnumber = '12345678' > and ba.routingNumber = '12345678') > order by UserID desc > limit 500 New version : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where bankaccountid in (select bankaccountid from bankaccount ba WHERE ba.accountnumber = '12345678' and ba.routingNumber = '12345678') You could also do this : select u.* from Users u, bankaccount ba where u.bankaccountid = ba.bankaccountid and ba.accountnumber = '12345678' and ba.routingNumber = '12345678')
Pierre-Frédéric Caillaud wrote: > > Your suffering comes from the "where ba.bankaccountID = > u.bankaccountID" in the subselect. It means postgres has to run the > subselect once for each row in Users. You want the subselect to run > only once, and return one (or more?) bankaccountid's, then fetch the > users from Users. > > Just remove the "where ba.bankaccountID = u.bankaccountID" ! > >> select userID, fname, lname, email, phone, dateEntered, dateCanceled, >> dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as >> searches >> from Users u >> where bankaccountid in (select bankaccountid >> from bankaccount ba >> where ba.bankaccountID = u.bankaccountID >> and ba.accountnumber = '12345678' >> and ba.routingNumber = '12345678') >> order by UserID desc >> limit 500 > > > New version : > > select userID, fname, lname, email, phone, dateEntered, dateCanceled, > dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as > searches > from Users u > where bankaccountid in (select bankaccountid > from bankaccount ba > WHERE ba.accountnumber = '12345678' > and ba.routingNumber = '12345678') > > You could also do this : > > select u.* from Users u, bankaccount ba > where u.bankaccountid = ba.bankaccountid > and ba.accountnumber = '12345678' > and ba.routingNumber = '12345678') > > > Thanks! a lot that was it , it is way much better now.