Thread: Improving performance on multicolumn query
Hi, all! I've been using postgresql for a long time now, but today I had some problem I couldn't solve properly - hope here some more experienced users have some hint's for me. First, I'm using postgresql 7.4.7 on a 2GHz machine having 1.5GByte RAM and I have a table with about 220 columns and 20000 rows - and the first five columns build a primary key (and a unique index). Now my problem: I need really many queries of rows using it's primary key and fetching about five different columns but these are quite slow (about 10 queries per second and as I have some other databases which can have about 300 queries per second I think this is slow): transfer=> explain analyse SELECT * FROM test WHERE test_a=9091150001 AND test_b=1 AND test_c=2 AND test_d=0 AND test_e=0; Index Scan using test_idx on test (cost=0.00..50.27 rows=1 width=1891) (actual time=0.161..0.167 rows=1 loops=1) Index Cond: (test_a = 9091150001::bigint) Filter: ((test_b = 1) AND (test_c = 2) AND (test_d = 0) AND (test_e 0)) So, what to do to speed things up? If I understand correctly this output, the planner uses my index (test_idx is the same as test_pkey created along with the table), but only for the first column. Accidently I can't refactor these tables as they were given to me. Thanks for any hint! Jan
Attachment
Jan Kesten wrote: > > First, I'm using postgresql 7.4.7 on a 2GHz machine having 1.5GByte RAM > and I have a table with about 220 columns and 20000 rows - and the first > five columns build a primary key (and a unique index). > transfer=> explain analyse SELECT * FROM test WHERE test_a=9091150001 > AND test_b=1 AND test_c=2 AND test_d=0 AND test_e=0; > > Index Scan using test_idx on test (cost=0.00..50.27 rows=1 width=1891) > (actual time=0.161..0.167 rows=1 loops=1) > Index Cond: (test_a = 9091150001::bigint) > Filter: ((test_b = 1) AND (test_c = 2) AND (test_d = 0) AND (test_e 0)) This says it's taking less than a millisecond - which is almost certainly too fast to measure accurately anyway. Are you sure this query is the problem? > So, what to do to speed things up? If I understand correctly this > output, the planner uses my index (test_idx is the same as test_pkey > created along with the table), but only for the first column. 1. Are all of test_a/b/c/d/e bigint rather than int? 2. Have you tried explicitly casting your query parameters? ...WHERE test_a=123::bigint AND test_b=456::bigint... -- Richard Huxton Archonet Ltd
On Wed, Nov 09, 2005 at 01:08:07PM +0100, Jan Kesten wrote: > First, I'm using postgresql 7.4.7 on a 2GHz machine having 1.5GByte RAM > and I have a table with about 220 columns and 20000 rows - and the first > five columns build a primary key (and a unique index). I forgot this, but it should be mentioned: A primary key works as an unique index, so you don't need both. /* Steinar */ -- Homepage: http://www.sesse.net/
> transfer=> explain analyse SELECT * FROM test WHERE test_a=9091150001 > AND test_b=1 AND test_c=2 AND test_d=0 AND test_e=0; > > Index Scan using test_idx on test (cost=0.00..50.27 rows=1 width=1891) > (actual time=0.161..0.167 rows=1 loops=1) > Index Cond: (test_a = 9091150001::bigint) > Filter: ((test_b = 1) AND (test_c = 2) AND (test_d = 0) AND (test_e 0)) > > So, what to do to speed things up? If I understand correctly this > output, the planner uses my index (test_idx is the same as test_pkey > created along with the table), but only for the first column. Hi Jan, If you're using 7.4.x then the planner can't use the index for unquoted bigints. Try this: SELECT * FROM test WHERE test_a='9091150001' AND test_b='1' AND test_c=''2 AND test_d='0' AND test_e='0'; Chris
On Wed, Nov 09, 2005 at 01:08:07PM +0100, Jan Kesten wrote: > Now my problem: I need really many queries of rows using it's primary > key and fetching about five different columns but these are quite slow > (about 10 queries per second and as I have some other databases which > can have about 300 queries per second I think this is slow): > > transfer=> explain analyse SELECT * FROM test WHERE test_a=9091150001 > AND test_b=1 AND test_c=2 AND test_d=0 AND test_e=0; > > Index Scan using test_idx on test (cost=0.00..50.27 rows=1 width=1891) > (actual time=0.161..0.167 rows=1 loops=1) > Index Cond: (test_a = 9091150001::bigint) > Filter: ((test_b = 1) AND (test_c = 2) AND (test_d = 0) AND (test_e 0)) You don't post your table definitions (please do), but it looks like test_b, test_c, test_d and test_e might be bigints? If so, you may want to do explicit "AND test_b=1::bigint AND test_c=2::bigint" etc. -- 7.4 doesn't figure this out for you. (8.0 and higher does.) /* Steinar */ -- Homepage: http://www.sesse.net/
Hi all! First thanks to any answer by now :-) > You don't post your table definitions (please do), but it looks like > test_b, test_c, test_d and test_e might be bigints? If so, you may > want to do explicit "AND test_b=1::bigint AND test_c=2::bigint" etc. > -- 7.4 doesn't figure this out for you. (8.0 and higher does.) I didn't post table defintion, but you all are right, test_a to test_e are all bigint. I use JDBC to connect to this database and use a prepared statment for the queries and set all parameters with pst.setLong() method. Perhaps this could be the problem? I'll try 'normal' statements with typecasting, because as far as I can see, the query is the problem (postgresql takes more than 98% cpu while running these statements) or the overhead produced (but not the network, as it has only 1-2% load). Quering other tables (not as big - both rows and columns are much less) run quite fast with the same code. So, thanks again - I'll try and report :-) Can't be so slow, I have some self-build database with millions of rows and they run very fast - but they don't use bigint ;-) Cheers, Jan