Thread: Takes too long to fetch the data from database
what is the query ?
use LIMIT or a restricting where clause.
regds
mallah.
Hello,I have difficulty in fetching the records from the database.Database table contains more than 1 GB data.For fetching the records it is taking more the 1 hour and that's why it is slowing down the performance.please provide some help regarding improving the performance and how do I run query so that records will be fetched in a less time.
Rajesh Kumar Mallah wrote: > > what is the query ? > use LIMIT or a restricting where clause. You could also use a cursor. Joshua D. Drake > > > regds > mallah. > > On 4/10/06, *soni de* < soni.de@gmail.com <mailto:soni.de@gmail.com>> wrote: > > Hello, > > I have difficulty in fetching the records from the database. > Database table contains more than 1 GB data. > For fetching the records it is taking more the 1 hour and that's why > it is slowing down the performance. > please provide some help regarding improving the performance and how > do I run query so that records will be fetched in a less time. > > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
I have flushed the database, so currently records in the "lan" table are: 665280
but records can be increased more than 1GB and in that case it takes more than 1 hour
Below is explain analyze output taken from the table having 665280 records
pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE (
( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate
>= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime ;
NOTICE: QUERY PLAN:
Sort (cost=17.13..17.13 rows=1 width=16) (actual time=619140.18..619140.29 rows
=288 loops=1)
-> Index Scan using lan_pkey on lan (cost=0.00..17.12 rows=1 width=16) (ac
tual time=7564.44..619121.61 rows=288 loops=1)
Total runtime: 619140.76 msec
EXPLAIN
bsdb=# explain analyze SELECT DISTINCT sdate, stime, rbts from lan
WHERE ( ( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate >= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) )
ORDER BY sdate, stime ;
NOTICE: QUERY PLAN:
Unique (cost=17.13..17.14 rows=1 width=16) (actual time=610546.66..610564.31 rows=288 loops=1)
-> Sort (cost=17.13..17.13 rows=1 width=16) (actual time=610546.65..610546.75 rows=288 loops=1)
-> Index Scan using lan_pkey on lan (cost=0.00..17.12 rows=1 width=16) (actual time=7524.47..610533.50 rows=288 loops=1)
Total runtime: 610565.51 msec
EXPLAIN
pdb=# explain analyze SELECT ALL sdate, stime, rbts from lan WHERE ( ( bname = 'neptune' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate >= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime ;
NOTICE: QUERY PLAN:
Sort (cost=17.13..17.13 rows=1 width=16) (actual time=1260756.66..1260756.76 rows=288 loops=1)
-> Index Scan using lan_pkey on lan (cost=0.00..17.12 rows=1 width=16) (actual time=7725.97..1260752.47 rows=288 loops=1)
Total runtime: 1260757.09 msec
pdb=# \d lan
Table "lan"
Column | Type | Modifiers
------------------+-----------------------+-----------
bname | character varying(64) | not null
sdate | date | not null
stime | integer | not null
cno | smallint | not null
pno | smallint | not null
rbts | bigint |
tbts | bigint |
u_inpkt | bigint |
u_outpkt | bigint |
m_inpkt | bigint |
m_outpkt | bigint |
b_inpkt | bigint |
b_outpkt | bigint |
Primary key: lan_pkey
Check constraints: "lan_stime" ((stime >= 0) AND (stime < 86400))
Rajesh Kumar Mallah wrote:
>
> what is the query ?
> use LIMIT or a restricting where clause.
You could also use a cursor.
Joshua D. Drake
>
>
> regds
> mallah.
>
> On 4/10/06, *soni de* < soni.de@gmail.com <mailto: soni.de@gmail.com>> wrote:
>
> Hello,
>
> I have difficulty in fetching the records from the database.
> Database table contains more than 1 GB data.
> For fetching the records it is taking more the 1 hour and that's why
> it is slowing down the performance.
> please provide some help regarding improving the performance and how
> do I run query so that records will be fetched in a less time.
>
>
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
soni de wrote: > I have flushed the database, so currently records in the "lan" table are: > 665280 > > but records can be increased more than 1GB and in that case it takes more > than 1 hour > > Below is explain analyze output taken from the table having 665280 records > > pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE ( > ( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate > >= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime > ; > > NOTICE: QUERY PLAN: > Sort (cost=17.13..17.13 rows=1 width=16) (actual time=619140.18..619140.29rows > =288 loops=1) > -> Index Scan using lan_pkey on lan (cost=0.00..17.12 rows=1 width=16) > (actual time=7564.44..619121.61 rows=288 loops=1) > > Total runtime: 619140.76 msec OK - there is clearly something wrong here when you take 10 minutes to fetch 288 rows from an index. 1. VACUUM FULL VERBOSE lan; 2. test again, and if that doesn't work... 3. REINDEX TABLE lan; 4. test again I'm guessing you have a *lot* of dead rows in there. -- Richard Huxton Archonet Ltd
> pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE ( > > ( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate > > >= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime > ; this query would benefit from an index on pluto, cno, pno, sdate create index Ian_idx on Ian(bname, cno, pno, sdate); > pdb=# explain analyze SELECT ALL sdate, stime, rbts from lan WHERE ( ( > bname = 'neptune' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate >= > '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime ; ditto above. Generally, the closer the fields in the where clause are matched by the index, the it will speed up your query. Merlin
Richard Huxton <dev@archonet.com> writes: > soni de wrote: >> NOTICE: QUERY PLAN: >> Sort (cost=17.13..17.13 rows=1 width=16) (actual time=619140.18..619140.29rows >> =288 loops=1) >> -> Index Scan using lan_pkey on lan (cost=0.00..17.12 rows=1 width=16) >> (actual time=7564.44..619121.61 rows=288 loops=1) >> >> Total runtime: 619140.76 msec > OK - there is clearly something wrong here when you take 10 minutes to > fetch 288 rows from an index. > I'm guessing you have a *lot* of dead rows in there. Yeah. The other small problem here is that EXPLAIN output hasn't looked like that since PG 7.2 (unless Soni has just omitted the index-condition lines). I'd recommend updating to something modern. regards, tom lane
Please provide me some help regarding how could I use cursor in following cases? :
I want to fetch 50 records at a time starting from largest stime.
Total no. of records in the "wan" table: 82019
pdb=# \d wan
Table "wan"
Column | Type | Modifiers
-------------+--------------------------+-----------
stime | bigint | not null
kname | character varying(64) |
eid | smallint |
rtpe | smallint |
taddr | character varying(16) |
ntime | bigint |
Primary key: wan_pkey
stime is the primary key.
pdb=#
SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;
pdb=# explain analyze SELECT * FROM wan ORDER BY stime LIMIT 50
OFFSET 81900;
NOTICE: QUERY PLAN:
Limit (cost=17995.15..17995.15 rows=50 width=95) (actual time=9842.92..9843.20
rows=50 loops=1)
-> Sort (cost=17995.15..17995.15 rows=82016 width=95) (actual time=9364.56..
9793.00 rows=81951 loops=1)
-> Seq Scan on wan (cost=0.00..3281.16 rows=82016 width=95) (actu
al time=0.11..3906.29 rows=82019 loops=1)
Total runtime: 10010.76 msec
EXPLAIN
pdb=#
SELECT * FROM wan where kname='pluto' ORDER BY stime LIMIT 50 OFFSET 81900;
pdb=# explain analyze SELECT * from wan where kname='pluto' order by stime limit 50 offset 81900;
NOTICE: QUERY PLAN:
Limit (cost=3494.13..3494.13 rows=1 width=95) (actual time=9512.85..9512.85 rows=0 loops=1)
-> Sort (cost=3494.13..3494.13 rows=206 width=95) (actual time=9330.74..9494.90 rows=27485 loops=1)
-> Seq Scan on wan (cost=0.00..3486.20 rows=206 width=95) (actual time=0.28..4951.76 rows=27485 loops=1)
Total runtime: 9636.96 msec
EXPLAIN
SELECT * FROM wan where kname='pluto' and rtpe=20 ORDER BY stime LIMIT 50 OFFSET 81900;
pdb=# explain analyze SELECT * from wan where kname='pluto' and rtpe = 20 order by stime limit 50 offset 81900;
NOTICE: QUERY PLAN:
Limit (cost=3691.25..3691.25 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1)
-> Sort (cost=3691.25..3691.25 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1)
-> Seq Scan on wan (cost=0.00..3691.24 rows=1 width=95) (actual time=7361.30..7361.30 rows=0 loops=1)
Total runtime: 7361.71 msec
EXPLAIN
pdb=#
all the above queries taking around 7~10 sec. to fetch the last 50 records. I want to reduce this time because table is growing and table can contain more than 1 GB data then for 1 GB data above queries will take too much time.
I am not getting how to use cursor to fetch records starting from last records in the above case offset can be any number (less than total no. of records).
I have use following cursor, but it is taking same time as query takes.
BEGIN;
DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;
FETCH ALL in crs;
CLOSE crs;
COMMIT;
> pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE (
>
> ( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate
>
> >= '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime
> ;
this query would benefit from an index on
pluto, cno, pno, sdate
create index Ian_idx on Ian(bname, cno, pno, sdate);
> pdb=# explain analyze SELECT ALL sdate, stime, rbts from lan WHERE ( (
> bname = 'neptune' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate >=
> '2004-07-21' ) AND ( sdate <= '2004-07-21' ) ) ) ORDER BY sdate, stime ;
ditto above. Generally, the closer the fields in the where clause are
matched by the index, the it will speed up your query.
Merlin
On Thu, Apr 20, 2006 at 11:07:31 +0530, soni de <soni.de@gmail.com> wrote: > Please provide me some help regarding how could I use cursor in following > cases? : > > I want to fetch 50 records at a time starting from largest stime. > > SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900; Something like the following may be faster: SELECT * FROM wan ORDER BY stime DESC LIMIT 50;
> SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900; you need to try and solve the problem without using 'offset'. you could do: BEGIN; DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime; FETCH ABSOLUTE 81900 in crs; FETCH 49 in crs; CLOSE crs; COMMIT; this may be a bit faster but will not solve the fundamental problem. the more interesting question is why you want to query exactly 81900 rows into a set. This type of thinking will always get you into trouble, absolute positioning will not really work in a true sql sense. if you are browsing a table sequentially, there are much better methods. merlin
> SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;
you need to try and solve the problem without using 'offset'. you could do:
BEGIN;
DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime;
FETCH ABSOLUTE 81900 in crs;
FETCH 49 in crs;
CLOSE crs;
COMMIT;
this may be a bit faster but will not solve the fundamental problem.
the more interesting question is why you want to query exactly 81900
rows into a set. This type of thinking will always get you into
trouble, absolute positioning will not really work in a true sql
sense. if you are browsing a table sequentially, there are much
better methods.
merlin
I’ve never used a cursor in Postgres, but I don’t think it will help you a lot. In theory cursors make it easier to do paging, but your main problem is that getting the first page is slow. A cursor isn’t going to be any faster at getting the first page than OFFSET/LIMIT is.
Did you try Bruno’s suggestion of:
SELECT * FROM wan ORDER BY stime DESC OFFSET 0 LIMIT 50;
You should run an EXPLAIN ANALYZE on that query to see if it is using an index scan. Also what version of Postgres are you using? You can run select version(); to check.
-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of soni de
Sent: Thursday, April 20, 2006 11:42 PM
To: Merlin Moncure
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Takes too long to fetch the data from database
I don't want to query exactly 81900 rows into set. I just want to fetch 50 or 100 rows at a time in a decreasing order of stime.(i.e 50 or 100 rows starting from last to end).
if we fetched sequentially, there is also problem in fetching all the records (select * from wan where kname='pluto' order by stime) it is taking more than 4~5 minutes. tried it on same table having more than 326054 records.
On 4/20/06, Merlin Moncure <mmoncure@gmail.com> wrote:
> SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;
you need to try and solve the problem without using 'offset'. you could do:
BEGIN;
DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime;
FETCH ABSOLUTE 81900 in crs;
FETCH 49 in crs;
CLOSE crs;
COMMIT;
this may be a bit faster but will not solve the fundamental problem.
the more interesting question is why you want to query exactly 81900
rows into a set. This type of thinking will always get you into
trouble, absolute positioning will not really work in a true sql
sense. if you are browsing a table sequentially, there are much
better methods.
merlin
On 4/21/06, soni de <soni.de@gmail.com> wrote: > > I don't want to query exactly 81900 rows into set. I just want to fetch 50 > or 100 rows at a time in a decreasing order of stime.(i.e 50 or 100 rows > starting from last to end). aha! you need to implement a 'sliding window' query. simplest is when you are ordering one field that is unique: 1st 50: select * from t order by k limit 50; 2nd 50: select * from t where k > k1 order by k limit 50: if you are ordering on two fields or on a field that is not unique, you must do: 1st 50: select * from t order by j, k limit 50; 2nd 50: select * from t where j >= j1 and (j > j1 or k > k1) order by j, k limit 50; 3 fields: select * from t where i >= i1 and (i > i1 or j >= j1) and (i > i1 or j > k1 or k > k1) order by i,j,k limit 50; i1,j1,k1 are the values of the 50th record you pulled out of the last query. if this seems a little complicated, either wait for pg 8.2 or get cvs tip and rewrite as: select * from t where (i,j,k) > (i1,j1,k1) order by i,j,k limit 50;
On Fri, Apr 21, 2006 at 10:12:24 +0530, soni de <soni.de@gmail.com> wrote: > I don't want to query exactly 81900 rows into set. I just want to fetch 50 > or 100 rows at a time in a decreasing order of stime.(i.e 50 or 100 rows > starting from last to end). You can do this efficiently, if stime has an index and you can deal with using stime from the previous query instead of the record count. The idea is to select up 50 or 100 records in descending order where the stime is <= the previous stime. This can give you some overlapping records, so you need some way to deal with this.
On Fri, Apr 21, 2006 at 09:44:25AM -0400, Merlin Moncure wrote: > 2nd 50: > select * from t where j >= j1 and (j > j1 or k > k1) order by j, k limit 50; > 3 fields: > select * from t where i >= i1 and (i > i1 or j >= j1) and (i > i1 or j > > k1 or k > k1) order by i,j,k limit 50; Note that in 8.2 you'll be able to do: WHERE (i, j, k) >= (i1, j1, k1) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Hello,
NOTICE: QUERY PLAN:
Limit (cost=0.00..12.10 rows=50 width=95) (actual time=24.29..50.24 rows=50 loops=1)
-> Index Scan Backward using wan_pkey on wan (cost=0.00..19983.31 rows=82586 width=95) (actual time=24.28..50.14 rows=51 loops=1)
Total runtime: 50.55 msec
EXPLAIN
NOTICE: QUERY PLAN:
Aggregate (cost=3507.84..3507.84 rows=1 width=0) (actual time=214647.53..214647.54 rows=1 loops=1)
-> Seq Scan on wan (cost=0.00..3507.32 rows=208 width=0) (actual time=13.65..214599.43 rows=18306 loops=1)
Total runtime: 214647.87 msec
EXPLAIN
pdb=# EXPLAIN ANALYZE select * from wan where kname = 'pluto' order by stime limit 50;
NOTICE: QUERY PLAN:
Limit (cost=3515.32..3515.32 rows=50 width=95) (actual time=230492.69..230493.07 rows=50 loops=1)
-> Sort (cost=3515.32..3515.32 rows=208 width=95) (actual time=230492.68..230493.00 rows=51 loops=1)
-> Seq Scan on wan (cost= 0.00..3507.32 rows=208 width=95) (actual time=0.44..229217.38 rows=18306 loops=1)
Total runtime: 230631.62 msec
EXPLAIN
pdb=# EXPLAIN ANALYZE SELECT * FROM wan WHERE stime >= 20123 AND stime <= 24000 ORDER BY stime limit 50;
NOTICE: QUERY PLAN:
Limit (cost=0.00..2519.70 rows=50 width=95) (actual time=7346.74..7351.42 rows=50 loops=1)
-> Index Scan using wan_pkey on wan (cost=0.00..20809.17 rows=413 width=95) (actual time=7346.73..7351.32 rows=51 loops=1)
Total runtime: 7351.71 msec
EXPLAIN
for above queries if I use desc order then the queries takes too much time.
I've never used a cursor in Postgres, but I don't think it will help you a lot. In theory cursors make it easier to do paging, but your main problem is that getting the first page is slow. A cursor isn't going to be any faster at getting the first page than OFFSET/LIMIT is.
Did you try Bruno's suggestion of:
SELECT * FROM wan ORDER BY stime DESC OFFSET 0 LIMIT 50;
You should run an EXPLAIN ANALYZE on that query to see if it is using an index scan. Also what version of Postgres are you using? You can run select version(); to check.
-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of soni de
Sent: Thursday, April 20, 2006 11:42 PM
To: Merlin Moncure
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Takes too long to fetch the data from database
I don't want to query exactly 81900 rows into set. I just want to fetch 50 or 100 rows at a time in a decreasing order of stime.(i.e 50 or 100 rows starting from last to end).
if we fetched sequentially, there is also problem in fetching all the records (select * from wan where kname='pluto' order by stime) it is taking more than 4~5 minutes. tried it on same table having more than 326054 records.
On 4/20/06, Merlin Moncure < mmoncure@gmail.com> wrote:
> SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;
you need to try and solve the problem without using 'offset'. you could do:
BEGIN;
DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime;
FETCH ABSOLUTE 81900 in crs;
FETCH 49 in crs;
CLOSE crs;
COMMIT;
this may be a bit faster but will not solve the fundamental problem.
the more interesting question is why you want to query exactly 81900
rows into a set. This type of thinking will always get you into
trouble, absolute positioning will not really work in a true sql
sense. if you are browsing a table sequentially, there are much
better methods.
merlin
On Tue, May 09, 2006 at 09:24:15 +0530, soni de <soni.de@gmail.com> wrote: > > EXPLAIN > pdb=# EXPLAIN ANALYZE select * from wan where kname = 'pluto' order by stime > limit 50; > NOTICE: QUERY PLAN: > > Limit (cost=3515.32..3515.32 rows=50 width=95) (actual time= > 230492.69..230493.07 rows=50 loops=1) > -> Sort (cost=3515.32..3515.32 rows=208 width=95) (actual time= > 230492.68..230493.00 rows=51 loops=1) > -> Seq Scan on wan (cost=0.00..3507.32 rows=208 width=95) (actual > time=0.44..229217.38 rows=18306 loops=1) > Total runtime: 230631.62 msec Unless you have an index on (kname, stime) the query is going to need to find the records with a value for kname of 'pluto' and then get the most recent 50 of them. It looks like there are enough estimated records with kname = 'pluto', that a sequential scan is being prefered. Creating an extra index will slow down inserts somewhat, but will speed up queries like the above significantly, so may be worthwhile for you. I think later versions of Postgres are smarter, but for sure in 7.2 you will need to write the query like: SELECT * FROM wan WHERE kname = 'pluto' ORDER BY kname DESC, stime DESC LIMIT 50 ;