Thread: psql nested queries with 2000+ records
I have noted with 2000+ records in a table (using a btree unique index on the primary key, btw) performance of a query takes 15+ minutes to respond. The query is in the structure of this select foo from bar where foo in (select foo from bar where bazzfoo like 'Word%'); Just reporting a possible bug.. With smaller amounts of records, responce time is within a few seconds. Other information you might want: postgres version: 6.3 os: BSDI 3.1 Thanks -coronach@hill-b-073.resnet.purdue.edu
> > I have noted with 2000+ records in a table (using a btree unique index on > the primary key, btw) performance of a query takes 15+ minutes to respond. > > The query is in the structure of this > > select foo from bar where foo in (select foo from bar where bazzfoo like > 'Word%'); Very strange. 15+ minutes? Wow, that is terrible, even longer than a sequential scan of the table. Try EXPLAIN and tell us what it says in the two cases. > > Just reporting a possible bug.. > > With smaller amounts of records, responce time is within a few seconds. > > Other information you might want: > > postgres version: 6.3 > os: BSDI 3.1 Same OS here. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
At 12:24 PM 3/20/98 -0500, Bruce Momjian wrote: >Very strange. 15+ minutes? Wow, that is terrible, even longer than a >sequential scan of the table. Try EXPLAIN and tell us what it says in >the two cases. This is within the 2717 record database. explain select name from games where name in (select name from games where name like 'A%'); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) EXPLAIN This is within the 24 record database of the same type of data explain select name from games where name in (select name from game s where name like 'A%'); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) EXPLAIN Obviously, I get the same information, any suggestions? -Coronach@hill-b-073.resnet.purdue.edu
OK, I think I see the problem. Indexes are not being used in this case. Basically for every row in the outer query, you are doing a sequential scan in the inner table. Can you try this: select * into games2 from games; explain select name from games where name in (select name from games2 where name like 'A%'); I would like to know if the use of the same table in the subquery is causing the problem. I assume you have run vacuum and vacuum analyze. > > At 12:24 PM 3/20/98 -0500, Bruce Momjian wrote: > > >Very strange. 15+ minutes? Wow, that is terrible, even longer than a > >sequential scan of the table. Try EXPLAIN and tell us what it says in > >the two cases. > > This is within the 2717 record database. > > explain select name from games where name in (select name from games where > name like 'A%'); > > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games (cost=207.95 size=1 width=12) > > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games (cost=207.95 size=1 width=12) > > EXPLAIN > > This is within the 24 record database of the same type of data > > explain select name from games where name in (select name from game > s where name like 'A%'); > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games (cost=207.95 size=1 width=12) > > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games (cost=207.95 size=1 width=12) > > EXPLAIN > > Obviously, I get the same information, any suggestions? > > -Coronach@hill-b-073.resnet.purdue.edu > > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
>select * into games2 from games; amusements=> select * into games2 from games; ERROR: parser: parse error at or near "games2" But, I just dumped it and changed a few things to fit the scenario... >explain select name from games where name in (select name from games2 where >name like 'A%'); amusements=> explain select name from games where name in (select name from game s2 where name like 'A%'); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games2 (cost=10.23 size=1 width=12) EXPLAIN Upon execution of the query, it still took quite a long time. >causing the problem. I assume you have run vacuum and vacuum analyze. Yep, ran vacuum on it and also had it work with newly generated databases too. Any other suggestions would be welcome, thanks -Coronach@hill-b-073.resnet.purdue.edu
Coronach wrote: > > I have noted with 2000+ records in a table (using a btree unique index on > the primary key, btw) performance of a query takes 15+ minutes to respond. > > The query is in the structure of this > > select foo from bar where foo in (select foo from bar where bazzfoo like > 'Word%'); EXPLAIN select foo from bar where bazzfoo like 'Word%'; Is index used ? Vadim
> > >select * into games2 from games; > > amusements=> select * into games2 from games; > ERROR: parser: parse error at or near "games2" Oops, we haven't released 6.3p1 yet, so 'select * into table games2 from games;' > > But, I just dumped it and changed a few things to fit the scenario... > > >explain select name from games where name in (select name from games2 where > >name like 'A%'); > > amusements=> explain select name from games where name in (select name from > game > s2 where name like 'A%'); > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games2 (cost=10.23 size=1 width=12) > > EXPLAIN > > Upon execution of the query, it still took quite a long time. > > >causing the problem. I assume you have run vacuum and vacuum analyze. > > Yep, ran vacuum on it and also had it work with newly generated databases too. Try it without the LIKE, with just an equals. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
At 09:54 AM 3/21/98 -0500, Bruce Momjian wrote: >Try it without the LIKE, with just an equals. amusements=> explain select name from games where name in (select name from game s where mfr = ''); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) EXPLAIN -- amusements=> explain select name from games where name in (select name from game s2 where mfr = ''); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games2 (cost=10.23 size=1 width=12) EXPLAIN The first query approved to be lengthy, but the last query was executed and returned rows in just a few. I then did an explain on a query that refered to the games table within the sub query. amusements=> explain select name from games where name in (select name from game s where mfr = ''); NOTICE: QUERY PLAN: Seq Scan on games (cost=207.95 size=446 width=12) SubPlan -> Seq Scan on games (cost=207.95 size=1 width=12) EXPLAIN For the nature of the searches that the front end send to postgres (the database is used with a web interface), it is nec. to use the like expression. Where should I go from here? Thanks once again, -Coronach@hill-b-073.resnet.purdue.edu
What does this show in explain? Does it use an index? select name from games where mfr = '' > > At 09:54 AM 3/21/98 -0500, Bruce Momjian wrote: > > >Try it without the LIKE, with just an equals. > > amusements=> explain select name from games where name in (select name from > game > s where mfr = ''); > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=207.95 size=446 width=12) > SubPlan > -> Seq Scan on games (cost=207.95 size=1 width=12) > > EXPLAIN I ran this test: test=> explain select * from pg_proc where oid in (select oid from pg_proc where oid =3); NOTICE: QUERY PLAN: Seq Scan on pg_proc (cost=48.90 size=91 width=122) SubPlan -> Index Scan on pg_proc (cost=2.05 size=1 width=4) and it used the index in the subquery. Please let us about the above test. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Aparently, the postgresql server was moved from the BSDI box to a linux 2.0.30 box, so the numbers that where given where not from the bsdi box. So now I am doing the queries personally. I'm sorry for the run around. This is the responce from the explain on the BSDI 3.1 box. amusements=> explain select name from games where name in (select name from game s where name like 'A%'); NOTICE: QUERY PLAN: Seq Scan on games (cost=0.00 size=0 width=12) SubPlan -> Seq Scan on games (cost=0.00 size=0 width=12) EXPLAIN amusements=> explain select name from games where name in (select name from game s2 where mfr = ''); NOTICE: QUERY PLAN: Seq Scan on games (cost=0.00 size=0 width=12) SubPlan -> Seq Scan on games2 (cost=0.00 size=0 width=12) EXPLAIN amusements=> explain select * from games where mfr = ''; NOTICE: QUERY PLAN: Seq Scan on games (cost=0.00 size=0 width=160) EXPLAIN Once again, I appologize for this misinformation and hope this sheds some light to the original problem. I'm making a note to change permissions on who can do what from now on. On a side note, the query was done on the 2.0.30 box in 5 minutes compared to the BSDI box at 15+. -Coronach@hill-b-073.resnet.purdue.edu
> amusements=> explain select * from games where mfr = ''; > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=0.00 size=0 width=160) > > EXPLAIN OK. This is the base of the problem. You are vacuum. How varied is the data in the mfr field. Is the usually just a few unique values. If so, the index really does you no good, and perhaps that is why it is not being used. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Coronach@hill-b-073.resnet.purdue.edu blushes and says: > Aparently, the postgresql server was moved from the BSDI box to a linux > 2.0.30 box, so the numbers that where given where not from the bsdi box. So > now I am doing the queries personally. I'm sorry for the run around. > > This is the responce from the explain on the BSDI 3.1 box. > > amusements=> explain select name from games where name in (select name from > game > s where name like 'A%'); > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=0.00 size=0 width=12) > SubPlan > -> Seq Scan on games (cost=0.00 size=0 width=12) > > EXPLAIN > > amusements=> explain select name from games where name in (select name from > game > s2 where mfr = ''); > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=0.00 size=0 width=12) > SubPlan > -> Seq Scan on games2 (cost=0.00 size=0 width=12) > > EXPLAIN > > amusements=> explain select * from games where mfr = ''; > NOTICE: QUERY PLAN: > > Seq Scan on games (cost=0.00 size=0 width=160) > > EXPLAIN > > Once again, I appologize for this misinformation and hope this sheds some > light to the original problem. I'm making a note to change permissions on > who can do what from now on. > > On a side note, the query was done on the 2.0.30 box in 5 minutes compared > to the BSDI box at 15+. Well of course... ;-) I am sure this has been mentioned, but could you post the exact schema for these tables and their indexes? It might even be worth dropping and recreating the indexes just to be _sure_. -dg David Gould dg@illustra.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 - I realize now that irony has no place in business communications.