Thread: Postgres-7.0.2 optimization question
Hi, I've got a slight optimization problem with postgres and I was hoping someone could give me a clue as to what could be tweaked. I have a couple of tables which contain little data (around 500,000 tuples each), and most operations take insanely long time to complete. The primary keys in both tables are ints (int8, iirc). When I perform a delete (with a where clause on a part of a primary key), an strace shows that postgres reads the entire table sequentially (lseek() and read()). Since each table is around 200MB, things take time. I tried vacuumdb --analyze. It did not help. I tried creating an index on the part of the primary key that is used in the abovementioned delete. It did not help either. Has anyone encountered the same kind of problems before? In that case, has anyone found a solution? (the problem is that the DB can very fast get 20 times larger (i.e. 10,000,000 tuples per table is a moderate size), and I'd rather not witness a delete that takes around 90 minutes (100,000 tuples were deleted) more than once). TIA, ivr -- Women wearing Wonder bras and low-cut blouses lose their right to complain about having their boobs stared at. "Things men wish women knew"
Post the query you're using, there may be a way to rewrite it to use the index. I've found this to be true on all kinds of DBMSs. -- Adam Ruth InterCation, Inc. www.intercation.com ""Igor V. Rafienko"" <igorr@ifi.uio.no> wrote in message news:Pine.SOL.4.21.0010131345100.23627-100000@vigrid.ifi.uio.no... > > > Hi, > > > I've got a slight optimization problem with postgres and I was hoping > someone could give me a clue as to what could be tweaked. > > I have a couple of tables which contain little data (around 500,000 tuples > each), and most operations take insanely long time to complete. The > primary keys in both tables are ints (int8, iirc). When I perform a delete > (with a where clause on a part of a primary key), an strace shows that > postgres reads the entire table sequentially (lseek() and read()). Since > each table is around 200MB, things take time. > > I tried vacuumdb --analyze. It did not help. I tried creating an index on > the part of the primary key that is used in the abovementioned delete. It > did not help either. > > Has anyone encountered the same kind of problems before? In that case, has > anyone found a solution? (the problem is that the DB can very fast get 20 > times larger (i.e. 10,000,000 tuples per table is a moderate size), and > I'd rather not witness a delete that takes around 90 minutes (100,000 > tuples were deleted) more than once). > > > TIA, > > > ivr > -- > Women wearing Wonder bras and low-cut blouses lose their right to > complain about having their boobs stared at. > "Things men wish women knew" > >
on Oct 13, 2000, 07:55, Adam Ruth std::cout'ed: | Post the query you're using, there may be a way to rewrite it to use the | index. I've found this to be true on all kinds of DBMSs. Okidoki (somewhat simplified (there are 5 other columns as well, but they have nothing to do with delete)): create table foo ( foo1 int8 not null, foo2 int8 not null, data1 varchar check( data_1 in ( 'zot', 'qux' ) ) not null, data2 varchar null, primary key (foo1,foo2), unique (foo2,data1) ); And then an index on foo2: create index foobar on foo( foo2 ); And then a vacuumdb --analyze. The query cannot be made simpler, imvho, but since you've asked: delete from foo where foo2 = 42; There is also a foreign key into this table from a primary key of another table. Yes, there is an index on the other table as well. ivr -- "... car il faut convenir que Dieu n'a crée les femmes que pour apprivoiser les hommes" Voltaire
On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote: > I tried vacuumdb --analyze. It did not help. I tried creating an index on If I parse right this is excactly VACUUM without ANALYZE because '--' is SQL comment start ... no? -- marko
> On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote: > > I tried vacuumdb --analyze. It did not help. I tried creating an index on > > If I parse right this is excactly VACUUM without ANALYZE > because '--' is SQL comment start ... no? "vacuumdb" is invoked at the shell, not in an SQL editor, so "--analyze" is just an argument passed to the vacuumdb program, causing it to analyze as it vacuums. "vacuumdb --help" tells you all of the nifty arguments. : ) steve
on Oct 13, 2000, 17:26, Marko Kreen std::cout'ed: | On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote: | > I tried vacuumdb --analyze. It did not help. I tried creating an index on | | If I parse right this is excactly VACUUM without ANALYZE | because '--' is SQL comment start ... no? Ehh... no. Not quite: $ ./vacuumdb --help | grep analyze vacuumdb cleans and analyzes a PostgreSQL database. -z, --analyze Update optimizer hints $ The analyze was performed from outside the database via vacuumdb command, not via VACUUM command in the psql. ivr -- Everybody has to have a philosophy. Some people believe in laissez faire economics, others believe in reincarnation. Some people even believe that COBOL is a real programming language. S. Meyers, "Effective C++", item 26
* Igor V. Rafienko <igorr@ifi.uio.no> [001013 05:09] wrote: > > > Hi, > > > I've got a slight optimization problem with postgres and I was hoping > someone could give me a clue as to what could be tweaked. > > I have a couple of tables which contain little data (around 500,000 tuples > each), and most operations take insanely long time to complete. The > primary keys in both tables are ints (int8, iirc). When I perform a delete > (with a where clause on a part of a primary key), an strace shows that > postgres reads the entire table sequentially (lseek() and read()). Since > each table is around 200MB, things take time. Postgresql fails to use the index on several of our tables, an 'EXPLAIN <query>' would probably output a lot of lines about doing a 'sequential scan'. The only solution that I've been able to come across is to issue a 'set enable_seqscan=off;' SQL statement on most of my queries to force postgresql to use an index. hope this helps, -Alfred
If you could post the queries in question along with the table structure and EXPLAIN output of the queries, I'm sure someone might be able to suggest something.. -Mitch ----- Original Message ----- From: "Alfred Perlstein" <bright@wintelcom.net> To: "Igor V. Rafienko" <igorr@ifi.uio.no> Cc: <pgsql-general@postgresql.org> Sent: Friday, October 13, 2000 10:47 AM Subject: Re: [GENERAL] Postgres-7.0.2 optimization question > * Igor V. Rafienko <igorr@ifi.uio.no> [001013 05:09] wrote: > > > > > > Hi, > > > > > > I've got a slight optimization problem with postgres and I was hoping > > someone could give me a clue as to what could be tweaked. > > > > I have a couple of tables which contain little data (around 500,000 tuples > > each), and most operations take insanely long time to complete. The > > primary keys in both tables are ints (int8, iirc). When I perform a delete > > (with a where clause on a part of a primary key), an strace shows that > > postgres reads the entire table sequentially (lseek() and read()). Since > > each table is around 200MB, things take time. > > Postgresql fails to use the index on several of our tables, an > 'EXPLAIN <query>' would probably output a lot of lines about > doing a 'sequential scan'. > > The only solution that I've been able to come across is to issue > a 'set enable_seqscan=off;' SQL statement on most of my queries > to force postgresql to use an index. > > hope this helps, > -Alfred >
You might want to try: delete from foo where foo2=42::int8 IIRC, there is an issue with int8 indexes and integer constants (which are treated as int4). Stephan Szabo sszabo@bigpanda.com On Fri, 13 Oct 2000, Igor V. Rafienko wrote: > on Oct 13, 2000, 07:55, Adam Ruth std::cout'ed: > > | Post the query you're using, there may be a way to rewrite it to use the > | index. I've found this to be true on all kinds of DBMSs. > > > Okidoki (somewhat simplified (there are 5 other columns as well, but > they have nothing to do with delete)): > > > create table foo ( > foo1 int8 not null, > foo2 int8 not null, > data1 varchar check( data_1 in ( 'zot', 'qux' ) ) not null, > data2 varchar null, > > primary key (foo1,foo2), > unique (foo2,data1) > ); > > And then an index on foo2: > > create index foobar on foo( foo2 ); > > And then a vacuumdb --analyze. > > The query cannot be made simpler, imvho, but since you've asked: > > delete from foo where foo2 = 42; > > > There is also a foreign key into this table from a primary key of another > table. Yes, there is an index on the other table as well. >