Thread: its really SLOW!!!!!
oh boy, I now realize how little I know about databases... OK, so I'm switching from MySQL to postgresql. I have this database in MySQL which used the enum data type, which looks to be a MySQL extension since there are no enum data types in postgresql. What I do is store a bunch of values of the setting of a magnet and I have about 6 entries per master record number. (i.e. I have a table which I call the master table, one row entry per master record, one table called the magnet table, which has many entries per Master record entry.) So what I do is loop over the master record entries, say record numbers 5000-5300, and for each master record entry, I look up in the magnet table data stored for that master record. The deal is this, its really slow pulling out the data from the magnet table. Like 15 seconds for 500 selects. The equivalent select in MySQL ran at least 10 times faster. Any ideas of what I'm doing wrong? Did I give you guys enough information so that you understand what I'm doing? Cheers. Steve.
On Mon, Dec 02, 2002 at 04:43:18PM -0500, Adler, Stephen wrote: > oh boy, I now realize how little I know about databases... > OK, so I'm switching from MySQL to postgresql. I have > this database in MySQL which used the enum data type, > which looks to be a MySQL extension since there are no > enum data types in postgresql. What I do is store a bunch > of values of the setting of a magnet and I have about 6 > entries per master record number. (i.e. I have a table > which I call the master table, one row entry per master record, > one table called the magnet table, which has many entries > per Master record entry.) > So what I do is loop over the master record entries, say > record numbers 5000-5300, and for each master record entry, > I look up in the magnet table data stored for that master > record. The deal is this, its really slow pulling out the > data from the magnet table. Like 15 seconds for 500 selects. > The equivalent select in MySQL ran at least 10 times faster. > > Any ideas of what I'm doing wrong? Did I give you guys > enough information so that you understand what I'm doing? Steve -- Why don't you post a dump of the create statements for your table, and the select statement that you think is running so slowly? -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant
create master ( masterindex integer not null primary key ); create magnet ( masterindex integer, current integer, voltage integer ); insert into master values (1); insert into master values (1); insert into magnet values (1, 100, 100); insert into magnet values (1, 102, 99); insert into magnet values (1, 99,100); insert into magnet values (2, 100, 101); insert into magnet values (2, 99,103); insert into magnet values (2, 100, 99); The idea being that I have a group of values which are related to the master index. Here I have 2 records in my master table, and 3 groups of values for each master record. The select goes like this select * from magnet where masterindex=1; select * from magnet where masterindex=2; These look ups are very slow. One thing I should remind you guys, is that the master table has over 50,000, the magnet table has about 6 times that many entries, 6 entries for each master record, unlike the example above where I only have 3 entries per record. What I'm wondering is that I need to make masterindex a real index. The idea of course is that I want to do a select on the master table with a join with the magnet table so that I pull out a list of magnet current and voltage settings for each master record. But right now I'm just doing the queries on the magnet table. I hope all of the above makes sense. Cheers. Steve. On Mon, 2002-12-02 at 16:56, Joel Burton wrote: > On Mon, Dec 02, 2002 at 04:43:18PM -0500, Adler, Stephen wrote: > > oh boy, I now realize how little I know about databases... > > OK, so I'm switching from MySQL to postgresql. I have > > this database in MySQL which used the enum data type, > > which looks to be a MySQL extension since there are no > > enum data types in postgresql. What I do is store a bunch > > of values of the setting of a magnet and I have about 6 > > entries per master record number. (i.e. I have a table > > which I call the master table, one row entry per master record, > > one table called the magnet table, which has many entries > > per Master record entry.) > > So what I do is loop over the master record entries, say > > record numbers 5000-5300, and for each master record entry, > > I look up in the magnet table data stored for that master > > record. The deal is this, its really slow pulling out the > > data from the magnet table. Like 15 seconds for 500 selects. > > The equivalent select in MySQL ran at least 10 times faster. > > > > Any ideas of what I'm doing wrong? Did I give you guys > > enough information so that you understand what I'm doing? > > Steve -- > > Why don't you post a dump of the create statements for your table, and > the select statement that you think is running so slowly? > > -- > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > Independent Knowledge Management Consultant > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
On Mon, Dec 02, 2002 at 05:11:45PM -0500, Adler, Stephen wrote: > create master ( > masterindex integer not null primary key > ); > > create magnet ( > masterindex integer, > current integer, > voltage integer > ); > > insert into master values (1); > insert into master values (1); > insert into magnet values (1, 100, 100); > insert into magnet values (1, 102, 99); > insert into magnet values (1, 99,100); > insert into magnet values (2, 100, 101); > insert into magnet values (2, 99,103); > insert into magnet values (2, 100, 99); > > The idea being that I have a group of values which > are related to the master index. Here I have 2 records > in my master table, and 3 groups of values for each > master record. The select goes like this > > select * from magnet where masterindex=1; > select * from magnet where masterindex=2; > > These look ups are very slow. One thing I should remind > you guys, is that the master table has over 50,000, > the magnet table has about 6 times that many entries, > 6 entries for each master record, unlike the example > above where I only have 3 entries per record. What I'm > wondering is that I need to make masterindex a real > index. The idea of course is that I want to do a select > on the master table with a join with the magnet table > so that I pull out a list of magnet current and voltage > settings for each master record. But right now I'm just > doing the queries on the magnet table. Stephen -- What, exactly, is your query? Are you running 50,000 statements in a row like "SELECT * FROM magnet WHERE masterindex=[x]"? Do you need all this raw data? Or are you trying to aggregate it or summarize it? Are you trying to get index current1 current2 current3 voltage1 voltage2 voltage3 1 100 101 105 100 95 94 2 99 102 102 99 101 95 (for example, made-up data) or what? What indexes are on your tables? Definitely, magnet.masterindex should be indexed. -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant
Adler, Stephen <adler@bnl.gov> wrote: > create master ( > masterindex integer not null primary key > ); > > create magnet ( > masterindex integer, > current integer, > voltage integer > ); Just some thoughts: First I would create an index on magnet.masterindex. (Indexes are automatically created only on the primary key.) CREATE INDEX idx_magnet_masterindex ON magnet (masterindex); After loading all your data, don't forget to analyze the tables. VACUUM ANALYZE; or ANALYZE <tablename>; for each table. > select * from magnet where masterindex=1; > select * from magnet where masterindex=2; These two queries will do a complete table scan because of the lack of an index. See EXPLAIN in the manuals for details about looking at query plans. EXPLAIN select * from magnet where masterindex=1; If the data set changes a lot it could be wise to cluster the tables once in a while. What exactly do you want to get out of the data set? Best Regards, Michael Paesold
From: "Adler, Stephen" <adler@bnl.gov> To: Joel Burton <joel@joelburton.com> Copies to: pgsql-novice@postgresql.org Subject: Re: [NOVICE] its really SLOW!!!!! Date sent: 02 Dec 2002 17:11:45 -0500 > create master ( > masterindex integer not null primary key > ); > > create magnet ( > masterindex integer, > current integer, > voltage integer > ); > > insert into master values (1); > insert into master values (1); > insert into magnet values (1, 100, 100); > insert into magnet values (1, 102, 99); > insert into magnet values (1, 99,100); > insert into magnet values (2, 100, 101); > insert into magnet values (2, 99,103); > insert into magnet values (2, 100, 99); The first thing you notice is that that there is no primary key for the magnet table You could do away with the master table by creating a new table: create master2(test serial primary key,magnet integer,voltage integer,current integer); insert into master2(magnet,voltage,current) select magnet,voltage,current from magnet; this takes a few seconds (I'm using pg thru cygwin on an ancient p2 machine) (of course you can maintain integrity on the magnet field by using it as a foreign key referencing the master field, I don't think mysql has grasped the importance of foreign keys yet) The test field will act as a primary key for each set of results and give you an unbroken sequence numbered 1-500 000 (or how many records there are) to look at the results of a batch of tests 140001->140500 select * from magnet where test > 140000 limit 500 order by magnet returns the results Takes no time at all to look at the results of a particular magnet in this batch select * from master2 where test > 140000 and test <= 140500 and magnet = 1; Index Scan using master2_pkey on master2 (cost=0.00..16.25 rows=86 width=16) NOTICE: QUERY PLAN: Index Scan using master2_pkey on master2 (cost=0.00..16.25 rows=86 width=16) to look at the results of a particular magnet in this batch select * from master2 where test > 40000 and test <= 40500 and magnet = 1; to look at the results of a magnets in this batch select * from master2 where test > 40000 and test <= 40500 and magnet in (1,2); These results come back (subjectively)immediately Hope this helps Paul Butler > > The idea being that I have a group of values which > are related to the master index. Here I have 2 records > in my master table, and 3 groups of values for each > master record. The select goes like this > > select * from magnet where masterindex=1; > select * from magnet where masterindex=2; > > These look ups are very slow. One thing I should remind > you guys, is that the master table has over 50,000, > the magnet table has about 6 times that many entries, > 6 entries for each master record, unlike the example > above where I only have 3 entries per record. What I'm > wondering is that I need to make masterindex a real > index. The idea of course is that I want to do a select > on the master table with a join with the magnet table > so that I pull out a list of magnet current and voltage > settings for each master record. But right now I'm just > doing the queries on the magnet table. > > I hope all of the above makes sense. Cheers. Steve. > > > On Mon, 2002-12-02 at 16:56, Joel Burton wrote: > > On Mon, Dec 02, 2002 at 04:43:18PM -0500, Adler, Stephen wrote: > > > oh boy, I now realize how little I know about databases... > > > OK, so I'm switching from MySQL to postgresql. I have > > > this database in MySQL which used the enum data type, > > > which looks to be a MySQL extension since there are no > > > enum data types in postgresql. What I do is store a bunch > > > of values of the setting of a magnet and I have about 6 > > > entries per master record number. (i.e. I have a table > > > which I call the master table, one row entry per master record, > > > one table called the magnet table, which has many entries > > > per Master record entry.) > > > So what I do is loop over the master record entries, say > > > record numbers 5000-5300, and for each master record entry, > > > I look up in the magnet table data stored for that master > > > record. The deal is this, its really slow pulling out the > > > data from the magnet table. Like 15 seconds for 500 selects. > > > The equivalent select in MySQL ran at least 10 times faster. > > > > > > Any ideas of what I'm doing wrong? Did I give you guys > > > enough information so that you understand what I'm doing? > > > > Steve -- > > > > Why don't you post a dump of the create statements for your table, and > > the select statement that you think is running so slowly? > > > > -- > > > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > > Independent Knowledge Management Consultant > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
That should be: insert into master2(magnet,voltage,current) select masterindex,voltage,current from magnet; to create your new table Cheers Paul Butler
From: "Adler, Stephen" <adler@bnl.gov> To: Joel Burton <joel@joelburton.com> Copies to: pgsql-novice@postgresql.org Subject: Re: [NOVICE] its really SLOW!!!!! Date sent: 02 Dec 2002 17:11:45 -0500 Thinking again I got the wrong end of the stick again creating a new table create master2(test serial primary key,magnet integer,voltage integer,current integer); insert into master2(magnet,voltage,current) select masterindex,voltage,current from magnet; (I'm renaming your masterindex as magnet) CREATE INDEX idx_magnet ON master2 (magnet); Using 45000 distinct masters with ten voltage current sets each (ie searching 450000 records) the query would be: for a hundred record range: $ time psql -c "select magnet,voltage,current from master2 where magnet between 400 and 500 order b y magnet;" magnet real 0m0.724s user 0m0.123s sys 0m0.280s for a 500 record range $ time psql -c "select magnet,voltage,current from master2 where magnet between 1 and 500 order by magnet;" magnet real 0m14.073s user 0m0.155s sys 0m0.374s which is what you're getting, so I suppose this is no help whatsoever Are these timings pretty standard for pg? I'm as curious as you are now. Cheers Paul Butler > create master ( > masterindex integer not null primary key > ); > > create magnet ( > masterindex integer, > current integer, > voltage integer > ); > > insert into master values (1); > insert into master values (1); > insert into magnet values (1, 100, 100); > insert into magnet values (1, 102, 99); > insert into magnet values (1, 99,100); > insert into magnet values (2, 100, 101); > insert into magnet values (2, 99,103); > insert into magnet values (2, 100, 99); > > The idea being that I have a group of values which > are related to the master index. Here I have 2 records > in my master table, and 3 groups of values for each > master record. The select goes like this > > select * from magnet where masterindex=1; > select * from magnet where masterindex=2; > > These look ups are very slow. One thing I should remind > you guys, is that the master table has over 50,000, > the magnet table has about 6 times that many entries, > 6 entries for each master record, unlike the example > above where I only have 3 entries per record. What I'm > wondering is that I need to make masterindex a real > index. The idea of course is that I want to do a select > on the master table with a join with the magnet table > so that I pull out a list of magnet current and voltage > settings for each master record. But right now I'm just > doing the queries on the magnet table. > > I hope all of the above makes sense. Cheers. Steve. > > > On Mon, 2002-12-02 at 16:56, Joel Burton wrote: > > On Mon, Dec 02, 2002 at 04:43:18PM -0500, Adler, Stephen wrote: > > > oh boy, I now realize how little I know about databases... > > > OK, so I'm switching from MySQL to postgresql. I have > > > this database in MySQL which used the enum data type, > > > which looks to be a MySQL extension since there are no > > > enum data types in postgresql. What I do is store a bunch > > > of values of the setting of a magnet and I have about 6 > > > entries per master record number. (i.e. I have a table > > > which I call the master table, one row entry per master record, > > > one table called the magnet table, which has many entries > > > per Master record entry.) > > > So what I do is loop over the master record entries, say > > > record numbers 5000-5300, and for each master record entry, > > > I look up in the magnet table data stored for that master > > > record. The deal is this, its really slow pulling out the > > > data from the magnet table. Like 15 seconds for 500 selects. > > > The equivalent select in MySQL ran at least 10 times faster. > > > > > > Any ideas of what I'm doing wrong? Did I give you guys > > > enough information so that you understand what I'm doing? > > > > Steve -- > > > > Why don't you post a dump of the create statements for your table, and > > the select statement that you think is running so slowly? > > > > -- > > > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > > Independent Knowledge Management Consultant > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Paul and Michael, I want to thank you guys so much for your input! What I ended up doing was creating an index on magnet(masterindex) and my access time has dropped by a factor of 10! Indexing really works!!!!! Here's a general question, In mysql, I can define a non-unique index in the create table command. create table magnet ( masterindex int, current int, voltage int, key (masterindex) ); and the masterindex key could be non-unique. (i.e. there can be many entries in the magnet table with the same masterindex value, since masterindex refers to a unique masterindex value in another table.) When I went to fill this table, under psql, psql immediately gave me errors saying that magent(masterindex) was being filled with identical values. So, is there a way of defining an index as part of the create table command, or is this a mysql extention? After reading the chapter on Indexs in the users guide, there is no mention of creating an index as part of the create table, but rather an index is a separate entity used to aid the select command. Am I getting keys and indexes mixed up? (I assume they are the same thing....) Cheers. Steve. On Tue, 2002-12-03 at 04:52, paul butler wrote: > From: "Adler, Stephen" <adler@bnl.gov> > To: Joel Burton <joel@joelburton.com> > Copies to: pgsql-novice@postgresql.org > Subject: Re: [NOVICE] its really SLOW!!!!! > Date sent: 02 Dec 2002 17:11:45 -0500 > > > create master ( > > masterindex integer not null primary key > > ); > > > > create magnet ( > > masterindex integer, > > current integer, > > voltage integer > > ); > > > > insert into master values (1); > > insert into master values (1); > > insert into magnet values (1, 100, 100); > > insert into magnet values (1, 102, 99); > > insert into magnet values (1, 99,100); > > insert into magnet values (2, 100, 101); > > insert into magnet values (2, 99,103); > > insert into magnet values (2, 100, 99); > The first thing you notice is that that there is no primary key for the > magnet table > You could do away with the master table by creating > a new table: > > create master2(test serial primary key,magnet integer,voltage > integer,current integer); > > insert into master2(magnet,voltage,current) select > magnet,voltage,current from magnet; > > this takes a few seconds (I'm using pg thru cygwin on an ancient > p2 machine) > > (of course you can maintain integrity on the magnet field by using it > as a foreign key referencing the master field, I don't think mysql > has grasped the importance of foreign keys yet) > > The test field will act as a primary key for each set of results and > give you an unbroken sequence numbered 1-500 000 (or how many > records there are) > > to look at the results of a batch of tests 140001->140500 > > select * from magnet where test > 140000 limit 500 order by > magnet > returns the results > > > Takes no time at all > > to look at the results of a particular magnet in this batch > > select * from master2 where test > 140000 and test <= 140500 and > magnet = 1; > > Index Scan using master2_pkey on master2 (cost=0.00..16.25 > rows=86 width=16) > > NOTICE: QUERY PLAN: > > Index Scan using master2_pkey on master2 (cost=0.00..16.25 > rows=86 width=16) > > to look at the results of a particular magnet in this batch > > select * from master2 where test > 40000 and test <= 40500 and > magnet = 1; > > to look at the results of a magnets in this batch > > select * from master2 where test > 40000 and test <= 40500 and > magnet in (1,2); > > These results come back (subjectively)immediately > > Hope this helps > > Paul Butler > > > > > The idea being that I have a group of values which > > are related to the master index. Here I have 2 records > > in my master table, and 3 groups of values for each > > master record. The select goes like this > > > > select * from magnet where masterindex=1; > > select * from magnet where masterindex=2; > > > > These look ups are very slow. One thing I should remind > > you guys, is that the master table has over 50,000, > > the magnet table has about 6 times that many entries, > > 6 entries for each master record, unlike the example > > above where I only have 3 entries per record. What I'm > > wondering is that I need to make masterindex a real > > index. The idea of course is that I want to do a select > > on the master table with a join with the magnet table > > so that I pull out a list of magnet current and voltage > > settings for each master record. But right now I'm just > > doing the queries on the magnet table. > > > > I hope all of the above makes sense. Cheers. Steve. > > > > > > On Mon, 2002-12-02 at 16:56, Joel Burton wrote: > > > On Mon, Dec 02, 2002 at 04:43:18PM -0500, Adler, Stephen wrote: > > > > oh boy, I now realize how little I know about databases... > > > > OK, so I'm switching from MySQL to postgresql. I have > > > > this database in MySQL which used the enum data type, > > > > which looks to be a MySQL extension since there are no > > > > enum data types in postgresql. What I do is store a bunch > > > > of values of the setting of a magnet and I have about 6 > > > > entries per master record number. (i.e. I have a table > > > > which I call the master table, one row entry per master record, > > > > one table called the magnet table, which has many entries > > > > per Master record entry.) > > > > So what I do is loop over the master record entries, say > > > > record numbers 5000-5300, and for each master record entry, > > > > I look up in the magnet table data stored for that master > > > > record. The deal is this, its really slow pulling out the > > > > data from the magnet table. Like 15 seconds for 500 selects. > > > > The equivalent select in MySQL ran at least 10 times faster. > > > > > > > > Any ideas of what I'm doing wrong? Did I give you guys > > > > enough information so that you understand what I'm doing? > > > > > > Steve -- > > > > > > Why don't you post a dump of the create statements for your table, and > > > the select statement that you think is running so slowly? > > > > > > -- > > > > > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > > > Independent Knowledge Management Consultant > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 4: Don't 'kill -9' the postmaster > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
On Tue, 2002-12-03 at 09:53, Adler, Stephen wrote: > Paul and Michael, [snip] > > When I went to fill this table, under psql, psql immediately > gave me errors saying that magent(masterindex) was being > filled with identical values. So, is there a way of defining > an index as part of the create table command, or is this > a mysql extention? After reading the chapter on Indexs in Is there a pressing need to have it in only 1 statement? > the users guide, there is no mention of creating an index > as part of the create table, but rather an index is > a separate entity used to aid the select command. Am I > getting keys and indexes mixed up? (I assume they are > the same thing....) Pedanticly, a key and an index are radically different: - A key describes a set of records. - with Primary Key, that set must only contain 1 record. - Foreign Keys describe sets of 1 or more records that also have a Primary Key of the same value in a seperate table. - An index is a method speeding up access to data. Typical types of indexes are: - Hashes - Trees As you can see, a UNIQUE index (note that I am not describing how the index works) is perfect for *implementing* primary keys. Postgres takes a short cut, and implicitly creates a unique index when you define a primary key. "Duplicates allows" indexes are used by Postgres to implement foreign keys, but, as you see from magent(masterindex), they are usually used "just" to speed access. Ron -- +------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "they love our milk and honey, but preach about another | | way of living" | | Merle Haggard, "The Fighting Side Of Me" | +------------------------------------------------------------+