Thread: Select query takes long to execute
Hello, I'm running a simple query on a table and I'm getting a very long response time. The table has 56,000 rows in it. It has a full text field, but it is not being referenced in this query. The query I'm running is select row_key, column1, column2, column3, column4, column5 from table1 where column6 = 1 order by column3 desc limit 21; There is an index on the table message_index btree (column6, column3, column7) Column 3 is a date type, column 6 is an integer and column 7 is unused in this query. The total query time is 6 seconds, but I can bring that down to 4.5 if I append "offset 0" to the end of the query. By checking query using "explain analyze" it shows that it is using the index. If anyone has any ideas as to why the query is taking so long and what I can do to make it more efficient I would love to know. Thanks Kevin
On 29 May 2003 at 8:58, Kevin Schroeder wrote: > If anyone has any ideas as to why the query is taking so long and what I can > do to make it more efficient I would love to know. Check yor shared buffers setting and effective OS cache setting. If these are appropriately tuned, then it should be fast enough. Is the table vacuumed? Is index taking too much space? Then try reindexing. It might help as vacuum does not reclaim wasted space in index. HTH Bye Shridhar -- Wait! You have not been prepared! -- Mr. Atoz, "Tomorrow is Yesterday", stardate 3113.2
See if lowering random_page_cost to 1.5 or so helps here. That and effective_cache_size are two of the more important values the planner uses to decide between seq scans and index scans. On Thu, 29 May 2003, Kevin Schroeder wrote: > Hello, > I'm running a simple query on a table and I'm getting a very long > response time. The table has 56,000 rows in it. It has a full text field, > but it is not being referenced in this query. The query I'm running is > > select row_key, column1, column2, column3, column4, column5 from table1 > where column6 = 1 order by column3 desc limit 21; > > There is an index on the table > > message_index btree (column6, column3, column7) > > Column 3 is a date type, column 6 is an integer and column 7 is unused in > this query. > > The total query time is 6 seconds, but I can bring that down to 4.5 if I > append "offset 0" to the end of the query. By checking query using "explain > analyze" it shows that it is using the index. > > If anyone has any ideas as to why the query is taking so long and what I can > do to make it more efficient I would love to know. > > Thanks > Kevin > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
Kevin, How about creating a new index just on column6? That should be much more effective than the multicolumn index. Regards, Nikolaus On Thu, 29 May 2003 08:58:07 -0500, "Kevin Schroeder" wrote: > > Hello, > I'm running a simple query on a table and I'm > getting a very long > response time. The table has 56,000 rows in it. It > has a full text field, > but it is not being referenced in this query. The > query I'm running is > > select row_key, column1, column2, column3, column4, > column5 from table1 > where column6 = 1 order by column3 desc limit 21; > > There is an index on the table > > message_index btree (column6, column3, column7) > > Column 3 is a date type, column 6 is an integer and > column 7 is unused in > this query. > > The total query time is 6 seconds, but I can bring that > down to 4.5 if I > append "offset 0" to the end of the query. By checking > query using "explain > analyze" it shows that it is using the index. > > If anyone has any ideas as to why the query is taking > so long and what I can > do to make it more efficient I would love to know. > > Thanks > Kevin > > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Nikolaus, I think that shouldn't be any more effective. As I experienced, it's irrelevant how many cols an index has as long as you only use the first column. And, after that, if you use another column, how could a missing second column be any better? G. ------------------------------- cut here ------------------------------- ----- Original Message ----- From: "Nikolaus Dilger" <nikolaus@dilger.cc> Sent: Friday, May 30, 2003 4:44 AM > Kevin, > > How about creating a new index just on column6? > That should be much more effective than the multicolumn > index. > > Regards, > Nikolaus
What are the advantages to having a database relational? I am currently discussing the design of a database with some people at work and they reckon it is best to create one table with and index and all the data instead of normalizing the database. I think that they think that joining tables will slow down retrieval, is this true? Thanks Jeandre
* Jeandre du Toit <jeandre@itvs.co.za> [30.05.2003 12:57]: > > What are the advantages to having a database relational? I am currently > discussing the design of a database with some people at work and they > reckon it is best to create one table with and index and all the data > instead of normalizing the database. I think that they think that joining > tables will slow down retrieval, is this true? > Take a look at situation from another side. Let's say: You own a store and have 3 customers and 5 products on your store. All you going to keep in DB is track of all purchases. So, each time a customer will by a product, an new record will be added. What this means: 1. Customer's name will be repeated as many times, as many purchases he had made. The same for each of products. In real world, you'll have about 10,000 customers and about 100,000 products. Do you have enoght space on your disks to store all that stuff? 2. Some of your customers decided to change it's name. What you're going to do? If you're going to insert new purchases of that customer with he's new name, then in all turnover reports you'll have to specify both: old name and new one. If he will hange his name again - again, all reports are to be updated. There is much more stuff to read about Relational Data Model in books. About slowing down retrieval of data: all efforts today are put to speed up things. You should think about your convenience in data manipulation. I suggest you should try both: one huge table, and a set of normalized tables and compare, what is quicker and what is easier to use. -- Victor Yegorov
On Fri, May 30, 2003 at 11:23:10 +0200, Jeandre du Toit <jeandre@itvs.co.za> wrote: > Don't reply to existing threads to start a new thread. > What are the advantages to having a database relational? I am currently > discussing the design of a database with some people at work and they > reckon it is best to create one table with and index and all the data > instead of normalizing the database. I think that they think that joining > tables will slow down retrieval, is this true? You might want to read some books on relational database theory. Date and Pascal are two noted authors of books on relational database theory. > > Thanks > Jeandre > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
On Fri, May 30, 2003 at 10:57:33 +0200, SZUCS Gábor <surrano@mailbox.hu> wrote: > Nikolaus, > > I think that shouldn't be any more effective. As I experienced, it's > irrelevant how many cols an index has as long as you only use the first > column. And, after that, if you use another column, how could a missing > second column be any better? Because the index will be more compact and reside on less disk blocks. The planner also makes different guesses for the selectivity whne using the first column of a multicolumn index as opposed to a single column index.
On Fri, 30 May 2003, Bruno Wolff III wrote: > On Fri, May 30, 2003 at 11:23:10 +0200, > Jeandre du Toit <jeandre@itvs.co.za> wrote: > > > > Don't reply to existing threads to start a new thread. Sorry about that, I did something screwy in Pine. I thought that it would create a new mail. > > > What are the advantages to having a database relational? I am currently > > discussing the design of a database with some people at work and they > > reckon it is best to create one table with and index and all the data > > instead of normalizing the database. I think that they think that joining > > tables will slow down retrieval, is this true? > > You might want to read some books on relational database theory. > > Date and Pascal are two noted authors of books on relational database theory. Thanks, I will look at these books. > > > > > Thanks > > Jeandre > > > >
Jeandre, > instead of normalizing the database. I think that they think that joining > tables will slow down retrieval, is this true? No, it's not. I'm afraid that your co-workers learned their computer knowledge 10 years ago and have not kept up to date. They may need retraining. Modern database systems, especially PostgreSQL, are much faster with a proper relational schema than with an inadequate flat-file table, due to the efficient storage of data ... i.e., no redundancy. I highly suggest that you take a look at the book "Database Design for Mere Mortals"; if you're asking the question you posted, you are nowhere near ready to build a production database application. -- Josh Berkus Aglio Database Solutions San Francisco
On Fri, 30 May 2003, Josh Berkus wrote: > Jeandre, > > > instead of normalizing the database. I think that they think that joining > > tables will slow down retrieval, is this true? > > No, it's not. I'm afraid that your co-workers learned their computer > knowledge 10 years ago and have not kept up to date. They may need > retraining. Thought as much > > Modern database systems, especially PostgreSQL, are much faster with a proper > relational schema than with an inadequate flat-file table, due to the > efficient storage of data ... i.e., no redundancy. That is what I thought, but since they out rank me at work I needed the extra conformation. Now at least I can show them that I am not the only person that thinks a flat table structure is stone age design. I know for a fact it is better on Sybase, but I wasn't to sure about postgres and since they have been working on it for longer than I have, I am expected to follow their lead. > > I highly suggest that you take a look at the book "Database Design for Mere > Mortals"; if you're asking the question you posted, you are nowhere near > ready to build a production database application. > Thanks, I will have a look at that book. You are right, I am only first year Bsc, but I had a feeling that the facts they are giving me can't be right, it just didn't make any sense. They way I figured it, is that having a relational database, makes the database smaller because there is no duplicate data, which should make it faster. Thanks for your help. I will approach my managers. Jeandre
On Fri, May 30, 2003 at 09:06:43AM -0700, Josh Berkus wrote: > Modern database systems, especially PostgreSQL, are much faster with a proper > relational schema than with an inadequate flat-file table, due to the > efficient storage of data ... i.e., no redundancy. Are you sure you want to say it that strongly? After all, if you have a data set which needs always to be returned in the same static format, why not just denormalise it? It's sure faster that way in every system I've ever encountered. It's only when you actually have relations to cope with that it ceases to be an advantage. So, as usual, it depends on what you're trying to do. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Andrew, > Are you sure you want to say it that strongly? After all, if you > have a data set which needs always to be returned in the same static > format, why not just denormalise it? It's sure faster that way in > every system I've ever encountered. > > It's only when you actually have relations to cope with that it > ceases to be an advantage. So, as usual, it depends on what you're > trying to do. Yeah, I suppose so ... if all they're doing is reporting on a static set of data which is not transactional ... sure. If it's a disposable, limited-time-use application. However, I have yet to see in my professional experience any application that was really this way and stayed this way once it was in use ... relations have a way of creeping in, and planning for them is less messy than refactoring. -- Josh Berkus Aglio Database Solutions San Francisco
On Fri, 30 May 2003, Josh Berkus wrote: > Andrew, > > > Are you sure you want to say it that strongly? After all, if you > > have a data set which needs always to be returned in the same static > > format, why not just denormalise it? It's sure faster that way in > > every system I've ever encountered. > > > > It's only when you actually have relations to cope with that it > > ceases to be an advantage. So, as usual, it depends on what you're > > trying to do. > > Yeah, I suppose so ... if all they're doing is reporting on a static set of > data which is not transactional ... sure. If it's a disposable, > limited-time-use application. > > However, I have yet to see in my professional experience any application that > was really this way and stayed this way once it was in use ... relations have > a way of creeping in, and planning for them is less messy than refactoring. My philosophy has been you store the data normalized, and denormalize it for performance down the line. but denormalizing for storage is usually a bad idea, as it allows your data to get filled with inconsistencies. It's funny how people start worrying about performance of flat versus normalized before really looking at the difference between the two first. On Postgresql and most other databases, there are far more important concerns to worry about when it comes to performance than whether or not you're joining a couple tables.
On Fri, May 30, 2003 at 11:20:33AM -0600, scott.marlowe wrote: > but denormalizing for storage is usually a bad idea, as it allows your > data to get filled with inconsistencies. Sure, but if performance is an important goal for certain kinds of SELECTs, using a trigger at insert or update to do denormalising is perhaps an acceptable approach. It's obvious that in most cases, denormalising instead of optimising your normalisation is silly. But if you need something to return in, say, 2ms most of the time, and it requires a wide variety of data, denormalising is a good idea. It is, of course, contrary to the RDBMS-y mind to denormalise. But there are (rare) times when it's a good idea, and I hate to see it rejected out of hand in such cases. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Andrew, > Sure, but if performance is an important goal for certain kinds of > SELECTs, using a trigger at insert or update to do denormalising is > perhaps an acceptable approach. It's obvious that in most cases, > denormalising instead of optimising your normalisation is silly. But > if you need something to return in, say, 2ms most of the time, and it > requires a wide variety of data, denormalising is a good idea. I've done this plenty of times ... but what you're talking about is more of a "materialized view" than denormalized data. The data is still stored in normal form; it is just distilled for a particular view and saved on disk for quick reference. This is often a good approach with performance-sensitive, complex databases. > It is, of course, contrary to the RDBMS-y mind to denormalise. But > there are (rare) times when it's a good idea, and I hate to see it > rejected out of hand in such cases. There is a big difference between denormalizing normalized data and storing your data in denormalized (basically flat file) form in the first place. -- -Josh Berkus Aglio Database Solutions San Francisco
On Fri, 2003-05-30 at 12:10, Victor Yegorov wrote: > Take a look at situation from another side. > > Let's say: You own a store and have 3 customers and 5 products on your > store. All you going to keep in DB is track of all purchases. > > So, each time a customer will by a product, an new record will be added. > What this means: > > 1. Customer's name will be repeated as many times, as many purchases he had > made. The same for each of products. In real world, you'll have about > 10,000 customers and about 100,000 products. Do you have enoght space on > your disks to store all that stuff? Well, to play the devil's advocate, to do it correctly, you should probably store the customer data duplicate (one in the main record, and once in the purchase order). If you do not, you'll get an ERP system that is incapable to reproduce work done, which is basically a BAD THING(tm) :) > 2. Some of your customers decided to change it's name. What you're going to > do? If you're going to insert new purchases of that customer with he's new > name, then in all turnover reports you'll have to specify both: > old name and new one. If he will hange his name again - again, all > reports are to be updated. Well, again, a purchase order should keep records -> it shouldn't magically change the name or address of the customer, just because the customer moved. Andreas