Thread: Select query takes long to execute

Select query takes long to execute

From

"Kevin Schroeder"

Date:

29 May 2003, 09:58:15

Hello,
    I'm running a simple query on a table and I'm getting a very long
response time.  The table has 56,000 rows in it.  It has a full text field,
but it is not being referenced in this query.  The query I'm running is

select row_key, column1, column2, column3, column4, column5 from table1
where column6 = 1 order by column3 desc limit 21;

There is an index on the table

message_index btree (column6, column3, column7)

Column 3 is a date type, column 6 is an integer and column 7 is unused in
this query.

The total query time is 6 seconds, but I can bring that down to 4.5 if I
append "offset 0" to the end of the query.  By checking query using "explain
analyze" it shows that it is using the index.

If anyone has any ideas as to why the query is taking so long and what I can
do to make it more efficient I would love to know.

Thanks
Kevin

Re: Select query takes long to execute

From

"Shridhar Daithankar"

Date:

29 May 2003, 10:25:44

On 29 May 2003 at 8:58, Kevin Schroeder wrote:
> If anyone has any ideas as to why the query is taking so long and what I can
> do to make it more efficient I would love to know.

Check yor shared buffers setting and effective OS cache setting. If these are
appropriately tuned, then it should be fast enough.

Is the table vacuumed? Is index taking too much space? Then try reindexing. It
might help as vacuum does not reclaim wasted space in index.

HTH

Bye
 Shridhar

--
Wait!  You have not been prepared!        -- Mr. Atoz, "Tomorrow is Yesterday",
stardate 3113.2

Re: Select query takes long to execute

From

"scott.marlowe"

Date:

29 May 2003, 12:18:32

See if lowering random_page_cost to 1.5 or so helps here.

That and effective_cache_size are two of the more important values the
planner uses to decide between seq scans and index scans.

On Thu, 29 May 2003, Kevin Schroeder wrote:

> Hello,
>     I'm running a simple query on a table and I'm getting a very long
> response time.  The table has 56,000 rows in it.  It has a full text field,
> but it is not being referenced in this query.  The query I'm running is
>
> select row_key, column1, column2, column3, column4, column5 from table1
> where column6 = 1 order by column3 desc limit 21;
>
> There is an index on the table
>
> message_index btree (column6, column3, column7)
>
> Column 3 is a date type, column 6 is an integer and column 7 is unused in
> this query.
>
> The total query time is 6 seconds, but I can bring that down to 4.5 if I
> append "offset 0" to the end of the query.  By checking query using "explain
> analyze" it shows that it is using the index.
>
> If anyone has any ideas as to why the query is taking so long and what I can
> do to make it more efficient I would love to know.
>
> Thanks
> Kevin
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: Select query takes long to execute

From

"Nikolaus Dilger"

Date:

29 May 2003, 22:44:55

Kevin,

How about creating a new index just on column6?
That should be much more effective than the multicolumn
index.

Regards,
Nikolaus

On Thu, 29 May 2003 08:58:07 -0500, "Kevin Schroeder"
wrote:

>
> Hello,
>     I'm running a simple query on a table and I'm
> getting a very long
> response time.  The table has 56,000 rows in it.  It
> has a full text field,
> but it is not being referenced in this query.  The
> query I'm running is
>
> select row_key, column1, column2, column3, column4,
> column5 from table1
> where column6 = 1 order by column3 desc limit 21;
>
> There is an index on the table
>
> message_index btree (column6, column3, column7)
>
> Column 3 is a date type, column 6 is an integer and
> column 7 is unused in
> this query.
>
> The total query time is 6 seconds, but I can bring
that
> down to 4.5 if I
> append "offset 0" to the end of the query.  By
checking
> query using "explain
> analyze" it shows that it is using the index.
>
> If anyone has any ideas as to why the query is taking
> so long and what I can
> do to make it more efficient I would love to know.
>
> Thanks
> Kevin
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

Re: Select query takes long to execute

From

SZUCS Gábor

Date:

30 May 2003, 04:59:07

Nikolaus,

I think that shouldn't be any more effective. As I experienced, it's
irrelevant how many cols an index has as long as you only use the first
column. And, after that, if you use another column, how could a missing
second column be any better?

G.
------------------------------- cut here -------------------------------
----- Original Message -----
From: "Nikolaus Dilger" <nikolaus@dilger.cc>
Sent: Friday, May 30, 2003 4:44 AM


> Kevin,
>
> How about creating a new index just on column6?
> That should be much more effective than the multicolumn
> index.
>
> Regards,
> Nikolaus

Table Relationships

From

Jeandre du Toit

Date:

30 May 2003, 05:37:24

What are the advantages to having a database relational? I am currently
discussing the design of a database with some people at work and they
reckon it is best to create one table with and index and all the data
instead of normalizing the database. I think that they think that joining
tables will slow down retrieval, is this true?

Thanks
Jeandre

Re: Table Relationships

From

"Victor Yegorov"

Date:

30 May 2003, 06:08:49

* Jeandre du Toit <jeandre@itvs.co.za> [30.05.2003 12:57]:
>
> What are the advantages to having a database relational? I am currently
> discussing the design of a database with some people at work and they
> reckon it is best to create one table with and index and all the data
> instead of normalizing the database. I think that they think that joining
> tables will slow down retrieval, is this true?
>

Take a look at situation from another side.

Let's say: You own a store and have 3 customers and 5 products on your
store. All you going to keep in DB is track of all purchases.

So, each time a customer will by a product, an new record will be added.
What this means:

1. Customer's name will be repeated as many times, as many purchases he had
   made. The same for each of products. In real world, you'll have about
   10,000 customers and about 100,000 products. Do you have enoght space on
   your disks to store all that stuff?

2. Some of your customers decided to change it's name. What you're going to
   do? If you're going to insert new purchases of that customer with he's new
   name, then in all turnover reports you'll have to specify both:
   old name and new one. If he will hange his name again - again, all
   reports are to be updated.

There is much more stuff to read about Relational Data Model in books.

About slowing down retrieval of data: all efforts today are put to speed up
things. You should think about your convenience in data manipulation.


I suggest you should try both: one huge table, and a set of normalized
tables and compare, what is quicker and what is easier to use.

--

Victor Yegorov

Re: Table Relationships

From

Bruno Wolff III

Date:

30 May 2003, 08:10:04

On Fri, May 30, 2003 at 11:23:10 +0200,
  Jeandre du Toit <jeandre@itvs.co.za> wrote:
>

Don't reply to existing threads to start a new thread.

> What are the advantages to having a database relational? I am currently
> discussing the design of a database with some people at work and they
> reckon it is best to create one table with and index and all the data
> instead of normalizing the database. I think that they think that joining
> tables will slow down retrieval, is this true?

You might want to read some books on relational database theory.

Date and Pascal are two noted authors of books on relational database theory.

>
> Thanks
> Jeandre
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

Re: Select query takes long to execute

From

Bruno Wolff III

Date:

30 May 2003, 08:12:13

On Fri, May 30, 2003 at 10:57:33 +0200,
  SZUCS Gábor <surrano@mailbox.hu> wrote:
> Nikolaus,
>
> I think that shouldn't be any more effective. As I experienced, it's
> irrelevant how many cols an index has as long as you only use the first
> column. And, after that, if you use another column, how could a missing
> second column be any better?

Because the index will be more compact and reside on less disk blocks.
The planner also makes different guesses for the selectivity whne using
the first column of a multicolumn index as opposed to a single column
index.

Re: Table Relationships

From

Jeandre du Toit

Date:

30 May 2003, 10:31:15

On Fri, 30 May 2003, Bruno Wolff III wrote:

> On Fri, May 30, 2003 at 11:23:10 +0200,
>   Jeandre du Toit <jeandre@itvs.co.za> wrote:
> >
>
> Don't reply to existing threads to start a new thread.

Sorry about that, I did something screwy in Pine. I thought that it would
create a new mail.

>
> > What are the advantages to having a database relational? I am currently
> > discussing the design of a database with some people at work and they
> > reckon it is best to create one table with and index and all the data
> > instead of normalizing the database. I think that they think that joining
> > tables will slow down retrieval, is this true?
>
> You might want to read some books on relational database theory.
>
> Date and Pascal are two noted authors of books on relational database theory.

Thanks, I will look at these books.

>
> >
> > Thanks
> > Jeandre
> >
> >

Re: Table Relationships

From

Josh Berkus

Date:

30 May 2003, 12:07:17

Jeandre,

> instead of normalizing the database. I think that they think that joining
> tables will slow down retrieval, is this true?

No, it's not.   I'm afraid that your co-workers learned their computer
knowledge 10 years ago and have not kept up to date.  They may need
retraining.

Modern database systems, especially PostgreSQL, are much faster with a proper
relational schema than with an inadequate flat-file table, due to the
efficient storage of data ... i.e., no redundancy.

I highly suggest that you take a look at the book "Database Design for Mere
Mortals"; if you're asking the question you posted, you are nowhere near
ready to build a production database application.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Table Relationships

From

Jeandre du Toit

Date:

30 May 2003, 12:22:06

On Fri, 30 May 2003, Josh Berkus wrote:

> Jeandre,
>
> > instead of normalizing the database. I think that they think that joining
> > tables will slow down retrieval, is this true?
>
> No, it's not.   I'm afraid that your co-workers learned their computer
> knowledge 10 years ago and have not kept up to date.  They may need
> retraining.

Thought as much

>
> Modern database systems, especially PostgreSQL, are much faster with a proper
> relational schema than with an inadequate flat-file table, due to the
> efficient storage of data ... i.e., no redundancy.

That is what I thought, but since they out rank me at work I needed the
extra conformation. Now at least I can show them that I am not the only
person that thinks a flat table structure is stone age design. I know for a
fact it is better on Sybase, but I wasn't to sure about postgres and since
they have been working on it for longer than I have, I am expected to
follow their lead.

>
> I highly suggest that you take a look at the book "Database Design for Mere
> Mortals"; if you're asking the question you posted, you are nowhere near
> ready to build a production database application.
>

Thanks, I will have a look at that book. You are right, I am only first
year Bsc, but I had a feeling that the facts they are giving me can't be
right, it just didn't make any sense. They way I figured it, is that
having a relational database, makes the database smaller because there is
no duplicate data, which should make it faster.

Thanks for your help. I will approach my managers.
Jeandre

Re: Table Relationships

From

Andrew Sullivan

Date:

30 May 2003, 12:39:04

On Fri, May 30, 2003 at 09:06:43AM -0700, Josh Berkus wrote:
> Modern database systems, especially PostgreSQL, are much faster with a proper
> relational schema than with an inadequate flat-file table, due to the
> efficient storage of data ... i.e., no redundancy.

Are you sure you want to say it that strongly?  After all, if you
have a data set which needs always to be returned in the same static
format, why not just denormalise it?  It's sure faster that way in
every system I've ever encountered.

It's only when you actually have relations to cope with that it
ceases to be an advantage.  So, as usual, it depends on what you're
trying to do.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Re: Table Relationships

From

Josh Berkus

Date:

30 May 2003, 13:03:53

Andrew,

> Are you sure you want to say it that strongly?  After all, if you
> have a data set which needs always to be returned in the same static
> format, why not just denormalise it?  It's sure faster that way in
> every system I've ever encountered.
>
> It's only when you actually have relations to cope with that it
> ceases to be an advantage.  So, as usual, it depends on what you're
> trying to do.

Yeah, I suppose so ... if all they're doing is reporting on a static set of
data which is not transactional ... sure.  If it's a disposable,
limited-time-use application.

However, I have yet to see in my professional experience any application that
was really this way and stayed this way once it was in use ... relations have
a way of creeping in, and planning for them is less messy than refactoring.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Table Relationships

From

"scott.marlowe"

Date:

30 May 2003, 13:37:20

On Fri, 30 May 2003, Josh Berkus wrote:

> Andrew,
>
> > Are you sure you want to say it that strongly?  After all, if you
> > have a data set which needs always to be returned in the same static
> > format, why not just denormalise it?  It's sure faster that way in
> > every system I've ever encountered.
> >
> > It's only when you actually have relations to cope with that it
> > ceases to be an advantage.  So, as usual, it depends on what you're
> > trying to do.
>
> Yeah, I suppose so ... if all they're doing is reporting on a static set of
> data which is not transactional ... sure.  If it's a disposable,
> limited-time-use application.
>
> However, I have yet to see in my professional experience any application that
> was really this way and stayed this way once it was in use ... relations have
> a way of creeping in, and planning for them is less messy than refactoring.

My philosophy has been you store the data normalized, and denormalize it
for performance down the line.

but denormalizing for storage is usually a bad idea, as it allows your
data to get filled with inconsistencies.

It's funny how people start worrying about performance of flat versus
normalized before really looking at the difference between the two first.
On Postgresql and most other databases, there are far more important
concerns to worry about when it comes to performance than whether or not
you're joining a couple tables.

Re: Table Relationships

From

Andrew Sullivan

Date:

30 May 2003, 14:18:59

On Fri, May 30, 2003 at 11:20:33AM -0600, scott.marlowe wrote:

> but denormalizing for storage is usually a bad idea, as it allows your
> data to get filled with inconsistencies.

Sure, but if performance is an important goal for certain kinds of
SELECTs, using a trigger at insert or update to do denormalising is
perhaps an acceptable approach.  It's obvious that in most cases,
denormalising instead of optimising your normalisation is silly.  But
if you need something to return in, say, 2ms most of the time, and it
requires a wide variety of data, denormalising is a good idea.

It is, of course, contrary to the RDBMS-y mind to denormalise.  But
there are (rare) times when it's a good idea, and I hate to see it
rejected out of hand in such cases.

A
--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Re: Table Relationships

From

Josh Berkus

Date:

30 May 2003, 15:55:04

Andrew,

> Sure, but if performance is an important goal for certain kinds of
> SELECTs, using a trigger at insert or update to do denormalising is
> perhaps an acceptable approach.  It's obvious that in most cases,
> denormalising instead of optimising your normalisation is silly.  But
> if you need something to return in, say, 2ms most of the time, and it
> requires a wide variety of data, denormalising is a good idea.

I've done this plenty of times ... but what you're talking about is more of a
"materialized view" than denormalized data.   The data is still stored in
normal form; it is just distilled for a particular view and saved on disk for
quick reference.  This is often a good approach with performance-sensitive,
complex databases.

> It is, of course, contrary to the RDBMS-y mind to denormalise.  But
> there are (rare) times when it's a good idea, and I hate to see it
> rejected out of hand in such cases.

There is a big difference between denormalizing normalized data and storing
your data in denormalized (basically flat file) form in the first place.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

Re: Table Relationships

From

Andreas Kostyrka

Date:

12 June 2003, 06:32:37

On Fri, 2003-05-30 at 12:10, Victor Yegorov wrote:
> Take a look at situation from another side.
>
> Let's say: You own a store and have 3 customers and 5 products on your
> store. All you going to keep in DB is track of all purchases.
>
> So, each time a customer will by a product, an new record will be added.
> What this means:
>
> 1. Customer's name will be repeated as many times, as many purchases he had
>    made. The same for each of products. In real world, you'll have about
>    10,000 customers and about 100,000 products. Do you have enoght space on
>    your disks to store all that stuff?
Well, to play the devil's advocate, to do it correctly, you should
probably store the customer data duplicate (one in the main record, and
once in the purchase order). If you do not, you'll get an ERP system
that is incapable to reproduce work done, which is basically a BAD
THING(tm) :)


> 2. Some of your customers decided to change it's name. What you're going to
>    do? If you're going to insert new purchases of that customer with he's new
>    name, then in all turnover reports you'll have to specify both:
>    old name and new one. If he will hange his name again - again, all
>    reports are to be updated.
Well, again, a purchase order should keep records -> it shouldn't
magically change the name or address of the customer, just because the
customer moved.

Andreas

Attachment

signature.asc