Thread: What's faster?

What's faster?

From

"Keith Bottner"

Date:

26 December 2003, 20:12:09

I have a database where the vast majority of information that is related to a customer never changes. However, there is a single field (i.e. balance) that changes potentially tens to hundreds of times per day per customer (customers ranging in the 1000s to 10000s). This information is not indexed. Because Postgres requires VACUUM ANALYZE more frequently on updated tables, should I break this single field out into its own table, and if so what kind of a speed up can I expect to achieve. I would be appreciative of any guidance offered.

BTW, currently using Postgres 7.3.4

Keith

Re: What's faster?

From

Tom Lane

Date:

26 December 2003, 20:49:55

"Keith Bottner" <kbottner@comcast.net> writes:
> I have a database where the vast majority of information that is related to
> a customer never changes. However, there is a single field (i.e. balance)
> that changes potentially tens to hundreds of times per day per customer
> (customers ranging in the 1000s to 10000s). This information is not indexed.
> Because Postgres requires VACUUM ANALYZE more frequently on updated tables,
> should I break this single field out into its own table,

Very likely a good idea, if the primary key that you'd need to add to
identify the balance is narrow.  Hard to say exactly how large the
benefit would be, but I'd think the update costs would be reduced
considerably.

            regards, tom lane

Re: What's faster?

From

Mike Nolan

Date:

26 December 2003, 21:07:13

> Because Postgres requires VACUUM ANALYZE more frequently on updated tables,
> should I break this single field out into its own table, and if so what kind
> of a speed up can I expect to achieve. I would be appreciative of any
> guidance offered.

Unless that field is part of the key, I wouldn't think that a vacuum
analyze would be needed, as the key distribution isn't changing.

I don't know if that is still true if that field is indexed.  Tom?

Even then, as I understand things vacuum analyze doesn't rebuild indexes,
so I could see a need to drop and rebuild indexes on a regular basis,
even if you move that field into a separate table.
--
Mike Nolan

Re: What's faster?

From

Tom Lane

Date:

27 December 2003, 00:01:00

Mike Nolan <nolan@gw.tssi.com> writes:
>> Because Postgres requires VACUUM ANALYZE more frequently on updated tables,
>> should I break this single field out into its own table, and if so what kind
>> of a speed up can I expect to achieve. I would be appreciative of any
>> guidance offered.

> Unless that field is part of the key, I wouldn't think that a vacuum
> analyze would be needed, as the key distribution isn't changing.

The "analyze" wouldn't matter ... but the "vacuum" would.  He needs to
get rid of the dead rows in a timely fashion.  The wider the rows, the
more disk space is at stake.

Also, if he has more than just a primary index on the main table,
the cost of updating the secondary indexes must be considered.
A balance-only table would presumably have just one index to update.

Against all this you have to weigh the cost of doing a join to get the
balance, so it's certainly not a no-brainer choice.  But I think it's
surely worth considering such a design.

            regards, tom lane

Re: What's faster?

From

"D'Arcy J.M. Cain"

Date:

27 December 2003, 06:52:56

On December 26, 2003 07:11 pm, Keith Bottner wrote:
> I have a database where the vast majority of information that is related to
> a customer never changes. However, there is a single field (i.e. balance)
> that changes potentially tens to hundreds of times per day per customer
> (customers ranging in the 1000s to 10000s). This information is not
> indexed. Because Postgres requires VACUUM ANALYZE more frequently on
> updated tables, should I break this single field out into its own table,
> and if so what kind of a speed up can I expect to achieve. I would be
> appreciative of any guidance offered.

We went through this recently.  One thing we found that may apply to you is
how many fields in the client record have a foreign key constraint.  We find
that tables with lots of FKeys are a lot more intensive on updates.  In our
case it was another table, think of it as an order or header table with a
balance, that has over 10 million records.  Sometimes we have 200,000
transactions a day where we have to check the balance.  We eventually moved
every field that could possibly be updated on a regular basis out to separate
tables.  The improvement was dramatic.

--
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

Re: What's faster?

From

Tom Lane

Date:

27 December 2003, 14:31:33

"D'Arcy J.M. Cain" <darcy@druid.net> writes:
> We went through this recently.  One thing we found that may apply to you is
> how many fields in the client record have a foreign key constraint.  We find
> that tables with lots of FKeys are a lot more intensive on updates.

BTW, this should have gotten better in 7.3.4 and later --- there is
logic to skip checking an FKey reference if the referencing columns
didn't change during the update.

            regards, tom lane