Thread: Maximum number of tables per database and slowness

Maximum number of tables per database and slowness

From
Benjamin Arai
Date:
What is the current maximum number of tables per database?  Also, does
having more tables slow down performance in any way?

Re: Maximum number of tables per database and slowness

From
William Yu
Date:
Benjamin Arai wrote:
> What is the current maximum number of tables per database?  Also, does
> having more tables slow down performance in any way?

For most cases, the answer is no. However, once you get near 6 figure
table counts, pg_catalog ends up being pretty massive. The problem is
that the query planner must check pg_catalog for every query to see what
indexes are available, what the statistics & value distributions are,
etc. in order to build the optimal plan. At some point, a really large
pg_catalog can begin to bog down your system.

See the performance list for a recent discussion about this very issue.

Performance large tables.

From
Benjamin Arai
Date:
My issue actually stems from the fact that I cannot do large weekly
updates on fast enough to meet a weekend window for the following work
week.  I am currently using a machine with a raid 1, 4GB RAM, and dual
opteron.  I could go 0+1 but peroformance increase is only about 20%
from the benchmarks I have read.  ANY suggestions would be greatly
appreciated.


Re: Performance large tables.

From
Martijn van Oosterhout
Date:
On Sat, Dec 10, 2005 at 03:22:47PM -0800, Benjamin Arai wrote:
> My issue actually stems from the fact that I cannot do large weekly
> updates on fast enough to meet a weekend window for the following work
> week.  I am currently using a machine with a raid 1, 4GB RAM, and dual
> opteron.  I could go 0+1 but peroformance increase is only about 20%
> from the benchmarks I have read.  ANY suggestions would be greatly
> appreciated.

Since you have given us no information to work with you're not going to
get any specific recommendations either. If you can tell is the things
you do most commonly and are most performance sensetive (exact
commands, timings, versions and settings please) we may be able to
help.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Performance large tables.

From
Benjamin Arai
Date:
To be more specific,  there are two types of commands that are run on
the system.  There are application commands that do all different types
of joins and etc but for the most part are fast enough to meet user
expectations.  On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc.  For the most part they are
records of the type name,date,value.  The update currently takes almost
two days.   The update does deletions, insertion, and updates depending
on what has happened from the previous week.

For the most part the updates are simple one liners.  I currently commit
in large batch to increase performance but it still takes a while as
stated above.  From evaluating the computers performance during an
update,  the system is thrashing both memory and disk.  I am currently
using Postgresql 8.0.3.

Example command "UPDATE data where name=x and date=y;".

Martijn van Oosterhout wrote:

>On Sat, Dec 10, 2005 at 03:22:47PM -0800, Benjamin Arai wrote:
>
>
>>My issue actually stems from the fact that I cannot do large weekly
>>updates on fast enough to meet a weekend window for the following work
>>week.  I am currently using a machine with a raid 1, 4GB RAM, and dual
>>opteron.  I could go 0+1 but peroformance increase is only about 20%
>>from the benchmarks I have read.  ANY suggestions would be greatly
>>appreciated.
>>
>>
>
>Since you have given us no information to work with you're not going to
>get any specific recommendations either. If you can tell is the things
>you do most commonly and are most performance sensetive (exact
>commands, timings, versions and settings please) we may be able to
>help.
>
>

Re: Performance large tables.

From
Jaime Casanova
Date:
On 12/10/05, Benjamin Arai <barai@cs.ucr.edu> wrote:
> To be more specific,  there are two types of commands that are run on
> the system.  There are application commands that do all different types
> of joins and etc but for the most part are fast enough to meet user
> expectations.  On the other hand there is a weekly update (This is the
> problem) that updates all of the modified records for a bunch of
> finacial data such as closes and etc.  For the most part they are
> records of the type name,date,value.  The update currently takes almost
> two days.   The update does deletions, insertion, and updates depending
> on what has happened from the previous week.
>
> For the most part the updates are simple one liners.  I currently commit
> in large batch to increase performance but it still takes a while as
> stated above.  From evaluating the computers performance during an
> update,  the system is thrashing both memory and disk.  I am currently
> using Postgresql 8.0.3.
>
> Example command "UPDATE data where name=x and date=y;".
>

Try using VACUUM or VACUUM FULL after those weekly updates...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

Re: Performance large tables.

From
Robert Treat
Date:
On Saturday 10 December 2005 19:28, Jaime Casanova wrote:
> On 12/10/05, Benjamin Arai <barai@cs.ucr.edu> wrote:
> > To be more specific,  there are two types of commands that are run on
> > the system.  There are application commands that do all different types
> > of joins and etc but for the most part are fast enough to meet user
> > expectations.  On the other hand there is a weekly update (This is the
> > problem) that updates all of the modified records for a bunch of
> > finacial data such as closes and etc.  For the most part they are
> > records of the type name,date,value.  The update currently takes almost
> > two days.   The update does deletions, insertion, and updates depending
> > on what has happened from the previous week.
> >
> > For the most part the updates are simple one liners.  I currently commit
> > in large batch to increase performance but it still takes a while as
> > stated above.  From evaluating the computers performance during an
> > update,  the system is thrashing both memory and disk.  I am currently
> > using Postgresql 8.0.3.
> >
> > Example command "UPDATE data where name=x and date=y;".
>
> Try using VACUUM or VACUUM FULL after those weekly updates...

Once you've updated a majority of the rows in a large tables, if your still
doing further work on that table that involves complex conditionals, you'll
probably notice a slow down, so you might even think about doing a vacuum
durring the updates if you can swing it.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: Maximum number of tables per database and slowness

From
Tom Lane
Date:
William Yu <wyu@talisys.com> writes:
> Benjamin Arai wrote:
>> What is the current maximum number of tables per database?  Also, does
>> having more tables slow down performance in any way?

> For most cases, the answer is no. However, once you get near 6 figure
> table counts, pg_catalog ends up being pretty massive.

You also have to think about the performance implications of having tens
of thousands of files in your database directory.  While some newer
filesystems aren't fazed by that particularly, a lot of 'em bog down on
lookups when there are more than a few thousand entries in a directory.

            regards, tom lane

Re: Performance large tables.

From
William Yu
Date:
Benjamin Arai wrote:
> For the most part the updates are simple one liners.  I currently commit
> in large batch to increase performance but it still takes a while as
> stated above.  From evaluating the computers performance during an
> update,  the system is thrashing both memory and disk.  I am currently
> using Postgresql 8.0.3.
>
> Example command "UPDATE data where name=x and date=y;".

Before you start throwing the baby out with the bathwater by totally
revamping your DB architecture, try some simple debugging first to see
why these queries take a long time. Use explain analyze, test
vacuuming/analyzing mid-updates, fiddle with postgresql.conf parameters
(the wal/commit settings especially). Try using using commit w/
different amounts of transactions -- the optimal # won't be the same
across all development tools.

My own experience is that periodic vacuuming & analyzing are very much
needed for batches of small update commands. For our batch processing,
autovacuum plus 1K-10K commit batches did the trick in keeping
performance up.

Re: Performance large tables.

From
Bruno Wolff III
Date:
Please start new threads to ask unrelated questions, rather than replying
to an existing thread. This makes the archives less useful, and may keep
people from seeing your question.

On Sat, Dec 10, 2005 at 15:37:01 -0800,
  Benjamin Arai <barai@cs.ucr.edu> wrote:
> To be more specific,  there are two types of commands that are run on
> the system.  There are application commands that do all different types
> of joins and etc but for the most part are fast enough to meet user
> expectations.  On the other hand there is a weekly update (This is the
> problem) that updates all of the modified records for a bunch of
> finacial data such as closes and etc.  For the most part they are
> records of the type name,date,value.  The update currently takes almost
> two days.   The update does deletions, insertion, and updates depending
> on what has happened from the previous week.
>
> For the most part the updates are simple one liners.  I currently commit
> in large batch to increase performance but it still takes a while as
> stated above.  From evaluating the computers performance during an
> update,  the system is thrashing both memory and disk.  I am currently
> using Postgresql 8.0.3.
>
> Example command "UPDATE data where name=x and date=y;".

Another possibility is that foreign key checks may be slowing things down.
You may need to create some indexes on referencing tables if you are doing
deletes (or updates which change the key) in referenced tables.

Finding out more about which queries are slow and what plans they are using
is a good first step.

Re: Performance large tables.

From
Vivek Khera
Date:
On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:

> For the most part the updates are simple one liners.  I currently
> commit in large batch to increase performance but it still takes a
> while as stated above.  From evaluating the computers performance
> during an update,  the system is thrashing both memory and disk.  I
> am currently using Postgresql 8.0.3.

Then buy faster disks.  My current favorite is to use U320 15kRPM
disks using a dual-chanel RAID controller with 1/2 the disks on one
channel and 1/2 on the other and mirroring them across channels, then
striping down the mirrors (ie, RAID10).

I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
RAID1.