Thread: Postgresql capabilities question

Postgresql capabilities question

From
"John Wells"
Date:
I have a M$ Sql Server db that I'm porting to postgresql.  Approx. 24
tables from this old db can be combined in the new database into one
table, and it would be a bit more elegant to do this.

However, the combined table would be around 95000 rows in size.

Having never really used Postgresql in the past, and unable to find a
datapoint on the web, I would really like to get input from current users.
 Is this an unreasonable table size to expect good performance when the
PHP app driving it gets a reasonable amount of traffic?  I know
performance is also heavily dependent on indexes and query structure, but
disregarding either of those for the sake of argument, would I be better
off keeping the tables separate, or is 95000 not something to worry about?
 btw, most tables in this database are quite small (<2000).  My redesign
would create two tables in the +90000 range, but less than 100000.

Thanks very much for your input.

John


Re: Postgresql capabilities question

From
Steve Atkins
Date:
On Wed, Apr 02, 2003 at 07:33:46PM -0500, John Wells wrote:
> I have a M$ Sql Server db that I'm porting to postgresql.  Approx. 24
> tables from this old db can be combined in the new database into one
> table, and it would be a bit more elegant to do this.
>
> However, the combined table would be around 95000 rows in size.
>
> Having never really used Postgresql in the past, and unable to find a
> datapoint on the web, I would really like to get input from current users.
>  Is this an unreasonable table size to expect good performance when the
> PHP app driving it gets a reasonable amount of traffic?  I know
> performance is also heavily dependent on indexes and query structure, but
> disregarding either of those for the sake of argument, would I be better
> off keeping the tables separate, or is 95000 not something to worry about?
>  btw, most tables in this database are quite small (<2000).  My redesign
> would create two tables in the +90000 range, but less than 100000.
>
> Thanks very much for your input.

I have a number of 1,000,000-plus row tables (very plus in some cases)
running on some nasty low-end (Celerons with 5400rpm IDE drives, Netras)
and performance is quite adequate for typical use.

Cheers,
  Steve


Re: Postgresql capabilities question

From
Ryan Mahoney
Date:
You're absolutely correct that there are *many* other factors that
determine performance aside from row count.  That being said, I have
table with over a million entries on actively used systems that perform
really well with queries utilize and index and acceptably well on
queries that require a sequential scan.

> Having never really used Postgresql in the past, and unable to find a
> datapoint on the web, I would really like to get input from current users.
>  Is this an unreasonable table size to expect good performance when the
> PHP app driving it gets a reasonable amount of traffic?  I know
> performance is also heavily dependent on indexes and query structure, but
> disregarding either of those for the sake of argument, would I be better
> off keeping the tables separate, or is 95000 not something to worry about?
>  btw, most tables in this database are quite small (<2000).  My redesign
> would create two tables in the +90000 range, but less than 100000.


Re: Postgresql capabilities question

From
Lincoln Yeoh
Date:
If the queries are selective and typically indexed, or you have enough RAM
then I'd say there won't be much difference between splitting the tables or
combining them into one.

95000 rows could be considered small. You might even have enough RAM to
cache the whole DB- can estimate from DB footprint on disk.

Link.

At 07:33 PM 4/2/03 -0500, John Wells wrote:

>I have a M$ Sql Server db that I'm porting to postgresql.  Approx. 24
>tables from this old db can be combined in the new database into one
>table, and it would be a bit more elegant to do this.
>
>However, the combined table would be around 95000 rows in size.
>
>PHP app driving it gets a reasonable amount of traffic?  I know
>performance is also heavily dependent on indexes and query structure, but
>disregarding either of those for the sake of argument, would I be better
>off keeping the tables separate, or is 95000 not something to worry about?
>  btw, most tables in this database are quite small (<2000).  My redesign
>would create two tables in the +90000 range, but less than 100000.


Re: Postgresql capabilities question

From
"Nigel J. Andrews"
Date:
On Wed, 2 Apr 2003, Steve Atkins wrote:

> On Wed, Apr 02, 2003 at 07:33:46PM -0500, John Wells wrote:
> > I have a M$ Sql Server db that I'm porting to postgresql.  Approx. 24
> > tables from this old db can be combined in the new database into one
> > table, and it would be a bit more elegant to do this.
> >
> > However, the combined table would be around 95000 rows in size.

Almost laughably small :)

> >
> > Having never really used Postgresql in the past, and unable to find a
> > datapoint on the web, I would really like to get input from current users.
> >  Is this an unreasonable table size to expect good performance when the
> > PHP app driving it gets a reasonable amount of traffic?  I know
> > performance is also heavily dependent on indexes and query structure, but
> > disregarding either of those for the sake of argument, would I be better
> > off keeping the tables separate, or is 95000 not something to worry about?
> >  btw, most tables in this database are quite small (<2000).  My redesign
> > would create two tables in the +90000 range, but less than 100000.
> >
> > Thanks very much for your input.
>
> I have a number of 1,000,000-plus row tables (very plus in some cases)
> running on some nasty low-end (Celerons with 5400rpm IDE drives, Netras)
> and performance is quite adequate for typical use.
>

Yeah, it's those sequential and tsearch index scans that kill it but selective
queries fly.


--
Nigel J. Andrews


Re: Postgresql capabilities question

From
"scott.marlowe"
Date:
On Wed, 2 Apr 2003, John Wells wrote:

> I have a M$ Sql Server db that I'm porting to postgresql.  Approx. 24
> tables from this old db can be combined in the new database into one
> table, and it would be a bit more elegant to do this.
>
> However, the combined table would be around 95000 rows in size.
>
> Having never really used Postgresql in the past, and unable to find a
> datapoint on the web, I would really like to get input from current users.
>  Is this an unreasonable table size to expect good performance when the
> PHP app driving it gets a reasonable amount of traffic?  I know
> performance is also heavily dependent on indexes and query structure, but
> disregarding either of those for the sake of argument, would I be better
> off keeping the tables separate, or is 95000 not something to worry about?
>  btw, most tables in this database are quite small (<2000).  My redesign
> would create two tables in the +90000 range, but less than 100000.

We use postgresql for lots of stuff that's more than 90,000 rows.  We have
a table we use to merge log files from multiple servers and then sort them
by time.

Here's an example of it:

\d logs
                              Table "logs"
  Column  |  Type   |                     Modifiers
----------+---------+----------------------------------------------------
 logid    | integer | not null default nextval('"logs_logid_seq"'::text)
 linedate | integer |
 line     | text    |
Primary key: logs_pkey

select count(*) from logs;
 count
--------
 602346
(1 row)

 explain analyze select count(*) from logs;
NOTICE:  QUERY PLAN:

Aggregate  (cost=334868.46..334868.46 rows=1 width=0) (actual
time=2463.31..2463.31 rows=1 loops=1)
  ->  Seq Scan on logs  (cost=0.00..33695.46 rows=602346 width=0) (actual
time=0.02..1592.28 rows=602346 loops=1)
Total runtime: 2463.43 msec

(2.5 seconds to seq scan the whole table);

-- select one record

explain analyze select logid from logs limit 1 offset 100000;
NOTICE:  QUERY PLAN:

Limit  (cost=5594.04..5594.09 rows=1 width=4) (actual time=394.75..394.76
rows=1 loops=1)
  ->  Seq Scan on logs  (cost=0.00..33695.46 rows=602346 width=4) (actual
time=0.02..307.74 rows=100002 loops=1)
Total runtime: 394.86 msec

-- grab every row in the table

explain analyze select * from logs;
NOTICE:  QUERY PLAN:

Seq Scan on logs  (cost=0.00..33695.46 rows=602346 width=341) (actual
time=0.03..3284.99 rows=602346 loops=1)
Total runtime: 3881.38 msec

--

the size of this table on the hard drive is 226,689,024 bytes.  It's one
of the smaller tables we play with on this server.  Please note that this
is our well tuned production server.  A workstation with default settings
and less memory might not be quite as fast as that, but you're still
looking at sub second response times on well indexed datasets with less
than say 5,000,000 rows or so.