Thread: Postgresql capabilities question
I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24 tables from this old db can be combined in the new database into one table, and it would be a bit more elegant to do this. However, the combined table would be around 95000 rows in size. Having never really used Postgresql in the past, and unable to find a datapoint on the web, I would really like to get input from current users. Is this an unreasonable table size to expect good performance when the PHP app driving it gets a reasonable amount of traffic? I know performance is also heavily dependent on indexes and query structure, but disregarding either of those for the sake of argument, would I be better off keeping the tables separate, or is 95000 not something to worry about? btw, most tables in this database are quite small (<2000). My redesign would create two tables in the +90000 range, but less than 100000. Thanks very much for your input. John
On Wed, Apr 02, 2003 at 07:33:46PM -0500, John Wells wrote: > I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24 > tables from this old db can be combined in the new database into one > table, and it would be a bit more elegant to do this. > > However, the combined table would be around 95000 rows in size. > > Having never really used Postgresql in the past, and unable to find a > datapoint on the web, I would really like to get input from current users. > Is this an unreasonable table size to expect good performance when the > PHP app driving it gets a reasonable amount of traffic? I know > performance is also heavily dependent on indexes and query structure, but > disregarding either of those for the sake of argument, would I be better > off keeping the tables separate, or is 95000 not something to worry about? > btw, most tables in this database are quite small (<2000). My redesign > would create two tables in the +90000 range, but less than 100000. > > Thanks very much for your input. I have a number of 1,000,000-plus row tables (very plus in some cases) running on some nasty low-end (Celerons with 5400rpm IDE drives, Netras) and performance is quite adequate for typical use. Cheers, Steve
You're absolutely correct that there are *many* other factors that determine performance aside from row count. That being said, I have table with over a million entries on actively used systems that perform really well with queries utilize and index and acceptably well on queries that require a sequential scan. > Having never really used Postgresql in the past, and unable to find a > datapoint on the web, I would really like to get input from current users. > Is this an unreasonable table size to expect good performance when the > PHP app driving it gets a reasonable amount of traffic? I know > performance is also heavily dependent on indexes and query structure, but > disregarding either of those for the sake of argument, would I be better > off keeping the tables separate, or is 95000 not something to worry about? > btw, most tables in this database are quite small (<2000). My redesign > would create two tables in the +90000 range, but less than 100000.
If the queries are selective and typically indexed, or you have enough RAM then I'd say there won't be much difference between splitting the tables or combining them into one. 95000 rows could be considered small. You might even have enough RAM to cache the whole DB- can estimate from DB footprint on disk. Link. At 07:33 PM 4/2/03 -0500, John Wells wrote: >I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24 >tables from this old db can be combined in the new database into one >table, and it would be a bit more elegant to do this. > >However, the combined table would be around 95000 rows in size. > >PHP app driving it gets a reasonable amount of traffic? I know >performance is also heavily dependent on indexes and query structure, but >disregarding either of those for the sake of argument, would I be better >off keeping the tables separate, or is 95000 not something to worry about? > btw, most tables in this database are quite small (<2000). My redesign >would create two tables in the +90000 range, but less than 100000.
On Wed, 2 Apr 2003, Steve Atkins wrote: > On Wed, Apr 02, 2003 at 07:33:46PM -0500, John Wells wrote: > > I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24 > > tables from this old db can be combined in the new database into one > > table, and it would be a bit more elegant to do this. > > > > However, the combined table would be around 95000 rows in size. Almost laughably small :) > > > > Having never really used Postgresql in the past, and unable to find a > > datapoint on the web, I would really like to get input from current users. > > Is this an unreasonable table size to expect good performance when the > > PHP app driving it gets a reasonable amount of traffic? I know > > performance is also heavily dependent on indexes and query structure, but > > disregarding either of those for the sake of argument, would I be better > > off keeping the tables separate, or is 95000 not something to worry about? > > btw, most tables in this database are quite small (<2000). My redesign > > would create two tables in the +90000 range, but less than 100000. > > > > Thanks very much for your input. > > I have a number of 1,000,000-plus row tables (very plus in some cases) > running on some nasty low-end (Celerons with 5400rpm IDE drives, Netras) > and performance is quite adequate for typical use. > Yeah, it's those sequential and tsearch index scans that kill it but selective queries fly. -- Nigel J. Andrews
On Wed, 2 Apr 2003, John Wells wrote: > I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24 > tables from this old db can be combined in the new database into one > table, and it would be a bit more elegant to do this. > > However, the combined table would be around 95000 rows in size. > > Having never really used Postgresql in the past, and unable to find a > datapoint on the web, I would really like to get input from current users. > Is this an unreasonable table size to expect good performance when the > PHP app driving it gets a reasonable amount of traffic? I know > performance is also heavily dependent on indexes and query structure, but > disregarding either of those for the sake of argument, would I be better > off keeping the tables separate, or is 95000 not something to worry about? > btw, most tables in this database are quite small (<2000). My redesign > would create two tables in the +90000 range, but less than 100000. We use postgresql for lots of stuff that's more than 90,000 rows. We have a table we use to merge log files from multiple servers and then sort them by time. Here's an example of it: \d logs Table "logs" Column | Type | Modifiers ----------+---------+---------------------------------------------------- logid | integer | not null default nextval('"logs_logid_seq"'::text) linedate | integer | line | text | Primary key: logs_pkey select count(*) from logs; count -------- 602346 (1 row) explain analyze select count(*) from logs; NOTICE: QUERY PLAN: Aggregate (cost=334868.46..334868.46 rows=1 width=0) (actual time=2463.31..2463.31 rows=1 loops=1) -> Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=0) (actual time=0.02..1592.28 rows=602346 loops=1) Total runtime: 2463.43 msec (2.5 seconds to seq scan the whole table); -- select one record explain analyze select logid from logs limit 1 offset 100000; NOTICE: QUERY PLAN: Limit (cost=5594.04..5594.09 rows=1 width=4) (actual time=394.75..394.76 rows=1 loops=1) -> Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=4) (actual time=0.02..307.74 rows=100002 loops=1) Total runtime: 394.86 msec -- grab every row in the table explain analyze select * from logs; NOTICE: QUERY PLAN: Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=341) (actual time=0.03..3284.99 rows=602346 loops=1) Total runtime: 3881.38 msec -- the size of this table on the hard drive is 226,689,024 bytes. It's one of the smaller tables we play with on this server. Please note that this is our well tuned production server. A workstation with default settings and less memory might not be quite as fast as that, but you're still looking at sub second response times on well indexed datasets with less than say 5,000,000 rows or so.