Re: Compare rows - Mailing list pgsql-performance

From Greg Spiegelberg
Subject Re: Compare rows
Date
Msg-id 3F8451EE.2050006@cranel.com
Whole thread Raw
In response to Re: Compare rows  (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>)
List pgsql-performance
See below.


Shridhar Daithankar wrote:
> Greg Spiegelberg wrote:
>
>> The data represents metrics at a point in time on a system for
>> network, disk, memory, bus, controller, and so-on.  Rx, Tx, errors,
>> speed, and whatever else can be gathered.
>>
>> We arrived at this one 642 column table after testing the whole
>> process from data gathering, methods of temporarily storing then
>> loading to the database.  Initially, 37+ tables were in use but
>> the one big-un has saved us over 3.4 minutes.
>
>
> I am sure you changed the desing because those 3.4 minutes were
> significant to you.
>
>
> But I suggest you go back to 37 table design and see where bottleneck
> is. Probably you can tune a join across 37 tables much better than
> optimizing a difference between two 637 column rows.

The bottleneck is across the board.

On the data collection side I'd have to manage 37 different methods
and output formats whereas now I have 1 standard associative array
that gets reset in memory for each "row" stored.

On the data validation side, I have one routine to check the incoming
data for errors, missing columns, data types and so on.  Quick & easy.

On the data import it's easier and more efficient to do one COPY for
a standard format from one program instead of multiple programs or
COPY's.  We were using 37 PHP scripts to handle the import and the
time it took to load, execute, exit, reload each script was killing
us.  Now, 1 PHP and 1 COPY.


> Besides such a large number of columns will cost heavily in terms of
> defragmentation across pages. The wasted space and IO therof could be
> significant issue for large number of rows.

No arguement here.


> 642 column is a bad design. Theoretically and from implementation of
> postgresql point of view. You did it because of speed problem. Now if we
> can resolve those speed problems, perhaps you could go back to other
> design.
>
> Is it feasible for you right now or you are too much committed to the
> big table?

Pretty commited though I do try to be open.

Greg

--
Greg Spiegelberg
  Sr. Product Development Engineer
  Cranel, Incorporated.
  Phone: 614.318.4314
  Fax:   614.431.8388
  Email: gspiegelberg@Cranel.com
Cranel. Technology. Integrity. Focus.



pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: PostgreSQL vs. MySQL
Next
From: Josh Berkus
Date:
Subject: Re: PostgreSQL vs. MySQL