Re: Streaming large data into postgres [WORM like applications] - Mailing list pgsql-general

From Alban Hertroys
Subject Re: Streaming large data into postgres [WORM like applications]
Date
Msg-id 46485EB7.80103@magproductions.nl
Whole thread Raw
In response to Re: Streaming large data into postgres [WORM like applications]  ("John D. Burger" <john@mitre.org>)
List pgsql-general
John D. Burger wrote:
> Dhaval Shah wrote:
>
>> 2. Most of the streamed rows are very similar. Think syslog rows,
>> where for most cases only the timestamp changes. Of course, if the
>> data can be compressed, it will result in improved savings in terms of
>> disk size.
>
> If it really is usually just the timestamp that changes, one way to
> "compress" such data might be to split your logical row into two
> tables.  First table has all the original columns but the timestanp,
> plus an ID.  Second table has the  timestamp and a foreign key into the
> first table.  Depending on how wide your original row is, and how often
> it's only the timestamp that changes, this could result in decent
> "compression".
>
> Of course, now you need referential integrity.

I thought of something similar. Maybe you could put those timestamps in
an array column; saves you a referential integrity check that you don't
seem to need very much.

OTOH, _if_ your log messages(?) look very similar each time, you may be
able to turn the problem around; you store unique log messages, with the
timestamps that they occured on. That way you rarely need to store more
than a timestamp. It'll add time to look up the matching log message
(there'll be quite a few less of them though). I'm quite confident
you'll save time inserting records this way, although that's hand waving
at this point.

You may be able to parameterize some log messages and store the
parameters with the timestamps. Takes a bit more processing though
(regular expression matches maybe?), and you'll want to now all
different log message permutations beforehand.

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

pgsql-general by date:

Previous
From: Ron Johnson
Date:
Subject: Re: primary key index
Next
From: Tom Lane
Date:
Subject: Re: EXCEPTION clause not identified