Thread: Live sort-of-'warehousing' database how-to?

Live sort-of-'warehousing' database how-to?

From
Mario Splivalo
Date:
Suppose I have 'stupid' database with just one tables, like this:

CREATE TABLE messages (
    message_id uuid NOT NULL PRIMARY KEY,
    message_time_created timestamp with time zone NOT NULL,
    message_phone_number character varying NOT NULL,
    message_state type_some_state_enum NOT NULL,
    message_value numeric(10,4)
)

Now, let's say that I end up with around 1.000.000 records each week. I
actually need just last week or two worth of data for the whole system
to function normaly.

But, sometimes I do need to peek into 'messages' for some old message,
let's say a year old.

So I would like to keep 'running' messages on the 'main' server, and
keep there a month worth of data. On the 'auxiliary' server I'd like to
keep all the data. (Messages on the 'auxiliary' server are in the final
state, no change to that data will ever be made).

Is there a solution to achieve something like that. It is fairly easy to
implement something like

INSERT INTO auxilary.database.messages
SELECT * FROM main.database.messagaes
    WHERE message_id NOT IN (SELECT message_id FROM
auxilary.database.messages....)

using python/dblink or something like that. But, is there already a
solution that would do something like that?

Or is there a better way to achieve desired functionality?

    Mike



Re: Live sort-of-'warehousing' database how-to?

From
"Igor Neyman"
Date:

> -----Original Message-----
> From: Mario Splivalo [mailto:mario.splivalo@megafon.hr]
> Sent: Wednesday, March 31, 2010 10:20 AM
> To: pgsql-admin@postgresql.org
> Subject: Live sort-of-'warehousing' database how-to?
>
> Suppose I have 'stupid' database with just one tables, like this:
>
> CREATE TABLE messages (
>     message_id uuid NOT NULL PRIMARY KEY,
>     message_time_created timestamp with time zone NOT NULL,
>     message_phone_number character varying NOT NULL,
>     message_state type_some_state_enum NOT NULL,
>     message_value numeric(10,4)
> )
>
> Now, let's say that I end up with around 1.000.000 records
> each week. I actually need just last week or two worth of
> data for the whole system to function normaly.
>
> But, sometimes I do need to peek into 'messages' for some old
> message, let's say a year old.
>
> So I would like to keep 'running' messages on the 'main'
> server, and keep there a month worth of data. On the
> 'auxiliary' server I'd like to keep all the data. (Messages
> on the 'auxiliary' server are in the final state, no change
> to that data will ever be made).
>
> Is there a solution to achieve something like that. It is
> fairly easy to implement something like
>
> INSERT INTO auxilary.database.messages
> SELECT * FROM main.database.messagaes
>     WHERE message_id NOT IN (SELECT message_id FROM
> auxilary.database.messages....)
>
> using python/dblink or something like that. But, is there
> already a solution that would do something like that?
>
> Or is there a better way to achieve desired functionality?
>
>     Mike
>

Partition your MESSAGES table by week or month  (read on table
partitioning in PG docs).

Pg_dump "old" purtitions from "current" server, when they are not needed
any more.
Move backups of dumped partitions to your "auxilary" server, and
pg_restore them there.

Igor Neyman

Re: Live sort-of-'warehousing' database how-to?

From
Mario Splivalo
Date:
Igor Neyman wrote:
>
> Partition your MESSAGES table by week or month  (read on table
> partitioning in PG docs).
>
> Pg_dump "old" purtitions from "current" server, when they are not needed
> any more.
> Move backups of dumped partitions to your "auxilary" server, and
> pg_restore them there.

Hm. I never actually wanted to use partitioning (and I always thought my
databases are small in size and don't need partitioning) because of what
seems to be awful overhead of maintaining insert rules or insert
triggers. But your idea sound plausible for exactly what i need.
The only problem is that on the "auxiliary" server I wouldn't have the
last month/week or so, and I'd love to so every day or maybe every hour.
Will see, thank you! :)

    Mike