Thread: Live sort-of-'warehousing' database how-to?
Suppose I have 'stupid' database with just one tables, like this: CREATE TABLE messages ( message_id uuid NOT NULL PRIMARY KEY, message_time_created timestamp with time zone NOT NULL, message_phone_number character varying NOT NULL, message_state type_some_state_enum NOT NULL, message_value numeric(10,4) ) Now, let's say that I end up with around 1.000.000 records each week. I actually need just last week or two worth of data for the whole system to function normaly. But, sometimes I do need to peek into 'messages' for some old message, let's say a year old. So I would like to keep 'running' messages on the 'main' server, and keep there a month worth of data. On the 'auxiliary' server I'd like to keep all the data. (Messages on the 'auxiliary' server are in the final state, no change to that data will ever be made). Is there a solution to achieve something like that. It is fairly easy to implement something like INSERT INTO auxilary.database.messages SELECT * FROM main.database.messagaes WHERE message_id NOT IN (SELECT message_id FROM auxilary.database.messages....) using python/dblink or something like that. But, is there already a solution that would do something like that? Or is there a better way to achieve desired functionality? Mike
> -----Original Message----- > From: Mario Splivalo [mailto:mario.splivalo@megafon.hr] > Sent: Wednesday, March 31, 2010 10:20 AM > To: pgsql-admin@postgresql.org > Subject: Live sort-of-'warehousing' database how-to? > > Suppose I have 'stupid' database with just one tables, like this: > > CREATE TABLE messages ( > message_id uuid NOT NULL PRIMARY KEY, > message_time_created timestamp with time zone NOT NULL, > message_phone_number character varying NOT NULL, > message_state type_some_state_enum NOT NULL, > message_value numeric(10,4) > ) > > Now, let's say that I end up with around 1.000.000 records > each week. I actually need just last week or two worth of > data for the whole system to function normaly. > > But, sometimes I do need to peek into 'messages' for some old > message, let's say a year old. > > So I would like to keep 'running' messages on the 'main' > server, and keep there a month worth of data. On the > 'auxiliary' server I'd like to keep all the data. (Messages > on the 'auxiliary' server are in the final state, no change > to that data will ever be made). > > Is there a solution to achieve something like that. It is > fairly easy to implement something like > > INSERT INTO auxilary.database.messages > SELECT * FROM main.database.messagaes > WHERE message_id NOT IN (SELECT message_id FROM > auxilary.database.messages....) > > using python/dblink or something like that. But, is there > already a solution that would do something like that? > > Or is there a better way to achieve desired functionality? > > Mike > Partition your MESSAGES table by week or month (read on table partitioning in PG docs). Pg_dump "old" purtitions from "current" server, when they are not needed any more. Move backups of dumped partitions to your "auxilary" server, and pg_restore them there. Igor Neyman
Igor Neyman wrote: > > Partition your MESSAGES table by week or month (read on table > partitioning in PG docs). > > Pg_dump "old" purtitions from "current" server, when they are not needed > any more. > Move backups of dumped partitions to your "auxilary" server, and > pg_restore them there. Hm. I never actually wanted to use partitioning (and I always thought my databases are small in size and don't need partitioning) because of what seems to be awful overhead of maintaining insert rules or insert triggers. But your idea sound plausible for exactly what i need. The only problem is that on the "auxiliary" server I wouldn't have the last month/week or so, and I'd love to so every day or maybe every hour. Will see, thank you! :) Mike