Re: Weird procedure question - Mailing list pgsql-general

From Tim Cross
Subject Re: Weird procedure question
Date
Msg-id 87h8id2p62.fsf@gmail.com
Whole thread Raw
In response to Weird procedure question  (digimer <lists@alteeve.ca>)
Responses Re: Weird procedure question
List pgsql-general
digimer <lists@alteeve.ca> writes:

> Hi all,
>
>   I've got an interesting use case that I am stuck on. It's a bit of a 
> complicated environment, but I'll try to keep it simple.
>
>   In short; I have a history schema that has tables that match the 
> public schema, plus one 'history_id' column that has a simple sequential 
> bigserial value. Then I have a procedure and trigger that, on UPDATE or 
> INSERT, copies the data to history. Example use case is that I can 
> UPDATE a sensor value in the public table and it's also INSERTs the data 
> into history. So public shows just the most recent values, but I can see 
> changes over time in the history schema.
>
>   I have built my system to support writing to one or more DBs. I keep 
> a list of connected DBs and send INSERT/UPDATE calls to a method that 
> then runs the UPDATE/INSERT against all connected databases, as a form 
> of redundancy. This all works fine.
>
>   The problem I've hit is that the 'history_id' differs between the 
> various databases. So I want to switch this to 'history_uuid' and use 
> UUIDs instead of bigserial.
>
>   Now the question;
>
>   Can I tell a produce to use a specific UUID?
>
>   The idea is to generate a UUID for 'history_uuid' so that I have 
> consistency across databases. Of course, if an UPDATE will change 
> multiple rows, then I'll need to predefine multiple UUIDs. This is where 
> things start to get really complicated I think... Maybe I could pass an 
> array of UUIDs? I don't care if I find out which UUID was used for which 
> record, just that the same UUID was used for the same record when the 
> procedure is (re)run on other DBs.
>
>   The databases are not clustered, on purpose. I've been trying to 
> handle all the HA stuff in my application for various reasons.
>
> If it helps, here is an example pair of tables, the procedure and the 
> trigger I currently use;
>
> ====
> CREATE TABLE host_variable (
>   host_variable_uuid uuid not null primary key,
>   host_variable_host_uuid uuid not null,
>   host_variable_name text not null,
>   host_variable_value text not null,
>   modified_date timestamp with time zone not null
> );
> ALTER TABLE host_variable OWNER TO admin;
>
> CREATE TABLE history.host_variable (
>   history_id bigserial,
>   host_variable_uuid uuid,
>   host_variable_host_uuid uuid,
>   host_variable_name text,
>   host_variable_value text,
>   modified_date timestamp with time zone not null
> );
> ALTER TABLE history.host_variable OWNER TO admin;
>
> CREATE FUNCTION history_host_variable() RETURNS trigger
> AS $$
> DECLARE
>   history_host_variable RECORD;
> BEGIN
>   SELECT INTO history_host_variable * FROM host_variable WHERE 
> host_uuid = new.host_uuid;
>   INSERT INTO history.host_variable
>   (host_variable_uuid,
>   host_variable_host_uuid,
>   host_variable_name,
>   host_variable_value,
>   modified_date)
>   VALUES
>   (history_host_variable.host_variable_uuid,
>   history_host_variable.host_variable_host_uuid,
>   history_host_variable.host_variable_name,
>   history_host_variable.host_variable_value,
>   history_host_variable.modified_date);
>   RETURN NULL;
> END;
> $$
> LANGUAGE plpgsql;
> ALTER FUNCTION history_host_variable() OWNER TO admin;
>
> CREATE TRIGGER trigger_host_variable
>   AFTER INSERT OR UPDATE ON host_variable
>   FOR EACH ROW EXECUTE PROCEDURE history_host_variable();
> ====
>
>   I know this might sound odd, but I didn't want to complicate things 
> with how my system works. However, if it would help solve the problem, 
> I'm happy to dig into more detail.
>
>   Thanks!

I think James has probably given you the input you need - basically,
don't allow the system to automatically set the modified time - make
that parameter to your function or set that value before the copy to the
history tables - content would then be the same, so uuid v3 should work.

However, I do think you have another big problem lurking in the
shadows. What happens if any of your connected databases are unavailable
or unreachable for a period of time? I suspect your going to run into
update anomalies and depending on your setup/environment, possibly even
partitioning problems (depending on number of clients and typology
etc). These are well known problems in distributed or replication
systems.

You appear to be implementing a 'poor mans' replication system. There
are lots of complex issues to deal with and I wonder why you want to
take them on when PG has already got well tested and robust solutions
for this that would simplify your architecture and avoid the need to
re-implement functionality which already exists?

regards,

Tim

-- 
Tim Cross


pgsql-general by date:

Previous
From: Tory M Blue
Date:
Subject: Re: Out of Memory
Next
From: digimer
Date:
Subject: Re: Weird procedure question