replication woes - Mailing list pgsql-general

From Justin Banks
Subject replication woes
Date
Msg-id 14932.63643.338566.217197@flotsam.cops.wamnet.com
Whole thread Raw
List pgsql-general
Hello -

I'm hoping this is a better place for this topic than -hackers, so
here goes.

Having gotten rather tired of waiting for the erserver thing, and
needing replication, I rolled my own. Essentially, I modified several
header files (removing static declarations on functions), added some
command line options to postgres, and hacked tcop/postgres.c to do
unidirectional synchronous replication. It doesn't work with anything
except CMD_UPDATE, CMD_INSERT, and CMD_DELETE (because that's all I
need it to do ;). If a query could affect repliation (via the above
mentioned command types), postgres dlopen(3C)s a .so (specified on
the command line), and uses dlsym(3C) to look for a function called
'replication'. If it finds one, it passes the connect string that was
used to open the original connection and the query string. The .so can
do anything as long as it contains that symbol, and in my case, it's
looks in a cache pool of connections for one that matches the one
specified in the connect string. If it finds one, it uses that one,
and if not, it connects. Then, it attempts to perform the query on the
replicant slave. If it succeeds, all is well, and if not, the original
change is not applied.

It works beautifully for what I need it to do (I'm not using copy or
any CMD_UTILITY), but it's slow. I hesitate to spam the world with my
modifications, but I was hoping that someone out there could say what
could be making it slow (by a *great* deal)? Essentially, it's just a
libpq-fe connection from the replicant master to the replicant slave,
applying modifications as they arrive locally. I could obviously make
it asynch., but even though it's slow, it's workable for my purposes,
and I don't want to put a huge amount of work into it just to have it
not get any better, if that's not where the trouble is.

Any pointers? I'm happy to provide code, but I was hoping for a
general idea on where to start looking. The code block doing the
replication is quick enough to make time() calls meaningless, so I
don't think that's where it is.

Thanks much,

-justinb

--
Justin Banks - WAM!NET Inc., Eagan MN justinb@wamnet.com
"Big Brother does not watch us, by his choice. We watch him, by ours."
  -- Neil Postman, Amusing Ourselves to Death


pgsql-general by date:

Previous
From: "Oliver Elphick"
Date:
Subject: Re: Inheritance and Foreign Keys
Next
From: Nelio Alves Pereira Filho
Date:
Subject: Re: Inheritance and Foreign Keys