Re: Query on postgres_fdw extension - Mailing list pgsql-general

From Vijaykumar Jain
Subject Re: Query on postgres_fdw extension
Date
Msg-id CAM+6J97sFkWdk1xXwxZA8mvYE-MQRvGjP1YiwMn3zoFPsAH7Ow@mail.gmail.com
Whole thread Raw
In response to Re: Query on postgres_fdw extension  (Duarte Carreira <dncarreira@gmail.com>)
Responses Re: Query on postgres_fdw extension
List pgsql-general
On Thu, 20 Jan 2022 at 21:29, Duarte Carreira <dncarreira@gmail.com> wrote:
Hello everyone.

I don't know... realistically what do you guys see as a best/simple approach? 

We implemented a custom sharding (directory sharding with lookup tables) layer of 10 shards, but it was write local, read global.
the api was responsible for all rebalancing incase of hotspots.
other api sharding examples ...

 although it worked really well, when you are maintaining it on your own, it gets really painful, much beyond id generation globally.

i will not go into the details, but in short, sharded setup is not the same as local setup. there would be many more things that would not work as expected
which would otherwise work really well on a standalone setup.

writes over shard may work, but you realize it is over the network, so you can lock you table for a much longer duration and cause a much more serious outage,
if you really wanted to have distributed writes with unique keys, you can go with uuid i think or have your own seq generator globally (see below).

Move ID generation out of the database to an ID generation service outside of the database… As soon as a piece of work enters their system, an ID gets assigned to it… and that ID generated in a way that is known to be globally unique within their system

Index of /shard_manager/shard_manager-0.0.1/ (pgxn.org)  (pretty old but if you can use your coordinator server as a id_generator(), then you can generate ids which are globally unique)

imho, do not try sharding manually, unless you have enough dbas to maintain the shards, try using citus, it would make a lot of the manual stuff easier.

also, the below work arounds are bad, incase you just want to rush through

postgres=# \c localdb
You are now connected to database "localdb" as user "postgres".
localdb=#
localdb=# \dt
Did not find any relations.
localdb=# \det
     List of foreign tables
 Schema | Table |    Server
--------+-------+---------------
 public | t     | remote_server
(1 row)

localdb=# \det+ t
                                List of foreign tables
 Schema | Table |    Server     |              FDW options               | Description
--------+-------+---------------+----------------------------------------+-------------
 public | t     | remote_server | (schema_name 'public', table_name 't') |
(1 row)

localdb=# \det t
     List of foreign tables
 Schema | Table |    Server
--------+-------+---------------
 public | t     | remote_server
(1 row)

localdb=# create or replace function getnext() returns int as $_$ select id FROM dblink ('dbname = remotedb', $$ select nextval('t_id_seq') $$  ) as  newtable(id int); $_$ language sql;
CREATE FUNCTION
localdb=# \c remotedb
You are now connected to database "remotedb" as user "postgres".
remotedb=# \dt t
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | t    | table | postgres
(1 row)

remotedb=# \ds t_id_seq
            List of relations
 Schema |   Name   |   Type   |  Owner
--------+----------+----------+----------
 public | t_id_seq | sequence | postgres
(1 row)

remotedb=# \c localdb
You are now connected to database "localdb" as user "postgres".
localdb=# insert into t values (getnext(), 100);
INSERT 0 1
localdb=# insert into t values (getnext(), 100);
INSERT 0 1
localdb=# select * from t;
 id | col1
----+------
 11 |    4
 12 |    5
 13 |  100
 14 |  100
(4 rows)

just my opinion, ignore it not useful.


pgsql-general by date:

Previous
From: Garfield Lewis
Date:
Subject: Re: [EXT] Re: Can we get the CTID value
Next
From: Tom Lane
Date:
Subject: Re: [EXT] Re: Can we get the CTID value