Thread: Keeping state in a foreign data wrapper

Keeping state in a foreign data wrapper

From
Stelios Sfakianakis
Date:
Hi,

I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for
examplethe total number of requests submiited. Ideally these data should be kept “per-user” and of course with the
properlocking to eliminate race conditions. So the question I have is how to design such a global (but per user and
database)state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private
fieldsin the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another
possiblityseems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is
evenless clear how to do it. 

So is there any simple way to do implement such shared thread/process safe state?
Thank you
Stelios






Re: Keeping state in a foreign data wrapper

From
Ian Lawrence Barwick
Date:
2020年8月4日(火) 1:24 Stelios Sfakianakis <sgsfak@gmail.com>:
>
> Hi,
>
> I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for
examplethe total number of requests submiited. Ideally these data should be kept “per-user” and of course with the
properlocking to eliminate race conditions. So the question I have is how to design such a global (but per user and
database)state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private
fieldsin the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another
possiblityseems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is
evenless clear how to do it. 

Shared memory would probably work; you'd need to load the FDW via
"shared_preload_libraries" and have the FDW handler function(s) update
shared memory with whatever statistics you want to track. You could
then define SQL functions to retrieve the stored values, and possibly
persist them over server restarts by storing/retrieving them from a
file.

Look at "pg_stat_statements" for a good example of how to do that kind of thing.

Regards

Ian Barwick

--
Ian Barwick                   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Keeping state in a foreign data wrapper

From
Stelios Sfakianakis
Date:
Thank you Ian for the prompt reply! I will certainly have a look at pg_stat_statements

I also came across mysql_fdw (https://github.com/EnterpriseDB/mysql_fdw) that features a connection pool shared across queries. It uses a hash table with the serverid and userid as lookup key : https://github.com/EnterpriseDB/mysql_fdw/blob/REL-2_5_4/connection.c#L55 The hash table is allocated in the cache memory context but it worries me that 1) no locks are used, 2) the "ConnectionHash" variable is declared static so in the multi-process architecture of Postgres could have been the case that multiple copies of this exist when the shared library of mysql_fdw is loaded?

Best regards
Stelios

On 4 Aug 2020, at 06:25, Ian Lawrence Barwick <barwick@gmail.com> wrote:

2020年8月4日(火) 1:24 Stelios Sfakianakis <sgsfak@gmail.com>:

Hi,

I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for example the total number of requests submiited. Ideally these data should be kept “per-user” and of course with the proper locking to eliminate race conditions. So the question I have is how to design such a global (but per user and database) state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private fields in the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another possiblity seems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even less clear how to do it.

Shared memory would probably work; you'd need to load the FDW via
"shared_preload_libraries" and have the FDW handler function(s) update
shared memory with whatever statistics you want to track. You could
then define SQL functions to retrieve the stored values, and possibly
persist them over server restarts by storing/retrieving them from a
file.

Look at "pg_stat_statements" for a good example of how to do that kind of thing.

Regards

Ian Barwick

--
Ian Barwick                   https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Keeping state in a foreign data wrapper

From
Ian Lawrence Barwick
Date:
2020年8月4日(火) 14:54 Stelios Sfakianakis <sgsfak@gmail.com>:
> On 4 Aug 2020, at 06:25, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> 2020年8月4日(火) 1:24 Stelios Sfakianakis <sgsfak@gmail.com>:
>
> Hi,
>
>>> I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for
examplethe total number of requests submiited. Ideally these data should be kept “per-user” and of course with the
properlocking to eliminate race conditions. So the question I have is how to design such a global (but per user and
database)state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private
fieldsin the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another
possiblityseems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is
evenless clear how to do it. 
>
>
>> Shared memory would probably work; you'd need to load the FDW via
>> "shared_preload_libraries" and have the FDW handler function(s) update
>> shared memory with whatever statistics you want to track. You could
>> then define SQL functions to retrieve the stored values, and possibly
>> persist them over server restarts by storing/retrieving them from a
>> file.
>
>> Look at "pg_stat_statements" for a good example of how to do that kind of thing.
>
> Thank you Ian for the prompt reply! I will certainly have a look at pg_stat_statements
>
> I also came across mysql_fdw (https://github.com/EnterpriseDB/mysql_fdw) that features a connection pool shared
acrossqueries. It uses a hash table with the serverid and userid as lookup key :
https://github.com/EnterpriseDB/mysql_fdw/blob/REL-2_5_4/connection.c#L55

This is essentially the same as what "postgres_fdw" and similar FDW
implementations do.

> The hash table is allocated in the cache memory context but it worries me that 1) no locks are used, 2) the
"ConnectionHash"variable is declared static so in the multi-process architecture of Postgres could have been the case
thatmultiple copies of this exist when the shared library of mysql_fdw is loaded? 

The hash table is specific to each running backend so will only be
accessed by that process.

Pre-loading a shared library just gives the library an opportunity to
set up shared memory etc. You can always try adding one of the FDW
libraries to "shared_preload_libraries" and see what happens
(theoretically nothing).

Regards

Ian Barwick



>
> Best regards
> Stelios
>
>
> Regards
>
> Ian Barwick
>
> --
> Ian Barwick                   https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>



Re: Keeping state in a foreign data wrapper

From
Stelios Sfakianakis
Date:
Thank you again, I have another question in order to make sure I have a clear understanding:


> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> The hash table is specific to each running backend so will only be
> accessed by that process.
>
> Pre-loading a shared library just gives the library an opportunity to
> set up shared memory etc. You can always try adding one of the FDW
> libraries to "shared_preload_libraries" and see what happens
> (theoretically nothing).
>

My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in the
backend(https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not global
perse and shared among the different client / users sessions. But that defeats the purpose, no?  

Thank you
Best
Stelios





Re: Keeping state in a foreign data wrapper

From
Ian Barwick
Date:
On 2020/08/04 19:21, Stelios Sfakianakis wrote:
 > Thank you again, I have another question in order to make sure I have a clear understanding:
 >
 >
 >> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <barwick@gmail.com> wrote:
 >>
 >> The hash table is specific to each running backend so will only be
 >> accessed by that process.
 >>
 >> Pre-loading a shared library just gives the library an opportunity to
 >> set up shared memory etc. You can always try adding one of the FDW
 >> libraries to "shared_preload_libraries" and see what happens
 >> (theoretically nothing).
 >>
 >
 > My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in
thebackend (https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not
globalper se and shared among the different client / users sessions.
 

Correct, the connections are specific to each individual backend.

 > But that defeats the purpose, no?

The purpose is to cache connections within the session, to avoid the overhead
of reconnecting to the remote server each time a query for that server is issued
in that session.


Regards

Ian Barwick




-- 
Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: Keeping state in a foreign data wrapper

From
Ian Barwick
Date:
On 2020/08/04 19:21, Stelios Sfakianakis wrote:
 > Thank you again, I have another question in order to make sure I have a clear understanding:
 >
 >
 >> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <barwick@gmail.com> wrote:
 >>
 >> The hash table is specific to each running backend so will only be
 >> accessed by that process.
 >>
 >> Pre-loading a shared library just gives the library an opportunity to
 >> set up shared memory etc. You can always try adding one of the FDW
 >> libraries to "shared_preload_libraries" and see what happens
 >> (theoretically nothing).
 >>
 >
 > My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in
thebackend (https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not
globalper se and shared among the different client / users sessions.
 

Correct, the connections are specific to each individual backend.

 > But that defeats the purpose, no?

The purpose is to cache connections within the session, to avoid the overhead
of reconnecting to the remote server each time a query for that server is issued
in that session.


Regards

Ian Barwick



-- 
Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services