Thread: Create a Foreign Table for PostgreSQL CSV Logs

Create a Foreign Table for PostgreSQL CSV Logs

From
PG Doc comments form
Date:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/12/runtime-config-logging.html
Description:

There is some imperfection  in PostgreSQL doc. There are two section in
documentation how to import PostgreSQL log into sql space and both not
complete. One of them is:
https://www.postgresql.org/docs/current/file-fdw.html
Example F.1. Create a Foreign Table for PostgreSQL CSV Logs
Here is described how to config file-fdw to get access to the PostgreSQL
logs, but there is not description or a link how to config PostgreSQL log
for this.

Other section is:
https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG
19.8.4. Using CSV-Format Log Output
Here is described how to configure PostgreSQL log to import it into
previously created ordinary table by COPY command. May be this configuration
is compatible with "Example F.1. Create a Foreign Table for PostgreSQL CSV
Logs", but this method is not mentioned.

My idea to improve documentation by adding cross link: to section «Example
F.1. Create a Foreign Table for PostgreSQL CSV Logs» add link to «19.8.4.
Using CSV-Format Log Output» as example of log configuration and to section
«19.8.4. Using CSV-Format Log Output» add link to «Example F.1. Create a
Foreign Table for PostgreSQL CSV Logs» as one more example how PostgreSQL
logs can be imported inside PostgreSQL.

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Bruce Momjian
Date:
On Fri, Aug 21, 2020 at 09:25:20AM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/12/runtime-config-logging.html
> Description:
> 
> There is some imperfection  in PostgreSQL doc. There are two section in
> documentation how to import PostgreSQL log into sql space and both not
> complete. One of them is:
> https://www.postgresql.org/docs/current/file-fdw.html
> Example F.1. Create a Foreign Table for PostgreSQL CSV Logs
> Here is described how to config file-fdw to get access to the PostgreSQL
> logs, but there is not description or a link how to config PostgreSQL log
> for this.
> 
> Other section is:
> https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG
> 19.8.4. Using CSV-Format Log Output
> Here is described how to configure PostgreSQL log to import it into
> previously created ordinary table by COPY command. May be this configuration
> is compatible with "Example F.1. Create a Foreign Table for PostgreSQL CSV
> Logs", but this method is not mentioned.
> 
> My idea to improve documentation by adding cross link: to section «Example
> F.1. Create a Foreign Table for PostgreSQL CSV Logs» add link to «19.8.4.
> Using CSV-Format Log Output» as example of log configuration and to section
> «19.8.4. Using CSV-Format Log Output» add link to «Example F.1. Create a
> Foreign Table for PostgreSQL CSV Logs» as one more example how PostgreSQL
> logs can be imported inside PostgreSQL.

Good idea.  People have been confused about this before.  Attached is a
patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee


Attachment

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
"David G. Johnston"
Date:
On Fri, Aug 21, 2020 at 2:58 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Aug 21, 2020 at 09:25:20AM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/12/runtime-config-logging.html
> Description:
>
> There is some imperfection  in PostgreSQL doc. There are two section in
> documentation how to import PostgreSQL log into sql space and both not
> complete. One of them is:
> https://www.postgresql.org/docs/current/file-fdw.html
> Example F.1. Create a Foreign Table for PostgreSQL CSV Logs
> Here is described how to config file-fdw to get access to the PostgreSQL
> logs, but there is not description or a link how to config PostgreSQL log
> for this.
>
> Other section is:
> https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG
> 19.8.4. Using CSV-Format Log Output
> Here is described how to configure PostgreSQL log to import it into
> previously created ordinary table by COPY command. May be this configuration
> is compatible with "Example F.1. Create a Foreign Table for PostgreSQL CSV
> Logs", but this method is not mentioned.
>
> My idea to improve documentation by adding cross link: to section «Example
> F.1. Create a Foreign Table for PostgreSQL CSV Logs» add link to «19.8.4.
> Using CSV-Format Log Output» as example of log configuration and to section
> «19.8.4. Using CSV-Format Log Output» add link to «Example F.1. Create a
> Foreign Table for PostgreSQL CSV Logs» as one more example how PostgreSQL
> logs can be imported inside PostgreSQL.

Good idea.  People have been confused about this before.  Attached is a
patch.

+ It is also possible to access the file as a foreign data wrapper
+        using <xref linkend="file-fdw"/>.

Seems more accurate to say "It is also possible to access the file as a foreign table, using the supplied <xref linkend="file-fdw"/> module."

The file_fdw -> config change looks good.


A bit off-topic, but since this is being touched anyway - the listing of fields in the paragraph is not particularly readable (but maybe we want to keep it for accessibility reasons?) while the CREATE TABLE statement is very readable and more accurate, though it could be better.  Adding CHECK constraints and -- comments to the CREATE TABLE command would be a welcome addition.  In particular I noticed:

paragraph: client host:port number
example: connection_from text,

could become:

connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the host and port of the client, colon-separated

I've been mentally playing around with the idea of having the Config section with the CREATE TABLE somehow describe both the plain table and foreign table variants directly and removing the example from the file_fdw section and instead leaving the cross-references in place from file_fdw to config to see the example and from config to file_fdw to get clarity on the options and the SERVER syntax.  As they are being written for copy-and-paste though, and it's not like we are going to change the format, having the table definition duplicated isn't a terrible option.  But consolidation is something to consider.

I may pick this up in the future unless someone thinks it wouldn't be a good idea.  I would be removing the paragraph of field names and make the table specification authoritative.

David J.

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Bruce Momjian
Date:
On Fri, Aug 21, 2020 at 08:41:54PM -0700, David G. Johnston wrote:
> On Fri, Aug 21, 2020 at 2:58 PM Bruce Momjian <bruce@momjian.us> wrote:
>     Good idea.  People have been confused about this before.  Attached is a
>     patch.
> 
> 
> + It is also possible to access the file as a foreign data wrapper
> +        using <xref linkend="file-fdw"/>.
> 
> Seems more accurate to say "It is also possible to access the file as a foreign
> table, using the supplied <xref linkend="file-fdw"/> module."
> 
> The file_fdw -> config change looks good.

OK, updated patch attached.

> A bit off-topic, but since this is being touched anyway - the listing of fields
> in the paragraph is not particularly readable (but maybe we want to keep it for
> accessibility reasons?) while the CREATE TABLE statement is very readable and
> more accurate, though it could be better.  Adding CHECK constraints and --
> comments to the CREATE TABLE command would be a welcome addition.  In
> particular I noticed:
> 
> paragraph: client host:port number
> example: connection_from text,
> 
> could become:
> 
> connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the host and
> port of the client, colon-separated
> 
> I've been mentally playing around with the idea of having the Config section
> with the CREATE TABLE somehow describe both the plain table and foreign table
> variants directly and removing the example from the file_fdw section and
> instead leaving the cross-references in place from file_fdw to config to see
> the example and from config to file_fdw to get clarity on the options and the
> SERVER syntax.  As they are being written for copy-and-paste though, and it's
> not like we are going to change the format, having the table definition
> duplicated isn't a terrible option.  But consolidation is something to
> consider.
> 
> I may pick this up in the future unless someone thinks it wouldn't be a good
> idea.  I would be removing the paragraph of field names and make the table
> specification authoritative.

I am a little worried about adding this since the data is generated in
an automated way, and might change, or some config value might change
its format.  I think the example is to show how to load, and adding extra
constraints would just detract from the illustration, I think.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee


Attachment

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
"David G. Johnston"
Date:
On Sat, Aug 22, 2020 at 10:51 AM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Aug 21, 2020 at 08:41:54PM -0700, David G. Johnston wrote:
> Adding CHECK constraints and --
> comments to the CREATE TABLE command would be a welcome addition.  In
> particular I noticed:
>
> paragraph: client host:port number
> example: connection_from text,
>
> could become:
>
> connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the host and
> port of the client, colon-separated

> I may pick this up in the future unless someone thinks it wouldn't be a good
> idea.  I would be removing the paragraph of field names and make the table
> specification authoritative.

I am a little worried about adding this since the data is generated in
an automated way, and might change, or some config value might change
its format.  I think the example is to show how to load, and adding extra
constraints would just detract from the illustration, I think.

Good constraints wouldn't include specific values, just general structure when appropriate.  The existing documentation already does this in the provided example - though relegating things to just comments would still be an improvement.  I see it less as an illustration in the Config section and more of a specification.  A config value changing its format is going to be something anyone using this example would have to understand and adapt to - it still seems worthwhile to have it fleshed out for the default.

The middle of the config section doesn't seem like a great place to teach that the COPY command exists.

David J.

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Олег Самойлов
Date:
There must not be constraints at all. Constraints are needed to check incoming data to the table. But here table is
read-onlyfor database! 

So all your constraints is totally useless.
> > could become:
> >
> > connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the host and
> > port of the client, colon-separated
> David J.




Re: Create a Foreign Table for PostgreSQL CSV Logs

From
"David G. Johnston"
Date:
On Mon, Aug 24, 2020 at 6:17 AM Олег Самойлов <splarv@ya.ru> wrote:
There must not be constraints at all. Constraints are needed to check incoming data to the table. But here table is read-only for database!

Please don't top-post.


So all your constraints is totally useless.
> > could become:
> >
> > connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the host and
> > port of the client, colon-separated

I'll agree that the benefit for adding the constraints to a foreign table are less than for a normal table but it is still not zero.  Constraints are also a form of documentation.  And also can be used (at least non-null ones) during optimization.

David J.

Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Bruce Momjian
Date:
On Mon, Aug 24, 2020 at 07:40:49AM -0700, David G. Johnston wrote:
> On Mon, Aug 24, 2020 at 6:17 AM Олег Самойлов <splarv@ya.ru> wrote:
> 
>     There must not be constraints at all. Constraints are needed to check
>     incoming data to the table. But here table is read-only for database!
> 
> 
> Please don't top-post.
> 
> 
> 
>     So all your constraints is totally useless.
>     > > could become:
>     > >
>     > > connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the
>     host and
>     > > port of the client, colon-separated
> 
> 
> I'll agree that the benefit for adding the constraints to a foreign table are
> less than for a normal table but it is still not zero.  Constraints are also a
> form of documentation.  And also can be used (at least non-null ones) during
> optimization.

I feel constraints are going to lose focus of what we are trying to
show.  Do the constraints actually do anything on a foreign table?  If
not, we would have to mention that here too, which might be fine.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Олег Самойлов
Date:

> 24 авг. 2020 г., в 18:07, Bruce Momjian <bruce@momjian.us> написал(а):
>
> On Mon, Aug 24, 2020 at 07:40:49AM -0700, David G. Johnston wrote:
>> On Mon, Aug 24, 2020 at 6:17 AM Олег Самойлов <splarv@ya.ru> wrote:
>>
>>    There must not be constraints at all. Constraints are needed to check
>>    incoming data to the table. But here table is read-only for database!
>>
>>
>> Please don't top-post.
>>
>>
>>
>>    So all your constraints is totally useless.
>>>> could become:
>>>>
>>>> connection_from text check(connection_from ~ '^[^:]+:[0-9]+$) -- the
>>    host and
>>>> port of the client, colon-separated
>>
>>
>> I'll agree that the benefit for adding the constraints to a foreign table are
>> less than for a normal table but it is still not zero.  Constraints are also a
>> form of documentation.  And also can be used (at least non-null ones) during
>> optimization.
>
> I feel constraints are going to lose focus of what we are trying to
> show.  Do the constraints actually do anything on a foreign table?

I'll add: "on foreign table based on read only text file without indexes, etc". :) But in this case I indeed added some
CHECK()constraints for old PostgreSQL and change them for the partition syntax sugar on new PostgreSQL to point on
differentfiles according to PostgreSQL default log config, where logs of each day of a week is kept in the different
files.For example: 

ALTER SYSTEM SET log_destination=csvlog;
SELECT pg_reload_conf();
CREATE EXTENSION file_fdw;
CREATE SERVER file_fdw FOREIGN DATA WRAPPER file_fdw;
BEGIN;
CREATE SCHEMA pglog;
CREATE TABLE pglog.pglog (
    log_time timestamp(3),
    user_name text,
    database_name text,
    process_id integer,
    connection_from text,
    session_id text,
    session_line_num bigint,
    command_tag text,
    session_start_time timestamp with time zone,
    virtual_transaction_id text,
    transaction_id bigint,
    error_severity text,
    sql_state_code text,
    message text,
    detail text,
    hint text,
    internal_query text,
    internal_query_pos integer,
    context text,
    query text,
    query_pos integer,
    location text,
    application_name text
) PARTITION BY LIST (date_part('isodow', log_time));
CREATE FOREIGN TABLE pglog.Mon
    PARTITION OF pglog.pglog FOR VALUES IN (1)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Mon.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Tue
    PARTITION OF pglog.pglog FOR VALUES IN (2)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Tue.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Wed
    PARTITION OF pglog.pglog FOR VALUES IN (3)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Wed.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Thu
    PARTITION OF pglog.pglog FOR VALUES IN (4)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Thu.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Fri
    PARTITION OF pglog.pglog FOR VALUES IN (5)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Fri.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Sat
    PARTITION OF pglog.pglog FOR VALUES IN (6)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Sat.csv', format 'csv' );
CREATE FOREIGN TABLE pglog.Sun
    PARTITION OF pglog.pglog FOR VALUES IN (7)
    SERVER file_fdw
    OPTIONS ( filename '/var/lib/pgsql/12/data/log/postgresql-Sun.csv', format 'csv' );
COMMIT;

But there is another point of view. The same table schema used now not only for file_fdw access to the log, but also in
examplewhere this is an ordinary table. In this case may be some other optimisation: indexes, etc. 

But do we really need in the simple example such detailed and specific code? May be better give as simple as possible
exampleas example, which everyone will can adapt for his own needs. 


Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Олег Самойлов
Date:

> 26 Aug 2020, в 09:24, Николай Чадаев <nick-ch58@yandex.ru> написал(а):
>
> Thank You Oleg.
>
> A simple example is extremely important and necessary.
> I will wait for this example.

It's already exists. The idea was to add cross links to this.




Re: Create a Foreign Table for PostgreSQL CSV Logs

From
Bruce Momjian
Date:
On Sat, Aug 22, 2020 at 01:51:56PM -0400, Bruce Momjian wrote:
> On Fri, Aug 21, 2020 at 08:41:54PM -0700, David G. Johnston wrote:
> > On Fri, Aug 21, 2020 at 2:58 PM Bruce Momjian <bruce@momjian.us> wrote:
> >     Good idea.  People have been confused about this before.  Attached is a
> >     patch.
> > 
> > 
> > + It is also possible to access the file as a foreign data wrapper
> > +        using <xref linkend="file-fdw"/>.
> > 
> > Seems more accurate to say "It is also possible to access the file as a foreign
> > table, using the supplied <xref linkend="file-fdw"/> module."
> > 
> > The file_fdw -> config change looks good.
> 
> OK, updated patch attached.

Patch applied.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee