Thread: I do not get the point of the information_schema

I do not get the point of the information_schema

From

Thiemo Kellner

Date:

13 February 2018, 01:01:41

I try to implement SCD2 on trigger level and try to generated needed 
code on the fly. Therefore I need to read data about the objects in the 
database. So far so good. I know of the information_schema and the 
pg_catalog. The documentation for the information_schema states that it 
'is defined in the SQL standard and can therefore be expected to be 
portable and remain stable'. I can think of a sensible meaning of 
portable. One cannot port it to MariaDB, can one? Maybe different 
PostreSQL version but then a one fits all implementation would mean only 
parts of the catalogue that never ever change can be exposed by the 
information_schema. Coming from Oracle I consider the information_schema 
the analogy to Oracles data dictionary views giving a stable interface 
on the database metadata hiding catalogue structure changes. But I 
dearly miss some information therein. I created following query to get 
the index columns of an index. I fear breakage when not run on the 
specific version I developed it against. Is there a more elegant way by 
the information_schema?

with INDEX_COLUMN_VECTOR as(
   select
     i.indkey
   from
     pg_catalog.pg_index i
   inner join pg_catalog.pg_class c on
     i.indexrelid = c.oid
   where
     c.relname = 'idiom_hist'
),
COLUMNS as(
   select
     a.attname,
     a.attnum
   from
     pg_catalog.pg_attribute a
   inner join pg_catalog.pg_class c on
     a.attrelid = c.oid
   where
     c.relname = 'idiom'
) select
   c.attname
from
   COLUMNS c
inner join INDEX_COLUMN_VECTOR v on
   c.attnum = any(v.indkey)
order by
   c.attnum asc;

An other simpler case.

     select
       indexname
     from
       pg_catalog.pg_indexes
     where
       schemaname = 'act'
       and tablename = i_table_name
       and indexname = i_table_name || '_hist';


-- 
Öffentlicher PGP-Schlüssel:
http://pgp.mit.edu/pks/lookup?op=get&search=0xCA167FB0E717AFFC

Attachment

thiemo.vcf

Re: I do not get the point of the information_schema

From

Tom Lane

Date:

13 February 2018, 01:23:14

Thiemo Kellner <thiemo@gelassene-pferde.biz> writes:
> I try to implement SCD2 on trigger level and try to generated needed 
> code on the fly. Therefore I need to read data about the objects in the 
> database. So far so good. I know of the information_schema and the 
> pg_catalog. The documentation for the information_schema states that it 
> 'is defined in the SQL standard and can therefore be expected to be 
> portable and remain stable'. I can think of a sensible meaning of 
> portable. One cannot port it to MariaDB, can one?

If MariaDB implements information_schema according to the spec, then
yes.  (If they don't, that's something to complain about to them,
not us.)

> I created following query to get 
> the index columns of an index. I fear breakage when not run on the 
> specific version I developed it against. Is there a more elegant way by 
> the information_schema?

No, because indexes are not a part of the SQL standard.  (I'm not here
to debate the wisdom of that choice; we didn't make it.)  You can get
information about constraints out of the information_schema, so to the
extent that what you're interested in is the indexes underlying PK or
UNIQUE constraints, that's an option.  Otherwise, you're dealing with an
implementation-specific feature and you shouldn't be surprised that the
way of finding out about it is likewise implementation-specific.

FWIW, the stuff used in your sample query has all been there for a
very long time; we don't like to break plausible client queries lightly.

            regards, tom lane

Re: I do not get the point of the information_schema

From

"Peter J. Holzer"

Date:

14 February 2018, 01:57:55

On 2018-02-12 23:01:41 +0100, Thiemo Kellner wrote:
> I try to implement SCD2 on trigger level and try to generated needed code on
> the fly. Therefore I need to read data about the objects in the database. So
> far so good. I know of the information_schema and the pg_catalog. The
> documentation for the information_schema states that it 'is defined in the
> SQL standard and can therefore be expected to be portable and remain
> stable'. I can think of a sensible meaning of portable. One cannot port it
> to MariaDB, can one?

You don't port the information schema to MariaDB. The information schema
is provided by the database.

The *use* of the information schema is portable between standard-
conforming databases, however.

You can use

    select table_schema, table_name
    from information_schema.tables
    where table_type='BASE TABLE';

on both PostgreSQL and MariaDB to get a list of tables.

(That said, it looks like both PostgreSQL and MariaDB include additional
columns beyond those mandated by the standard - you can't rely on those,
of course. And some databases like Oracle don't even have an information
schema.)

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: I do not get the point of the information_schema

From

"David G. Johnston"

Date:

14 February 2018, 02:06:43

On Tue, Feb 13, 2018 at 3:57 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

(That said, it looks like both PostgreSQL and MariaDB include additional
columns beyond those mandated by the standard - you can't rely on those,
of course. And some databases like Oracle don't even have an information
schema.)

Given the documented charter of information_schema I'd present your conclusion and evidence to pgsql-bugs...while I suppose extra columns are not inherently harmful at minimum they would need to be documented if kept.

David J.

Re: I do not get the point of the information_schema

From

"Peter J. Holzer"

Date:

14 February 2018, 02:17:31

On 2018-02-13 16:06:43 -0700, David G. Johnston wrote:
> On Tue, Feb 13, 2018 at 3:57 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
>     (That said, it looks like both PostgreSQL and MariaDB include additional
>     columns beyond those mandated by the standard - you can't rely on those,
>     of course. And some databases like Oracle don't even have an information
>     schema.)
>
>
> Given the documented charter of information_schema I'd present your conclusion
> and evidence to pgsql-bugs...while I suppose extra columns are not inherently
> harmful at minimum they would need to be documented if kept.

I don't have evidence, as I don't have access to a recent SQL standard.
But I noticed that for example information_schema.tables have only a few
columns in common between PostgreSQL and MariaDB:

hjp=> select * from information_schema.tables where table_type='BASE TABLE' limit 1;
─[ RECORD 1 ]────────────────┬───────────
table_catalog                │ hjp
table_schema                 │ public
table_name                   │ hjpnet
table_type                   │ BASE TABLE
self_referencing_column_name │ (∅)
reference_generation         │ (∅)
user_defined_type_catalog    │ (∅)
user_defined_type_schema     │ (∅)
user_defined_type_name       │ (∅)
is_insertable_into           │ YES
is_typed                     │ NO
commit_action                │ (∅)

MariaDB [simba]> select * from information_schema.tables where table_type='BASE TABLE' limit 1\G
*************************** 1. row ***************************
  TABLE_CATALOG: def
   TABLE_SCHEMA: simba
     TABLE_NAME: archived_versions
     TABLE_TYPE: BASE TABLE
         ENGINE: MyISAM
        VERSION: 10
     ROW_FORMAT: Fixed
     TABLE_ROWS: 0
 AVG_ROW_LENGTH: 0
    DATA_LENGTH: 0
MAX_DATA_LENGTH: 7036874417766399
   INDEX_LENGTH: 1024
      DATA_FREE: 0
 AUTO_INCREMENT: 1
    CREATE_TIME: 2008-01-28 01:24:48
    UPDATE_TIME: 2008-01-28 01:24:48
     CHECK_TIME: NULL
TABLE_COLLATION: utf8_general_ci
       CHECKSUM: NULL
 CREATE_OPTIONS:
  TABLE_COMMENT:
1 row in set (0.01 sec)

The first 4 are the same, all others are different.
It is possible that all the columns that PostgreSQL has are required by
the standard and that MariaDB is non-conforming by omitting them, but at
least some of the names look quite PostgreSQL-specific to me. So my
guess is that the standard only requires the first 4 and the rest are
RDBMS-specific.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: I do not get the point of the information_schema

From

"David G. Johnston"

Date:

14 February 2018, 02:25:58

On Tue, Feb 13, 2018 at 4:17 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

It is possible that all the columns that PostgreSQL has are required by
the standard and that MariaDB is non-conforming by omitting them, but at
least some of the names look quite PostgreSQL-specific to me. So my
guess is that the standard only requires the first 4 and the rest are
RDBMS-specific.

Unless our docs are completely misleading I'd say that PostgreSQL is being conforming while MariaDB is treating information_schema as their version of pg_catalog (or at least our system views over top of pg_catalog).

https://www.postgresql.org/docs/10/static/infoschema-tables.html

If 5 and 6 and the last columns were not standard conforming it would seem pointless to include them since we don't have/implement the features they cover.

David J.

Re: I do not get the point of the information_schema

From

Tom Lane

Date:

14 February 2018, 02:46:25

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Unless our docs are completely misleading I'd say that PostgreSQL is being
> conforming while MariaDB is treating information_schema as their version of
> pg_catalog (or at least our system views over top of pg_catalog).

Our project policy is that information_schema should show exactly the
columns mandated by whichever spec version we consider current.  If you
see some that are not in your copy of the standard, you probably need a
newer copy.  (There are links in our wiki to free draft versions of the
spec, which I think are what most of us rely on in practice.)

MariaDB has a much laxer notion of what compliance to the standard means
here, and AFAIK they feel free to add columns that are not in the
standard.  We do not do that.

            regards, tom lane

Re: I do not get the point of the information_schema

From

Eric Hanson

Date:

15 February 2018, 06:46:53

You might find Aquameta's meta module helpful, it reimplements information_schema in a more normalized layout, as updatable views:

http://blog.aquameta.com/2015/08/29/intro-meta/

https://github.com/aquametalabs/aquameta/tree/master/core/000-meta

Best,

Eric

On Mon, Feb 12, 2018 at 2:02 PM Thiemo Kellner <thiemo@gelassene-pferde.biz> wrote:

I try to implement SCD2 on trigger level and try to generated needed
code on the fly. Therefore I need to read data about the objects in the
database. So far so good. I know of the information_schema and the
pg_catalog. The documentation for the information_schema states that it
'is defined in the SQL standard and can therefore be expected to be
portable and remain stable'. I can think of a sensible meaning of
portable. One cannot port it to MariaDB, can one? Maybe different
PostreSQL version but then a one fits all implementation would mean only
parts of the catalogue that never ever change can be exposed by the
information_schema. Coming from Oracle I consider the information_schema
the analogy to Oracles data dictionary views giving a stable interface
on the database metadata hiding catalogue structure changes. But I
dearly miss some information therein. I created following query to get
the index columns of an index. I fear breakage when not run on the
specific version I developed it against. Is there a more elegant way by
the information_schema?

with INDEX_COLUMN_VECTOR as(
select
i.indkey
from
pg_catalog.pg_index i
inner join pg_catalog.pg_class c on
i.indexrelid = c.oid
where
c.relname = 'idiom_hist'
),
COLUMNS as(
select
a.attname,
a.attnum
from
pg_catalog.pg_attribute a
inner join pg_catalog.pg_class c on
a.attrelid = c.oid
where
c.relname = 'idiom'
) select
c.attname
from
COLUMNS c
inner join INDEX_COLUMN_VECTOR v on
c.attnum = any(v.indkey)
order by
c.attnum asc;

An other simpler case.

select
indexname
from
pg_catalog.pg_indexes
where
schemaname = 'act'
and tablename = i_table_name
and indexname = i_table_name || '_hist';

--
Öffentlicher PGP-Schlüssel:
http://pgp.mit.edu/pks/lookup?op=get&search=0xCA167FB0E717AFFC

-- Eric Hanson CEO, Aquameta 503-929-1073