Re: WIP - xmlvalidate implementation from TODO list - Mailing list pgsql-hackers

From Marcos Magueta
Subject Re: WIP - xmlvalidate implementation from TODO list
Date
Msg-id CAN3aFCfdGp6TGTQNOVO1im1u2vO_E2jnTGVV2xhea7eNY7GtuQ@mail.gmail.com
Whole thread Raw
In response to Re: WIP - xmlvalidate implementation from TODO list  (Jim Jones <jim.jones@uni-muenster.de>)
Responses Re: WIP - xmlvalidate implementation from TODO list
List pgsql-hackers
Thank you all for the careful review!

I'll go through the topics to fix the test and code changes today, but I have a couple of questions about a catalog.

If we were to implement a catalog, I believe it would be either copying an insert to a specified relation (created on demand) or to something in the catalog, like pg_xmlschema. That could be a realistic change I could work on. But what about the privilege level and file fetch support? I believe it's not really an issue if the user is sufficiently privileged, so should it mirror COPY FROM? I haven't seen its implementation, but I suppose it already has security checks at the user privilege level. A valid alternative to not deal with privileges and to leave the same restrictions already in place to fetch arbitrary extensions to a specified schema; in that way we are just moving the schema definition to another command before being invoked and ignoring if it has any references outside of the plain text specified (therefore, not using file://, like IBM, just text).

Surprisingly, the standard (I only have the 2016 here) leaves a great room for freedom on how to implement the registration. It just specifies what it should have:

An XML namespace NS contained in a registered XML Schema is non-deterministic if NS contains a global
element declaration schema component that is non-deterministic.
A registered XML Schema is non-deterministic if it contains a non-deterministic XML namespace.
A registered XML Schema is described by a registered XML Schema descriptor. A registered XML Schema
descriptor includes:
— The target namespace URI of the registered XML Schema.
— The schema location URI of the registered XML Schema.
— The <registered XML Schema name> of the registered XML Schema.
— An indication of whether the registered XML Schema is permanently registered.
— An indication of whether the registered XML Schema is non-deterministic.
— An unordered collection of the namespaces defined by the registered XML Schema (the target namespace
is one of these namespaces).
— For each namespace defined by the registered XML Schema, an unordered collection of the global element
declaration schema components in that namespace, with an indication for each global element declaration
schema component whether that global element declaration schema component is non-deterministic.
NOTE 9 — Without Feature X161, “Advanced Information Schema for registered XML Schemas”, information whether an XML
Schema is deterministic, information about the collection of namespaces defined in that XML Schema, and, for each such namespace
information about the global element declaration schema components in that namespace, is not available in the XML_SCHEMAS,
XML_SCHEMA_NAMESPACES, and XML_SCHEMA_ELEMENTS views.
A registered XML Schema is identified by its <registered XML Schema name>.

I am tempted to go with a pg_xmlschema definition on the catalog and an interface like the one IBM has, but still restricting file access. Dealing with the security problems for that sounds excruciating. Any opinions?

Regards, Magueta.

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: apply_scanjoin_target_to_paths and partitionwise join
Next
From: Masahiko Sawada
Date:
Subject: pg_upgrade: optimize replication slot caught-up check