Re: WIP - xmlvalidate implementation from TODO list - Mailing list pgsql-hackers

From Marcos Magueta
Subject Re: WIP - xmlvalidate implementation from TODO list
Date
Msg-id CAN3aFCcXwS7BrU1gHRUEBH3G59EVf_7LUhLeEWqW2Sc9Vk5k-A@mail.gmail.com
Whole thread Raw
In response to Re: WIP - xmlvalidate implementation from TODO list  (Jim Jones <jim.jones@uni-muenster.de>)
List pgsql-hackers
Hey Jim!

On 06.01.26, Jim Jones <jim.jones@uni-muenster.de> wrote:
> The result of <XML validate> is R.
That was an oversight on my behalf, I had a hard time understanding the standard, but now the validation of DOCUMENT and CONTENT being accepted makes more sense.

The current patch has some issues.

>  xmloption is document_or_content. But xmlvalidate_text_schema() always validates as a document.
As Andrey noticed, we should indeed support both a document and content. Which entails into an iterative validation (for each node provided) on content mode, so I should likely add the xmloption back. The fact it worked with the example I created was actually luck.

Also, I am not sure if some variables used inside of the PG_TRY are memory safe -- notice that none right now is set to volatile, despite being accessed in different parts of the block; other functions in xml.c do handle such correctly it seems (like xml_parse).

About the syntax proposal by Jim, I have no problems with complying to it. It does increase considerably the scope from what I originally intended, but that's the price to have something actually nice.

I can think of several useful extensions we could consider in a further implementation:

Schema Dependencies/Imports
CREATE XMLSCHEMA base AS '...';
CREATE XMLSCHEMA extended
  IMPORTS base
  AS '...';

Schema Versioning
CREATE XMLSCHEMA patient VERSION '1.0' AS '...';
CREATE XMLSCHEMA patient VERSION '2.0' AS '...';
XMLVALIDATE(doc ACCORDING TO XMLSCHEMA patient VERSION '2.0')

Custom Error Messages
CREATE XMLSCHEMA patient
  AS '...'
  ERROR MESSAGE 'Patient record does not match schema v2.0';

Schema inference from samples (if the lib supports it, that is)
CREATE XMLSCHEMA patient
    INFER FROM (SELECT data FROM patient_samples);

And much more, but perhaps that's already too ambitious for a first version.

I'll wait for the others to ring their bells.

Regards, Magueta.

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Next
From: Haritabh Gupta
Date:
Subject: Re: NOT NULL NOT ENFORCED