Re: Tackling JsonPath support - Mailing list pgsql-hackers

From Nico Williams
Subject Re: Tackling JsonPath support
Date
Msg-id 20161202213223.GJ24797@localhost
Whole thread Raw
In response to Re: Tackling JsonPath support  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Tackling JsonPath support
Re: Tackling JsonPath support
List pgsql-hackers
On Fri, Dec 02, 2016 at 08:53:33AM -0500, Robert Haas wrote:
> On Tue, Nov 29, 2016 at 11:50 AM, Christian Convey
> <christian.convey@gmail.com> wrote:
> > I think I can satisfy (3) with a PG extension which provides a function that
> > approximately implements JSONPath.  My short-term plans are to submit such a
> > patch.
> 
> FWIW, I think that's a fine plan.  I don't really know whether
> JSONPath is the right standard to pick for the task of extracting bits

It's not even a standard.  Are there particular proposals that the ANSI
SQL working group is considering?

> of JSON from other bits of JSON, but I think there's some value in
> picking something is simple enough that we can implement it in our own
> code and not have to rely on a third-party library.  Of course, if
> somebody feels like adding a configure option for --with-jq and

Sure.  My main concern is that I don't want to have to parse/format JSON
around every such call.  I'd rather parsed JSON remain in an internal
form for as long as possible.

Speaking of which, you could use libjq's jv API and not support the jq
language itself.

> appropriate interfaces to integrate with JQ, we could consider that,
> too, but that imposes a packaging requirement that a home-grown
> implementation doesn't.  I'd want to hear more than one vote for such

What we do in Heimdal, OpenAFS, and other open source projects, some
times, is include a copy / git submodule / similar of some such external
dependencies.  Naturally it's not possible to do this for all external
dependencies, but it works well enough.  The jv API part of jq is small
and simple, and could be ripped out into a library that could be
included in PostgreSQL.

> a course of action before embracing it.  If JQ is a Turing-complete
> query language, integrating it might be quite difficult -- for

Even if it weren't!  (It is.)

Consider this expression using a builtin in jq:
 [range(4503599627370496)]

That is, an array of integers from 0 to 4503599627370495, inclusive.
That will "halt" given a very, very large computer and a lot of time.

(Because jq is Turning-complete, range() can be coded in jq itself, and
some variants of range() are.)

> example, we'd need a way to make sure that it does periodic
> CHECK_FOR_INTERRUPTS() calls, and that it doesn't leak resources or
> crash if those calls decide longjmp() away due to an ERROR -- and
> would we let people query database tables with it?  Would that be
> efficient?  I think it's fine to have more limited objectives than
> what a JQ implementation would apparently entail.

Agreed.  I think this means that we need either or both of a variant of
the C jq_next() function that takes either a timeout parameter, or a
jq_intr() function that can cause a running jq_next() to stop.

(Tolerating longjmp() is harder to do and I'd rather not.)

Other projects, like, say, nginx or similar where there is a per-client
or per-connection memory pool to limit memory footprint, might want
libjq to get an allocator hook, so that's another enhancement to
consider.  If that's something that PostgreSQL would need, please let me
know.

Nico
-- 



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Performance improvement for joins where outer side is unique
Next
From: Alvaro Herrera
Date:
Subject: Re: patch: function xmltable