Re: JSON for PG 9.2 - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: JSON for PG 9.2
Date
Msg-id CAHyXU0xiH-1afzY5dT-F83jBhC6D0BjdH+ib2forcA4_+U+NDA@mail.gmail.com
Whole thread Raw
In response to Re: JSON for PG 9.2  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Tue, Jan 31, 2012 at 11:46 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> The array(select...) locution turns out to have less flexibility than the
> array_agg(record-ref) locution.

Less flexible maybe, but it can cleaner for exactly the type of
queries that will tend to come up in exactly the type of functionality
people are looking for with JSON output.  libpqtypes does exactly the
same stuff but for C clients -- so I've done tons of this kind of
programming and am maybe a bit ahead of the curve here.  Note: while
the following contrived example may seem a bit complex it has a
certain elegance and shows how the postgres type system can whip out
document style 'nosql' objects to clients who can handle them.
Perhaps there is more simplification through syntax possible, but as
it stands things are pretty functional.   The equivalent production
through array_agg I find to be pretty awful looking although it can
produce a better plan since it doesn't force everything through
flattened subqueries:

create table foo
( foo_id serial primary key, a int
);

create table bar
( bar_id serial primary key, foo_id int references foo, b int
);

create table baz
( baz_id serial primary key, bar_id int references bar, c int
);

create type baz_t as
( c int
);

create type bar_t as
( bazs baz_t[], b int
);

create type foo_t as
( bars bar_t[], a int
);

INSERT INTO foo(a) VALUES (1);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 100);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 1000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 2000);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 200);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 3000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 4000);
INSERT INTO foo(a) VALUES (2);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 300);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 5000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 6000);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 400);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 7000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 8000);

-- nosql!
select array( select row(   array(     select row(       array(         select row(           c         )::baz_t from
bazwhere baz.bar_id = bar.bar_id       )::baz_t[],       b     )::bar_t from bar where bar.foo_id = foo.foo_id
)::bar_t[],  a )::foo_t from foo
 
)::foo_t[];
                               foo_t

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------{"(\"{\"\"(\\\\\"\"{(1000),(2000)}\\\\\"\",100)\"\",\"\"(\\\\\"\"{(3000),(4000)}\\\\\"\",200)\"\"}\",1)","(\"{\"\"(\\\\\"\"{(5000),(6000)}\\\\\"\",300)\"\",\"\"(\\\\\"\"{(7000),(8000)}\\\\\"\",400)\"\"}\",2)"}

as you can see, the postgres default escaping format sucks for sending
nested data -- throw even one quote or backslash in there and your
data can explode in size 10+ times -- this is why we insisted on
binary.  json, of course, is much better suited for this type of
communication.   despite the complicated-ness look of the above, this
type of code is in fact very easy to write once you get the knack.
This type of coding also leads to much simpler coding on the cilent
since relationships are directly built into the structure and don't
have to be inferred or duplicated.


merlin


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Progress on fast path sorting, btree index creation time
Next
From: Magnus Hagander
Date:
Subject: Re: libpq: fix sslcompression leak