Thread: Benchmark of using JSON to transport query results in node.js
Hi! I made some benchmarks of using JSON to transport results to node.js and it seems it really makes a difference over using native or standard PostgreSQL. So the idea is that you simply wrap all results into JSON like SELECT to_json(t) FROM (... original query ...) AS t. I am guessing because node.js/JavaScript has really fast JSON parser but for everything else there is overhead. See my blog post for more details [1]. Any feedback welcome. This makes me wonder. If serialization/deserialization makes such big impact, where there efforts to improve how results are serialized for over-the-wire transmission? For example, to use something like Capnproto [2] to serialize into structure which can be directly used without any real deserialization? [1] https://mitar.tnode.com/post/181893159351/in-nodejs-always-query-in-json-from-postgresql [2] https://capnproto.org/ Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m
I'm fairly new to Postgres, but one question is how node.js implements the native driver when fetching the data: fetchall, fetchmany or fetch.single? Also which native driver is it using?
Does the native driver do a round trip for each record fetched, or can it batch them into multiples?
For example, in the Oracle native driver (for Python, in my case), setting the cursor arraysize makes a huge performance difference when pulling back large datasets.
Pulling back 800k + records through a cursor on a remote machine with the default arraysize was way too long(3 hours before I canceled it).
Upping the arraysize to 800 dropped that to around 40 minutes, including loading each record into a local Postgres via a function call (more complex database structure to be handled).
This is on low-level test equipment.
This is a relevant issue for us, as we well be developing a new front end to our application. and we still haven't finalized the architecture.
The backend build to date uses Python / Postgres. Python/Flask is one option, possibly serving the data to Android / web via JSON / REST.
Another option is to query directly from node.js and get JSON or native query from the database (extensive use of functions / stored procedures).
Our application is data-intensive, involving a lot of geotracking data across hundreds of devices at it's core, and then quite a bit of geo/mapping/ analytics around that..
On Thu, 10 Jan 2019 at 23:52, Mitar <mmitar@gmail.com> wrote:
Hi!
I made some benchmarks of using JSON to transport results to node.js
and it seems it really makes a difference over using native or
standard PostgreSQL. So the idea is that you simply wrap all results
into JSON like SELECT to_json(t) FROM (... original query ...) AS t. I
am guessing because node.js/JavaScript has really fast JSON parser but
for everything else there is overhead. See my blog post for more
details [1]. Any feedback welcome.
This makes me wonder. If serialization/deserialization makes such big
impact, where there efforts to improve how results are serialized for
over-the-wire transmission? For example, to use something like
Capnproto [2] to serialize into structure which can be directly used
without any real deserialization?
[1] https://mitar.tnode.com/post/181893159351/in-nodejs-always-query-in-json-from-postgresql
[2] https://capnproto.org/
Mitar
--
http://mitar.tnode.com/
https://twitter.com/mitar_m
Hi! On Fri, Jan 11, 2019 at 3:06 AM Tony Shelver <tshelver@gmail.com> wrote: > I'm fairly new to Postgres, but one question is how node.js implements the native driver when fetching the data: fetchall,fetchmany or fetch.single? Also which native driver is it using? The package I used is here [1]. It is using libpq. Is there some better native driver to use? It seems it is mostly using PQexec call, not sure how that call internally fetches the data [2]. I think the reason for slowness is because parsing of results is still done in JavaScript [3]. [1] https://github.com/brianc/node-pg-native [2] https://www.postgresql.org/docs/devel/libpq-exec.html [3] https://github.com/brianc/node-pg-native/blob/master/lib/build-result.js > Does the native driver do a round trip for each record fetched, or can it batch them into multiples? I think it waits for all results to first arrive using native driver and then it starts processing it in JavaScript. > Another option is to query directly from node.js and get JSON or native query from the database (extensive use of functions/ stored procedures). For web applications, I was even thinking about this crazy approach: get PostgreSQL to encode all results in JSON, and then in node.js do not parse JSON, but send it as string directly to the browser. I have seen this many times in my other projects. That having REST and other stuff in between requires you to read from the database, serialize it into something, then you deserialize it back on the server-side, maybe even wrap it into ORM objects (Django I am looking at you) just so that you can send it to your REST code, which then converts it to JSON and sends it over. From my older benchmarks using Python/Django this added few 100ms very quickly when having for example time-series GeoJSON data being read and send to the client through a REST interface, few thousands of points. I hacked it so that I directly passed through GeoJSON fields as strings by injecting GeoJSON into a placeholder in otherwise serialized JSON (luckily JSON is recursive) through pure string manipulation and it improved greatly. And I just disabled all other REST content types except for JSON. We have not used XML or YAML over our API anyway. So to me it seems encoding all results in JSON is a win always in node.js. If you consume it by node.js, great. If you are planing to pass it on, maybe just pass it on as-is. > Our application is data-intensive, involving a lot of geotracking data across hundreds of devices at it's core, and thenquite a bit of geo/mapping/ analytics around that.. It seems maybe very similar to what I had in Python/Django/REST in the past. :-) Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m
On Fri, Jan 11, 2019 at 10:31 AM Mitar <mmitar@gmail.com> wrote:
Hi!
> Another option is to query directly from node.js and get JSON or native query from the database (extensive use of functions / stored procedures).
For web applications, I was even thinking about this crazy approach:
get PostgreSQL to encode all results in JSON, and then in node.js do
not parse JSON, but send it as string directly to the browser.
Not crazy at all, that's basically how PostgREST works:
-Michel