Home > mailing lists

Re: Benchmark of using JSON to transport query results in node.js - Mailing list pgsql-general

From	Tony Shelver
Subject	Re: Benchmark of using JSON to transport query results in node.js
Date	January 11, 2019 11:06:01
Msg-id	CAG0dhZBZhLGWu1Xqvn2jqg=djf35sYhRw3j2OCUEjjC0V3Y2Fg@mail.gmail.com Whole thread Raw
In response to	Benchmark of using JSON to transport query results in node.js (Mitar <mmitar@gmail.com>)
Responses	Re: Benchmark of using JSON to transport query results in node.js
List	pgsql-general

Tree view

I'm fairly new to Postgres, but one question is how node.js implements the native driver when fetching the data: fetchall, fetchmany or fetch.single? Also which native driver is it using?

Does the native driver do a round trip for each record fetched, or can it batch them into multiples?

For example, in the Oracle native driver (for Python, in my case), setting the cursor arraysize makes a huge performance difference when pulling back large datasets.

Pulling back 800k + records through a cursor on a remote machine with the default arraysize was way too long(3 hours before I canceled it).

Upping the arraysize to 800 dropped that to around 40 minutes, including loading each record into a local Postgres via a function call (more complex database structure to be handled).

This is on low-level test equipment.

This is a relevant issue for us, as we well be developing a new front end to our application. and we still haven't finalized the architecture.

The backend build to date uses Python / Postgres. Python/Flask is one option, possibly serving the data to Android / web via JSON / REST.

Another option is to query directly from node.js and get JSON or native query from the database (extensive use of functions / stored procedures).

Our application is data-intensive, involving a lot of geotracking data across hundreds of devices at it's core, and then quite a bit of geo/mapping/ analytics around that..

On Thu, 10 Jan 2019 at 23:52, Mitar <mmitar@gmail.com> wrote:

Hi!

I made some benchmarks of using JSON to transport results to node.js
and it seems it really makes a difference over using native or
standard PostgreSQL. So the idea is that you simply wrap all results
into JSON like SELECT to_json(t) FROM (... original query ...) AS t. I
am guessing because node.js/JavaScript has really fast JSON parser but
for everything else there is overhead. See my blog post for more
details [1]. Any feedback welcome.

This makes me wonder. If serialization/deserialization makes such big
impact, where there efforts to improve how results are serialized for
over-the-wire transmission? For example, to use something like
Capnproto [2] to serialize into structure which can be directly used
without any real deserialization?

[1] https://mitar.tnode.com/post/181893159351/in-nodejs-always-query-in-json-from-postgresql
[2] https://capnproto.org/

Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

pgsql-general by date:

From: Mitar
Date: 10 January 2019, 21:51:53
Subject: Benchmark of using JSON to transport query results in node.js

From: Willy-Bas Loos
Date: 11 January 2019, 12:50:34
Subject: Re: log level of "drop cascade" lists

Re: Benchmark of using JSON to transport query results in node.js - Mailing list pgsql-general

Previous

Next