Re: binary representation of datatypes - Mailing list pgsql-hackers

From Matthieu Imbert
Subject Re: binary representation of datatypes
Date
Msg-id 48FDBF08.4080200@ens-lyon.fr
Whole thread Raw
In response to Re: binary representation of datatypes  (Michael Meskes <meskes@postgresql.org>)
Responses Re: binary representation of datatypes
Re: binary representation of datatypes
List pgsql-hackers
Michael Meskes wrote:
> On Tue, Oct 21, 2008 at 10:21:38AM +0200, Matthieu Imbert wrote:
>> I would like to be able to get results from SQL commands directly in a
>> binary format, instead of a textual one. Actually, I want to be able to
>> get timestamps with their full precision (microsecond).
> 
> Are you sure you cannot get those in textual mode? If so I wonder why I got
> some numbers in a quick test:
> 
> ...
> [NO_PID]: ecpg_execute on line 37: query: select  *  from date_test where d =  $1   ; with 1 parameter(s) on
connectionregress1
 
> [NO_PID]: sqlca: code: 0, state: 00000
> [NO_PID]: ecpg_execute on line 37: using PQexecParams
> [NO_PID]: sqlca: code: 0, state: 00000
> [NO_PID]: free_params on line 37: parameter 1 = 1966-01-17
> [NO_PID]: sqlca: code: 0, state: 00000
> [NO_PID]: ecpg_execute on line 37: correctly got 1 tuples with 2 fields
> [NO_PID]: sqlca: code: 0, state: 00000
> [NO_PID]: ecpg_get_data on line 37: RESULT: 1966-01-17 offset: -1; array: yes
> [NO_PID]: sqlca: code: 0, state: 00000
> [NO_PID]: ecpg_get_data on line 37: RESULT: 2000-07-12 17:34:29.140787 offset: -1; array: yes
> [NO_PID]: sqlca: code: 0, state: 00000
> ...
> 
> What do I miss here?
> 
> Michael


Yes microseconds are available in textual mode but i do want to use binary mode. Let me explain why:

- my data will be time series. So typical requests will return lots of timestamped data (mainly floats or int).

- after extraction i need to have all timestamps stored in format convenient for calculations. I can accommodate
differentformats
 
(for example: number of microseconds since 2000-01-01, or a structure similar to pg_tm (but with microsecond
precision),or a
 
time-format similar to one defined in rfc1305), but definitely storing timestamps as text is a no go for me.

so i have two choices:

scenario 1 - parse the textual representation of all results of requests to the database and convert textual timestamps
toa binary
 
format that i choose among those ones (number of microseconds since 2000-01-01, or a structure similar to pg_tm (but
with
microsecond precision), or a time-format similar to one defined in rfc1305, or something else)

or

scenario 2 - directly use pgsql binary timestamp format. I think the latter is far more efficient. I'm new to
postgresql,but from
 
what i understand, here are the conversions involved in both scenarios (hopping that my ascii art won't be garbled by
yourmail
 
clients ;-) :


scenario 1:
.---------.  .----------.  .---------.  .----------.  .--------------.  .----------.  .---------.
|timestamp|  |pgsql     |  |timestamp|  |pgsql     |  |timestamp     |  |my        |  |my       |
|storage  |->|internal  |->|storage  |->|network   |->|as            |->|timestamp |->|timestamp|
|in       |  |to        |  |in       |  |to        |  |textual       |  |conversion|  |format   |
|database |  |network   |  |network  |  |textual   |  |representation|  |routines  |  |         |
|backend  |  |conversion|  |         |  |conversion|  |              |  |          |  |         |
|         |  |function  |  |         |  |function  |  |              |  |          |  |         |
'---------'  '----------'  '---------'  '----------'  '--------------'  '----------'  '---------'

scenario 2:
.---------.  .----------.  .---------.  .----------.  .---------.
|timestamp|  |pgsql     |  |timestamp|  |pgsql     |  |timestamp|
|storage  |->|internal  |->|storage  |->|network   |->|official |
|in       |  |to        |  |in       |  |to        |  |format   |
|database |  |network   |  |network  |  |offical   |  |         |
|backend  |  |conversion|  |         |  |conversion|  |         |
|         |  |function  |  |         |  |function  |  |         |
'---------'  '----------'  '---------'  '----------'  '---------'

if i'm correct, it seems obvious that the second scenario is more efficient (and less ugly).

In scenario 2, when talking about timestamp 'official' format, i mean timestamp expressed as number of microseconds
since
2000-01-01. But of course, it only deserves this name 'official' if it is guaranteed to stay the same across postgresql
versionsand
 
platforms

-- 
Matthieu


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: SSL cleanups/hostname verification
Next
From: Peter Eisentraut
Date:
Subject: Re: SSL cleanups/hostname verification