Re: cast bytea to double precision[] - Mailing list pgsql-novice

From Mathieu Dubois
Subject Re: cast bytea to double precision[]
Date
Msg-id 4E2EF890.9090901@limsi.fr
Whole thread Raw
In response to Re: cast bytea to double precision[]  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: cast bytea to double precision[]  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-novice
On 07/26/2011 04:30 PM, Merlin Moncure wrote:
> On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois<mathieu.dubois@limsi.fr>  wrote:
>> Hello,
>>
>> Le 25/07/2011 17:58, Mathieu Dubois a écrit :
>>> On 07/25/2011 05:54 PM, Merlin Moncure wrote:
>>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu
>>>> Dubois<mathieu.dubois@limsi.fr>  wrote:
>>>>> I have found a solution by myself for the conversion:
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements
>>>>> are in
>>>>> fact separated by commas).
>>>>>
>>>>> To convert my column I have used:
>>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[];
>>>>>
>>>>> Is that correct?
>>>>> Is it correct to pass the column name to regexp_split_to_array()?
>>>> Yeah -- you are just passing a column's data into a function as an
>>>> argument -- standard practice. This will work -- your bytea is really
>>>> a text column, so it's just a matter of breaking up the string.
>>>> regexp_* functions are great for that.
>>> Thank you very much for your reply.
>>>
>>> I will launch the conversion right now.
>>>
>> The main reason to do this was to have smaller backups.
>> The size of a compressed backup was around 1GB with bytea.
>> I have converted the columns (on a copy of the database) but the expected
>> gain are not here!
>> With double precision[] it is still around 1GB (a little bit smaller but
>> just a few MB).
>>
>> Also the size on the disk is not smaller.
>> I have listed the content of /var/lib/postgres/8.4/main/base with du and the
>> 2 versions have the same size (3.1GB).
>>
>> Does it make sense?
>> My hypothesis is that the compression algorithm is able to find regularities
>> the data so it finds the same regularity in bytea and in double precision[].
>>
>> Is there any advantage to use double precision[] over bytea in my case?
> probably not -- arrays can be significantly smaller than a set of
> individual tuples each holding one value because of the tuple
> overhead, but you still have to pay for the array header and a 4 byte
> length/null indicator per element.
>
> A packed string is often the smallest way to store data, although not
> necessarily the best.  A double precision[] comes with a lot of syntax
> advantages.
Thanks for your advice!

I find the result surprising because the floats are encoded with a lot
of characters (something like 20) while a double is 8 bytes.

I have tried to run VACCUM but it changed nothing...

All of my code is based on strings so I won't take time to modify it if
there is no gain.


Mathieu

> merlin
>


pgsql-novice by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: cast bytea to double precision[]
Next
From: Merlin Moncure
Date:
Subject: Re: cast bytea to double precision[]