Re: cast bytea to double precision[] - Mailing list pgsql-novice

From Merlin Moncure
Subject Re: cast bytea to double precision[]
Date
Msg-id CAHyXU0zP9tcLODN2iAw5DiYWPY50uvJz30V1duT_4QHqcMSNow@mail.gmail.com
Whole thread Raw
In response to Re: cast bytea to double precision[]  (Mathieu Dubois <mathieu.dubois@limsi.fr>)
Responses Re: cast bytea to double precision[]  (Mathieu Dubois <mathieu.dubois@limsi.fr>)
List pgsql-novice
On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois <mathieu.dubois@limsi.fr> wrote:
> Hello,
>
> Le 25/07/2011 17:58, Mathieu Dubois a écrit :
>>
>> On 07/25/2011 05:54 PM, Merlin Moncure wrote:
>>>
>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu
>>> Dubois<mathieu.dubois@limsi.fr> wrote:
>>>>
>>>> I have found a solution by myself for the conversion:
>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements
>>>> are in
>>>> fact separated by commas).
>>>>
>>>> To convert my column I have used:
>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING
>>>> regexp_split_to_array(sig_vector, E',')::double precision[];
>>>>
>>>> Is that correct?
>>>> Is it correct to pass the column name to regexp_split_to_array()?
>>>
>>> Yeah -- you are just passing a column's data into a function as an
>>> argument -- standard practice. This will work -- your bytea is really
>>> a text column, so it's just a matter of breaking up the string.
>>> regexp_* functions are great for that.
>>
>> Thank you very much for your reply.
>>
>> I will launch the conversion right now.
>>
> The main reason to do this was to have smaller backups.
> The size of a compressed backup was around 1GB with bytea.
> I have converted the columns (on a copy of the database) but the expected
> gain are not here!
> With double precision[] it is still around 1GB (a little bit smaller but
> just a few MB).
>
> Also the size on the disk is not smaller.
> I have listed the content of /var/lib/postgres/8.4/main/base with du and the
> 2 versions have the same size (3.1GB).
>
> Does it make sense?
> My hypothesis is that the compression algorithm is able to find regularities
> the data so it finds the same regularity in bytea and in double precision[].
>
> Is there any advantage to use double precision[] over bytea in my case?

probably not -- arrays can be significantly smaller than a set of
individual tuples each holding one value because of the tuple
overhead, but you still have to pay for the array header and a 4 byte
length/null indicator per element.

A packed string is often the smallest way to store data, although not
necessarily the best.  A double precision[] comes with a lot of syntax
advantages.

merlin

pgsql-novice by date:

Previous
From: Mathieu Dubois
Date:
Subject: Re: cast bytea to double precision[]
Next
From: Mathieu Dubois
Date:
Subject: Re: cast bytea to double precision[]