Re: On using doubles as primary keys - Mailing list pgsql-general

From Paul A Jungwirth
Subject Re: On using doubles as primary keys
Date
Msg-id CA+renyUXZzY5m=LW9buofO7zA9PP+G-pkoDT=323Z-ih45fSWA@mail.gmail.com
Whole thread Raw
In response to On using doubles as primary keys  (Kynn Jones <kynnjo@gmail.com>)
List pgsql-general


On Apr 17, 2015 8:35 AM, "Kynn Jones" <kynnjo@gmail.com> wrote:
> (The only reason for wanting to transfer this data to a Pg table
> is the hope that it will be easier to work with it by using SQL

800 million 8-byte numbers doesn't seem totally unreasonable for python/R/Matlab, if you have a lot of memory. Are you sure you want it in Postgres? Load the file once then filter it as you like. If you don't have the memory I can see how using Postgres to get fewer rows at a time might help. Fewer columns at a time would help even more if that's possible.

> In its simplest form, this would mean using
> doubles as primary keys, but this seems to me a bit weird.

I'd avoid that and just include an integer PK with your data. Datagrams in the languages above support that, or just slice off the PK column before doing your matrix math.

Also instead of 401 columns per row maybe store all 400 doubles in an array column? Not sure if that's useful for you but maybe it's worth considering.

Also if you put the metadata in the same table as the doubles, can you leave off the PKs altogether? Why join if you don't have to? It sounds like the tables are 1-to-1? Even if some metadata is not, maybe you can finesse it with hstore/arrays.

Good luck!

Paul

pgsql-general by date:

Previous
From: John McKown
Date:
Subject: Re: On using doubles as primary keys
Next
From: William Dunn
Date:
Subject: Re: PL\pgSQL 'ERROR: invalid input syntax for type oid:' [PostgreSQL 9.3.6 and 9.4]