Thread: PostgreSQL add id column that increments based on data

PostgreSQL add id column that increments based on data

From

DerekW

Date:

01 August 2014, 09:15:47

I am relatively new to using Postgres, coming from a MySQL background. I am
using Postgres 9.3.4 on Windows x64.

We are being supplied data in multiple fixed length text files. The first
digit on each line is a number between 1 and 4 that indicates the record
type of the data in that row. The rows are grouped sequentially such that
there will always first be a row of type 1 followed by zero or more rows of
the other types.
   data_x.txt   ---------------------   1data01   2data02   4data03   4data04   1data05   1data06   3data07

To import this into Postgres I have used the following SQL commands:
   CREATE TABLE data_raw (       raw_data TEXT   );      COPY data_raw FROM 'C:\path\data_x.txt' ...; -- Repeated for
eachfile      ALTER TABLE data_raw       ADD COLUMN indicator integer;          UPDATE data_raw SET       indicator =
CAST(substr(raw_data,1, 1) AS integer),       raw_data = substr(raw_data, 2);
 

I then create tables for each of the 4 record types:
   CREATE TABLE table_1 SELECT raw_data FROM data_raw WHERE indicator = 1;   CREATE TABLE table_2 SELECT raw_data FROM
data_rawWHERE indicator = 2;   CREATE TABLE table_3 SELECT raw_data FROM data_raw WHERE indicator = 3;   CREATE TABLE
table_4SELECT raw_data FROM data_raw WHERE indicator = 4;
 

What I need to do, but am unsure how, is to also add an "id" column for each
group where the indicator starts with 1. We will be getting weekly updates
so I need to specify the initial id for each batch. So if this batch starts
at id = 225, then I want to get the following tables from the sample data:
   table_1   id  | raw_data   --------------------   225 | data01   226 | data05   227 | data06      table_2   id  |
raw_data  --------------------   225 | data02      table_3   id  | raw_data   --------------------   227 | data07
table_4  id  | raw_data   --------------------   225 | data03   225 | data04
 






--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-add-id-column-that-increments-based-on-data-tp5813514.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.

Re: PostgreSQL add id column that increments based on data

From

Jonas Xie

Date:

01 August 2014, 09:55:47

You can try the *sequence* :
http://www.postgresql.org/docs/9.3/static/functions-sequence.html

Am 01.08.2014 11:15, schrieb DerekW:
> I am relatively new to using Postgres, coming from a MySQL background. I am
> using Postgres 9.3.4 on Windows x64.
>
> We are being supplied data in multiple fixed length text files. The first
> digit on each line is a number between 1 and 4 that indicates the record
> type of the data in that row. The rows are grouped sequentially such that
> there will always first be a row of type 1 followed by zero or more rows of
> the other types.
>
>      data_x.txt
>      ---------------------
>      1data01
>      2data02
>      4data03
>      4data04
>      1data05
>      1data06
>      3data07
>
> To import this into Postgres I have used the following SQL commands:
>
>      CREATE TABLE data_raw (
>          raw_data TEXT
>      );
>      
>      COPY data_raw FROM 'C:\path\data_x.txt' ...; -- Repeated for each file
>      
>      ALTER TABLE data_raw
>          ADD COLUMN indicator integer;
>          
>      UPDATE data_raw SET
>          indicator = CAST(substr(raw_data, 1, 1) AS integer),
>          raw_data = substr(raw_data, 2);
>
> I then create tables for each of the 4 record types:
>
>      CREATE TABLE table_1 SELECT raw_data FROM data_raw WHERE indicator = 1;
>      CREATE TABLE table_2 SELECT raw_data FROM data_raw WHERE indicator = 2;
>      CREATE TABLE table_3 SELECT raw_data FROM data_raw WHERE indicator = 3;
>      CREATE TABLE table_4 SELECT raw_data FROM data_raw WHERE indicator = 4;
>
> What I need to do, but am unsure how, is to also add an "id" column for each
> group where the indicator starts with 1. We will be getting weekly updates
> so I need to specify the initial id for each batch. So if this batch starts
> at id = 225, then I want to get the following tables from the sample data:
>
>      table_1
>      id  | raw_data
>      --------------------
>      225 | data01
>      226 | data05
>      227 | data06
>      
>      table_2
>      id  | raw_data
>      --------------------
>      225 | data02
>      
>      table_3
>      id  | raw_data
>      --------------------
>      227 | data07
>      
>      table_4
>      id  | raw_data
>      --------------------
>      225 | data03
>      225 | data04
>
>
>
>
>
>
> --
> View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-add-id-column-that-increments-based-on-data-tp5813514.html
> Sent from the PostgreSQL - sql mailing list archive at Nabble.com.
>
>

Re: PostgreSQL add id column that increments based on data

From

Adrian Klaver

Date:

01 August 2014, 13:31:13

On 08/01/2014 02:15 AM, DerekW wrote:
> I am relatively new to using Postgres, coming from a MySQL background. I am
> using Postgres 9.3.4 on Windows x64.
>
> We are being supplied data in multiple fixed length text files. The first
> digit on each line is a number between 1 and 4 that indicates the record
> type of the data in that row. The rows are grouped sequentially such that
> there will always first be a row of type 1 followed by zero or more rows of
> the other types.
>
>      data_x.txt
>      ---------------------
>      1data01
>      2data02
>      4data03
>      4data04
>      1data05
>      1data06
>      3data07
>
> To import this into Postgres I have used the following SQL commands:
>
>      CREATE TABLE data_raw (
>          raw_data TEXT
>      );
>
>      COPY data_raw FROM 'C:\path\data_x.txt' ...; -- Repeated for each file
>
>      ALTER TABLE data_raw
>          ADD COLUMN indicator integer;
>
>      UPDATE data_raw SET
>          indicator = CAST(substr(raw_data, 1, 1) AS integer),
>          raw_data = substr(raw_data, 2);
>
> I then create tables for each of the 4 record types:
>
>      CREATE TABLE table_1 SELECT raw_data FROM data_raw WHERE indicator = 1;
>      CREATE TABLE table_2 SELECT raw_data FROM data_raw WHERE indicator = 2;
>      CREATE TABLE table_3 SELECT raw_data FROM data_raw WHERE indicator = 3;
>      CREATE TABLE table_4 SELECT raw_data FROM data_raw WHERE indicator = 4;
>
> What I need to do, but am unsure how, is to also add an "id" column for each
> group where the indicator starts with 1. We will be getting weekly updates
> so I need to specify the initial id for each batch. So if this batch starts
> at id = 225, then I want to get the following tables from the sample data:
>
>      table_1
>      id  | raw_data
>      --------------------
>      225 | data01
>      226 | data05
>      227 | data06
>
>      table_2
>      id  | raw_data
>      --------------------
>      225 | data02
>
>      table_3
>      id  | raw_data
>      --------------------
>      227 | data07
>
>      table_4
>      id  | raw_data
>      --------------------
>      225 | data03
>      225 | data04

I am not following the logic of the id numbering scheme. I see the 
sequential numbering for indicator 1. Not why indicator 2 uses 225 again 
when indicator 3 starts with 227 and indicator 4 has the same id for 
both rows. My guess, whatever the logic is, it is going to involve 
triggers and possibly another table to keep track of batch numbers

>
>
>
>
>
>
> --
> View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-add-id-column-that-increments-based-on-data-tp5813514.html
> Sent from the PostgreSQL - sql mailing list archive at Nabble.com.
>
>


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: PostgreSQL add id column that increments based on data

From

David G Johnston

Date:

01 August 2014, 16:12:58

DerekW wrote
> I am relatively new to using Postgres, coming from a MySQL background. I
> am using Postgres 9.3.4 on Windows x64.
> 
> We are being supplied data in multiple fixed length text files. The first
> digit on each line is a number between 1 and 4 that indicates the record
> type of the data in that row. The rows are grouped sequentially such that
> there will always first be a row of type 1 followed by zero or more rows
> of the other types.
> 
>     data_x.txt
>     ---------------------
>     1data01
>     2data02
>     4data03
>     4data04
>     1data05
>     1data06
>     3data07
> 
> To import this into Postgres I have used the following SQL commands:
> 
>     CREATE TABLE data_raw (
>         raw_data TEXT
>     );
>     
>     COPY data_raw FROM 'C:\path\data_x.txt' ...; -- Repeated for each file
>     

1. Add a serial column to data_raw so that relative order can be recorded
during import.
2. Alter the COPY command to explicitly list only the raw_data column - so
the serial column uses its default.

At this point if you need to re-assign the sequence numbers on the
production tables based upon some business rules you can simply apply your
logic and feed in the raw data in the previously established order of
import.

The general logic would be to query all of the 1 rows and get their import
index.  Using lead/lag over this set you can, for each 1 record, get the
valid range of child indexes.  You can then join the non-1 data by use of a
between predicate.  In the same 1 query as the lead/lag you can assign base
group numbers using the row_number function and simply add some base offset.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-add-id-column-that-increments-based-on-data-tp5813514p5813543.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.

Re: PostgreSQL add id column that increments based on data

From

DerekW

Date:

04 August 2014, 08:35:29

Thank you for the input.

I have come up with the following solution:

CREATE TABLE data_raw (   raw_data TEXT
);

COPY data_raw FROM 'C:\path\data_x.txt' ...; -- Repeated for each file

ALTER TABLE data_raw   ADD COLUMN pk_id serial,   ADD COLUMN id integer,   ADD COLUMN indicator integer;

UPDATE data_raw SET   indicator = CAST(substr(raw_data, 1, 1) AS integer),   raw_data = substr(raw_data, 2);

CREATE TABLE id_base AS
SELECT   pk_id,   sum(CASE WHEN indicator = 1 THEN 1 ELSE 0 END) OVER (ORDER BY pk_id) AS
rec_id
FROM data_raw;

CREATE INDEX id_base_pk ON id_base USING btree(pk_id);

UPDATE data_raw r SET   id = (SELECT rec_id FROM id_base b WHERE b.pk_id = r.pk_id);

DROP TABLE id_base;





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-add-id-column-that-increments-based-on-data-tp5813514p5813661.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.