Thread: Inserting rpt content from a file into a table column in json format.

Inserting rpt content from a file into a table column in json format.

From
Salim KOC
Date:
Hello everyone,

I have a 5gb rpt file.
Rpt file content is as follows:


(base) skoc@skoc ~ % head -5 '/Users/skoc/postgresql/lx/lxRo.rpt' 
lxOid|lxFlags|lxType|lxFromLat|lxFromId|lxToLat|lxToId|lxCrDate|lxRelRul|lxModDate|lxOwner|lxAltOwn1|lxAltOwn2|lxTenant|lxDOV|lxIOV|lxToType
-2147483508|67108864|-542076010|341897887|-40640797|341897887|-40639830|2020-01-24 11:14:52.000|1|2020-01-24 11:14:52.000|-1062416028|-655292577|-1283415274|1|1|1|-2099227551
-2147483247|67108864|-366961227|341897887|-1257644257|341897887|1179980124|2023-04-26 07:43:53.000|1|2023-04-26 07:43:53.000|-1536708294|-655292577|-2119128014|1|-1071039821|-1071043321|708897621
-2147483213|67108864|1076431157|341897887|1128100732|341897887|1128103344|2021-02-05 13:48:55.000|1|2021-02-05 13:48:55.000|-1400837562|-655292577|54575254|1|1|1|-1487879320
-2147482748|67108864|-1075776416|341897887|-215312115|341897887|-215311217|2022-01-24 07:31:56.000|1|2022-01-24 07:31:56.000|1282737577|-655292577|54575254|1|1|1|-60774595

Delimiters is “|”
Current posttgresql version:
PostgreSQL 16.2 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.5 (clang-1205.0.22.9), 64-bit
plm=# show work_mem;
 work_mem 
----------
 16MB
(1 row)


I created a table with the following command.

CREATE TABLE IF NOT EXISTS public.lxbo
(
    c1 serial NOT NULL,
    c2 jsonb NOT NULL,
    CONSTRAINT lxbo_pkey PRIMARY KEY (c1)
)

The process I want to do:
insert the data whose content is specified above into column c2 in jsonb format.

1-How should the rpt file content be converted to jsonb format?
2-What is the way to import the relevant jsonb file into the table I specified?
3-Do I need to update the work_mem value and/or any other value?

Thanks 
Salim KOÇ

Re: Inserting rpt content from a file into a table column in json format.

From
Erik Wienhold
Date:
On 2024-04-21 19:34 +0200, Salim KOC wrote:
> I have a 5gb rpt file.
> Rpt file content is as follows:
> 
> 
> (base) skoc@skoc ~ % head -5 '/Users/skoc/postgresql/lx/lxRo.rpt'
>
lxOid|lxFlags|lxType|lxFromLat|lxFromId|lxToLat|lxToId|lxCrDate|lxRelRul|lxModDate|lxOwner|lxAltOwn1|lxAltOwn2|lxTenant|lxDOV|lxIOV|lxToType
> -2147483508|67108864|-542076010|341897887|-40640797|341897887|-40639830|2020-01-24 11:14:52.000|1|2020-01-24
11:14:52.000|-1062416028|-655292577|-1283415274|1|1|1|-2099227551
> -2147483247|67108864|-366961227|341897887|-1257644257|341897887|1179980124|2023-04-26 07:43:53.000|1|2023-04-26
07:43:53.000|-1536708294|-655292577|-2119128014|1|-1071039821|-1071043321|708897621
> -2147483213|67108864|1076431157|341897887|1128100732|341897887|1128103344|2021-02-05 13:48:55.000|1|2021-02-05
13:48:55.000|-1400837562|-655292577|54575254|1|1|1|-1487879320
> -2147482748|67108864|-1075776416|341897887|-215312115|341897887|-215311217|2022-01-24 07:31:56.000|1|2022-01-24
07:31:56.000|1282737577|-655292577|54575254|1|1|1|-60774595
> 
> Delimiters is “|”
> Current posttgresql version:
> PostgreSQL 16.2 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.5 (clang-1205.0.22.9), 64-bit
> plm=# show work_mem;
>  work_mem
> ----------
>  16MB
> (1 row)
> 
> 
> I created a table with the following command.
> 
> CREATE TABLE IF NOT EXISTS public.lxbo
> (
>     c1 serial NOT NULL,
>     c2 jsonb NOT NULL,
>     CONSTRAINT lxbo_pkey PRIMARY KEY (c1)
> )
> 
> The process I want to do:
> insert the data whose content is specified above into column c2 in jsonb format.

You did not say how the jsonb should look.  I assume you want a separate
row for each line in the rpt file and use the same property names as in
the header row.

> 1-How should the rpt file content be converted to jsonb format?
> 2-What is the way to import the relevant jsonb file into the table I specified?

Because the rpt file looks like CSV, I'd create a temporary table with
the same columns as in the rpt file and load the file with \copy:

    CREATE TEMP TABLE tmp_lxbo (...);
    \copy tmp_lxbo from '/Users/skoc/postgresql/lx/lxRo.rpt' (format csv, delimiter '|', header match)

Then convert the tmp_lxbo rows with to_jsonb:

    INSERT INTO public.lxbo (c2) SELECT to_jsonb(lxbo_raw.*) FROM tmp_lxbo;

> 3-Do I need to update the work_mem value and/or any other value?

No, not for this import process.  work_mem is relevant for query
operations such as sorting and merging.

-- 
Erik



Hello,

I provide an example below:

{   "lxOid": -2147483213, "lxFlags": 67108864, "lxType": 1076431157, "lxFromLat": 341897887, "lxFromId": 1128100732,
"lxToLat":341897887, "lxToId": 1128103344, "lxCrDate": "2021-02-05 13:48:55.000", "lxRelRul": 1, "lxModDate":
"2021-02-0513:48:55.000", "lxOwner": -1400837562, "lxAltOwn1": -655292577, "lxAltOwn2": 54575254, "lxTenant": 1,
"lxDOV":1, "lxIOV": 1, "lxToType": -1487879320
 
},
{
    "lxOid": -2147482748, "lxFlags": 67108864, "lxType": -1075776416, "lxFromLat": 341897887, "lxFromId": -215312115,
"lxToLat":341897887, "lxToId": -215311217, "lxCrDate": "2022-01-24 07:31:56.000", "lxRelRul": 1, "lxModDate":
"2022-01-2407:31:56.000", "lxOwner": 1282737577, "lxAltOwn1": -655292577, "lxAltOwn2": 54575254, "lxTenant": 1,
"lxDOV":1, "lxIOV": 1, "lxToType": -60774595
 
},
{
    "lxOid": -2147481421, "lxFlags": 67108864, "lxType": 1517856560, "lxFromLat": 341897887, "lxFromId": -1053599722,
"lxToLat":341897887, "lxToId": 1511184627, "lxCrDate": "2021-10-13 08:05:12.000", "lxRelRul": 1, "lxModDate":
"2022-03-1807:34:33.000", "lxOwner": -1400837562, "lxAltOwn1": -655292577, "lxAltOwn2": 54575254, "lxTenant": 1,
"lxDOV":1, "lxIOV": 1, "lxToType": -1487835620
 
}

Regards.


> On Apr 21, 2024, at 9:09 PM, Erik Wienhold <ewie@ewie.name> wrote:
> 
> On 2024-04-21 19:34 +0200, Salim KOC wrote:
>> I have a 5gb rpt file.
>> Rpt file content is as follows:
>> 
>> 
>> (base) skoc@skoc ~ % head -5 '/Users/skoc/postgresql/lx/lxRo.rpt'
>>
lxOid|lxFlags|lxType|lxFromLat|lxFromId|lxToLat|lxToId|lxCrDate|lxRelRul|lxModDate|lxOwner|lxAltOwn1|lxAltOwn2|lxTenant|lxDOV|lxIOV|lxToType
>> -2147483508|67108864|-542076010|341897887|-40640797|341897887|-40639830|2020-01-24 11:14:52.000|1|2020-01-24
11:14:52.000|-1062416028|-655292577|-1283415274|1|1|1|-2099227551
>> -2147483247|67108864|-366961227|341897887|-1257644257|341897887|1179980124|2023-04-26 07:43:53.000|1|2023-04-26
07:43:53.000|-1536708294|-655292577|-2119128014|1|-1071039821|-1071043321|708897621
>> -2147483213|67108864|1076431157|341897887|1128100732|341897887|1128103344|2021-02-05 13:48:55.000|1|2021-02-05
13:48:55.000|-1400837562|-655292577|54575254|1|1|1|-1487879320
>> -2147482748|67108864|-1075776416|341897887|-215312115|341897887|-215311217|2022-01-24 07:31:56.000|1|2022-01-24
07:31:56.000|1282737577|-655292577|54575254|1|1|1|-60774595
>> 
>> Delimiters is “|”
>> Current posttgresql version:
>> PostgreSQL 16.2 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.5 (clang-1205.0.22.9), 64-bit
>> plm=# show work_mem;
>> work_mem
>> ----------
>> 16MB
>> (1 row)
>> 
>> 
>> I created a table with the following command.
>> 
>> CREATE TABLE IF NOT EXISTS public.lxbo
>> (
>>    c1 serial NOT NULL,
>>    c2 jsonb NOT NULL,
>>    CONSTRAINT lxbo_pkey PRIMARY KEY (c1)
>> )
>> 
>> The process I want to do:
>> insert the data whose content is specified above into column c2 in jsonb format.
> 
> You did not say how the jsonb should look.  I assume you want a separate
> row for each line in the rpt file and use the same property names as in
> the header row.
> 
>> 1-How should the rpt file content be converted to jsonb format?
>> 2-What is the way to import the relevant jsonb file into the table I specified?
> 
> Because the rpt file looks like CSV, I'd create a temporary table with
> the same columns as in the rpt file and load the file with \copy:
> 
>    CREATE TEMP TABLE tmp_lxbo (...);
>    \copy tmp_lxbo from '/Users/skoc/postgresql/lx/lxRo.rpt' (format csv, delimiter '|', header match)
> 
> Then convert the tmp_lxbo rows with to_jsonb:
> 
>    INSERT INTO public.lxbo (c2) SELECT to_jsonb(lxbo_raw.*) FROM tmp_lxbo;
> 
>> 3-Do I need to update the work_mem value and/or any other value?
> 
> No, not for this import process.  work_mem is relevant for query
> operations such as sorting and merging.
> 
> -- 
> Erik