Home > mailing lists

Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n' - Mailing list pgsql-hackers

From	Joel Jacobson
Subject	Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n'
Date	May 6, 2021 05:52:11
Msg-id	a8a39fbf-cf4f-4292-b994-fe997a45946c@www.fastmail.com Whole thread Raw
In response to	Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n' (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Wed, May 5, 2021, at 21:51, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:
> On 5/5/21 2:45 PM, Tom Lane wrote:
>> Yeah, that's because of the conversion to "chr". But a regexp
>> is overkill for that anyway. Don't we have something that will
>> split on simple substring matches?

> Not that I know of. There is split_part but I don't think that's fit for
> purpose here. Do we need one, or have I missed something?

[ checks manual ... ]

string_to_array or string_to_table would do, I think.

regards, tom lane

Thanks for these new functions, they seem really useful for a lot of cases.

However, I see two problems with using string_to_table() for this particular use-case.

- Doesn't work with files larger than 1GB, due to pg_read_file()'s limit.

- 68% slower than using the COPY-hack.

% ls -lah foo.txt

-rw-r--r-- 1 joel staff 623M May 6 07:31 foo.txt

% wc -l foo.txt

6771864 foo.txt

# \d txt

Table "public.txt"

Column | Type | Collation | Nullable | Default

--------+------+-----------+----------+---------

line | text | | |

# COPY txt (line) FROM 'foo.txt' DELIMITER '"';

COPY 6771864

Time: 9829.707 ms (00:09.830)

Time: 9552.286 ms (00:09.552)

Time: 9483.115 ms (00:09.483)

# TRUNCATE txt;

TRUNCATE TABLE

# INSERT INTO txt (line) SELECT string_to_table(pg_read_file('foo.txt'),E'\n');

INSERT 0 6771865

Time: 16556.078 ms (00:16.556)

Time: 14720.343 ms (00:14.720)

Time: 17266.088 ms (00:17.266)

/Joel

pgsql-hackers by date:

From: vignesh C
Date: 06 May 2021, 05:52:09
Subject: Re: Replication slot stats misgivings

From: Thomas Kellerer
Date: 06 May 2021, 06:10:20
Subject: Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n'

Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n' - Mailing list pgsql-hackers

Previous

Next