Re: truncate data before importing - Mailing list pgsql-novice

From Felipe Santos
Subject Re: truncate data before importing
Date
Msg-id CAPYcRiWPC6eb+sbnqh6oXcd_GsXaShNUfv9mOBbqfpW9wF-EOA@mail.gmail.com
Whole thread Raw
In response to Re: truncate data before importing  (Shmagi Kavtaradze <kavtaradze.s@gmail.com>)
Responses Re: truncate data before importing  (Shmagi Kavtaradze <kavtaradze.s@gmail.com>)
List pgsql-novice
You´re almost there, here it is what they mean:

"''s/\t/ /g" = Replaces tab (\t) for white space (" ")

"''s/^\S+\s+//''" = Replaces everything that starts (^)  + with a non-white space (S+) + and is then followed by a white space (s+) for nothing (//).

Hope it helps

2015-11-18 16:24 GMT-02:00 Shmagi Kavtaradze <kavtaradze.s@gmail.com>:
I searched and found they are for whitespace characters. But they deleted sentence coordinates from text for me. Still no idea how.

On Wed, Nov 18, 2015 at 7:18 PM, Felipe Santos <felipepts@gmail.com> wrote:


2015-11-18 16:08 GMT-02:00 Shmagi Kavtaradze <kavtaradze.s@gmail.com>:
I was not able to find any details about "\S+\s+ ", can u explain them? Also thanks a lot, this worked perfectly!

On Wed, Nov 18, 2015 at 4:36 PM, Hans Ginzel <hans@matfyz.cz> wrote:
On Wed, Nov 18, 2015 at 01:49:35PM +0100, Shmagi Kavtaradze wrote:
  I am importing sentences from txt file. They look like:
  "0,170     A       recent  statistical     analysis        by      David
  Barton  graphically     illustrates     how     America         has
  plummeted       from    righteous       living  ,       prosperity
  and     success         in      the     last    quarter         century
       .
  Each Sentence starts with coordinates and each word is delimited with tab.
  I want to import data to tables without coordinates, just text and if
  possible to convert tab delimited space with just 'space', not to have
  such a gap between words. Any solutions how to do it? maybe with shell
  script?

You can use the 'PROGRAM' in COPY syntax
http://www.postgresql.org/docs/current/static/sql-copy.html

-- DROP TABLE IF EXISTS Sentence;
CREATE TABLE IF NOT EXISTS Sentence (s text);
COPY Sentence
FROM PROGRAM 'sed -re ''s/\t/ /g; s/^\S+\s+//'' file.txt'
WITH (FORMAT text, NULL '');

Take care of escape sequences – backslashes in the file.

If the file is on the client side see the \copy command of psql client instaed.
http://www.postgresql.org/docs/9.4/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY

https://www.gnu.org/software/sed/manual/sed.html

H.




These ("S+" and "s+") are regular expressions (regex).

They are being used as part of the SED tool, which is available from your Linux bash.




pgsql-novice by date:

Previous
From: Shmagi Kavtaradze
Date:
Subject: Re: truncate data before importing
Next
From: Shmagi Kavtaradze
Date:
Subject: Re: truncate data before importing