Thread: psql \copy

psql \copy

From
Steve Clark
Date:
Hello,

I am using psql to copy data extracted from an InfluxDB in csv format into postgresql.
I have a key field on the time field which I have defined as a bigint since the time I get
from InfluxDB is an epoch time.

My question is does psql abort the copy if it hits a duplicate key, or does it keep processing?


Thanks,
--
Stephen Clark
NetWolves Managed Services, LLC.
Sr. Applications Architect 
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Email Confidentiality Notice: The information contained in this transmission may contain privileged and confidential and/or protected health information (PHI) and may be subject to protection under the law, including the Health Insurance Portability and Accountability Act of 1996, as amended (HIPAA). This transmission is intended for the sole use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are notified that any use, dissemination, distribution, printing or copying of this transmission is strictly prohibited and may subject you to criminal or civil penalties. If you have received this transmission in error, please contact the sender immediately and delete this email and any attachments from any computer. Vaso Corporation and its subsidiary companies are not responsible for data leaks that result from email messages received that contain privileged and confidential and/or protected health information (PHI).

Re: psql \copy

From
Adrian Klaver
Date:
On 4/24/20 8:55 AM, Steve Clark wrote:
> Hello,
> 
> I am using psql to copy data extracted from an InfluxDB in csv format 
> into postgresql.
> I have a key field on the time field which I have defined as a bigint 
> since the time I get
> from InfluxDB is an epoch time.
> 
> My question is does psql abort the copy if it hits a duplicate key, or 
> does it keep processing?

Aborts.

\copy uses COPY so:

https://www.postgresql.org/docs/12/sql-copy.html

"COPY stops operation at the first error. This should not lead to 
problems in the event of a COPY TO, but the target table will already 
have received earlier rows in a COPY FROM. These rows will not be 
visible or accessible, but they still occupy disk space. This might 
amount to a considerable amount of wasted disk space if the failure 
happened well into a large copy operation. You might wish to invoke 
VACUUM to recover the wasted space."

> 
> 
> Thanks,
> -- 
> Stephen Clark
> *NetWolves Managed Services, LLC.*
> Sr. Applications Architect
> Phone: 813-579-3200
> Fax: 813-882-0209
> Email: steve.clark@netwolves.com
> http://www.netwolves.com
> 
> Email Confidentiality Notice: The information contained in this 
> transmission may contain privileged and confidential and/or protected 
> health information (PHI) and may be subject to protection under the law, 
> including the Health Insurance Portability and Accountability Act of 
> 1996, as amended (HIPAA). This transmission is intended for the sole use 
> of the individual or entity to whom it is addressed. If you are not the 
> intended recipient, you are notified that any use, dissemination, 
> distribution, printing or copying of this transmission is strictly 
> prohibited and may subject you to criminal or civil penalties. If you 
> have received this transmission in error, please contact the sender 
> immediately and delete this email and any attachments from any computer. 
> Vaso Corporation and its subsidiary companies are not responsible for 
> data leaks that result from email messages received that contain 
> privileged and confidential and/or protected health information (PHI).


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: psql \copy

From
"David G. Johnston"
Date:
On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com> wrote:
Hello,

I am using psql to copy data extracted from an InfluxDB in csv format into postgresql.
I have a key field on the time field which I have defined as a bigint since the time I get
from InfluxDB is an epoch time.

My question is does psql abort the copy if it hits a duplicate key, or does it keep processing?


Aborts

Re: psql \copy

From
Steve Crawford
Date:
On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com> wrote:
Hello,

I am using psql to copy data extracted from an InfluxDB in csv format into postgresql.
I have a key field on the time field which I have defined as a bigint since the time I get
from InfluxDB is an epoch time.

My question is does psql abort the copy if it hits a duplicate key, or does it keep processing?


The copy will fail. You could import into a temporary table and preprocess then copy to your permanent table or use an ETL solution to remove unwanted data before importing. I don't know the nature of your data or project but perhaps that column isn't suitable for a key.

Cheers,
Steve 

Re: psql \copy

From
Steve Clark
Date:
On 04/24/2020 11:59 AM, Steve Crawford wrote:
On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com> wrote:
Hello,

I am using psql to copy data extracted from an InfluxDB in csv format into postgresql.
I have a key field on the time field which I have defined as a bigint since the time I get
from InfluxDB is an epoch time.

My question is does psql abort the copy if it hits a duplicate key, or does it keep processing?


The copy will fail. You could import into a temporary table and preprocess then copy to your permanent table or use an ETL solution to remove unwanted data before importing. I don't know the nature of your data or project but perhaps that column isn't suitable for a key.

Cheers,
Steve 
I am attempting to periodically pull time series data from an InfluxDB.
The column at issue is the timestamp. I have a script that pulls the last 15 minutes of data from the InfluxDB
as csv data and pipe it into a psql -c "\copy...." command. I was looking for the simplest way to do this.

--
Stephen Clark
NetWolves Managed Services, LLC.
Sr. Applications Architect 
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Email Confidentiality Notice: The information contained in this transmission may contain privileged and confidential and/or protected health information (PHI) and may be subject to protection under the law, including the Health Insurance Portability and Accountability Act of 1996, as amended (HIPAA). This transmission is intended for the sole use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are notified that any use, dissemination, distribution, printing or copying of this transmission is strictly prohibited and may subject you to criminal or civil penalties. If you have received this transmission in error, please contact the sender immediately and delete this email and any attachments from any computer. Vaso Corporation and its subsidiary companies are not responsible for data leaks that result from email messages received that contain privileged and confidential and/or protected health information (PHI).

Re: psql \copy

From
Adrian Klaver
Date:
On 4/24/20 9:12 AM, Steve Clark wrote:
> On 04/24/2020 11:59 AM, Steve Crawford wrote:
>> On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com 
>> <mailto:steve.clark@netwolves.com>> wrote:
>>
>>     Hello,
>>
>>     I am using psql to copy data extracted from an InfluxDB in csv
>>     format into postgresql.
>>     I have a key field on the time field which I have defined as a
>>     bigint since the time I get
>>     from InfluxDB is an epoch time.
>>
>>     My question is does psql abort the copy if it hits a duplicate
>>     key, or does it keep processing?
>>
>>
>> The copy will fail. You could import into a temporary table and 
>> preprocess then copy to your permanent table or use an ETL solution to 
>> remove unwanted data before importing. I don't know the nature of your 
>> data or project but perhaps that column isn't suitable for a key.
>>
>> Cheers,
>> Steve
> I am attempting to periodically pull time series data from an InfluxDB.
> The column at issue is the timestamp. I have a script that pulls the 
> last 15 minutes of data from the InfluxDB
> as csv data and pipe it into a psql -c "\copy...." command. I was 
> looking for the simplest way to do this.

Then as suggested above pull into staging table that has no constraints 
e.g. PK. Verify data and then push into permanent table.

> 
> -- 
> Stephen Clark
> *NetWolves Managed Services, LLC.*
> Sr. Applications Architect
> Phone: 813-579-3200
> Fax: 813-882-0209
> Email: steve.clark@netwolves.com
> http://www.netwolves.com


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: psql \copy

From
Rob Sargent
Date:


On 4/24/20 10:12 AM, Steve Clark wrote:
On 04/24/2020 11:59 AM, Steve Crawford wrote:
On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com> wrote:
Hello,

I am using psql to copy data extracted from an InfluxDB in csv format into postgresql.
I have a key field on the time field which I have defined as a bigint since the time I get
from InfluxDB is an epoch time.

My question is does psql abort the copy if it hits a duplicate key, or does it keep processing?


The copy will fail. You could import into a temporary table and preprocess then copy to your permanent table or use an ETL solution to remove unwanted data before importing. I don't know the nature of your data or project but perhaps that column isn't suitable for a key.

Cheers,
Steve 
I am attempting to periodically pull time series data from an InfluxDB.
The column at issue is the timestamp. I have a script that pulls the last 15 minutes of data from the InfluxDB
as csv data and pipe it into a psql -c "\copy...." command. I was looking for the simplest way to do this.

Is the duplication due to overlapping 15min chunks (i.e. imprecise definition of  "15 minutes ago")?  Perhaps retaining last timestamp sent to pg and use in the get-from-influx call?

--
Stephen Clark
NetWolves Managed Services, LLC.
Sr. Applications Architect 
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Email Confidentiality Notice: The information contained in this transmission may contain privileged and confidential and/or protected health information (PHI) and may be subject to protection under the law, including the Health Insurance Portability and Accountability Act of 1996, as amended (HIPAA). This transmission is intended for the sole use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are notified that any use, dissemination, distribution, printing or copying of this transmission is strictly prohibited and may subject you to criminal or civil penalties. If you have received this transmission in error, please contact the sender immediately and delete this email and any attachments from any computer. Vaso Corporation and its subsidiary companies are not responsible for data leaks that result from email messages received that contain privileged and confidential and/or protected health information (PHI).

Re: psql \copy

From
Steve Clark
Date:
On 04/24/2020 12:15 PM, Adrian Klaver wrote:
On 4/24/20 9:12 AM, Steve Clark wrote:
On 04/24/2020 11:59 AM, Steve Crawford wrote:
On Fri, Apr 24, 2020 at 8:55 AM Steve Clark <steve.clark@netwolves.com 
<mailto:steve.clark@netwolves.com>> wrote:
   Hello,
   I am using psql to copy data extracted from an InfluxDB in csv   format into postgresql.   I have a key field on the time field which I have defined as a   bigint since the time I get   from InfluxDB is an epoch time.
   My question is does psql abort the copy if it hits a duplicate   key, or does it keep processing?


The copy will fail. You could import into a temporary table and 
preprocess then copy to your permanent table or use an ETL solution to 
remove unwanted data before importing. I don't know the nature of your 
data or project but perhaps that column isn't suitable for a key.

Cheers,
Steve
I am attempting to periodically pull time series data from an InfluxDB.
The column at issue is the timestamp. I have a script that pulls the 
last 15 minutes of data from the InfluxDB
as csv data and pipe it into a psql -c "\copy...." command. I was 
looking for the simplest way to do this.
Then as suggested above pull into staging table that has no constraints 
e.g. PK. Verify data and then push into permanent table.
Thanks for the tip. I'll head down that road. Stay safe everyone.

-- 
Stephen Clark
*NetWolves Managed Services, LLC.*
Sr. Applications Architect
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com



--
Stephen Clark
NetWolves Managed Services, LLC.
Sr. Applications Architect 
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Email Confidentiality Notice: The information contained in this transmission may contain privileged and confidential and/or protected health information (PHI) and may be subject to protection under the law, including the Health Insurance Portability and Accountability Act of 1996, as amended (HIPAA). This transmission is intended for the sole use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are notified that any use, dissemination, distribution, printing or copying of this transmission is strictly prohibited and may subject you to criminal or civil penalties. If you have received this transmission in error, please contact the sender immediately and delete this email and any attachments from any computer. Vaso Corporation and its subsidiary companies are not responsible for data leaks that result from email messages received that contain privileged and confidential and/or protected health information (PHI).

Re: psql \copy

From
Ron
Date:
You might want to investigate pg_bulkload for this activity.

On 4/24/20 10:55 AM, Steve Clark wrote:
> Hello,
>
> I am using psql to copy data extracted from an InfluxDB in csv format into 
> postgresql.
> I have a key field on the time field which I have defined as a bigint 
> since the time I get
> from InfluxDB is an epoch time.
>
> My question is does psql abort the copy if it hits a duplicate key, or 
> does it keep processing?
>

-- 
Angular momentum makes the world go 'round.