Re: How to import Apache parquet files? - Mailing list pgsql-general

From Softwarelimits
Subject Re: How to import Apache parquet files?
Date
Msg-id CALnJc4XX1cc=oDX8SdKwDTAb5grZPx9Zu02f7LwM5OhmrBPFxw@mail.gmail.com
Whole thread Raw
In response to Re: How to import Apache parquet files?  (Imre Samu <pella.samu@gmail.com>)
Responses Re: How to import Apache parquet files?
List pgsql-general
Hi Imre, thanks for the quick response - yes, I found that, but I was not sure if it is already production ready - also I would like to use the data with the timescale extension, that is why I need a full import.

Have  nice day!

On Tue, Nov 5, 2019 at 4:09 PM Imre Samu <pella.samu@gmail.com> wrote:
>I would like to import (lots of) Apache parquet files to a PostgreSQL 11 cluster 

imho: You have to check and test the Parquet FDW ( Parquet File Wrapper )

Imre




Softwarelimits <softwarelimits@gmail.com> ezt írta (időpont: 2019. nov. 5., K, 15:57):
Hi, I need to come and ask here, I did not find enough information so I hope I am just having a bad day or somebody is censoring my search results for fun... :)

I would like to import (lots of) Apache parquet files to a PostgreSQL 11 cluster - yes, I believe it should be done with the Python pyarrow module, but before digging into the possible traps I would like to ask here if there is some common, well understood and documented tool that may be helpful with that process?

It seems that the COPY command can import binary data, but I am not able to allocate enough resources to understand how to implement a parquet file import with that.

I really would like follow a person with much more knowledge than me about either PostgreSQL or Apache parquet format instead of inventing a bad wheel.

Any hints very welcome,
thank you very much for your attention!
John

pgsql-general by date:

Previous
From: Imre Samu
Date:
Subject: Re: How to import Apache parquet files?
Next
From: Tom Lane
Date:
Subject: Re: select view definition from pg_views feature request