Home > mailing lists

Re: what would tar file FDW look like? - Mailing list pgsql-hackers

From	Andrew Dunstan
Subject	Re: what would tar file FDW look like?
Date	August 17, 2015 18:03:37
Msg-id	55D1F7BF.2010902@dunslane.net Whole thread Raw
In response to	what would tar file FDW look like? (Bear Giles <bgiles@coyotesong.com>)
List	pgsql-hackers

Tree view


On 08/17/2015 10:14 AM, Bear Giles wrote:
> I'm starting to work on a tar FDW as a proxy for a much more specific 
> FDW. (It's the 'faster to build two and toss the first away' approach 
> - tar lets me get the FDW stuff nailed down before attacking the more 
> complex container.) It could also be useful in its own right, or as 
> the basis for a zip file FDW.
>
> I have figured out that in one mode the FDW mapping that would take 
> the name of the tarball as an option and produce a relation that has 
> all of the metadata for the contained files - filename, size, owner, 
> timestamp, etc. I can use the same approach I used for the /etc/passwd 
> FDW for that.
>
> (BTW the current version is at 
> https://github.com/beargiles/passwd-fdw. It's skimpy on automated 
> tests until I can figure out how to handle the user mapping but it works.)
>
> The problem is the second mode where I pull a single file out of the 
> FDW. I've identified three approachs so far:
>
> 1. A FDW mapping specific to each file. It would take the name of the 
> tarfile and the embedded file. Cleanest in some ways but it would be a 
> real pain if you're reading a tarball dynamically.
>
> 2. A user-defined function that takes the name of the tarball and file 
> and returns a blob. This is the traditional approach but why bother 
> with a FDW then? It also brings up access control issues since it 
> requires disclosure of the tarball name to the user. A FDW could hide 
> that.
>
> 3. A user-defined function that takes a tar FDW and the name of a file 
> and returns a blob. I think this is the best approach but I don't know 
> if I can specify a FDW as a parameter or how to access it.
>
> I've skimmed the existing list of FDW but didn't find anything that 
> can serve as a model. The foreign DB are closest but, again, they 
> aren't designed for dynamic use where you want to do something with 
> every file in an archive / table in a foreign DB.
>
> Is there an obvious approach? Or is it simply a bad match for FDW and 
> should be two standard UDF?  (One returns the metadata, the second 
> returns the specific file.)
>
>


I would probably do something like this:

In this mode, define a table that has <path, blob>. To get the blob for 
a single file, just do "select blob from fdwtable where path = 
'/path/to/foo'". Make sure you process the qual in the FDW.

e.g.
   create foreign table tarblobs (path text, blob bytea)   server tarfiles options (filename  '/path/to/tarball', mode
'contents');


cheers

andrew

pgsql-hackers by date:

From: Tom Lane
Date: 17 August 2015, 17:56:07
Subject: Re: Memory allocation in spi_printtup()

From: Andres Freund
Date: 17 August 2015, 18:13:13
Subject: Re: checkpointer continuous flushing

Re: what would tar file FDW look like? - Mailing list pgsql-hackers

Previous

Next