Re: what would tar file FDW look like? - Mailing list pgsql-hackers
From | Bear Giles |
---|---|
Subject | Re: what would tar file FDW look like? |
Date | |
Msg-id | CALBNtw4dpNi9YJksmrju6S8BZLy-vd=kpvK7K3fH4pxqM1r9aw@mail.gmail.com Whole thread Raw |
In response to | Re: what would tar file FDW look like? (Greg Stark <stark@mit.edu>) |
List | pgsql-hackers |
<div dir="ltr"><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000">I've written readers for bothfrom scratch. Tar isn't that bad since it's blocked - you read the header, skip forward N blocks, continue. The hardestpart is setting up the decompression libraries if you want to support tar.gz or tar.bz2 files.</div><div class="gmail_default"style="font-family:tahoma,sans-serif;color:#000000"><br /></div><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000">Zipfiles are more complex. You have (iirc) 5 control blocks - start ofarchive, start of file, end of file, start of index, end of archive, and the information in the control block is prettylimited. That's not a huge burden since there's support for extensions for things like the unix file metadata. Onecomplication is that you need to support compression from the start.</div><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000"><br/></div><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000">Zipfiles support two types of encryption. There's a really weak versionthat almost nobody supports and a much stronger modern version that's subject to license restrictions. (Some peopleuse the weak version on embedded systems because of legal requirements to /do something/, no matter how lame.)</div><divclass="gmail_default" style="font-family:tahoma,sans-serif;color:#000000"><br /></div><div class="gmail_default"style="font-family:tahoma,sans-serif;color:#000000">There are third-party libraries, of course, butthat introduces dependencies. Both formats are simple enough to write from scratch.</div><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000"><br/></div><div class="gmail_default" style="font-family:tahoma,sans-serif;color:#000000">Iguess my bigger question is if there's an interest in either or bothfor "real" use. I'm doing this as an exercise but am willing to contrib the code if there's a general interest in it.</div><divclass="gmail_default" style="font-family:tahoma,sans-serif;color:#000000"><br /></div><div class="gmail_default"style="font-family:tahoma,sans-serif;color:#000000">(BTW the more complex object I'm working on is the.p12 keystore for digital certificates and private keys. We have everything we need in the openssl library so there'sno additional third-party dependencies. I have a minimal FDW for the digital certificate itself and am now workingon a way to access keys stored in a standard format on the filesystem instead of in the database itself. A naturalfit is a specialized archive FDW. Unlike tar and zip it will have two payloads, the digital certificate and the (optionallyencrypted) private key. It has searchable metadata, e.g., finding all records with a specific subject.)</div><divclass="gmail_default" style="font-family:tahoma,sans-serif;color:#000000"><br /></div><div class="gmail_default"style="font-family:tahoma,sans-serif;color:#000000">Bear</div></div><div class="gmail_extra"><br /><divclass="gmail_quote">On Mon, Aug 17, 2015 at 8:29 AM, Greg Stark <span dir="ltr"><<a href="mailto:stark@mit.edu"target="_blank">stark@mit.edu</a>></span> wrote:<br /><blockquote class="gmail_quote" style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, Aug 17, 2015 at 3:14 PM, BearGiles <<a href="mailto:bgiles@coyotesong.com">bgiles@coyotesong.com</a>> wrote:<br /> > I'm starting to workon a tar FDW as a proxy for a much more specific FDW.<br /> > (It's the 'faster to build two and toss the first away'approach - tar lets<br /> > me get the FDW stuff nailed down before attacking the more complex<br /> > container.)It could also be useful in its own right, or as the basis for a<br /> > zip file FDW.<br /><br /></span>Hm.tar may be a bad fit where zip may be much easier. Tar has no<br /> index or table of contents. You have to scanthe entire file to find<br /> all the members. IIRC Zip does have a table of contents at the end of<br /> the file.<br/><br /> The most efficient way to process a tar file is to describe exactly<br /> what you want to happen with eachmember and then process it linearly<br /> from start to end (or until you've found the members you're looking<br /> for).Trying to return meta info and then go looking for individual<br /> members will be quite slow and have a large startupcost.<br /><span class="HOEnZb"><font color="#888888"><br /><br /> --<br /> greg<br /></font></span></blockquote></div><br/></div>
pgsql-hackers by date: