Re: Storing files: 2.3TBytes, 17M file count - Mailing list pgsql-general

From Mike Sofen
Subject Re: Storing files: 2.3TBytes, 17M file count
Date
Msg-id 00f201d249da$e142b830$a3c82890$@runbox.com
Whole thread Raw
In response to Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Responses Re: Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
List pgsql-general

From: Thomas Güttler   Sent: Monday, November 28, 2016 6:28 AM

...I have 2.3TBytes of files. File count is 17M

Since we already store our structured data in postgres, I think about storing the files in PostgreSQL, too.

Is it feasible to store file in PostgreSQL?

-------

I am doing something similar, but in reverse.  The legacy mysql databases I’m converting into a modern Postgres data model, have very large genomic strings stored in 3 separate columns.  Out of the 25 TB of legacy data storage (in 800 dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total space, and they are just used for reference, never used in searches or calculations.  They range from 1k to several MB.

 

Since I am collapsing all 800 dbs into a single PG db, being very smart about storage was critical.  Since we’re also migrating everything to AWS, we’re placing those 3 strings (per row) into a single json document and storing the document in S3 bins, with the pointer to the file being the globally unique PK for the row…super simple.  The app tier knows to fetch the data from the db and large string json from the S3 bins.  The retrieval time is surprisingly fast, this is all real time web app stuff.

 

This is a model that could work for anyone dealing with large objects (text or binary).  The nice part is, the original 25TB of data storage drops to 5TB – a much more manageable number, allowing for significant growth, which is on the horizon.

 

Mike Sofen  (Synthetic Genomics USA)

pgsql-general by date:

Previous
From: Israel Brewster
Date:
Subject: Re: Backup "Best Practices"
Next
From: David Steele
Date:
Subject: Re: Wal files - Question | Postgres 9.2