Re: Storing files: 2.3TBytes, 17M file count - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Storing files: 2.3TBytes, 17M file count
Date
Msg-id 696ffd72-fead-6780-8605-55e29438c5ee@aklaver.com
Whole thread Raw
In response to Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Responses Re: Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
List pgsql-general
On 11/28/2016 06:28 AM, Thomas Güttler wrote:
> Hi,
>
> PostgreSQL is rock solid and one of the most reliable parts of our
> toolchain.
>
>    Thank you
>
> Up to now, we don't store files in PostgreSQL.
>
> I was told, that you must not do this .... But this was 20 years ago.
>
>
> I have 2.3TBytes of files. File count is 17M
>
> Up to now we use rsync (via rsnapshot) to backup our data.
>
> But it takes longer and longer for rsync to detect
> the changes. Rsync checks many files. But daily only
> very few files really change. More than 99.9% don't.

Are you rsyncing over all the files at one time?

Or do break it down into segments over the day?

>
> Since we already store our structured data in postgres, I think
> about storing the files in PostgreSQL, too.
>
> What is the current state of the art?

I don't know.

>
> Is it feasible to store file in PostgreSQL?

Yes, you can store a file in Postgres. Still I am not sure that stuffing
17M files into Postgres is going to perform any better then dealing with
them on the file system. In fact in Postgres they would be still be on
the file system but with an extra layer above them.

>
> Are there already projects which use PostgreSQL as storage backend?

The closest I remember is Bacula:

http://blog.bacula.org/documentation/documentation/

It uses a hybrid solution where the files are stored on a file server
and data about the files is stored in a database. Postgres is one of the
database backends it can work with.

>
> I have the hope, that it would be easier to backup only the files which
> changed.

Backup to where and how?
Are you thinking of using replication?

>
> Regards,
>    Thomas Güttler
>
>
> Related question at rsnapshot mailing list:
> https://sourceforge.net/p/rsnapshot/mailman/rsnapshot-discuss/thread/57A1A2F3.5090409@thomas-guettler.de/
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: pg_dump system catalog
Next
From: Chris Travers
Date:
Subject: Re: Storing files: 2.3TBytes, 17M file count