Re: Best Strategy for Large Number of Images - Mailing list pgsql-general

From Imre Samu
Subject Re: Best Strategy for Large Number of Images
Date
Msg-id CAJnEWw=SR8oKyEqKZo2q=R8_27jzL-mMG06iFQy=u3c_pqC_gw@mail.gmail.com
Whole thread Raw
In response to Re: Best Strategy for Large Number of Images  (Estevan Rech <softrech@gmail.com>)
List pgsql-general
> ... I have about 2 million images ... 
folder structure 

The "Who's On First" gazetteer with ~ 26M geojson records - using 3-number chunks subfolder structure.
"Given a Who's On First ID its (relative) URI can be derived by splitting the ID in to 3-number chunks representing nested subdirectories, followed by filename consisting of the ID followed by .geojson.    For example the ID for Montréal is 101736545 which becomes:   101/736/545/101736545.geojson"
it is working .. but this is also not optimal 
"As of this writing it remains clear that this approach (lots of tiny files parented by lots of nested directories) can be problematic. We may be forced to choose another approach, like fewer subdirectories but nothing has been decided and anything we do will be backwards compatible." ( from https://whosonfirst.org/data/principles/ )
Now  the structure have been migrated to per-country repositories  ( https://whosonfirst.org/blog/2019/05/09/changes/ )

maybe you can adopt some ideas. 
imho:  with 3-number chunks representing nested subdirectories - you can choose more file systems / hosting solutions .. 

regards,
 Imre

Estevan Rech <softrech@gmail.com> ezt írta (időpont: 2021. dec. 20., H, 11:30):
How is this folder structure like 10,000 folders? and the backup of it, how long does it take?

pgsql-general by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: Re: Best Strategy for Large Number of Images
Next
From: iulian dragos
Date:
Subject: How to reduce query planning time (10s)