Re: Big 7.1 open items - Mailing list pgsql-hackers
From | JanWieck@t-online.de (Jan Wieck) |
---|---|
Subject | Re: Big 7.1 open items |
Date | |
Msg-id | 200006180010.CAA07405@hot.jw.home Whole thread Raw |
In response to | Re: Big 7.1 open items (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Tom Lane wrote: > JanWieck@t-online.de (Jan Wieck) writes: > > Tom Lane wrote: > >> It gets a little trickier if you want to be able to split > >> multi-gig tables across several tablespaces, though, since > >> you couldn't just append ".N" to the base table path in that > >> scenario. > >> > >> I'd be interested to know what sort of facilities Oracle > >> provides for managing huge tables... > > > Oracle tablespaces are a collection of 1...n preallocated > > files. Each table then is bound to a tablespace and > > allocates extents (chunks) from those files. > > OK, to get back to the point here: so in Oracle, tables can't cross > tablespace boundaries, but a tablespace itself could span multiple > disks? They can. The path in ALTER TABLESPACE <tsname> ADD DATAFILE ... can point to any location the db system has access to. > > Not sure if I like that better or worse than equating a tablespace > with a directory (so, presumably, all the files within it live on > one filesystem) and then trying to make tables able to span > tablespaces. We will need to do one or the other though, if we want > to have any significant improvement over the current state of affairs > for large tables. > > One way is to play the flip-the-path-ordering game some more, > and access multiple-segment tables with pathnames like this: > > .../TABLESPACE/RELATION -- first or only segment > .../TABLESPACE/N/RELATION -- N'th extension segment > > [...] In most cases all objects in one database are bound to one or two tablespaces (data and indices). So you do an estimation of the size required, create the tablespaces (and probably all their extension files), then create the schema and load it. The only reason not to do so is if your DB exceeds some size where you have to fear not beeingable to finish online backups before getting into Online-Relolog stuck. Has to do the the online backup procedureof Oracle. > This isn't any harder for md.c to deal with than what we do now, > but by making the /N subdirectories be symlinks, the dbadmin could > easily arrange for extension segments to go on different filesystems. > Also, since /N subdirectory symlinks can be added as needed, > expanding available space by attaching more disks isn't hard. > (If the admin hasn't pre-made a /N symlink when it's needed, > I'd envision the backend just automatically creating a plain > subdirectory so that it can extend the table.) So the admin allways have to leave enough freespace in the default location to keep the DB running until he cantake it offline, move the autocreated files and create the symlinks. What a pain for 24/7 systems. > We'd still want to create some tools to help the dbadmin with slinging > all these symlinks around, of course. But I think it's critical to keep > the low-level file access protocol simple and reliable, which really > means minimizing the amount of information the backend needs to know to > figure out which file to write a page in. With something like the above > you only need to know the tablespace name (or more likely OID), the > relation OID (+name or not, depending on outcome of other argument), > and the offset in the table. No worse than now from the software's > point of view. Exactly the "low-level file access" protocol is highly complicated in Postgres. Because nearly every object needs his own file, we need to deal with virtual file descriptors. With an Oracle-like tablespace concept anda fixed limit of total tablespace files (this time OS or installation specific), we could keep them allopen all the time. IMHO a big win. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-hackers by date: