Re: Table spaces again [was Re: Threaded Sorting] - Mailing list pgsql-hackers

From Shridhar Daithankar
Subject Re: Table spaces again [was Re: Threaded Sorting]
Date
Msg-id 3DA1E866.19584.108D05BC@localhost
Whole thread Raw
In response to Re: Table spaces again [was Re: Threaded Sorting]  (Hans-Jürgen Schönig <postgres@cybertec.at>)
List pgsql-hackers
On 7 Oct 2002 at 15:52, Hans-Jürgen Schönig wrote:

> >Can anybody please tell me in detail.(Not just a pointing towards TODO items)
> >1) What a table space supposed to offer?
> They allow you to define a maximum amount of storage for a certain set 
> of data.

Use quota

> They help you to define the location of data.

Mount/symlink whereever you want assuming database supports one directory per 
object metaphor. This is finer control than tablespaces as tablespaces often 
hosts many objects from possibly many databases. Once a tablespace is created, 
it's difficult to keep track of what all goes on a table space. You look at 
directory structure and you get a clear picture..

> They help you to define how much data can be used by which ressource.

Which resource? I am confused. Disk space is only resource we are talking 
about.

> >2) What a directory structure does not offer that table space does?
> You need to the command line in order to manage quotas - you might not 
> want that.

Mount a directory on a partition. If the data exceeds on that partition, there 
would be disk error. Like tablespace getting overflown. I have seen both the 
scenarios in action..

> Quotas are handled differently on ever platform (if available).

Yeah. But that's sysadmins responsibility not DBA's.

> With tablespaces you can assign 30mb to use a, 120mb to user b etc. ...
> Table spaces are a nice abstraction layer to the file system.

Hmm.. And how does that fit in database metaphor? What practical use is that? I 
can't imagine as I am a developer and not a DBA.

> how would you handle table spaces? just propose it to the hackers' list ...
> we should definitely discuss that ...
> a bad implementation of table spaces would be painful ...

I suggest a directory per object structure. A database gets it's own directory, 
an index gets it's own dir. etc. 

In addition to that, postgresql should offer capability to suspend a 
database/table so that they can be moved without restarting database daemon. 

This is to acknowledge the fact that database storage is relocatable. In 
current implementation, database does not offer any assistance in relocating 
the database structure on disk..

Besides postgresql runs a cluster of databases as opposed to single database 
run by likes of oracle. So relocating a single database should not affect 
others.

Additionally postgresql should handle out of disk space errors gracefully 
noting that out of space for an object may not mean out of space for all 
objects. Obviously tablespaces can do better here.

Let's say for each object that gets storage, e.g. database, table, index and 
transaction log, we maintain a flag in metadata to indicate whether it's 
relocated or not. t can offer a command to set this flag. Typically the command 
sequence would look like this

sql> take <object> offline;
---
relocate/symlink/remount the appropriate dir.
--
sql> take <object> online mark relocated;

Postgresql should continue working if a relocated object experiences disk space 
full.

Of course, if postgresql could accept arguments at object creation time for 
alternate directories to symlink to, that would be better than sliced bread.

I believe giving each database it's own transaction log would be a great 
advantage of this scheme.

These are some thoughts how it would work. I believe this course of action 
would make the transition easier, offer a much better granular control over 
object allocation on disk and ultimately would prove useful to users.

I might have lost couple of points as I took long time composing it. But having 
all the possible objections dealt with should be the starting point. 

Thanks once again..

ByeShridhar

--
Cheit's Lament:    If you help a friend in need, he is sure to remember you--    the 
next time he's in need.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [pgsql-performance] [GENERAL] Large databases, performance
Next
From: "Shridhar Daithankar"
Date:
Subject: Re: [pgsql-performance] [GENERAL] Large databases, performance