Thread: Speed of locating tables?
Given the name of a table, how quickly can postgresql locate and access the table? How does the performance compare (ballpark estimate) with just using flat files to represent the data in each table? I have a problem where the (to me) most natural solution is to create a large number of small tables. A new solar telescope/instrument set we're building needs to share configuration information (sets of attribute (name-value pairs)) across a distributed environment, plus retain these sets for possible reuse. Typically, there will be 10-30 thousand of these sets created each day. Each set has associated with it a unique id string. When an attribute set is needed, it is needed quickly - every 1/5 of a second or so a request will be made of the system that will require access to one of the sets - this request will be via the id string, never by any more complex scheme. To me, the most natural way to encode the sets is to create a separate table for each set, since the attributes can then be indexed and referenced quickly once the table is accessed. But I don't know how fast PG is at locating a table, given its name. So, to refine the question - given a DB with (say) 100,000 tables, how quickly can PG access a table given its name? Thanks! I'm also open to suggestions on other ways to represent the data that would provide better access performance - you can probably tell I'm new to the world of databases. -- Steve Wampler- SOLIS Project, National Solar Observatory swampler@noao.edu
Hi, Steve Wampler: > So, to refine the question - given a DB with (say) 100,000 > tables, how quickly can PG access a table given its name? > You will then have a directory with one hundred thousand entries (plus one for each index you decide to create). This will slow down your system like hell. > Thanks! I'm also open to suggestions on other ways to > represent the data that would provide better access > performance - > Easy. Every table gets another column "telescope_ID", and you create a new table which maps from the csope's ID to their name and vice versa. > you can probably tell I'm new to the world of databases. You definitely want to read an introductory book / online article about the subject. -- Matthias Urlichs | noris network GmbH | smurf@noris.de | ICQ: 20193661 The quote was selected randomly. Really. | http://smurf.noris.de/ -- There are those who argue that everything breaks even... I suppose that because the rich man gets ice in the summer and the poor man gets it in the winter things are breaking even for both. -- Bat Masterson
Steve Wampler <swampler@noao.edu> writes: > To me, the most natural way to encode the sets is to > create a separate table for each set, since the attributes > can then be indexed and referenced quickly once the table > is accessed. But I don't know how fast PG is at locating > a table, given its name. > So, to refine the question - given a DB with (say) 100,000 > tables, how quickly can PG access a table given its name? Don't even think about 100000 separate tables in a database :-(. It's not so much that PG's own datastructures wouldn't cope, as that very few Unix filesystems can cope with 100000 files in a directory. You'd be killed on directory search times. I don't see a good reason to be using more than one table for your attributes --- add one more column to what you were going to use, to contain an ID for each attribute set, and you'll be a lot better off. You'll want to make sure there's an index on the ID column, of course, or on whichever columns you plan to search by. regards, tom lane
Thanks for the quick response! Matthias Urlichs wrote: > Steve Wampler: > > So, to refine the question - given a DB with (say) 100,000 > > tables, how quickly can PG access a table given its name? > > > You will then have a directory with one hundred thousand entries (plus > one for each index you decide to create). This will slow down your > system like hell. Ah - I hadn't appreciated that postgres stores each table as a file! I assume that means also that every database is managed as a single directory (i.e. no way to spread tables in a database across multiple directories). > > Thanks! I'm also open to suggestions on other ways to > > represent the data that would provide better access > > performance - > > > Easy. Every table gets another column "telescope_ID", and you create > a new table which maps from the csope's ID to their name and vice versa. Hmmm, but there's now only one table (with around 2 million rows) instead of 100,000 (each with around 20 rows) right? (I'm confused by the "Every table gets..."). I read what you've written as saying "add the configuration set ID to each attribute (so now it's an ID, name, value triple instead of a pair), storing all attributes for all sets in a single table and then, when given an ID, search the table, collect the matching rows into a temporary table and reference the attributes from that temporary table" - is that correct? [I don't need fast update of the attributes, just fast read access while the sun is shining.] I guess I don't understand the "name" in the 2nd table you've mentioned - I have a "config set ID" identifying each set of attributes, not an ID and a name. I imagine I can pull "old" configuration sets from the large table (every night, perhaps) and archive them into some other table(s) to keep the size of the active table smaller. > > you can probably tell I'm new to the world of databases. >>You definitely want to read an introductory book / online article about > the subject. Thanks - any favorites? I've looked through a couple, but they seem to concentrate on performance issues when the query is complicated and not with quickly locating a small set of information from a larger set with a simple query (in my case, the config set ID...). Thanks again! -- Steve Wampler- SOLIS Project, National Solar Observatory swampler@noao.edu
Cary O'Brien wrote: > > Steve Wampler wrote: > > Thanks! I'm also open to suggestions on other ways to > > represent the data that would provide better access > > performance - you can probably tell I'm new to the world of > > databases. > > > > I wouldn't want a database with 100,000 tables. How about > something like this... > > create table run_time_parameters ( > id integer, > name varchar, > value varchar, > ... > ) > > create index run_time_parameters on run_time_parameters(id); > > ... insert data ... > > vacuum analyze run_time_parameters; This sounds promising. Most of the insertions will take place in the morning (before sunrise) so I should have time to do a vacuum analyze at that point. However some insertions will take place after that, while the database is being accessed. I assume that the vacuum analyze done earlier (on 90% of the total insertions) will still be a net win? I'll have to test this. > One final point. You have a non-trivial performance requirement. > You should probably set up a simple working prototype and verify > that you can get the performance you need using tables that are > the size you expect to handle. No sense in having any suprises > when you roll out your real system. Scripting languages like > TCL and Python are great for generating data and running tests. I'm in that stage now, but want to narrow down the number of possible solutions to try out... For example, I've thought about having one row per configuration (id, attributeSet) and manage the access of individual attributes outside the database. That seems clumsy to me, however. > Feel free to ask me if you have any more questions. Thanks! Most helpful! -- Steve Wampler- SOLIS Project, National Solar Observatory swampler@noao.edu
Steve Wampler <swampler@noao.edu> writes: > Hmmm, but there's now only one table (with around 2 million rows) > instead of 100,000 (each with around 20 rows) right? (I'm confused by > the "Every table gets..."). I read what you've written as saying "add > the configuration set ID to each attribute (so now it's an ID, name, > value triple instead of a pair), storing all attributes for all sets > in a single table Right so far. > and then, when given an ID, search the table, > collect the matching rows into a temporary table and reference the > attributes from that temporary table" - is that correct? [I don't > need fast update of the attributes, just fast read access while the > sun is shining.] There's no particular reason to make a temp table. As long as you have an index on the ID column, queries like SELECT * FROM configurations WHERE id = 'foo'; will be fast. This is what DBMSes live to do. > I imagine I can pull "old" configuration sets from the large table > (every night, perhaps) and archive them into some other table(s) to > keep the size of the active table smaller. If you insist, but it's unlikely to be worth the trouble. regards, tom lane
Does this also mean that if you are using large objects that you really won't be able to store large numbers of large objects in a database? (If I am correct, each large object creates two files, one for the large object and one for it's index.) If this is true for Large Objects, is there any workaround? The application I am porting from Oracle will be storing on the order of 1,000,000 large objects. thanks, --Barry Tom Lane wrote: > > Steve Wampler <swampler@noao.edu> writes: > > To me, the most natural way to encode the sets is to > > create a separate table for each set, since the attributes > > can then be indexed and referenced quickly once the table > > is accessed. But I don't know how fast PG is at locating > > a table, given its name. > > > So, to refine the question - given a DB with (say) 100,000 > > tables, how quickly can PG access a table given its name? > > Don't even think about 100000 separate tables in a database :-(. > It's not so much that PG's own datastructures wouldn't cope, > as that very few Unix filesystems can cope with 100000 files > in a directory. You'd be killed on directory search times. > > I don't see a good reason to be using more than one table for > your attributes --- add one more column to what you were going > to use, to contain an ID for each attribute set, and you'll be > a lot better off. You'll want to make sure there's an index > on the ID column, of course, or on whichever columns you plan > to search by. > > regards, tom lane
Barry Lind <barry@xythos.com> writes: > Does this also mean that if you are using large objects that you really > won't be able to store large numbers of large objects in a database? > (If I am correct, each large object creates two files, one for the large > object and one for it's index.) Yup. > If this is true for Large Objects, is > there any workaround? The application I am porting from Oracle will be > storing on the order of 1,000,000 large objects. You are going to have some serious problems :-( There's never been much enthusiasm among the core developers for large objects at all --- we see them as a poor substitute for allowing large values directly. (The "TOAST" work scheduled for 7.1 will finally resolve that issue, I hope.) So no one's felt like working on improving the large-object implementation. If someone did want to work on it, what would probably make sense is to eliminate the separate-table-per-object setup in favor of one big table holding all the large objects of a database. It'd be easy enough to do; the big table would be built just like LO tables are now, but with an extra column holding the large object OID associated with each row. And you'd add that column to the index of course. You might have to think a little about how the existing LO locking semantics should translate into that environment, but I see no showstoppers. (It might've been done the way it was done because there didn't use to be support for tables > 2gig, but in the current system I see no reason to be afraid of having one big table for LOs instead of many not-so-big ones.) I doubt this would be a big project ... it just needs someone who cares enough about large objects to do the work ... regards, tom lane
This sounds doable. It would probably also remove the problem I have that unlinks of large objects can't be rolled back. thanks, --Barry Tom Lane wrote: > > Barry Lind <barry@xythos.com> writes: > > Does this also mean that if you are using large objects that you really > > won't be able to store large numbers of large objects in a database? > > (If I am correct, each large object creates two files, one for the large > > object and one for it's index.) > > Yup. > > > If this is true for Large Objects, is > > there any workaround? The application I am porting from Oracle will be > > storing on the order of 1,000,000 large objects. > > You are going to have some serious problems :-( > > There's never been much enthusiasm among the core developers for large > objects at all --- we see them as a poor substitute for allowing large > values directly. (The "TOAST" work scheduled for 7.1 will finally > resolve that issue, I hope.) So no one's felt like working on improving > the large-object implementation. > > If someone did want to work on it, what would probably make sense is to > eliminate the separate-table-per-object setup in favor of one big table > holding all the large objects of a database. It'd be easy enough to do; > the big table would be built just like LO tables are now, but with an > extra column holding the large object OID associated with each row. And > you'd add that column to the index of course. You might have to think a > little about how the existing LO locking semantics should translate into > that environment, but I see no showstoppers. > > (It might've been done the way it was done because there didn't use to > be support for tables > 2gig, but in the current system I see no reason > to be afraid of having one big table for LOs instead of many not-so-big > ones.) > > I doubt this would be a big project ... it just needs someone who cares > enough about large objects to do the work ... > > regards, tom lane
Tom Lane wrote: > (It might've been done the way it was done because there didn't use to > be support for tables > 2gig, but in the current system I see no reason > to be afraid of having one big table for LOs instead of many not-so-big > ones.) I am curious, how does PostgreSQL support tables larger than 2Gig, given the file per table architecture? --Barry
Barry Lind <barry@xythos.com> writes: > This sounds doable. It would probably also remove the problem I have > that unlinks of large objects can't be rolled back. Right, lo_unlink would become something like "DELETE FROM pg_largeobjects WHERE loid = NNNN" which is perfectly rollbackable. This would also solve the problem a number of people have had with accessing thousands of LOs in a single transaction. (The lock manager tends to run out of shared memory when asked to keep track of that many table locks :-(.) This might also make dump/restore of large objects more manageable, since you'd just be talking about dumping & restoring the contents of this one table instead of having to look around to see what LOs exist. A disadvantage I didn't think of before is that it'd take extra work to implement access protection of LOs --- we do not have any existing way of protecting individual rows in a table... regards, tom lane
Barry Lind <barry@xythos.com> writes: > I am curious, how does PostgreSQL support tables larger than 2Gig, given > the file per table architecture? Multiple files per table ... regards, tom lane
At 01:36 PM 26-05-2000 -0400, Tom Lane wrote: >Barry Lind <barry@xythos.com> writes: >> Does this also mean that if you are using large objects that you really >> won't be able to store large numbers of large objects in a database? >> (If I am correct, each large object creates two files, one for the large >> object and one for it's index.) > >Yup. Wow! For my webmail app that would be really bad- fortunately I went the filesystem way for storing the actual emails and database storing the path. In theory if BLOBS were handled better then storing them in the database would be quite nice, but right now the BLOBS don't seem to be helpful. >There's never been much enthusiasm among the core developers for large >objects at all --- we see them as a poor substitute for allowing large >values directly. (The "TOAST" work scheduled for 7.1 will finally >resolve that issue, I hope.) So no one's felt like working on improving >the large-object implementation. On the practical side, say I want to insert/read a large amount of information into/from a TOAST field. How should I do it? Is there a pipe method where I can continuously print to/read from? My worry is that if it's just like a standard insert/select command, it will take up a lot of memory to insert/select big stuff. So if lots of people are inserting/reading 1MB email attachments at the same time it'll get nasty. For other apps with really big stuff it could become really unmanageable. Cheerio, Link.
Lincoln Yeoh <lylyeoh@mecomb.com> writes: >> There's never been much enthusiasm among the core developers for large >> objects at all --- we see them as a poor substitute for allowing large >> values directly. (The "TOAST" work scheduled for 7.1 will finally >> resolve that issue, I hope.) So no one's felt like working on improving >> the large-object implementation. > On the practical side, say I want to insert/read a large amount of > information into/from a TOAST field. How should I do it? > Is there a pipe method where I can continuously print to/read from? Not at the moment, but that's obviously going to be a necessary feature if we want to make the existing flavor of large objects obsolete. There have been some preliminary discussions about it --- AFAIR no one's laid out a complete proposal yet. regards, tom lane
At 11:22 PM 29-05-2000 -0400, Tom Lane wrote: >Lincoln Yeoh <lylyeoh@mecomb.com> writes: >> Is there a pipe method where I can continuously print to/read from? > >Not at the moment, but that's obviously going to be a necessary feature >if we want to make the existing flavor of large objects obsolete. There >have been some preliminary discussions about it --- AFAIR no one's laid >out a complete proposal yet. Will such a "TOAST" and pg_dump interoperate? That would be nice to have. But if that's not possible then maybe new dump and load tools will be needed which cover everything. If not there'll be a new line on the FAQ - I dumped XXX and reloaded but it's not there. How do the other databases do BLOBS? Which ones do it well :). Have a nice day! Link.
Tom Lane wrote: > Lincoln Yeoh <lylyeoh@mecomb.com> writes: > >> There's never been much enthusiasm among the core developers for large > >> objects at all --- we see them as a poor substitute for allowing large > >> values directly. (The "TOAST" work scheduled for 7.1 will finally > >> resolve that issue, I hope.) So no one's felt like working on improving > >> the large-object implementation. > > > On the practical side, say I want to insert/read a large amount of > > information into/from a TOAST field. How should I do it? > > Is there a pipe method where I can continuously print to/read from? > > Not at the moment, but that's obviously going to be a necessary feature > if we want to make the existing flavor of large objects obsolete. There > have been some preliminary discussions about it --- AFAIR no one's laid > out a complete proposal yet. Yes, we already saw that problem. And looking at some other databases they seem to deal with it the same way I'm inclined to do. The BLOB/CLOB data types have to be references. A dummy reference is created by a special function that's used in the values for INSERT. If such a dummy ref occurs in a heap tuple to be stored, a real empty reference is created. Wherever we store the content, we need some new interface in libpq to access them like files with seek/read/write. It's somehow like our current LO, but all LOs for one column reside in one shadow table, and the system definitely knows when an item is obsolete. The type output functions just output the reference, and new libpq functions then gain access to it. It will need some enhancements on the SQL level too, to make pg_dump able to save/restore them. The shadow tables might have a different relkind, so they aren't accessible via normal SQL. But if COPY can, why not? All that is still a little vague and there are problems I don't want to talk about right now. I need to get TOAST ready for 7.1, then go back to FOREIGN KEY and implement the file buffered trigger queue. So I don't expect BLOB/CLOB to be ready before 7.2. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #