Re: which is more scalable for the database? - Mailing list pgsql-general
From | Timasmith |
---|---|
Subject | Re: which is more scalable for the database? |
Date | |
Msg-id | 1173757150.216710.220780@30g2000cwc.googlegroups.com Whole thread Raw |
In response to | Re: which is more scalable for the database? (Shane Ambler <pgsql@Sheeky.Biz>) |
List | pgsql-general |
On Mar 8, 2:26 pm, p...@Sheeky.Biz (Shane Ambler) wrote: > Timasmithwrote: > > Suppose I have a database table with 20 fields which are lookups to a > > single table. > > > configtable(configtable_id, a_field, something_lookup_id, > > another_lookup_id, ...) > > lookup(lookup_id, value, description, ...) > > > what is going to be faster to map the rows to an Object which needs > > the 'value' for every field ending in lookup_id > > How long is ball of string 'a' compared to ball of string 'b'? > > That will depend on a lot on how large the lookup table will be (100 > rows or 10 million rows?) 100-500,000 rows and on how long your descriptions are. 10-60, perhaps averaging around 20 characters Will > you want all 20 descriptions for each query you run? Yes, it is the price I pay with Hibernate for vastly simplifying data access If your > descriptions are 200 characters each then maybe you can reduce some data > transfer by caching these at the client instead of transferring them > each time you retrieve a result. But then how will you keep your local > cache up to date? The data I would cache would change infrequently in the production environment and there would be no expectation it would be 'immediate' on change. I would provide services to flush on demand, or on save (of reference data), and/or through a scheduler etc. > > I would suggest the view for a couple of reasons - first it will > simplify your queries as you won't need all the join details in every > query you run. Second the select for the query can be parsed and stored > whereas separate selects will be parsed each time. If I was caching on the client, I would select all data only once and there would be no queries, it would only be select * from sometable and the client (actually server side bean), would populate the object with the missing lookups. > > A lot of this will depend on what you are doing and what you are using > to do it. If you are using php then you have less options for caching > than say a client program written in c. php may run pretty fast but it > won't outrun a compiled c program. Java hashtable. > > > > > > > a) View > > > select c.configtable_id, l1.value as something_lookup, l2.value as > > another_lookup > > from configtable c, > > lookup l1, > > lookup l2 > > where c.something_lookup_id = l1.lookup_id > > and c.another_lookup_id = l2.lookup_id > > > foreach row > > map values to object > > end > > > b) Cache all lookup values and populate > > > select c.* from configtable > > > foreach row > > map values to object > > if lookup_id > > find value from hashtable and map value to object > > endif > > end > > > It seems that the latter *might* be better to scale outward better, > > as > > you could add application servers to do the caching/mapping and you > > only select from a single table? > > Maybe but then postgresql may do the lookups quicker than what you have > available at the client end. Right, but could it keep up, how much extra effort is it to do the indexed lookups (almost all in memory), for the client. > > The only way you will really know is to load up some sample data and > test each method yourself. Its very hard to simulate though due to production hardware having 16 cpus, Gigs of memory, huge databases, hundereds of concurrent users - I just dont have access to that kind of environment. > > How far are you expecting to scale? The simplest method may just work > fast enough that all the extra caching and setup/maintenance of this > will outweigh any benefit. I would like to scale to 10,000s of power users on a complex OLTP system. Adding servers to expand out is generally feasible, but one can only scale upward so far before reaching a hardware peak. > > -- > > Shane Ambler > p...@Sheeky.Biz > > Get Sheeky @http://Sheeky.Biz > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match- Hide quoted text - > > - Show quoted text -
pgsql-general by date: