Thread: column size too large, is this a bug?
I have a query which get's data from a single table. When I try to get data from for an RFQ which has around 5000 rows, it is breaking off at 18th row. If i reduce some columns , then it returns all the rows and not so slow. I have tried with different sets of column and there is no pattern based on columns. But one thing is sure one size of the rows grows more than some bytes, the records do not get returned. Now the following query returns me all 5001 rows to me pretty fast <color><param>2676,2346,FFFD</param> select _level_ as l, nextval('seq_pk_bom_detail') as bom_detail, prior nextval('seq_pk_bom_detail') as parent_subassembly, parent_part_number, customer_part_number, /* mfr_name, mfr_part, description,*/ commodity, needs_date, target_price, comments, case qty_per when null then 0.00001 when 0 then 0.00001 else qty_per end, qty_multiplier1, qty_multiplier2, qty_multiplier3, qty_multiplier4, qty_multiplier5 from bom_detail_work_clean where (0=0) and bom_header=20252 and file_number = 1 start with customer_part_number = 'Top Assembly 1' connect by parent_part_number = prior customer_part_number; </color> But if I uncomment the description then it returns me only 18 rows. <color><param>FFFD,2231,314B</param> select _level_ as l, nextval('seq_pk_bom_detail') as bom_detail, prior nextval('seq_pk_bom_detail') as parent_subassembly, parent_part_number, customer_part_number, /* mfr_name, mfr_part,*/ description, commodity, needs_date, target_price, comments, case qty_per when null then 0.00001 when 0 then 0.00001 else qty_per end, qty_multiplier1, qty_multiplier2, qty_multiplier3, qty_multiplier4, qty_multiplier5 from bom_detail_work_clean where (0=0) and bom_header=20252 and file_number = 1 start with customer_part_number = 'Top Assembly 1' connect by parent_part_number = prior customer_part_number; </color><color><param>0126,0126,0126</param>Now these 18 rows are level 2 records in heirarchical query. I have a feeling the server has some memory paging mechanism and if it can not handle beyond certain byets, it just returns what it has. During your investigation of optimization of postgreSQL did you come across any setting that might help us ? Thanks! Qing </color>PS: I just reload the file while reducing the content in the description column. The file got uploaded. So looks like the problem is size of the record being inserted. <color><param>0126,0126,0126</param> </color> I have a query which get's data from a single table. When I try to get data from for an RFQ which has around 5000 rows, it is breaking off at 18th row. If i reduce some columns , then it returns all the rows and not so slow. I have tried with different sets of column and there is no pattern based on columns. But one thing is sure one size of the rows grows more than some bytes, the records do not get returned. Now the following query returns me all 5001 rows to me pretty fast select _level_ as l, nextval('seq_pk_bom_detail') as bom_detail, prior nextval('seq_pk_bom_detail') as parent_subassembly, parent_part_number, customer_part_number, /* mfr_name, mfr_part, description,*/ commodity, needs_date, target_price, comments, case qty_per when null then 0.00001 when 0 then 0.00001 else qty_per end, qty_multiplier1, qty_multiplier2, qty_multiplier3, qty_multiplier4, qty_multiplier5 from bom_detail_work_clean where (0=0) and bom_header=20252 and file_number = 1 start with customer_part_number = 'Top Assembly 1' connect by parent_part_number = prior customer_part_number; But if I uncomment the description then it returns me only 18 rows. select _level_ as l, nextval('seq_pk_bom_detail') as bom_detail, prior nextval('seq_pk_bom_detail') as parent_subassembly, parent_part_number, customer_part_number, /* mfr_name, mfr_part,*/ description, commodity, needs_date, target_price, comments, case qty_per when null then 0.00001 when 0 then 0.00001 else qty_per end, qty_multiplier1, qty_multiplier2, qty_multiplier3, qty_multiplier4, qty_multiplier5 from bom_detail_work_clean where (0=0) and bom_header=20252 and file_number = 1 start with customer_part_number = 'Top Assembly 1' connect by parent_part_number = prior customer_part_number; Now these 18 rows are level 2 records in heirarchical query. I have a feeling the server has some memory paging mechanism and if it can not handle beyond certain byets, it just returns what it has. During your investigation of optimization of postgreSQL did you come across any setting that might help us ? Thanks! Qing PS: I just reload the file while reducing the content in the description column. The file got uploaded. So looks like the problem is size of the record being inserted.
Qing Zhao <qzhao@quotefx.net> writes: > I have a query which get's data from a single table. > When I try to get data from for an RFQ which has around 5000 rows, it > is breaking off at 18th row. > If i reduce some columns , then it returns all the rows and not so slow. What client-side software are you using? This is surely a limitation on the client side, because there is no such problem in the server. regards, tom lane
Qing Zhao <qzhao@quotefx.net> writes: > It's not through one client. I am using JDBC. But the same things > happen when I use client like psql. That's really hard to believe. Can you provide a reproducible test case? regards, tom lane
Tom, Thanks for your help! It's not through one client. I am using JDBC. But the same things happen when I use client like psql. Qing On Mar 25, 2004, at 10:20 AM, Tom Lane wrote: > Qing Zhao <qzhao@quotefx.net> writes: >> I have a query which get's data from a single table. >> When I try to get data from for an RFQ which has around 5000 rows, it >> is breaking off at 18th row. >> If i reduce some columns , then it returns all the rows and not so >> slow. > > What client-side software are you using? This is surely a limitation > on > the client side, because there is no such problem in the server. > > regards, tom lane >
On Thu, 25 Mar 2004, Qing Zhao wrote: > select > _level_ as l, > nextval('seq_pk_bom_detail') as bom_detail, > prior nextval('seq_pk_bom_detail') as parent_subassembly, > parent_part_number, > customer_part_number, > /* mfr_name, > mfr_part, > description,*/ > commodity, > needs_date, > target_price, > comments, > case qty_per > when null then 0.00001 > when 0 then 0.00001 > else qty_per > end, > qty_multiplier1, > qty_multiplier2, > qty_multiplier3, > qty_multiplier4, > qty_multiplier5 > from bom_detail_work_clean > where (0=0) > and bom_header=20252 > and file_number = 1 > start with customer_part_number = 'Top Assembly 1' > connect by parent_part_number = prior customer_part_number; What version are you running, and did you apply any patches (for example one to support the start with/connect by syntax used above?)
It is 7.3.4 on MAC OS X (darwin). The patch we applied is hier-Pg7.3-0.5, which<fontfamily><param>Arial</param><x-tad-bigger> allows to perform hierarchical queries on PgSQL using Oracle's syntax. Thanks! Qing </x-tad-bigger></fontfamily> On Mar 25, 2004, at 2:57 PM, Stephan Szabo wrote: <excerpt>On Thu, 25 Mar 2004, Qing Zhao wrote: <excerpt> select _level_ as l, nextval('seq_pk_bom_detail') as bom_detail, prior nextval('seq_pk_bom_detail') as parent_subassembly, parent_part_number, customer_part_number, /* mfr_name, mfr_part, description,*/ commodity, needs_date, target_price, comments, case qty_per when null then 0.00001 when 0 then 0.00001 else qty_per end, qty_multiplier1, qty_multiplier2, qty_multiplier3, qty_multiplier4, qty_multiplier5 from bom_detail_work_clean where (0=0) and bom_header=20252 and file_number = 1 start with customer_part_number = 'Top Assembly 1' connect by parent_part_number = prior customer_part_number; </excerpt> What version are you running, and did you apply any patches (for example one to support the start with/connect by syntax used above?) </excerpt> It is 7.3.4 on MAC OS X (darwin). The patch we applied is hier-Pg7.3-0.5, which allows to perform hierarchical queries on PgSQL using Oracle's syntax. Thanks! Qing On Mar 25, 2004, at 2:57 PM, Stephan Szabo wrote: > On Thu, 25 Mar 2004, Qing Zhao wrote: > >> select >> _level_ as l, >> nextval('seq_pk_bom_detail') as bom_detail, >> prior nextval('seq_pk_bom_detail') as parent_subassembly, >> parent_part_number, >> customer_part_number, >> /* mfr_name, >> mfr_part, >> description,*/ >> commodity, >> needs_date, >> target_price, >> comments, >> case qty_per >> when null then 0.00001 >> when 0 then 0.00001 >> else qty_per >> end, >> qty_multiplier1, >> qty_multiplier2, >> qty_multiplier3, >> qty_multiplier4, >> qty_multiplier5 >> from bom_detail_work_clean >> where (0=0) >> and bom_header=20252 >> and file_number = 1 >> start with customer_part_number = 'Top Assembly 1' >> connect by parent_part_number = prior customer_part_number; > > What version are you running, and did you apply any patches (for > example > one to support the start with/connect by syntax used above?) > >
Stephan Szabo <sszabo@megazone.bigpanda.com> writes: > On Thu, 25 Mar 2004, Qing Zhao wrote: >> start with customer_part_number = 'Top Assembly 1' >> connect by parent_part_number = prior customer_part_number; > What version are you running, and did you apply any patches (for example > one to support the start with/connect by syntax used above?) Oh, good eye ... it's that infamous CONNECT BY patch again, without doubt. I think we should add "Have you applied any patches to your copy of Postgres?" to the standard bug report form ... regards, tom lane
Tom, > Oh, good eye ... it's that infamous CONNECT BY patch again, without doubt. Hey, who does this patch? What's wrong wiith it? -- -Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> Oh, good eye ... it's that infamous CONNECT BY patch again, without doubt. > Hey, who does this patch? What's wrong wiith it? I'm just venting my annoyance at people expecting us to support hacked-up versions, especially without telling us they're hacked-up. This is the third or fourth trouble report I can recall that was eventually traced to that patch (after considerable effort). Anyway, my guess for the immediate problem is incorrect installation of the patch, viz not doing a complete "make clean" and rebuild after patching. The patch changes the Query struct which is referenced in many more files than are actually modified by the patch, and so if you didn't build with --enable-depend then a simple "make" will leave you with a patchwork of files that have different ideas about the field offsets in Query. I'm a bit surprised it doesn't just dump core... (That's not directly the fault of the patch, though, except to the extent that it can be blamed for coming without adequate installation instructions. What is directly the fault of the patch is that it doesn't force an initdb by changing catversion. The prior trouble reports had to do with views not working because their stored rules were incompatible with the patched backend. We should not have had to deal with that, and neither should those users.) Theory B, of course, is that this is an actual bug in the patch and not just incorrect installation. I'm not interested enough to investigate though. regards, tom lane
> Theory B, of course, is that this is an actual bug in the patch and not > just incorrect installation. I'm not interested enough to investigate > though. Is there still someone around who's working on getting a similar patch into 7.5? Seems there huge user demand for such a thing... (And no, I'm not volunteering, it's well beyond my abilities...) Chris
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > Is there still someone around who's working on getting a similar patch > into 7.5? Seems there huge user demand for such a thing... Andrew Overholt did some preliminary work toward implementing the SQL99-spec WITH functionality (which subsumes what CONNECT BY does, and a few other things too). But he's left Red Hat and gone back to school. One of the many things on my todo list is to pick up that patch and get it finished. IIRC Andrew had finished the parser work and we had a paper design for the executor support. regards, tom lane
> Andrew Overholt did some preliminary work toward implementing the > SQL99-spec WITH functionality (which subsumes what CONNECT BY does, > and a few other things too). But he's left Red Hat and gone back > to school. One of the many things on my todo list is to pick up that > patch and get it finished. Out of interest, what is your 7.5 todo list? Chris
Thanks a lot! We were migrating to Postgres from Oracle and every now and then, we ran into something that we do not understand completely and it is a learning process for us. Your responses have made it much clear for us. BTW, do you think that it's better for us just to rewrite everything so we don't need to use the patch at all? Why do others still use it? Thanks! Qing On Mar 25, 2004, at 6:04 PM, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: >>> Oh, good eye ... it's that infamous CONNECT BY patch again, without >>> doubt. > >> Hey, who does this patch? What's wrong wiith it? > > I'm just venting my annoyance at people expecting us to support > hacked-up versions, especially without telling us they're hacked-up. > This is the third or fourth trouble report I can recall that was > eventually traced to that patch (after considerable effort). > > Anyway, my guess for the immediate problem is incorrect installation of > the patch, viz not doing a complete "make clean" and rebuild after > patching. The patch changes the Query struct which is referenced in > many more files than are actually modified by the patch, and so if you > didn't build with --enable-depend then a simple "make" will leave you > with a patchwork of files that have different ideas about the field > offsets in Query. I'm a bit surprised it doesn't just dump core... > > (That's not directly the fault of the patch, though, except to the > extent that it can be blamed for coming without adequate installation > instructions. What is directly the fault of the patch is that it > doesn't force an initdb by changing catversion. The prior trouble > reports had to do with views not working because their stored rules > were > incompatible with the patched backend. We should not have had to deal > with that, and neither should those users.) > > Theory B, of course, is that this is an actual bug in the patch and not > just incorrect installation. I'm not interested enough to investigate > though. > > regards, tom lane >
Quig, > Your responses have made it much clear for us. BTW, do you > think that it's better for us just to rewrite everything so we don't > need to use the patch at all? Why do others still use it? Others use it because of the same reason you do. If you want to use the patch for seemless porting, I suggest that you contact Evgen directly. He's not very active on the main project mailing lists, so you'll need to e-mail him personally. You may also need to sponsor him for bug fixes, since he is apparently an independent developer. I don't really know him. As an alternative, you may want to take a look at the IS_CONNECTED_BY patch in /contrib/tablefunc in the PostgreSQL source. As this was developed by Joe Conway, who is a very active major contributor in the community, it is more likely to be bug-free. However, it will force you to change your query syntax somewhat. Of course, there are other query tree structures you could use if you're willing to modify your database design. But you may not want to go that far. -- -Josh Berkus Aglio Database Solutions San Francisco
I used to use the connect-by patch, but have since rewritten everything to use a nested set model. I was having problems that, while not immediately traceable back to the patch, showed up when I started using it and went away when I stopped (strange locking behavior, crashing with vacuum full, problems after dropping columns) . Plus the annoyance of maintaining a non-stock build across numerous installations exceeded its benefits. Relying on it for a business critical situation became too much of a risk. On Mar 26, 2004, at 12:29 PM, Qing Zhao wrote: > Thanks a lot! We were migrating to Postgres from Oracle and > every now and then, we ran into something that we do not > understand completely and it is a learning process for us. > > Your responses have made it much clear for us. BTW, do you > think that it's better for us just to rewrite everything so we don't > need to use the patch at all? Why do others still use it? > > Thanks! > > Qing > On Mar 25, 2004, at 6:04 PM, Tom Lane wrote: > >> Josh Berkus <josh@agliodbs.com> writes: >>>> Oh, good eye ... it's that infamous CONNECT BY patch again, without >>>> doubt. >> >>> Hey, who does this patch? What's wrong wiith it? >> >> I'm just venting my annoyance at people expecting us to support >> hacked-up versions, especially without telling us they're hacked-up. >> This is the third or fourth trouble report I can recall that was >> eventually traced to that patch (after considerable effort). >> >> Anyway, my guess for the immediate problem is incorrect installation >> of >> the patch, viz not doing a complete "make clean" and rebuild after >> patching. The patch changes the Query struct which is referenced in >> many more files than are actually modified by the patch, and so if you >> didn't build with --enable-depend then a simple "make" will leave you >> with a patchwork of files that have different ideas about the field >> offsets in Query. I'm a bit surprised it doesn't just dump core... >> >> (That's not directly the fault of the patch, though, except to the >> extent that it can be blamed for coming without adequate installation >> instructions. What is directly the fault of the patch is that it >> doesn't force an initdb by changing catversion. The prior trouble >> reports had to do with views not working because their stored rules >> were >> incompatible with the patched backend. We should not have had to deal >> with that, and neither should those users.) >> >> Theory B, of course, is that this is an actual bug in the patch and >> not >> just incorrect installation. I'm not interested enough to investigate >> though. >> >> regards, tom lane >> > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
Andrew, > I used to use the connect-by patch, but have since rewritten everything > to use a nested set model. Cool! You're probably the only person I know other than me using nested sets in a production environment. -- Josh Berkus Aglio Database Solutions San Francisco
Well, I don't know if I would use it in an insert-heavy environment (at least the way I implemented it), but for select-heavy stuff I don't know why you would want to use anything else. Hard to beat the performance of a simple BETWEEN. On Mar 28, 2004, at 2:25 PM, Josh Berkus wrote: > Andrew, > >> I used to use the connect-by patch, but have since rewritten >> everything >> to use a nested set model. > > Cool! You're probably the only person I know other than me using > nested sets > in a production environment. > > -- > Josh Berkus > Aglio Database Solutions > San Francisco > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
>Andrew, > > I used to use the connect-by patch, but have since rewritten everything > > to use a nested set model. >Cool! You're probably the only person I know other than me using nested >sets >in a production environment. can you explain me what is a nested set? _________________________________________________________________ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail
Its a way of representing a tree with right-left pointers in each record (basically re-inventing a hierarchical database in a relational model...). A good description is in Joe Celko's SQL For Smarties book. Selection is very fast because any node's children have node ID's between the right and left nodes of said node, so there's no mucking about with connect by and what not. There's a synopsis linked at the PG Cookbook pages (http://www.brasileiro.net/postgres/cookbook), but the cookbook seems to off-line (I think I'll offer to mirror it - this happens frequently). There's another description at http://www.intelligententerprise.com/001020/celko.jhtml? _requestid=65750. Insertion takes a fair amount of work, as you generally have to re-arrange the node IDs when you add a record. On Mar 29, 2004, at 12:05 PM, Jaime Casanova wrote: > >> Andrew, > >> > I used to use the connect-by patch, but have since rewritten >> everything >> > to use a nested set model. > >> Cool! You're probably the only person I know other than me using >> nested sets >> in a production environment. > > > can you explain me what is a nested set? > > _________________________________________________________________ > Add photos to your messages with MSN 8. Get 2 months FREE*. > http://join.msn.com/?page=features/featuredemail > > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
thanx a lot _________________________________________________________________ STOP MORE SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail
Andrew, > Its a way of representing a tree with right-left pointers in each > record (basically re-inventing a hierarchical database > in a relational model...). A good description is in Joe Celko's SQL > For Smarties book. Selection is very fast because > any node's children have node ID's between the right and left nodes of > said node, so there's no mucking about > with connect by and what not. There's a synopsis linked at the PG > Cookbook pages (http://www.brasileiro.net/postgres/cookbook), > but the cookbook seems to off-line (I think I'll offer to mirror it - > this happens frequently). There's another description at > http://www.intelligententerprise.com/001020/celko.jhtml? > _requestid=65750. I have a full implementation of this. I was going to do it as a magazine article, so I've been holding it off line. However, publication seems to be indefinitely delayed, so I'll probably post it on TechDocs as soon as I have time. -- -Josh Berkus Aglio Database Solutions San Francisco
Robert, > If you search the pgsql-sql archives you'll find some helpful threads on > using nested sets in PostgreSQL, one in particular I was involved with was > a generic "move_tree" function that enabled moving a node from one branch > to another. I have to admit to failing to follow -SQL over the last few months. This list and Hackers are pretty much the only ones I read all of. Maybe I should get back on -SQL and we can compare move_tree functions :-) Did yours use a temp table, or some other means? -- Josh Berkus Aglio Database Solutions San Francisco
On Sunday 28 March 2004 14:25, Josh Berkus wrote: > Andrew, > > > I used to use the connect-by patch, but have since rewritten everything > > to use a nested set model. > > Cool! You're probably the only person I know other than me using nested > sets in a production environment. You cut me deep there Josh, real deep. :-) If you search the pgsql-sql archives you'll find some helpful threads on using nested sets in PostgreSQL, one in particular I was involved with was a generic "move_tree" function that enabled moving a node from one branch to another. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On Tuesday 30 March 2004 11:38, Josh Berkus wrote: > Robert, > > > If you search the pgsql-sql archives you'll find some helpful threads on > > using nested sets in PostgreSQL, one in particular I was involved with > > was a generic "move_tree" function that enabled moving a node from one > > branch to another. > > I have to admit to failing to follow -SQL over the last few months. This > list and Hackers are pretty much the only ones I read all of. > > Maybe I should get back on -SQL and we can compare move_tree functions :-) > > Did yours use a temp table, or some other means? Nope, Greg Mullane and I worked out the math and came up with an algorithm of sorts that we could apply to the tree when moving elements. <digs a little> http://archives.postgresql.org/pgsql-sql/2002-11/msg00355.php Seemed to work though someone else had posted yet another version after ours... and in fact the one posted is not exactly what I use now either :-) Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Robert, > http://archives.postgresql.org/pgsql-sql/2002-11/msg00355.php > > Seemed to work though someone else had posted yet another version after > ours... and in fact the one posted is not exactly what I use now either :-) Hmmm ... I'd want to do a *lot* of testing before I trusted that approach. Seems like it could be very vunerable to order-of-exection issues. I'll start a GUIDE on it, people can post their various Nested Sets solutions. -- -Josh Berkus Aglio Database Solutions San Francisco