Thread: [GENERAL] recursive query too big to complete. are there any strategies to limit/partition?

There are over 20 million records in a self-referential database table, where one record may point to another record as
adescendant. 

Because of a bug in application code, there was no limit on recursion.  The max was supposed to be 4.  A few outlier
recordshave between 5 and 5000 descendants (there could be more.  I manually found one chain of 5000. 

I need to find all the chains of 5+ and mark them for update/deletion.  While the database is about 10GB, the recursive
searchis maxing out on diskspace and causing a failure (there was over over 100GB of workspace free) 

Is there any way to make a recursive query work, or will I have to use another means and just iterate over the entire
dataset(either in postgres or an external service) 








On Thu, Jan 26, 2017 at 4:37 PM, Jonathan Vanasco <postgres@2xlp.com> wrote:
There are over 20 million records in a self-referential database table, where one record may point to another record as a descendant.

Because of a bug in application code, there was no limit on recursion.  The max was supposed to be 4.  A few outlier records have between 5 and 5000 descendants (there could be more.  I manually found one chain of 5000.

I need to find all the chains of 5+ and mark them for update/deletion.  While the database is about 10GB, the recursive search is maxing out on diskspace and causing a failure (there was over over 100GB of workspace free)

Is there any way to make a recursive query work, or will I have to use another means and just iterate over the entire dataset (either in postgres or an external service)

​Thinking aloud - why doesn't just finding every record with 5 descendants not work?  Any chain longer than 5 would have at least 5 items.

​Even without recursion you could build out a five-way self-join and any records that make it that far are guilty.  I suppose this assumes your setup is non-cyclic.

David J.


On Jan 26, 2017, at 7:07 PM, David G. Johnston wrote:

​Thinking aloud - why doesn't just finding every record with 5 descendants not work?  Any chain longer than 5 would have at least 5 items.

Oh it works. This is why I ask these questions -- new perspectives!

​Even without recursion you could build out a five-way self-join and any records that make it that far are guilty.  I suppose this assumes your setup is non-cyclic.

There could be cyclic records, but that's easy to filter out.  A first approach took 40 seconds to run.  A little tweaking is necessary, but this is a great start.

THANK YOU!  You saved me!