Re: Megabytes of stats saved after every connection - Mailing list pgsql-general

From Phil Endecott
Subject Re: Megabytes of stats saved after every connection
Date
Msg-id 42EAA229.3070507@chezphil.org
Whole thread Raw
In response to Re: Megabytes of stats saved after every connection  (Greg Stark <gsstark@mit.edu>)
Responses Re: Megabytes of stats saved after every connection
List pgsql-general
Greg Stark wrote:
>>The sort of question I do need to answer is this: starting from individual
>>X, find all the ancestors and descendants for n generations. This involves n
>>iterations of a loop, joining the relatives found so far with the next
>>generation. If there are p people in the tree this has something like O(n
>>log p) complexity. On the other hand, if I stored all users' data in the
>>same tables and I had u users, this operation would have O(n log (u*p))
>>complexity. My guess is that it would be about an order of magnitude slower.
>
> You're omitting the time spent finding the actual table for the correct user
> in your current scheme. That's exactly the same as the log(u) factor above.

I hope not - can anyone confirm?

I have the impression that within a plpgsql function, the table lookup
cost happens once, and subsequent accesses to the same table are cheap.
  In fact this characteristic has caused problems for me in the past,
see http://archives.postgresql.org/pgsql-general/2004-09/msg00316.php

I hope that the same is true of PQexecPrepared - can anyone confirm?

> You might be interested in the ltree contrib module and gist indexes. You
> might be able to do this recursive algorithm in a single indexed non-recursive
> query using them.

I could use something like "CONNECT BY", though last time I investigated
I believe there were some stability concerns with the patch.
Unfortunately genealogies are not trees in anything other than the
informal sense of the word, so I don't think ltree is applicable.

>>The individual users' sites are entirely disjoint - there are no queries that
>>overlap them.
>
> If you had a more flexible design you might find that you have a wealth of
> data that you're currently not able to see because your design hides it.

I have a wealth of data that the majority of my users want me to keep
private.  There are other sites that try to match up peoples'
genealogies, and I'm not competing with them.

Thanks for your suggestions Greg, but I think I know what I'm doing.
The Postgresql core copes well with this setup.  It's just peripheral
things, like autovacuum and this stats writing issue, where poor big-O
complexity had gone un-noticed.

--Phil.



pgsql-general by date:

Previous
From: "Rod MacNeil"
Date:
Subject: Looking for version 7.4.7 for windows
Next
From: Alvaro Herrera
Date:
Subject: Re: fix pg_autovacuum