Re: Megabytes of stats saved after every connection - Mailing list pgsql-general

From Greg Stark
Subject Re: Megabytes of stats saved after every connection
Date
Msg-id 87mzo574jr.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Megabytes of stats saved after every connection  (Phil Endecott <spam_from_postgresql_general@chezphil.org>)
Responses Re: Megabytes of stats saved after every connection  (Phil Endecott <spam_from_postgresql_general@chezphil.org>)
List pgsql-general
Phil Endecott <spam_from_postgresql_general@chezphil.org> writes:

> Those aren't questions that I need to answer often.

But the fact that they're utterly infeasible in your current design is a bad
sign. Just because you don't need them now doesn't mean you won't need
*something* that spans users later. Sometimes you have to be pragmatic and
look at what your actual current needs are and make sacrifices but you should
at least be aware that you're giving up a *lot* and in this case I think for
little or no gain.

> The sort of question I do need to answer is this: starting from individual
> X, find all the ancestors and descendants for n generations. This involves n
> iterations of a loop, joining the relatives found so far with the next
> generation. If there are p people in the tree this has something like O(n
> log p) complexity. On the other hand, if I stored all users' data in the
> same tables and I had u users, this operation would have O(n log (u*p))
> complexity. My guess is that it would be about an order of magnitude slower.

You're omitting the time spent finding the actual table for the correct user
in your current scheme. That's exactly the same as the log(u) factor above. Of
course the time spent finding the table is pretty small but it's also small in
the normalized schema where it represents probably a single extra btree level.

You might be interested in the ltree contrib module and gist indexes. You
might be able to do this recursive algorithm in a single indexed non-recursive
query using them.

> The individual users' sites are entirely disjoint - there are no queries that
> overlap them.

If you had a more flexible design you might find that you have a wealth of
data that you're currently not able to see because your design hides it.

--
greg

pgsql-general by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: [pgsql-advocacy] MySQL to PostgreSQL, was ENUM type
Next
From: "Rod MacNeil"
Date:
Subject: Looking for version 7.4.7 for windows