Thread: noob inheritance question

noob inheritance question

From
Zintrigue
Date:
Hello,

I am complete noob to Postgres and to this list, and I hope this will be the appropriate list for this question.

I'm hoping the inheritance feature will be a nice alternative method for me to implement categories in particular database of products I need to keep updated. I suppose in MySQL I would probably do this by creating, for example, one table for the products, and then a table(s) for categories, and then I'd be faced with a choice between using an adjacency list or nested set paradigm for, say, breadcrumb links in my private web app.

On the other hand, in Postgres what I'd like to do it just create an empty root "product" table, then create, for example, a "spirts" table that inherits from products, and "rums" table that inherits from spirits, and then "aged rum", "flavored rum", et al, which inherit from rums.

In this scenario, my idea was to have all my fields in "products" and to not add any additional fields in the child tables. Also, only the lowest level of child tables in any given branch of products would actually contain data / rows.

Assuming this is a good design, what I'm wondering is how inheritance is actually implemented deep down inside Postgres, if it's anything at all like JOINS (say, in the case of merely doing:
SELECT * FROM "flavored_rum" (the lowest level in a given branch)
or
SELECT * FROM "spirits" (the root level, or some intermediate level in a given branch)

I'm wondering if there's any performance penalty here, analogous to the penalty of JOINs in a regular RDBMS (versus an ORDBMS).

If anyone can offer in any insight as too how inheritance is actually executed (compared to JOINs especially), I'd be most grateful.

Thank you,
DG

Re: noob inheritance question

From
Richard Broersma
Date:
On Wed, Jan 6, 2010 at 3:53 PM, Zintrigue <zintrigue@gmail.com> wrote:
> I'm wondering if there's any performance penalty here, analogous to the
> penalty of JOINs in a regular RDBMS (versus an ORDBMS).
> If anyone can offer in any insight as too how inheritance is actually
> executed (compared to JOINs especially), I'd be most grateful.

PostgreSQL inheritance is just a sugar coated form of horizontal table
partitioning.  So it suffers from all of the problems associated with
selection on UNION ALL queries.


--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

Re: noob inheritance question

From
Richard Neill
Date:
Zintrigue wrote:

> I'm hoping the inheritance feature will be a nice alternative method for
> me to implement categories in particular database of products I need to
> keep updated. I suppose in MySQL I would probably do this by creating,
> for example, one table for the products, and then a table(s) for
> categories, and then I'd be faced with a choice between using an
> adjacency list or nested set paradigm for, say, breadcrumb links in my
> private web app.
>
> On the other hand, in Postgres what I'd like to do it just create an
> empty root "product" table, then create, for example, a "spirts" table
> that inherits from products, and "rums" table that inherits from
> spirits, and then "aged rum", "flavored rum", et al, which inherit from
> rums.
>
> In this scenario, my idea was to have all my fields in "products" and to
> not add any additional fields in the child tables. Also, only the lowest
> level of child tables in any given branch of products would actually
> contain data / rows.
>
> Assuming this is a good design,

May I venture to stop you there. This sounds like you are doing it
The Hard Way.

In particular, each time you add a new category, you're going to have to
add a new database table, and your schema is going to get to be
horrible. Inserts aren't going to be much fun either.

Rather than adding multiple child tables, may I suggest some other way
of tracking which item is a subset of the other.
  You could do it by having 2 columns:
     id, parent_id      (each integer and indexed)
or you could do it by having 2 columns:
     id, list           (id is integer, list is eg "1,3,5,13")
(where the list is a comma-separated list, or an array, and holds the
full path)


Depending on scale, you may be able to choose a simple algorithm instead
of hunting for the most efficient one.


Best wishes,

Richard


P.S. This is the performance mailing list - you may find one of the
other lists better suited to your questions.

Re: noob inheritance question

From
Nikolas Everett
Date:
Inheritance would only make sense if each of your categories had more columns.  Say if you had a "wines" category and only they had a year column.  Its probably not worth it for one or two columns but if you've got a big crazy heterogeneous tree of stuff then its probably appropriate.

I'm with Richard in that it sounds like the right way to solve your problem is to have a "categories" table and a "products" table.  Let the categories table have a reference to the parent.  I suppose just like they do in the first section of http://dev.mysql.com/tech-resources/articles/hierarchical-data.html .  The other sections on the page just seem like overkill to me.

On Wed, Jan 6, 2010 at 7:13 PM, Richard Neill <rn214@cam.ac.uk> wrote:
Zintrigue wrote:

I'm hoping the inheritance feature will be a nice alternative method for me to implement categories in particular database of products I need to keep updated. I suppose in MySQL I would probably do this by creating, for example, one table for the products, and then a table(s) for categories, and then I'd be faced with a choice between using an adjacency list or nested set paradigm for, say, breadcrumb links in my private web app.

On the other hand, in Postgres what I'd like to do it just create an empty root "product" table, then create, for example, a "spirts" table that inherits from products, and "rums" table that inherits from spirits, and then "aged rum", "flavored rum", et al, which inherit from rums.

In this scenario, my idea was to have all my fields in "products" and to not add any additional fields in the child tables. Also, only the lowest level of child tables in any given branch of products would actually contain data / rows.

Assuming this is a good design,

May I venture to stop you there. This sounds like you are doing it
The Hard Way.

In particular, each time you add a new category, you're going to have to add a new database table, and your schema is going to get to be horrible. Inserts aren't going to be much fun either.

Rather than adding multiple child tables, may I suggest some other way of tracking which item is a subset of the other.
 You could do it by having 2 columns:
   id, parent_id      (each integer and indexed)
or you could do it by having 2 columns:
   id, list           (id is integer, list is eg "1,3,5,13")
(where the list is a comma-separated list, or an array, and holds the full path)


Depending on scale, you may be able to choose a simple algorithm instead of hunting for the most efficient one.


Best wishes,

Richard


P.S. This is the performance mailing list - you may find one of the other lists better suited to your questions.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: noob inheritance question

From
Robert Haas
Date:
On Wed, Jan 6, 2010 at 6:53 PM, Zintrigue <zintrigue@gmail.com> wrote:
> I'm wondering if there's any performance penalty here

There definitely is.  Your design sounds pretty painful to me...
adding a column referencing a side-table will be much nicer.

> If anyone can offer in any insight as too how inheritance is actually
> executed (compared to JOINs especially), I'd be most grateful.

You can look at the query plans for your queries using EXPLAIN.
Inheritance is really just UNION ALL under the covers - it's meant for
partitioning, not the sort of thing you're trying to do here, so your
query plans will probably not be too good with this design.

...Robert