Thread: Feature Proposal: Constant Values in Columns or Foreign Keys

Feature Proposal: Constant Values in Columns or Foreign Keys

From

Nils Gösche

Date:

17 April 2012, 13:58:23

Hi!

I have a little feature proposal. Let me try to explain the motivation
behind it.

Suppose our application has two types of objects, looking somewhat like
this:

abstract class Base
{
    public int Id;
    public int SomeData;
}

class Derived1 : Base
{
    public int Data1;
}

class Derived2 : Base
{
    public string Data2;
}

There are many ways of modeling this in a relational database. I am
interested in this one:

CREATE TYPE derived_type AS ENUM ('derived1', 'derived2);

CREATE TABLE base (
    id int PRIMARY KEY,
    some_data int NOT NULL,
    type derived_type NOT NULL
);

CREATE UNIQUE INDEX base_derived_type_index ON base (id, derived_type);

CREATE TABLE derived1 (
   id int PRIMARY KEY,
   data1 int NOT NULL,
   type derived_type NOT NULL CHECK (type = 'derived1'),
   FOREIGN KEY (id, type) REFERENCES base (id, type) ON DELETE CASCADE
)

CREATE TABLE derived2 (
    id int PRIMARY KEY,
    data2 text NOT NULL,
    type derived_type NOT NULL CHECK (type = 'derived2'),
    FOREIGN KEY (id, type) REFERENCES base (id, type) ON DELETE CASCADE
)

Note that the type column in derived1 and derived2 ensures that there is at
most one row in either derived1 or derived2 which refers to a given row in
"base".

This works fine, actually. What bugs me, though, is the redundant data in
the type columns of derived1 and derived2. It would be nice if I could
either declare the columns as constant (so the data wouldn't be stored on
disk anymore), or (even better) use a constant value in the foreign keys, as
in

    FOREIGN KEY (id, 'derived2') REFERENCES base (id, type) ON DELETE
CASCADE

In the latter case, I could omit the type column of derived1 and derived2
altogether.

I suspect that it wouldn't be terribly hard to implement this. What do you
think? Wouldn't this be nice to have?

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Bartosz Dmytrak

Date:

17 April 2012, 14:51:36

Hi,

how about inheritance in postgres?

CREATE TABLE "tblBase"

(

id serial NOT NULL, -- serial type is my assumption.

"SomeData" integer,

CONSTRAINT "tblBase_pkey" PRIMARY KEY (id )

)

WITH (

OIDS=FALSE

);

CREATE TABLE "tblDerived1"

(

-- Inherited from table "tblBase": id integer NOT NULL DEFAULT nextval('"tblBase_id_seq"'::regclass),

-- Inherited from table "tblBase": "SomeData" integer,

"Data1" integer,

CONSTRAINT "tblDerived1_pkey" PRIMARY KEY (id )

)

INHERITS ("tblBase")

WITH (

OIDS=FALSE

);

CREATE TABLE "tblDerived2"

(

-- Inherited from table "tblBase": id integer NOT NULL DEFAULT nextval('"tblBase_id_seq"'::regclass),

-- Inherited from table "tblBase": "SomeData" integer,

"Data2" text,

CONSTRAINT "tblDerived2_pkey" PRIMARY KEY (id )

)

INHERITS ("tblBase")

WITH (

OIDS=FALSE

);

inheritance is described in doc here: http://www.postgresql.org/docs/9.1/static/ddl-inherit.html

With this approach all IDs will use the same sequence so there will not be duplicated PKs in inherited tables.

This could be also modeled with "standard" SQL approach without redundant information. Solution depends on requirements.

Regards,
Bartek

2012/4/17 Nils Gösche <cartan@cartan.de>

Hi!

I have a little feature proposal. Let me try to explain the motivation
behind it.

Suppose our application has two types of objects, looking somewhat like
this:

abstract class Base
{
public int Id;
public int SomeData;
}

class Derived1 : Base
{
public int Data1;
}

class Derived2 : Base
{
public string Data2;
}

There are many ways of modeling this in a relational database. I am
interested in this one:

CREATE TYPE derived_type AS ENUM ('derived1', 'derived2);

CREATE TABLE base (
id int PRIMARY KEY,
some_data int NOT NULL,
type derived_type NOT NULL
);

CREATE UNIQUE INDEX base_derived_type_index ON base (id, derived_type);

CREATE TABLE derived1 (
id int PRIMARY KEY,
data1 int NOT NULL,
type derived_type NOT NULL CHECK (type = 'derived1'),
FOREIGN KEY (id, type) REFERENCES base (id, type) ON DELETE CASCADE
)

CREATE TABLE derived2 (
id int PRIMARY KEY,
data2 text NOT NULL,
type derived_type NOT NULL CHECK (type = 'derived2'),
FOREIGN KEY (id, type) REFERENCES base (id, type) ON DELETE CASCADE
)

Note that the type column in derived1 and derived2 ensures that there is at
most one row in either derived1 or derived2 which refers to a given row in
"base".

This works fine, actually. What bugs me, though, is the redundant data in
the type columns of derived1 and derived2. It would be nice if I could
either declare the columns as constant (so the data wouldn't be stored on
disk anymore), or (even better) use a constant value in the foreign keys, as
in

FOREIGN KEY (id, 'derived2') REFERENCES base (id, type) ON DELETE
CASCADE

In the latter case, I could omit the type column of derived1 and derived2
altogether.

I suspect that it wouldn't be terribly hard to implement this. What do you
think? Wouldn't this be nice to have?

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Nils Gösche

Date:

17 April 2012, 19:06:30

Bartosz Dmytrak wrote:

> how about inheritance in postgres?

I know about Postgres' inheritance feature, but would prefer a more standard relational solution.

> With this approach all IDs will use the same sequence so there will not be duplicated PKs in inherited tables.

In my case, the primary keys are usually uuids, actually. It was just an example, anyway.

> This could be also modeled with "standard" SQL approach without redundant information. Solution depends on
requirements.

Well, it could be modeled in the way I described, if only I could use constant values in foreign keys:

  FOREIGN KEY (id, 'derived2') REFERENCES base (id, type) ON DELETE CASCADE

I was quite surprised to find that this wasn't possible. Is there any good reason why not?

The reason I like this particular way of modeling the data is that I have a guarantee that there won't be an entry in
bothderived tables at the same time for the same row in the base table; also, I can have further constraints and
foreignkeys from and to the base table. 

Of course, I could also omit the type field and simply live with the possibility of having two rows in the derived
tablesreferring to the same row of the base table. But it would be nice if I could rule that out with simple
constraints.

Regards,
--
Nils Gösche
Don't ask for whom the <Ctrl-G> tolls.

Attachment

smime.p7s

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Tom Lane

Date:

17 April 2012, 20:02:18

=?UTF-8?Q?Nils_G=C3=B6sche?= <cartan@cartan.de> writes:
> Bartosz Dmytrak wrote:
>> how about inheritance in postgres?

> I know about Postgres' inheritance feature, but would prefer a more standard relational solution.

[ blink... ] That seems like a pretty silly argument for proposing
something that is *more* nonstandard.

> I was quite surprised to find that this wasn't possible. Is there any good reason why not?

It's contrary to SQL standard is why not.  And it's not just a matter of
being outside the spec, as inheritance is; this is messing with the
details of something that is defined in the standard.  As an example,
I would wonder how such an FK is supposed to be represented in the
spec-defined information schema views.  Other interesting questions
include what would happen to the supposed constant during actions such
as ON DELETE SET NULL or ON UPDATE CASCADE, which normally would result
in a change in the referencing row.

            regards, tom lane

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Nils Gösche

Date:

18 April 2012, 04:18:37

Tom Lane wrote:

> =?UTF-8?Q?Nils_G=C3=B6sche?= <cartan@cartan.de> writes:

> > I was quite surprised to find that this wasn't possible. Is there any
> > good reason why not?
>
> It's contrary to SQL standard is why not.  And it's not just a matter
> of
> being outside the spec, as inheritance is; this is messing with the
> details of something that is defined in the standard.  As an example,
> I would wonder how such an FK is supposed to be represented in the
> spec-defined information schema views.  Other interesting questions
> include what would happen to the supposed constant during actions such
> as ON DELETE SET NULL or ON UPDATE CASCADE, which normally would result
> in a change in the referencing row.

Well, something I can have already is a column together with a NOT NULL constraint, and a CHECK constraint that ensures
thatthe value is really constant, and a foreign key that includes this constant column. You could just handle it the
sameway. 

The information schema is a good point, though. If the only way to put this into the information schema is to introduce
somedummy column, possibly with NULL as a name or something, then I don't like the idea much myself anymore. 

I guess the best I could hope for, then, is an optimization that does not store constant columns on disk ;-)

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Vincent Veyron

Date:

18 April 2012, 11:27:43

Le mercredi 18 avril 2012 à 00:06 +0200, Nils Gösche a écrit :
> Bartosz Dmytrak wrote:
>
> >
> The reason I like this particular way of modeling the data is that I have a guarantee that there won't be an entry in
bothderived tables at the same time for the same row in the base table; also, I can have further constraints and
foreignkeys from and to the base table. 
>

use a trigger on each of the derived tables, that cancels any insert if
the same id already exists in the other table?

> Of course, I could also omit the type field and simply live with the possibility of having two rows in the derived
tablesreferring to the same row of the base table. But it would be nice if I could rule that out with simple
constraints.

You don't say how your data gets inserted, but considering how
complicated your preferred option looks, I have to ask why you can't use
something as simple as :

CREATE TABLE base (
    id int PRIMARY KEY,
    some_data int NOT NULL,
    type integer NOT NULL DEFAULT 1
);

-- type 1 = derived1, type 2 = derived2

CREATE TABLE derived1 (
   id int PRIMARY KEY,
   data1 int NOT NULL,
   FOREIGN KEY (id) REFERENCES base (id) ON DELETE CASCADE
);

CREATE TABLE derived2 (
    id int PRIMARY KEY,
    data2 text NOT NULL,
   FOREIGN KEY (id) REFERENCES base (id) ON DELETE CASCADE
);


You'll have to build the queries according to the value of type, but
this should give you the features you mention?

--
Vincent Veyron
http://marica.fr/
Logiciel de gestion des sinistres assurances et des dossiers contentieux pour le service juridique

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Nils Gösche

Date:

18 April 2012, 11:35:54

Vincent Veyron wrote:

> use a trigger on each of the derived tables, that cancels any insert if
> the same id already exists in the other table?

Yes, that would work.

> You don't say how your data gets inserted, but considering how
> complicated your preferred option looks, I have to ask why you can't
> use something as simple as :

[...]

> You'll have to build the queries according to the value of type, but
> this should give you the features you mention?

Sure, this is what I meant when I said that you could simply omit the type
field. However, if you still want the guarantee about not having two derived
rows for the same base row, you'll have to use a trigger function. The whole
point of the exercise is to avoid that trigger function and replace it with
a simple foreign key :-).

It's ok, maybe it's not worth the trouble.

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Bartosz Dmytrak

Date:

18 April 2012, 14:51:30

Hi,

according to DB theory:

1NF: Table faithfully represents a relation and has no repeating groups

2NF: No non-prime attribute in the table is functionally dependent on a proper subset of anycandidate key.

source: http://en.wikipedia.org/wiki/Database_normalization#Normal_forms

so these constants are not in line with this approach.

You can implement one to one relation:

CREATE TABLE "tblBase"(

id text NOT NULL,

"SomeData" integer,

CONSTRAINT "tblBase_pkey" PRIMARY KEY (id )

);

CREATE TABLE "tblDerived1"(

id text NOT NULL,

"Data1" integer,

CONSTRAINT "tblDerived1_pkey" PRIMARY KEY (id ),

CONSTRAINT "tblDerived1_id_fkey" FOREIGN KEY (id)

REFERENCES "tblBase" (id) MATCH FULL

ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

);

CREATE TABLE "tblDerived2"(

id text NOT NULL,

"Data1" text,

CONSTRAINT "tblDerived2_pkey" PRIMARY KEY (id ),

CONSTRAINT "tblDerived2_id_fkey" FOREIGN KEY (id)

REFERENCES "tblBase" (id) MATCH FULL

ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

);

So, You don't have any duplicated PK, every class is described in separate entity (table).

If Your business logic needs to recognize classes in other way You can use views, with constant field which is not stored on disk:

CREATE VIEW "vDerived1"

SELECT *, 'DERIVED1'::text as "ClassType" FROM

"tblBase" NATURAL JOIN "tblDerived1";

CREATE VIEW "vDerived2"

SELECT *, 'DERIVED2'::text as "ClassType" FROM

"tblBase" NATURAL JOIN "tblDerived2";

The problem is that "tblDerived1".id is not guaranteed to be not present in

"tblDerived2". This could be handled e.g. by trigger (before update) on both tables (cross check), or using a kind of "middleware", I mean a function which is responsible do perform inserts.

In my opinion it should be possible to recognize proper class based on its attributes, so it should be quite easy to implement this function. Function could be overloaded (same name, different set of attributes).

Regards,
Bartek

Re: Feature Proposal: Constant Values in Columns or Foreign Keys

From

Nils Gösche

Date:

18 April 2012, 15:37:52

Bartosz Dmytrak wrote:

> according to DB theory:
> 1NF: Table faithfully represents a relation and has no repeating groups
> 2NF: No non-prime attribute in the table is functionally dependent on a proper subset of anycandidate key.
> source: http://en.wikipedia.org/wiki/Database_normalization#Normal_forms

> so these constants are not in line with this approach.

This is true.  That's why you would have to guard the "constantness" of the column/attribute with a CHECK constraint,
toavoid update anomalies. 

Yes, the whole model would be simpler and more beautiful without the type column, and trigger functions on the derived
tablesinstead. On the other hand, the foreign key including the type field might be faster than the foreign key without
thetype field plus trigger function. So, if this approach is any good, then only if it is actually faster. No, I
haven'ttimed it :-) 

Regards,
--
Nils Gösche
Don't ask for whom the <Ctrl-G> tolls.

Attachment

smime.p7s