Thread: Enforce primary key on every table during dev?
We want to enforce a policy, partly just to protect those who might forget, for every table in a particular schema to have a primary key. This can't be done with event triggers as far as I can see, because it is quite legitimate to do:
BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;
It would be nice to have some kind of "deferrable event trigger" or some way to enforce that no transaction commits which added a table without a primary key.
Any ideas?
Thanks,
Jeremy
On 28/02/2018 15:34, Jeremy Finzel wrote: > We want to enforce a policy, partly just to protect those who might forget, for every table in a particular schema to havea primary key. This can't be done with event triggers as far as I can see, > because it is quite legitimate to do: > > BEGIN; > CREATE TABLE foo (id int); > ALTER TABLE foo ADD PRIMARY KEY (id); > COMMIT; > > It would be nice to have some kind of "deferrable event trigger" or some way to enforce that no transaction commits whichadded a table without a primary key. > > Any ideas? cron job to check for tables without PK ? Although for a short period the offending table would be there live without a PK. But IMO every table, in addition to PK, should have also natural unique keys as much as possible. Better safe than sorry. > > Thanks, > Jeremy -- Achilleas Mantzios IT DEV Lead IT DEPT Dynacom Tankers Mgmt
We want to enforce a policy, partly just to protect those who might forget, for every table in a particular schema to have a primary key. This can't be done with event triggers as far as I can see, because it is quite legitimate to do:BEGIN;CREATE TABLE foo (id int);ALTER TABLE foo ADD PRIMARY KEY (id);COMMIT;It would be nice to have some kind of "deferrable event trigger" or some way to enforce that no transaction commits which added a table without a primary key.Any ideas?Thanks,Jeremy
What stops somebody from doing:
CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text);
And then just never bother to ever insert anything into the column FILLER? It fulfills your stated requirement of every table having a primary key. Of course, you could amend the policy to say a "non-NULL primary key".
I have a theory that it's impossible to prove anything, but I can't prove it.
Maranatha! <><
John McKown
John McKown
On 02/28/2018 05:52 AM, John McKown wrote: > On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel <finzelj@gmail.com > <mailto:finzelj@gmail.com>>wrote: > > We want to enforce a policy, partly just to protect those who might > forget, for every table in a particular schema to have a primary > key. This can't be done with event triggers as far as I can see, > because it is quite legitimate to do: > > BEGIN; > CREATE TABLE foo (id int); > ALTER TABLE foo ADD PRIMARY KEY (id); > COMMIT; > > It would be nice to have some kind of "deferrable event trigger" or > some way to enforce that no transaction commits which added a table > without a primary key. > > Any ideas? > > Thanks, > Jeremy > > > > What stops somebody from doing: > > CREATE TABLE foo (filler text primary key default null, realcol1 int, > realcol2 text); > > And then just never bother to ever insert anything into the column > FILLER? It fulfills your stated requirement of every table having a Then you would get this: test=# CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text); CREATE TABLE test=# insert into foo (realcol1, realcol2) values (1, 'test'); ERROR: null value in column "filler" violates not-null constraint DETAIL: Failing row contains (null, 1, test). > primary key. Of course, you could amend the policy to say a "non-NULL > primary key". > > > > -- > I have a theory that it's impossible to prove anything, but I can't > prove it. > > Maranatha! <>< > John McKown -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Feb 28, 2018 at 8:57 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 02/28/2018 05:52 AM, John McKown wrote:On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel <finzelj@gmail.com <mailto:finzelj@gmail.com>>wrote:
We want to enforce a policy, partly just to protect those who might
forget, for every table in a particular schema to have a primary
key. This can't be done with event triggers as far as I can see,
because it is quite legitimate to do:
BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;
It would be nice to have some kind of "deferrable event trigger" or
some way to enforce that no transaction commits which added a table
without a primary key.
Any ideas?
Thanks,
Jeremy
What stops somebody from doing:
CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text);
And then just never bother to ever insert anything into the column FILLER? It fulfills your stated requirement of every table having a
Then you would get this:
test=# CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text);
CREATE TABLE
test=# insert into foo (realcol1, realcol2) values (1, 'test');
ERROR: null value in column "filler" violates not-null constraint
DETAIL: Failing row contains (null, 1, test).--primary key. Of course, you could amend the policy to say a "non-NULL primary key".
--
I have a theory that it's impossible to prove anything, but I can't prove it.
Maranatha! <><
John McKown
Adrian Klaver
adrian.klaver@aklaver.com
As Adrian pointed out, by definition, PK's create a constraint which are NOT NULLABLE;
Here is the SQL to check for tables with no primary key.
SELECT n.nspname,
c.relname as table,
c.reltuples::bigint
FROM pg_class c
JOIN pg_namespace n ON (n.oid =c.relnamespace )
WHERE relkind = 'r' AND
relhaspkey = FALSE
ORDER BY n.nspname, c.relname;
SELECT n.nspname,
c.relname as table,
c.reltuples::bigint
FROM pg_class c
JOIN pg_namespace n ON (n.oid =c.relnamespace )
WHERE relkind = 'r' AND
relhaspkey = FALSE
ORDER BY n.nspname, c.relname;
--
Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
On 02/28/2018 05:52 AM, John McKown wrote:On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel <finzelj@gmail.com <mailto:finzelj@gmail.com>>wrote:
We want to enforce a policy, partly just to protect those who might
forget, for every table in a particular schema to have a primary
key. This can't be done with event triggers as far as I can see,
because it is quite legitimate to do:
BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;
It would be nice to have some kind of "deferrable event trigger" or
some way to enforce that no transaction commits which added a table
without a primary key.
Any ideas?
Thanks,
Jeremy
What stops somebody from doing:
CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text);
And then just never bother to ever insert anything into the column FILLER? It fulfills your stated requirement of every table having a
Then you would get this:
test=# CREATE TABLE foo (filler text primary key default null, realcol1 int, realcol2 text);
CREATE TABLE
test=# insert into foo (realcol1, realcol2) values (1, 'test');
ERROR: null value in column "filler" violates not-null constraint
DETAIL: Failing row contains (null, 1, test).
Hum, it's been so long, I totally forgot. Which makes me wonder why the parser doesn't "know" that a default of NULL for a primary key is going to fail anyway and flag it at CREATE time. Oh, well. Thanks.
primary key. Of course, you could amend the policy to say a "non-NULL primary key".
--
I have a theory that it's impossible to prove anything, but I can't prove it.
Maranatha! <><
John McKown
--
Adrian Klaver
adrian.klaver@aklaver.com
I have a theory that it's impossible to prove anything, but I can't prove it.
Maranatha! <><
John McKown
John McKown
We want to enforce a policy, partly just to protect those who might forget, for every table in a particular schema to have a primary key. This can't be done with event triggers as far as I can see, because it is quite legitimate to do:
Add a query to your test suite that queries the catalogs and fails if this policy is violated. There is nothing in a running PostgreSQL server instance that is going to enforce this for you.
David J.
Jeremy Finzel <finzelj@gmail.com> writes: > We want to enforce a policy, partly just to protect those who might forget, > for every table in a particular schema to have a primary key. This can't > be done with event triggers as far as I can see, because it is quite > legitimate to do: > > BEGIN; > CREATE TABLE foo (id int); > ALTER TABLE foo ADD PRIMARY KEY (id); > COMMIT; > > It would be nice to have some kind of "deferrable event trigger" or some > way to enforce that no transaction commits which added a table without a > primary key. > I think you would be better off having an automated report which alerts you to tables lacking a primary key and deal with that policy through other means. Using triggers in this way often leads to unexpected behaviour and difficult to identify bugs. The policy is a management policy and probably should be dealt with via management channels rather than technical ones. Besides, the likely outcome will be your developers will just adopt the practice of adding a serial column to every table, which in itself doesn't really add any value. Tim -- Tim Cross
----- Original Message ----- > From: "Tim Cross" <theophilusx@gmail.com> > Sent: Wednesday, February 28, 2018 4:07:43 PM > > Jeremy Finzel <finzelj@gmail.com> writes: > > > We want to enforce a policy, partly just to protect those who might forget, > > for every table in a particular schema to have a primary key. This can't > > be done with event triggers as far as I can see, because it is quite > > legitimate to do: > > > > BEGIN; > > CREATE TABLE foo (id int); > > ALTER TABLE foo ADD PRIMARY KEY (id); > > COMMIT; > > > > It would be nice to have some kind of "deferrable event trigger" or some > > way to enforce that no transaction commits which added a table without a > > primary key. > > > > I think you would be better off having an automated report which alerts > you to tables lacking a primary key and deal with that policy through > other means. Using triggers in this way often leads to unexpected > behaviour and difficult to identify bugs. The policy is a management > policy and probably should be dealt with via management channels rather > than technical ones. Besides, the likely outcome will be your developers > will just adopt the practice of adding a serial column to every table, > which in itself doesn't really add any value. I concur with other respondents that suggest this is more of a policy issue. In fact, you yourself identify it right therein the first sentence as a policy issue! One tool that changed my life (as a PostgreSQL enthusiast) forever is David Wheeler's pgTAP (http://pgtap.org/) tool. Itincludes a suite of functionality to assess the database schema via automated testing. Part of a rigorous development environmentmight include using this tool so that any application/database changes be driven by tests, and then your codereview process would assure that the appropriate tests are added to the pgTAP script to confirm that changes meet a policystandard such as what you are demanding. I can't imagine doing PostgreSQL development without it now. Same guy also produced a related tool called Sqitch (http://sqitch.org/) for data base change management. Use these toolstogether, so that before a developer is allowed to check in a feature branch, your teams' code review process maintainsrigorous oversight of modifications. -- B
On Wed, Feb 28, 2018 at 4:07 PM, Tim Cross <theophilusx@gmail.com> wrote:
Jeremy Finzel <finzelj@gmail.com> writes:
> We want to enforce a policy, partly just to protect those who might forget,
> for every table in a particular schema to have a primary key. This can't
> be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>
> BEGIN;
> CREATE TABLE foo (id int);
> ALTER TABLE foo ADD PRIMARY KEY (id);
> COMMIT;
>
> It would be nice to have some kind of "deferrable event trigger" or some
> way to enforce that no transaction commits which added a table without a
> primary key.
>
I think you would be better off having an automated report which alerts
you to tables lacking a primary key and deal with that policy through
other means. Using triggers in this way often leads to unexpected
behaviour and difficult to identify bugs. The policy is a management
policy and probably should be dealt with via management channels rather
than technical ones. Besides, the likely outcome will be your developers
will just adopt the practice of adding a serial column to every table,
which in itself doesn't really add any value.
Tim
--
Tim Cross
> I think you would be better off having an automated report which alerts
>you to tables lacking a primary key and deal with that policy through
>other means.
Perhaps a better solution is to have a meeting with the developers and explain to them
WHY the policy of enforcing a primary key is important. Also, explain the purpose of
primary keys and why it is not always suitable to just use an integer or serial as the key,
but rather why natural unique (even multi column) keys are better. But this begs the question,
why are "developers" allowed to design database tables? That should be the job of the DBA! At
the very minimum, the DBA should be reviewing and have the authority to approve of disapprove
of table/schema designs/changes .
--
Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
> On 1 Mar 2018, at 1:47, Melvin Davidson <melvin6925@gmail.com> wrote: > > I think you would be better off having an automated report which alerts > >you to tables lacking a primary key and deal with that policy through > >other means. > > Perhaps a better solution is to have a meeting with the developers and explain to them > WHY the policy of enforcing a primary key is important. Also, explain the purpose of > primary keys and why it is not always suitable to just use an integer or serial as the key, > but rather why natural unique (even multi column) keys are better. But this begs the question, > why are "developers" allowed to design database tables? That should be the job of the DBA! At > the very minimum, the DBA should be reviewing and have the authority to approve of disapprove > of table/schema designs/changes . Not to mention that not all types of tables necessarily have suitable candidates for a primary key. You could add a surrogatekey based on a serial type, but in such cases that may not serve any purpose other than to have some arbitrary primarykey. An example of such tables is a monetary transaction table that contains records for deposits and withdrawals to accounts.It will have lots of foreign key references to other tables, but rows containing the same values are probably notduplicates. Adding a surrogate key to such a table just adds overhead, although that could be useful in case specific rows need updatingor deleting without also modifying the other rows with that same data - normally, only insertions and selectionshappen on such tables though, and updates or deletes are absolutely forbidden - corrections happen by insertingrows with an opposite transaction. Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
On 03/01/2018 02:20 AM, Alban Hertroys wrote: [snip] > Not to mention that not all types of tables necessarily have suitable candidates for a primary key. You could add a surrogatekey based on a serial type, but in such cases that may not serve any purpose other than to have some arbitrary primarykey. > > An example of such tables is a monetary transaction table that contains records for deposits and withdrawals to accounts.It will have lots of foreign key references to other tables, but rows containing the same values are probably notduplicates. > Adding a surrogate key to such a table just adds overhead, although that could be useful in case specific rows need updatingor deleting without also modifying the other rows with that same data - normally, only insertions and selectionshappen on such tables though, and updates or deletes are absolutely forbidden - corrections happen by insertingrows with an opposite transaction. Wouldn't the natural pk of such a table be timestamp+seqno, just as the natural pk of a transaction_detail table be transaction_no+seqno? -- Angular momentum makes the world go 'round.
>Adding a surrogate key to such a table just adds overhead, although that could be useful >in case specific rows need updating or deleting without also modifying the other rows with >that same data - normally, only insertions and selections happen on such tables though, >and updates or deletes are absolutely forbidden - corrections happen by inserting rows with >an opposite transaction. I routinely add surrogate keys like serial col to a table already having a nice candidate keys to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary key and need to join it with child tables.
> On 03/01/2018 02:20 AM, Alban Hertroys wrote: > [snip] > > Not to mention that not all types of tables necessarily have > > suitable candidates for a primary key. You could add a surrogate > > key based on a serial type, but in such cases that may not serve > > any purpose other than to have some arbitrary primary key. > > > > An example of such tables is a monetary transaction table that > > contains records for deposits and withdrawals to accounts. It will > > have lots of foreign key references to other tables, but rows > > containing the same values are probably not duplicates. Adding a > > surrogate key to such a table just adds overhead, although that > > could be useful in case specific rows need updating or deleting > > without also modifying the other rows with that same data - > > normally, only insertions and selections happen on such tables > > though, and updates or deletes are absolutely forbidden - > > corrections happen by inserting rows with an opposite transaction. > > Wouldn't the natural pk of such a table be timestamp+seqno, just as > the natural pk of a transaction_detail table be transaction_no+seqno? Start with Date's notion that a database exists to correclty represent data about the real world. Storing un-identified data breaks this since we have no idea what the data means or have any good way of getting it back out. Net result is that any workable relational database will have at least one candidate key for any table in it. If you can say that "rows containing the same values are not duplicates" then you have a database that cannot be queried, audited, or managed effectively. The problem is that you cannot identify the rows, and thus cannot select related ones, update them (e.g., to expire outdated records), or validate the content against any external values (e.g., audit POS tapes using the database). In the case of a monitary transaction you need a transaction table, which will have most of the FK's, and a ledger for the transaction amounts. A minimum candidate key for the transaction table would be account, timestamp, authorizing customer id, and channel. This allows two people to, say, make deposits at the same time or the same authorizing account (e.g., a credit card number) to be processed at the same time in two places. The data for a transaction would include things like the final status, in-house authorizing agent, completion time. The ledger entries would include the transaction SK, sequence within the transaction, amount, and account. The ledger's candidate key is a transaction SK + sequence number -- the amount and account don't work because you may end up, say, making multiple deposits of $10 to your checking account on the same transaction. The ledger's sequence value can be problematic, requiring external code or moderately messy triggers to manage. Timestamps don't always work and are subject to clock-skew. One way to avoid this is require that a single transaction contain only unique amounts and accounts. At that point the ledger becomes a degenerate table of transaction id, amount, account (i.e., the entire table is nothing but a unique index). This might require generating multiple database transactions for a single external process (e.g., a customer walking up to the teller) but does simplify processing quite a bit. In both cases, having an SK on the ledger is useful for audit queries, which might have to process a large number of ledger entries in code. Extracting the ledger SK's in one query and walking down them using a unique index can be more effecient than having to extract the values. Either way, you can identify all of the transactions as unique and all of the ledger entries for that transaction. At that point the database can be queried for data, updated as necessary, audited against external data. If you have a design with un-identified data it means that you havn't normalized it properly: something is missing from the table with un-identifiable rows. -- Steven Lembark 1505 National Ave Workhorse Computing Rockford, IL 61103 lembark@wrkhors.com +1 888 359 3508
On 1 March 2018 at 17:22, Steven Lembark <lembark@wrkhors.com> wrote: > >> On 03/01/2018 02:20 AM, Alban Hertroys wrote: >> [snip] >> > Not to mention that not all types of tables necessarily have >> > suitable candidates for a primary key. You could add a surrogate >> > key based on a serial type, but in such cases that may not serve >> > any purpose other than to have some arbitrary primary key. >> > >> > An example of such tables is a monetary transaction table that >> > contains records for deposits and withdrawals to accounts. (...) > Start with Date's notion that a database exists to correclty represent > data about the real world. Storing un-identified data breaks this > since we have no idea what the data means or have any good way of > getting it back out. Net result is that any workable relational > database will have at least one candidate key for any table in it. (...) > If you have a design with un-identified data it means that you havn't > normalized it properly: something is missing from the table with > un-identifiable rows. While that holds true for a relational model, in reporting for example, it is common practice to denormalize data without a requirement to be able to identify a single record. The use case for such tables is providing quick aggregates on the data. Often this deals with derived data. It's not that uncommon to not have a primary or even a uniquely identifiable key on such tables. I do not disagree that having a primary key on a table is a bad thing, but I do disagree that a primary key is a requirement for all tables. More generally: For every rule there are exceptions. Even for this one. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
On Thu, Mar 1, 2018 at 11:43 AM, Alban Hertroys <haramrae@gmail.com> wrote:
On 1 March 2018 at 17:22, Steven Lembark <lembark@wrkhors.com> wrote:
>
>> On 03/01/2018 02:20 AM, Alban Hertroys wrote:
>> [snip]
>> > Not to mention that not all types of tables necessarily have
>> > suitable candidates for a primary key. You could add a surrogate
>> > key based on a serial type, but in such cases that may not serve
>> > any purpose other than to have some arbitrary primary key.
>> >
>> > An example of such tables is a monetary transaction table that
>> > contains records for deposits and withdrawals to accounts.
(...)
> Start with Date's notion that a database exists to correclty represent
> data about the real world. Storing un-identified data breaks this
> since we have no idea what the data means or have any good way of
> getting it back out. Net result is that any workable relational
> database will have at least one candidate key for any table in it.
(...)
> If you have a design with un-identified data it means that you havn't
> normalized it properly: something is missing from the table with
> un-identifiable rows.
While that holds true for a relational model, in reporting for
example, it is common practice to denormalize data without a
requirement to be able to identify a single record. The use case for
such tables is providing quick aggregates on the data. Often this
deals with derived data. It's not that uncommon to not have a primary
or even a uniquely identifiable key on such tables.
I do not disagree that having a primary key on a table is a bad thing,
but I do disagree that a primary key is a requirement for all tables.
More generally: For every rule there are exceptions. Even for this one.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
> it is common practice to denormalize data without a
>requirement to be able to identify a single record
You may perceive that to be "common practice", but in reality it is not, and in fact a bad one. As was previously stated, PosgreSQL is a _relational_ database,>requirement to be able to identify a single record
and breaking that premise will eventually land you in very big trouble. There is no solid reason not to a primary key for every table.
--
Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
(Melvin's mail doesn't quote properly; I've tried to fix it but may have broken something) > On Mar 1, 2018, at 8:50 AM, Melvin Davidson <melvin6925@gmail.com> wrote: > > > On Thu, Mar 1, 2018 at 11:43 AM, Alban Hertroys <haramrae@gmail.com> wrote: > > >> On 1 March 2018 at 17:22, Steven Lembark <lembark@wrkhors.com> wrote: >>> If you have a design with un-identified data it means that you havn't >>> normalized it properly: something is missing from the table with >>> un-identifiable rows. >> >> While that holds true for a relational model, in reporting for >> example, it is common practice to denormalize data without a >> requirement to be able to identify a single record. The use case for >> such tables is providing quick aggregates on the data. Often this >> deals with derived data. It's not that uncommon to not have a primary >> or even a uniquely identifiable key on such tables. >> >> I do not disagree that having a primary key on a table is a bad thing, >> but I do disagree that a primary key is a requirement for all tables. >> >> More generally: For every rule there are exceptions. Even for this one. > > You may perceive that to be "common practice", but in reality it is not, and in fact a bad one. As was previously stated,PosgreSQL is a _relational_ database, > and breaking that premise will eventually land you in very big trouble. There is no solid reason not to a primary key forevery table. Sure there is. It's an additional index and significant additional insert / update overhead. If you're never going to retrieve single rows, nor join in such a way that uniqueness on this side is required there's no need for a unique identifier. It's a rare case that you won't want a primary key, and I'll often add a surrogate one for convenience even when it's not actually needed, but there are cases where it's appropriate not to have one, even in OLTP work. Log tables, for example. "Every table should have a primary key, whether natural or surrogate" is a great guideline, and everyone should follow it until they understand when they shouldn't. More generally: For every rule there are exceptions. Even for this one. Cheers, Steve
On 03/01/2018 11:07 AM, Steve Atkins wrote: [snip] > "Every table should have a primary key, whether natural or surrogate" > is a great guideline, and everyone should follow it until they understand > when they shouldn't. Most people think they know, but they don't. -- Angular momentum makes the world go 'round.
On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.
I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.
Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed! So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
On 03/01/2018 11:47 AM, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.
So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
Melvin: On Thu, Mar 1, 2018 at 1:47 AM, Melvin Davidson <melvin6925@gmail.com> wrote: > But this begs the question, > why are "developers" allowed to design database tables? That should be the > job of the DBA! That's the DBA wearing her developer hat. ( I agree with the spirit ) Francisco Olarte.
>I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the >table.
How does not having a ‘real’ PK allow you to insert garbage and a ‘real’ one prevent garbage?
If there’s no natural PK, at least a synthetic one will allow very quick record identification if used.
Martin.
On Thu, Mar 1, 2018 at 9:20 AM, Alban Hertroys <haramrae@gmail.com> wrote: > Not to mention that not all types of tables necessarily have suitable candidates for a primary key. They do if they are in 1NF. ( no dupes alllowed ) > An example of such tables is a monetary transaction table that contains records for deposits and withdrawals to accounts.It will have lots of foreign key references to other tables, but rows containing the same values are probably notduplicates. That's a bad example. They would normally have a transaction id, or a timestamp, or a sequence counter. PKs can expand all non-nullable columns. You could try to come with a real example, but all the times I've found these in one of my dessigns is because I didn't correctly model the "real world". > Adding a surrogate key to such a table just adds overhead, although that could be useful in case specific rows need updatingor deleting without also modifying the other rows with that same data - normally, only insertions and selectionshappen on such tables though, and updates or deletes are absolutely forbidden - corrections happen by insertingrows with an opposite transaction. And normally you would need to pinpoint an individual transaction for selection, hard to do if you do not have a pk. Francisco Olarte.
On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.--
Angular momentum makes the world go 'round.
Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
On Thu, Mar 1, 2018 at 1:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote: ... > I routinely add surrogate keys like serial col to a table already having a nice candidate keys > to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary > key and need to join it with child tables. It does, but many times useful, let me explain: table currencies ( code text, description text), primary key code ( i.e. "USD", "US Dollars" ) table sellers ( currency text, id number, .....), primary key (currency, id), foreign key currency references currencies table buyers ( currency text, id number, .....), primary key (currency, id) foreign key currency references currencies table transactions ( currency text, seller_id number, buyer_id number, trans_id number ....) primery key trans_id, foreign key currency references currencies, foreign key (currency, seller_id ) references sellers, foreign key (currency, buyer_id ) references buyers This is a bit unwieldy, but it expreses my example constraint, buyers can only buy from a seller with the same currency, there is no way to insert a cross-currency transaction. Of course, 3 femtoseconds after deployment the PHB will decide you can do cross-currency sales. Francisco Olarte.
On Thu, Mar 1, 2018 at 5:22 PM, Steven Lembark <lembark@wrkhors.com> wrote: > If you can say that "rows containing the same values are not > duplicates" Not a native speaker, but "Rows having the same values" seems to me the definition of duplicate ( ;-), J.K. ) > then you have a database that cannot be queried, audited, > or managed effectively. The problem is that you cannot identify the > rows, and thus cannot select related ones, update them (e.g., to > expire outdated records), or validate the content against any external > values (e.g., audit POS tapes using the database). Good point. All the times I've found myself with complete duplicates allowed I've alwasy found the correct model is no duplicates + count field ( with possible splits as you pointed below ). I would not have a "marbles" table with (red, red, blue, white, red, white), I would switch it to red=3, blue=1, white=2. Francisco Olarte.
Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
A bit contrived but it makes the point:
Company:
C1 (id c1)
C2 (id c2)
Department:
C1-D1 (id d1)
C1-D2 (id d2)
C2-D1 (id d3)
C2-D2 (id d4)
Employee:
C1-E1 (id e1)
C1-E2 (id e2)
C2-E1 (id e3)
C2-E2 (id e4)
Employee-Department:
e1-d1
e2-d2
e3-d2
e4-d4
The pair e3-d2 is invalid because e3 belongs to company c2 while d2 belongs to company c1 - but we've hidden the knowledge of c# behind the surrogate key and now we can insert garbage into employee-department.
David J.
On 03/01/2018 12:32 PM, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
On Thu, Mar 1, 2018 at 2:00 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 03/01/2018 12:32 PM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.--
Angular momentum makes the world go 'round.
If you are going to go to the trouble of having a surrogate/synthetic key, then you may as well have a primary key , which is much better.
--
Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
On 01/03/2018 16:00 , Ron Johnson wrote:
IMHO, business logic can and must preclude "garbage insertion". Except you are inserting data directly to database using SQL, any n-tier architecture will be checking data validity.On 03/01/2018 12:32 PM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.--
Angular momentum makes the world go 'round.
El software de antivirus Avast ha analizado este correo electrónico en busca de virus. |
On 03/01/2018 01:05 PM, Melvin Davidson wrote:
[snip]On Thu, Mar 1, 2018 at 2:00 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 12:32 PM, Daevor The Devoted wrote:
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.--
Angular momentum makes the world go 'round.
If you are going to go to the trouble of having a surrogate/synthetic key, then you may as well have a primary key , which is much better.
I completely agree.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
On 03/01/2018 01:11 PM, marcelo wrote: > > On 01/03/2018 16:00 , Ron Johnson wrote: [snip] >> If your only unique index is a synthetic key, then you can insert the >> same "business data" multiple times with different synthetic keys. >> >> >> -- >> Angular momentum makes the world go 'round. > IMHO, business logic can and must preclude "garbage insertion". Except you > are inserting data directly to database using SQL, any n-tier architecture > will be checking data validity. Any n-tier architecture that's bug-free. -- Angular momentum makes the world go 'round.
On Thu, Mar 1, 2018 at 8:52 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.A bit contrived but it makes the point:Company:C1 (id c1)C2 (id c2)Department:C1-D1 (id d1)C1-D2 (id d2)C2-D1 (id d3)C2-D2 (id d4)Employee:C1-E1 (id e1)C1-E2 (id e2)C2-E1 (id e3)C2-E2 (id e4)Employee-Department:e1-d1e2-d2e3-d2e4-d4The pair e3-d2 is invalid because e3 belongs to company c2 while d2 belongs to company c1 - but we've hidden the knowledge of c# behind the surrogate key and now we can insert garbage into employee-department.David J.
This seems like hierarchical data, where employee's parent should be department, and department's parent is company. So it wouldn't be possible to "insert garbage" since Company is not stored in the Employee table, only a reference to Department (and Company determined via Department). Isn't that how normal hierarchical data works?
On 01/03/2018 16:42 , Ron Johnson wrote: > On 03/01/2018 01:11 PM, marcelo wrote: >> >> On 01/03/2018 16:00 , Ron Johnson wrote: > [snip] >>> If your only unique index is a synthetic key, then you can insert >>> the same "business data" multiple times with different synthetic keys. >>> >>> >>> -- >>> Angular momentum makes the world go 'round. >> IMHO, business logic can and must preclude "garbage insertion". >> Except you are inserting data directly to database using SQL, any >> n-tier architecture will be checking data validity. > > Any n-tier architecture that's bug-free. > Do you know about unit testing? --- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus
On Thu, Mar 1, 2018 at 9:00 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 03/01/2018 12:32 PM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.--
Angular momentum makes the world go 'round.
That might be where we're talking past each other: I do not advocate for the arbitrary primary key being the only unique index. Absolutely not. Whatever the business rules say is unique must also have unique indexes. If it's a business constraint on the data, it must be enforced in the DB (at least, that's how I try to do things).
This seems like hierarchical data
Hence the "this is contrived" disclaimer - but if one allows for employee-department to be many-to-many, and thus requiring a joining table, this still applies even if the specific choice to nouns doesn't make sense.
David J.
On 03/01/2018 02:08 PM, marcelo wrote: > > > On 01/03/2018 16:42 , Ron Johnson wrote: >> On 03/01/2018 01:11 PM, marcelo wrote: >>> >>> On 01/03/2018 16:00 , Ron Johnson wrote: >> [snip] >>>> If your only unique index is a synthetic key, then you can insert the >>>> same "business data" multiple times with different synthetic keys. >>>> >>>> >>>> -- >>>> Angular momentum makes the world go 'round. >>> IMHO, business logic can and must preclude "garbage insertion". Except >>> you are inserting data directly to database using SQL, any n-tier >>> architecture will be checking data validity. >> >> Any n-tier architecture that's bug-free. >> > Do you know about unit testing? Way Back When Dinosaurs Still Roamed The Earth and I first learned the trade, the focus was on proper design instead of throwing crud against the wall and hoping tests caught any bugs. Because, of course, unit tests are only as good as you imagination in devising tests. -- Angular momentum makes the world go 'round.
On 03/01/2018 02:09 PM, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 9:00 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 12:32 PM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:On 03/01/2018 11:47 AM, Daevor The Devoted wrote:On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com> wrote:
>Adding a surrogate key to such a table just adds overhead, although that could be useful
>in case specific rows need updating or deleting without also modifying the other rows with
>that same data - normally, only insertions and selections happen on such tables though,
>and updates or deletes are absolutely forbidden - corrections happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when you have a 3 col primary
key and need to join it with child tables.I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed!
And so you drop the existing index and build a new one. I've done it before, and I'll do it again.So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I can't stand synthetic keys. By their very nature, they're so purposelessly arbitrary, and allow you to insert garbage into the table.Could you perhaps elaborate on how a surrogate key allows one to insert garbage into the table? I'm afraid I don't quite get what you're saying.
If your only unique index is a synthetic key, then you can insert the same "business data" multiple times with different synthetic keys.--
Angular momentum makes the world go 'round.That might be where we're talking past each other: I do not advocate for the arbitrary primary key being the only unique index. Absolutely not. Whatever the business rules say is unique must also have unique indexes. If it's a business constraint on the data, it must be enforced in the DB (at least, that's how I try to do things).
Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.
The strongest case, for me, when a surrogate key is highly desirable is when there is no truly natural key and the best key for the model is potentially alterable. Specific, the "name" of something. If I add myself to a database and make name unique, so David Johnston, then someone else comes along with the same name and now I want to add the new person as, say David A. Johnston AND rename my existing record to David G. Johnston. I keep the needed uniqueness and don't need to cobble together other data elements. Or, if I were to use email address as the key the same physical entity can now change their address without me having to cascade update all FK instances too. Avoiding the FK cascade when enforcing a non-ideal PK is a major good reason to assign a surrogate.
David J.
On 01/03/2018 17:21 , Ron Johnson wrote: > On 03/01/2018 02:08 PM, marcelo wrote: >> >> >> On 01/03/2018 16:42 , Ron Johnson wrote: >>> On 03/01/2018 01:11 PM, marcelo wrote: >>>> >>>> On 01/03/2018 16:00 , Ron Johnson wrote: >>> [snip] >>>>> If your only unique index is a synthetic key, then you can insert >>>>> the same "business data" multiple times with different synthetic >>>>> keys. >>>>> >>>>> >>>>> -- >>>>> Angular momentum makes the world go 'round. >>>> IMHO, business logic can and must preclude "garbage insertion". >>>> Except you are inserting data directly to database using SQL, any >>>> n-tier architecture will be checking data validity. >>> bl >>> Any n-tier architecture that's bug-free. >>> >> Do you know about unit testing? > > Way Back When Dinosaurs Still Roamed The Earth and I first learned the > trade, the focus was on proper design instead of throwing crud against > the wall and hoping tests caught any bugs. Because, of course, unit > tests are only as good as you imagination in devising tests. > > So, you are fully convinced that there´s no bug free software... Same as I (and you) can code following the business rules, you (and me) can design unit tests not from "imagination" but from same rules. Moreover: you can have a surrogate key (to speedup foreign keys) and simultaneously put a unique constraint on the columns requiring it. What´s the question? --- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus
What´s the question?
Whether the OP, who hasn't come back, knew they were starting a flame war by asking this question...
There is no context-less "right place" to place validation logic, nor are the various options mutually exclusive.
David J.
> On Mar 1, 2018, at 12:47 , Daevor The Devoted <dollien@gmail.com> wrote: > > > I was always of the opinion that a mandatory surrogate key (as you describe) is good practice. > Sure there may be a unique key according to business logic (which may be consist of those > "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed! > So using a primary key whose sole purpose is to be a primary key makes perfect sense to me. I did not get your point. Can you explain why a change of business logic makes it difficult to change existing rows with surrogate key. thanks.
On Thu, Mar 1, 2018 at 10:32 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.The strongest case, for me, when a surrogate key is highly desirable is when there is no truly natural key and the best key for the model is potentially alterable. Specific, the "name" of something. If I add myself to a database and make name unique, so David Johnston, then someone else comes along with the same name and now I want to add the new person as, say David A. Johnston AND rename my existing record to David G. Johnston. I keep the needed uniqueness and don't need to cobble together other data elements. Or, if I were to use email address as the key the same physical entity can now change their address without me having to cascade update all FK instances too. Avoiding the FK cascade when enforcing a non-ideal PK is a major good reason to assign a surrogate.David J.
This is exactly my point: you cannot know when a Business Rule is going to change. Consider, for example, your Social Security number (or ID number as we call it in South Africa). This is unique, right?. Tomorrow, however, data of people from multiple countries gets added to your DB, and BAM! that ID number is suddenly no longer unique. Business Rules can and do change, and we do not know what may change in the future. Hence, it is safest to have the surrogate in place from the start, and avoid the potential migraine later on.
Disclaimer: this is just my opinion based on my experience (and the pain I had to go through when Business Rules changed). I have not done any research or conducted any studies on this.
On Thu, Mar 1, 2018 at 10:36 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
What´s the question?Whether the OP, who hasn't come back, knew they were starting a flame war by asking this question...There is no context-less "right place" to place validation logic, nor are the various options mutually exclusive.David J.
This I can wholeheartedly agree with. And my apologies if I came across as "flaming". Not my intention at all. I'm simply here to learn (and, well, offer my opinion from time to time :) )
On 01/03/2018 17:32 , David G. Johnston wrote:
I suffered myself what David said as an example...Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.The strongest case, for me, when a surrogate key is highly desirable is when there is no truly natural key and the best key for the model is potentially alterable. Specific, the "name" of something. If I add myself to a database and make name unique, so David Johnston, then someone else comes along with the same name and now I want to add the new person as, say David A. Johnston AND rename my existing record to David G. Johnston. I keep the needed uniqueness and don't need to cobble together other data elements. Or, if I were to use email address as the key the same physical entity can now change their address without me having to cascade update all FK instances too. Avoiding the FK cascade when enforcing a non-ideal PK is a major good reason to assign a surrogate.David J.
El software de antivirus Avast ha analizado este correo electrónico en busca de virus. |
On 03/01/2018 02:32 PM, marcelo wrote: > > > On 01/03/2018 17:21 , Ron Johnson wrote: >> On 03/01/2018 02:08 PM, marcelo wrote: >>> >>> >>> On 01/03/2018 16:42 , Ron Johnson wrote: >>>> On 03/01/2018 01:11 PM, marcelo wrote: >>>>> >>>>> On 01/03/2018 16:00 , Ron Johnson wrote: >>>> [snip] >>>>>> If your only unique index is a synthetic key, then you can insert the >>>>>> same "business data" multiple times with different synthetic keys. >>>>>> >>>>>> >>>>>> -- >>>>>> Angular momentum makes the world go 'round. >>>>> IMHO, business logic can and must preclude "garbage insertion". Except >>>>> you are inserting data directly to database using SQL, any n-tier >>>>> architecture will be checking data validity. >>>> bl >>>> Any n-tier architecture that's bug-free. >>>> >>> Do you know about unit testing? >> >> Way Back When Dinosaurs Still Roamed The Earth and I first learned the >> trade, the focus was on proper design instead of throwing crud against >> the wall and hoping tests caught any bugs. Because, of course, unit >> tests are only as good as you imagination in devising tests. >> >> > So, you are fully convinced that there´s no bug free software... Same as I > (and you) can code following the business rules, you (and me) can design > unit tests not from "imagination" but from same rules. > Moreover: you can have a surrogate key (to speedup foreign keys) and > simultaneously put a unique constraint on the columns requiring it. What´s > the question? Implementing tests to cover edge cases is much harder than implementing business rules in natural (and foreign) keys. -- Angular momentum makes the world go 'round.
On 03/01/2018 02:44 PM, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 10:32 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.The strongest case, for me, when a surrogate key is highly desirable is when there is no truly natural key and the best key for the model is potentially alterable. Specific, the "name" of something. If I add myself to a database and make name unique, so David Johnston, then someone else comes along with the same name and now I want to add the new person as, say David A. Johnston AND rename my existing record to David G. Johnston. I keep the needed uniqueness and don't need to cobble together other data elements. Or, if I were to use email address as the key the same physical entity can now change their address without me having to cascade update all FK instances too. Avoiding the FK cascade when enforcing a non-ideal PK is a major good reason to assign a surrogate.David J.This is exactly my point: you cannot know when a Business Rule is going to change. Consider, for example, your Social Security number (or ID number as we call it in South Africa). This is unique, right?.
No, the SSN is not unique. https://www.computerworld.com/article/2552992/it-management/not-so-unique.html
Tomorrow, however, data of people from multiple countries gets added to your DB, and BAM! that ID number is suddenly no longer unique. Business Rules can and do change, and we do not know what may change in the future. Hence, it is safest to have the surrogate in place from the start, and avoid the potential migraine later on.Disclaimer: this is just my opinion based on my experience (and the pain I had to go through when Business Rules changed). I have not done any research or conducted any studies on this.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
I found this thread very interesting. A pivot table is a perfectly valid use case where a compound unique key on two or more columns performs the same functionas a primary key without one. I’m not nearly as familiar with Postgres as I am with MySQL (which is why I recently joined this list)... it may be possibleto define a collection of tables as a primary key. But if only a unique key is specified in this case, everyone wouldstill be ok with the result from a logic design standpoint. I think Melvin, way up the thread, had the best answer- be the DBA and have a review process. Don’t let folks go addingtables as they like.
On 03/01/2018 02:32 PM, David G. Johnston wrote:
Why have the overhead of a second unique index? If it's "ease of joins", then I agree with Francisco Olarte and use the business logic keys in your joins even though it's a bit of extra work.The strongest case, for me, when a surrogate key is highly desirable is when there is no truly natural key and the best key for the model is potentially alterable. Specific, the "name" of something. If I add myself to a database and make name unique, so David Johnston, then someone else comes along with the same name and now I want to add the new person as, say David A. Johnston AND rename my existing record to David G. Johnston. I keep the needed uniqueness and don't need to cobble together other data elements. Or, if I were to use email address as the key the same physical entity can now change their address without me having to cascade update all FK instances too. Avoiding the FK cascade when enforcing a non-ideal PK is a major good reason to assign a surrogate.
There's always the "account number", which is usually synthetic. Credit Card numbers are also synthetic. ICD numbers are (relatively) synthetic, too.
But that doesn't mean we have to use them willy-nilly everywhere.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
Ron Johnson <ron.l.johnson@cox.net> writes: > On 03/01/2018 02:08 PM, marcelo wrote: >> >> >> On 01/03/2018 16:42 , Ron Johnson wrote: >>> On 03/01/2018 01:11 PM, marcelo wrote: >>>> >>>> On 01/03/2018 16:00 , Ron Johnson wrote: >>> [snip] >>>>> If your only unique index is a synthetic key, then you can insert the >>>>> same "business data" multiple times with different synthetic keys. >>>>> >>>>> >>>>> -- >>>>> Angular momentum makes the world go 'round. >>>> IMHO, business logic can and must preclude "garbage insertion". Except >>>> you are inserting data directly to database using SQL, any n-tier >>>> architecture will be checking data validity. >>> >>> Any n-tier architecture that's bug-free. >>> >> Do you know about unit testing? > > Way Back When Dinosaurs Still Roamed The Earth and I first learned the > trade, the focus was on proper design instead of throwing crud against the > wall and hoping tests caught any bugs. Because, of course, unit tests are > only as good as you imagination in devising tests. +1. And a good test of your underlying data model is whether you can identify a natural primary key. If you can't, chances are your model is immature/flawed and needs more analysis. -- Tim Cross
On 03/01/2018 01:03 PM, Ron Johnson wrote: > On 03/01/2018 02:32 PM, David G. Johnston wrote: > There's always the "account number", which is usually synthetic. Credit > Card numbers are also synthetic. Actually, no: https://en.wikipedia.org/wiki/Payment_card_number There is a method to the madness, not just random issuance of numbers. It was made it relatively easy for folks to generate numbers. Hence the addition of CSC codes. ICD numbers are (relatively) > synthetic, too. > > But that doesn't mean we have to use them willy-nilly everywhere. > > > -- > Angular momentum makes the world go 'round. -- Adrian Klaver adrian.klaver@aklaver.com
+1. And a good test of your underlying data model is whether you can
identify a natural primary key. If you can't, chances are your model is
immature/flawed and needs more analysis.
Unfortunately identifying a natural primary key doesn't guarantee that one's model is mature, unblemished, and complete - the model writer may just not know what they don't know. But they may know enough, or the application is constrained enough, for it to be useful anyway.
David J.
On 03/01/2018 03:14 PM, Adrian Klaver wrote:
On 03/01/2018 01:03 PM, Ron Johnson wrote:On 03/01/2018 02:32 PM, David G. Johnston wrote:There's always the "account number", which is usually synthetic. Credit Card numbers are also synthetic.
Actually, no:
https://en.wikipedia.org/wiki/Payment_card_number
There is a method to the madness, not just random issuance of numbers. It was made it relatively easy for folks to generate numbers. Hence the addition of CSC codes.
Right. And how do the issuers generate the individual account identifier within their IIN ranges?
ICD numbers are (relatively)synthetic, too.
But that doesn't mean we have to use them willy-nilly everywhere.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
On 03/01/2018 01:26 PM, Ron Johnson wrote: > On 03/01/2018 03:14 PM, Adrian Klaver wrote: >> On 03/01/2018 01:03 PM, Ron Johnson wrote: >>> On 03/01/2018 02:32 PM, David G. Johnston wrote: >> >>> There's always the "account number", which is usually synthetic. >>> Credit Card numbers are also synthetic. >> >> Actually, no: >> >> https://en.wikipedia.org/wiki/Payment_card_number >> >> There is a method to the madness, not just random issuance of numbers. >> It was made it relatively easy for folks to *generate numbers*. Hence >> the addition of CSC codes. > > Right. And how do the issuers generate the individual account > identifier within their IIN ranges? Who knows, that is their business, though there is nothing to say they don't use some sort of internal 'natural' logic. It has been awhile since we have gone down this rabbit hole on this list, mostly because it is an issue that is usually left at 'we agree to disagree'. Though the thing that always strikes me is the assumption that a number/surrogate key is less 'natural' then some other sort of tag or combination of tags. Because that is what PK's are, a tag to identify a record. > >> >> ICD numbers are (relatively) >>> synthetic, too. >>> >>> But that doesn't mean we have to use them willy-nilly everywhere. > > -- > Angular momentum makes the world go 'round. -- Adrian Klaver adrian.klaver@aklaver.com
On 02/03/18 06:47, Daevor The Devoted wrote: > > On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com > <mailto:rakeshkumar464@aol.com>> wrote: > > > >Adding a surrogate key to such a table just adds overhead, > although that could be useful > >in case specific rows need updating or deleting without also > modifying the other rows with > >that same data - normally, only insertions and selections happen > on such tables though, > >and updates or deletes are absolutely forbidden - corrections > happen by inserting rows with > >an opposite transaction. > > I routinely add surrogate keys like serial col to a table already > having a nice candidate keys > to make it easy to join tables. SQL starts looking ungainly when > you have a 3 col primary > key and need to join it with child tables. > > > I was always of the opinion that a mandatory surrogate key (as you > describe) is good practice. > Sure there may be a unique key according to business logic (which may > be consist of those "ungainly" multiple columns), but guess what, > business logic changes, and then you're screwed! So using a primary > key whose sole purpose is to be a primary key makes perfect sense to me. I once worked in a data base that had primary keys of at least 4 columns, all character fields, Primary Key could easily exceed 45 characters. Parent child structure was at least 4 deep. A child table only needs to know its parent, so there is no logical need to include its parent and higher tables primary keys, and then have to add a field to make the composite primary key unique! So if every table has int (or long) primary keys, then a child only need a single field to reference its parent. Some apparently safe Natural Keys might change unexpectedly. A few years aback there was a long thread on Natural versus Surrogate keys - plenty of examples were using Natural Keys can give grief when they had to be changed! I think it best to isolate a database from external changes as much as is practicable. Surrogate keys also simply coding, be it in SQL or Java, or whatever language is flavour of the month. Also it makes setting up testdata and debugging easier. I almost invariably define a Surrogate key when I design tables. Cheers, Gavin
On 01/03/2018 18:41 , Adrian Klaver wrote: > On 03/01/2018 01:26 PM, Ron Johnson wrote: >> On 03/01/2018 03:14 PM, Adrian Klaver wrote: >>> On 03/01/2018 01:03 PM, Ron Johnson wrote: >>>> On 03/01/2018 02:32 PM, David G. Johnston wrote: >>> >>>> There's always the "account number", which is usually synthetic. >>>> Credit Card numbers are also synthetic. >>> >>> Actually, no: >>> >>> https://en.wikipedia.org/wiki/Payment_card_number >>> >>> There is a method to the madness, not just random issuance of >>> numbers. It was made it relatively easy for folks to *generate >>> numbers*. Hence the addition of CSC codes. >> >> Right. And how do the issuers generate the individual account >> identifier within their IIN ranges? > > Who knows, that is their business, though there is nothing to say they > don't use some sort of internal 'natural' logic. It has been awhile > since we have gone down this rabbit hole on this list, mostly because > it is an issue that is usually left at 'we agree to disagree'. Though > the thing that always strikes me is the assumption that a > number/surrogate key is less 'natural' then some other sort of tag or > combination of tags. Because that is what PK's are, a tag to identify > a record. +1. > >> >>> >>> ICD numbers are (relatively) >>>> synthetic, too. >>>> >>>> But that doesn't mean we have to use them willy-nilly everywhere. >> >> -- >> Angular momentum makes the world go 'round. > > --- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus
On 01/03/2018 19:05 , Gavin Flower wrote: > On 02/03/18 06:47, Daevor The Devoted wrote: >> >> On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com >> <mailto:rakeshkumar464@aol.com>> wrote: >> >> >> >Adding a surrogate key to such a table just adds overhead, >> although that could be useful >> >in case specific rows need updating or deleting without also >> modifying the other rows with >> >that same data - normally, only insertions and selections happen >> on such tables though, >> >and updates or deletes are absolutely forbidden - corrections >> happen by inserting rows with >> >an opposite transaction. >> >> I routinely add surrogate keys like serial col to a table already >> having a nice candidate keys >> to make it easy to join tables. SQL starts looking ungainly when >> you have a 3 col primary >> key and need to join it with child tables. >> >> >> I was always of the opinion that a mandatory surrogate key (as you >> describe) is good practice. >> Sure there may be a unique key according to business logic (which may >> be consist of those "ungainly" multiple columns), but guess what, >> business logic changes, and then you're screwed! So using a primary >> key whose sole purpose is to be a primary key makes perfect sense to me. > > I once worked in a data base that had primary keys of at least 4 > columns, all character fields, Primary Key could easily exceed 45 > characters. Parent child structure was at least 4 deep. > > A child table only needs to know its parent, so there is no logical > need to include its parent and higher tables primary keys, and then > have to add a field to make the composite primary key unique! So if > every table has int (or long) primary keys, then a child only need a > single field to reference its parent. > > Some apparently safe Natural Keys might change unexpectedly. A few > years aback there was a long thread on Natural versus Surrogate keys - > plenty of examples were using Natural Keys can give grief when they > had to be changed! I think it best to isolate a database from > external changes as much as is practicable. > > Surrogate keys also simply coding, be it in SQL or Java, or whatever > language is flavour of the month. Also it makes setting up testdata > and debugging easier. > > I almost invariably define a Surrogate key when I design tables. > > > Cheers, > Gavin > > > > +5. I fully agree. --- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus
On Fri, Mar 2, 2018 at 12:05 AM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
Thank you! I think you have expressed far more clearly what I have been trying to say. +10 to you.On 02/03/18 06:47, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com <mailto:rakeshkumar464@aol.com>> wrote:
>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without also
modifying the other rows with
>that same data - normally, only insertions and selections happen
on such tables though,
>and updates or deletes are absolutely forbidden - corrections
happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already
having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when
you have a 3 col primary
key and need to join it with child tables.
I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.
Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed! So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I once worked in a data base that had primary keys of at least 4 columns, all character fields, Primary Key could easily exceed 45 characters. Parent child structure was at least 4 deep.
A child table only needs to know its parent, so there is no logical need to include its parent and higher tables primary keys, and then have to add a field to make the composite primary key unique! So if every table has int (or long) primary keys, then a child only need a single field to reference its parent.
Some apparently safe Natural Keys might change unexpectedly. A few years aback there was a long thread on Natural versus Surrogate keys - plenty of examples were using Natural Keys can give grief when they had to be changed! I think it best to isolate a database from external changes as much as is practicable.
Surrogate keys also simply coding, be it in SQL or Java, or whatever language is flavour of the month. Also it makes setting up testdata and debugging easier.
I almost invariably define a Surrogate key when I design tables.
Cheers,
Gavin
On 02/03/2018 01:10 , Daevor The Devoted wrote:
Me too. Another +10.On Fri, Mar 2, 2018 at 12:05 AM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:Thank you! I think you have expressed far more clearly what I have been trying to say. +10 to you.On 02/03/18 06:47, Daevor The Devoted wrote:
On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar <rakeshkumar464@aol.com <mailto:rakeshkumar464@aol.com>> wrote:
>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without also
modifying the other rows with
>that same data - normally, only insertions and selections happen
on such tables though,
>and updates or deletes are absolutely forbidden - corrections
happen by inserting rows with
>an opposite transaction.
I routinely add surrogate keys like serial col to a table already
having a nice candidate keys
to make it easy to join tables. SQL starts looking ungainly when
you have a 3 col primary
key and need to join it with child tables.
I was always of the opinion that a mandatory surrogate key (as you describe) is good practice.
Sure there may be a unique key according to business logic (which may be consist of those "ungainly" multiple columns), but guess what, business logic changes, and then you're screwed! So using a primary key whose sole purpose is to be a primary key makes perfect sense to me.
I once worked in a data base that had primary keys of at least 4 columns, all character fields, Primary Key could easily exceed 45 characters. Parent child structure was at least 4 deep.
A child table only needs to know its parent, so there is no logical need to include its parent and higher tables primary keys, and then have to add a field to make the composite primary key unique! So if every table has int (or long) primary keys, then a child only need a single field to reference its parent.
Some apparently safe Natural Keys might change unexpectedly. A few years aback there was a long thread on Natural versus Surrogate keys - plenty of examples were using Natural Keys can give grief when they had to be changed! I think it best to isolate a database from external changes as much as is practicable.
Surrogate keys also simply coding, be it in SQL or Java, or whatever language is flavour of the month. Also it makes setting up testdata and debugging easier.
I almost invariably define a Surrogate key when I design tables.
Cheers,
Gavin
El software de antivirus Avast ha analizado este correo electrónico en busca de virus. |