Thread: [GENERAL] looking for a globally unique row ID

[GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:
Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

on the other hand having a dedicated table just for the purpose of
holding that UNIQUE column with all the "client tables" having FK
pointing to it does not guarantee that multiple "client tables" want use
the same ID from that table. ... and I don't know a way to make an FK
"backfireing" from that table into a multitude of client tables.

I thought of using OIDs on tables that need this unique ID ... but
querying the web returns warnings of OID "rollover" problem. As I'm not
quite sure how to manage that, and if (3) is guaranteed (not just being
lucky during couple of tests), I think this is not an option here.

I'd appreciate any hints, as I'm bitting my head against this problem
for a long time, now.

Thenx in advance,

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
John R Pierce
Date:
On 9/14/2017 12:45 AM, Rafal Pietrak wrote:
> Can anybody help me find a way to implement an ID which:
>
> 1. guarantees being unique across multiple tables.
>
> 2. guarantees its uniqueness not only during INSERT, but also during the
> lifetime of the database/application (e.i. during future UPDATES).
>
> 3. guarantees persistence of value across database backup/restore/upgrade.


isn't that the problem that GUID are supposed to answer ?


-- 
john r pierce, recycling bits in santa cruz



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Michael Paquier
Date:
On Thu, Sep 14, 2017 at 4:45 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> Can anybody help me find a way to implement an ID which:
>
> 1. guarantees being unique across multiple tables.
>
> 2. guarantees its uniqueness not only during INSERT, but also during the
> lifetime of the database/application (e.i. during future UPDATES).
>
> 3. guarantees persistence of value across database backup/restore/upgrade.
> [...]
> I'd appreciate any hints, as I'm bitting my head against this problem
> for a long time, now.

UUID would give you some room for requirement 1. 2 and 3 can be found
with the usage of a serial column.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
George Neuner
Date:
On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
wrote:

>Hello everybody,
>
>Can anybody help me find a way to implement an ID which:
>
>1. guarantees being unique across multiple tables.
>
>2. guarantees its uniqueness not only during INSERT, but also during the
>lifetime of the database/application (e.i. during future UPDATES).
>
>3. guarantees persistence of value across database backup/restore/upgrade.

UUID is the obvious choice, but it does take a lot of space.

Something like this might do the job:
http://rob.conery.io/2014/05/28/a-better-id-generator-for-postgresql/


George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:
W dniu 14.09.2017 o 10:57, George Neuner pisze:
> On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
> wrote:
> 
>> Hello everybody,
>>
>> Can anybody help me find a way to implement an ID which:
>>
>> 1. guarantees being unique across multiple tables.
>>
>> 2. guarantees its uniqueness not only during INSERT, but also during the
>> lifetime of the database/application (e.i. during future UPDATES).
>>
>> 3. guarantees persistence of value across database backup/restore/upgrade.

Seeing the answers I feel, I should probably have added:

4. not necessarily guarantee "planetary wide" uniquness. Meaning:
backup/restore should instantiate those ID dupplication on the second
instance of the database.

> 
> UUID is the obvious choice, but it does take a lot of space.

I was hoping for something like a database-scoped "primary key" - which
in particular does not need to be anything big.... provided the dataset
is small.

As far as I can tell, UUID is an ID, that is "simple/fast" to generate,
and has "extremally low" probability of collisions.

Instead I was looking for a "mechanizms/program-sql-idioms" which don't
have to be particularly efficient, but once generated, no matter what,
the uniqueness is asurred by the database. Including UPDATEs - e.i.
assignment of a completly new ID for a particular ROW.

But I understand I may quit searching - there is nothing "so simple".

> 
> Something like this might do the job:
> http://rob.conery.io/2014/05/28/a-better-id-generator-for-postgresql/

I'll have a look, but this is not quite the tool I'm looking for.

> 
> 
> George
> 
> 
> 

Thank you all for answers.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
vinny
Date:
On 2017-09-14 15:06, Rafal Pietrak wrote:
> W dniu 14.09.2017 o 10:57, George Neuner pisze:
>> On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
>> wrote:
>> 
>>> Hello everybody,
>>> 
>>> Can anybody help me find a way to implement an ID which:
>>> 
>>> 1. guarantees being unique across multiple tables.
>>> 
>>> 2. guarantees its uniqueness not only during INSERT, but also during 
>>> the
>>> lifetime of the database/application (e.i. during future UPDATES).
>>> 
>>> 3. guarantees persistence of value across database 
>>> backup/restore/upgrade.
> 
> Seeing the answers I feel, I should probably have added:
> 
> 4. not necessarily guarantee "planetary wide" uniquness. Meaning:
> backup/restore should instantiate those ID dupplication on the second
> instance of the database.
> 
>> 
>> UUID is the obvious choice, but it does take a lot of space.
> 
> I was hoping for something like a database-scoped "primary key" - which
> in particular does not need to be anything big.... provided the dataset
> is small.
> 
> As far as I can tell, UUID is an ID, that is "simple/fast" to generate,
> and has "extremally low" probability of collisions.
> 
> Instead I was looking for a "mechanizms/program-sql-idioms" which don't
> have to be particularly efficient, but once generated, no matter what,
> the uniqueness is asurred by the database. Including UPDATEs - e.i.
> assignment of a completly new ID for a particular ROW.
> 
> But I understand I may quit searching - there is nothing "so simple".
> 

If it is only one database, on one server, then couldn't you just use 
one sequence?
If oyu prefix the value with some identifier of the current table then 
you cannot get duplicates
across tables even if you reset the sequence.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Merlin Moncure
Date:
On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> Hello everybody,
>
> Can anybody help me find a way to implement an ID which:
>
> 1. guarantees being unique across multiple tables.
>
> 2. guarantees its uniqueness not only during INSERT, but also during the
> lifetime of the database/application (e.i. during future UPDATES).
>
> 3. guarantees persistence of value across database backup/restore/upgrade.
>
> an obvious candidate - a single SERIAL() (same serial) used in every
> table that needs that ID does not guarantee (2).

A shared sequence meets all of those requirements.  I tend to prefer
this over GUID.   GUIDs are twice the size, and randomized which has
some unpleasant performance characteristics.   The advantage they
bring is being able to set up multiple generation points (say, across
multiple database servers) without fear of conflict.

Sequence generation is very fast so there is no performance related
argument not to use the approach.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Karl Czajkowski
Date:
On Sep 14, vinny modulated:

> If it is only one database, on one server, then couldn't you just
> use one sequence?
> If oyu prefix the value with some identifier of the current table
> then you cannot get duplicates
> across tables even if you reset the sequence.
> 

I didn't follow the whole thread, so I apologize if I'm repeating
earlier suggestions.

We use a shared sequence to issue the new identifiers, and in fact
limited the sequence to 43 bits so they can serialize as JSON numbers
for the benefit of our clients.  We disabled wrapping, so it will fail
if we exhaust the range.

If you rapidly churn through identifiers and could envision exhausting
64 bits in your database's lifetime, you should probably just use
UUIDs instead of a sequence.  A timestamp-based UUID still has 
reasonably sorting and indexing properties.

To "guarantee" uniqueness with a shared sequence or UUID generator,
you can use a trigger to prevent override of identifiers from SQL. As
long as you always use the correct value generator during INSERT and
disallow mutation of identifiers during UPDATE, the rows will not
share identifiers.


Karl


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Steve Atkins
Date:
> On Sep 14, 2017, at 8:38 AM, Karl Czajkowski <karlcz@isi.edu> wrote:
> 
> On Sep 14, vinny modulated:
> 
>> If it is only one database, on one server, then couldn't you just
>> use one sequence?
>> If oyu prefix the value with some identifier of the current table
>> then you cannot get duplicates
>> across tables even if you reset the sequence.
>> 
> 
> I didn't follow the whole thread, so I apologize if I'm repeating
> earlier suggestions.
> 
> We use a shared sequence to issue the new identifiers, and in fact
> limited the sequence to 43 bits so they can serialize as JSON numbers
> for the benefit of our clients.  We disabled wrapping, so it will fail
> if we exhaust the range.
> 
> If you rapidly churn through identifiers and could envision exhausting
> 64 bits in your database's lifetime, you should probably just use

2^63 nanoseconds is about three centuries.

Unless you need to generate identifiers in multiple places a simple
bigserial is good enough. (If you do need to generate unique identifiers
at facebook / instagram / twitter scale then there are other options, but
you're almost certainly not that big and you probably don't).

For distributed ids on a system you control there are a bunch of 64 bit
id generation algorithms that work well. Twitter snowflake was one of the
earlier ones.

Where UUIDs or GUIDs shine is when you want to be able to generate
ids with a reasonably guarantee that nobody else, anywhere on the planet
or off, ever has or ever will generate the same ID. If you're not in that
situation you don't really need the behaviour they try to guarantee.

> UUIDs instead of a sequence.  A timestamp-based UUID still has 
> reasonably sorting and indexing properties.
> 
> To "guarantee" uniqueness with a shared sequence or UUID generator,
> you can use a trigger to prevent override of identifiers from SQL. As
> long as you always use the correct value generator during INSERT and
> disallow mutation of identifiers during UPDATE, the rows will not
> share identifiers.

Cheers, Steve


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 14.09.2017 o 15:54, Merlin Moncure pisze:
> On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>> Hello everybody,
>>
>> Can anybody help me find a way to implement an ID which:
>>
>> 1. guarantees being unique across multiple tables.
>>
>> 2. guarantees its uniqueness not only during INSERT, but also during the
>> lifetime of the database/application (e.i. during future UPDATES).
>>
>> 3. guarantees persistence of value across database backup/restore/upgrade.
>>
>> an obvious candidate - a single SERIAL() (same serial) used in every
>> table that needs that ID does not guarantee (2).
> 
> A shared sequence meets all of those requirements.  I tend to prefer

Not really.

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

That's the goal.

Multitude of tablas using a single sequence does not give that guarantee.

As I've said, a solution closest to my target is a separate table with
just one column of that "global primary key", which get inserted/updated
within trigger on insert/update of the "client tables" ... only I'm not
so sure how to "cleanly" manage multitude of tables using the same key
of that "global table of keys"... that is its "back references".

So I'm stuck with seriously incomplete solution.

that's why I have an impression, that I'm going into entirely wrong
direction here.

> this over GUID.   GUIDs are twice the size, and randomized which has
> some unpleasant performance characteristics.   The advantage they
> bring is being able to set up multiple generation points (say, across
> multiple database servers) without fear of conflict.
> 
> Sequence generation is very fast so there is no performance related
> argument not to use the approach.
> 
> merlin
> 
> 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rob Sargent
Date:

On 09/14/2017 11:11 AM, Rafal Pietrak wrote:
>
> Not really.
>
> As I said, I'm not looking for performance or "fair probability" of
> planetary-wide uniqueness.
>
> My main objective is the "guarantee". Which I've tried to indicate
> referring to "future UPDATEs".
>
> What I mean here is functionality similar to "primary key", or "unique
> constraint". Whenever somebody (application, like faulty application
> IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
> fact could possibly be generated earlier by UUID algorithms, or even a
> sequence), if that value is among table that uses that (misterious)
> "global primary key"; that application just fails the transaction like
> any other "not unique" constraint failing.
>
> That's the goal.
>
> Multitude of tablas using a single sequence does not give that guarantee.
>
> As I've said, a solution closest to my target is a separate table with
> just one column of that "global primary key", which get inserted/updated
> within trigger on insert/update of the "client tables" ... only I'm not
> so sure how to "cleanly" manage multitude of tables using the same key
> of that "global table of keys"... that is its "back references".
>
> So I'm stuck with seriously incomplete solution.
>
> that's why I have an impression, that I'm going into entirely wrong
> direction here.
>
>
So you care if the same id is used in separate, unrelated tables? What's 
your fear here?  And I completely get the confusion generated be the 
same small integer being re-used in various context ("sample id" is the 
bane for me). Could you use a sufficiently accurate time value?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Kenneth Marshall
Date:
On Thu, Sep 14, 2017 at 07:11:19PM +0200, Rafal Pietrak wrote:
> 
> As I said, I'm not looking for performance or "fair probability" of
> planetary-wide uniqueness.
> 
> My main objective is the "guarantee". Which I've tried to indicate
> referring to "future UPDATEs".
> 
> What I mean here is functionality similar to "primary key", or "unique
> constraint". Whenever somebody (application, like faulty application
> IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
> fact could possibly be generated earlier by UUID algorithms, or even a
> sequence), if that value is among table that uses that (misterious)
> "global primary key"; that application just fails the transaction like
> any other "not unique" constraint failing.
> 
> That's the goal.
> 
> that's why I have an impression, that I'm going into entirely wrong
> direction here.

Hi Rafal,

How many tables do you need to support a unique key across? My approach
to problems like this is to to provide constraints that will allow normal
DB functions to provide these assurances. For example, give each table
its on serial generator as a primary key, but make the sequences disjoint.
There are 9,592 prime numbers less than 100,000. Give each table one of
those as the increment and within that table you will never hit the sequence
value generated for a second table. This will at least allow you to prevent
any table from ever using the value for another table. Obviously, this may
not fit your use case, but it provides another way to attack the problem.
Good luck.

Regards,
Ken


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 14.09.2017 o 19:30, Rob Sargent pisze:
> 
> 
> On 09/14/2017 11:11 AM, Rafal Pietrak wrote:
>>

[------------------]
>> So I'm stuck with seriously incomplete solution.
>>
>> that's why I have an impression, that I'm going into entirely wrong
>> direction here.
>>
>>
> So you care if the same id is used in separate, unrelated tables? What's
> your fear here?  And I completely get the confusion generated be the

You can call it fear, but I see it as an attempt for "schema/design
resilience".

Like with "unique constraint". You lay out the schema, and whatever bug
get planted into an application, the database does not let you put
inconsistent data there.

And answering your question, my problem is actually trivial. I need that
to systematically cover document identification within archive. There
are couple of document "Classes" I need to cover, their representation
in the DB requires different columns, so they are registered in
different table. I've started the project with one table for all those
classes, and quite soon it become monstrous.

So every document class got their own table. (and their private unique ID).

But problem rises when I need to referencing those documents along their
lifetime. Every class needs separate table for that. I cannot have a
single table of "history" (document-ID, dispatch-date, recepient), as
I'm not able to make FK from single table into multitude of tables (with
separate classes). So the forest of tables grows.

The above schema would get significantly simpler if only I could have a
"global primary key"... or something equivalent.

And as I said, this is not a problem of performance (or identification
space exhaustion). It's a problem of "schema explosion". Something that
actually is quite simple become complex (if not unmanageable) just
because I cannot see a good tool to manage the "globality".

I kick this problem for some time now - rewriting the basic schema
numerous times, and:

1. either I get monstrous "document master table" - which very
effectively leads to contents inconsistency (document get attributs
which dont belong to their classes)

2. or the schema explodes, as with documents stored in smaller (less
columns) specialized by-class table ... they all need their private
"managemnet tables" which FK into its respective document tables.

Both ways turned out to be quite expensive in maintenance.

Throwing actual numbers: 12 basic classes of documents; 17 tables
registering various operations document may undergo during its lifetime.
Variant (2) above make it 12*17 = 204 tables, which I'm currently
maintaining.... and it's too much. With variant (1) I simply wasn't able
to effectively keep document attributes consistent.

Thus I'm searching for tools (paradigms/sql-idioms) that would fit the
problem.

> same small integer being re-used in various context ("sample id" is the
> bane for me). Could you use a sufficiently accurate time value?
> 
> 
> 


But thank you all for answers.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rob Sargent
Date:

On 09/14/2017 02:39 PM, Rafal Pietrak wrote:
>
> W dniu 14.09.2017 o 19:30, Rob Sargent pisze:
>>
>> On 09/14/2017 11:11 AM, Rafal Pietrak wrote:
> [------------------]
>>> So I'm stuck with seriously incomplete solution.
>>>
>>> that's why I have an impression, that I'm going into entirely wrong
>>> direction here.
>>>
>>>
>> So you care if the same id is used in separate, unrelated tables? What's
>> your fear here?  And I completely get the confusion generated be the
> You can call it fear, but I see it as an attempt for "schema/design
> resilience".
>
> Like with "unique constraint". You lay out the schema, and whatever bug
> get planted into an application, the database does not let you put
> inconsistent data there.
>
> And answering your question, my problem is actually trivial. I need that
> to systematically cover document identification within archive. There
> are couple of document "Classes" I need to cover, their representation
> in the DB requires different columns, so they are registered in
> different table. I've started the project with one table for all those
> classes, and quite soon it become monstrous.
>
> So every document class got their own table. (and their private unique ID).
>
> But problem rises when I need to referencing those documents along their
> lifetime. Every class needs separate table for that. I cannot have a
> single table of "history" (document-ID, dispatch-date, recepient), as
> I'm not able to make FK from single table into multitude of tables (with
> separate classes). So the forest of tables grows.
>
> The above schema would get significantly simpler if only I could have a
> "global primary key"... or something equivalent.
>
> And as I said, this is not a problem of performance (or identification
> space exhaustion). It's a problem of "schema explosion". Something that
> actually is quite simple become complex (if not unmanageable) just
> because I cannot see a good tool to manage the "globality".
>
> I kick this problem for some time now - rewriting the basic schema
> numerous times, and:
>
> 1. either I get monstrous "document master table" - which very
> effectively leads to contents inconsistency (document get attributs
> which dont belong to their classes)
>
> 2. or the schema explodes, as with documents stored in smaller (less
> columns) specialized by-class table ... they all need their private
> "managemnet tables" which FK into its respective document tables.
>
> Both ways turned out to be quite expensive in maintenance.
>
> Throwing actual numbers: 12 basic classes of documents; 17 tables
> registering various operations document may undergo during its lifetime.
> Variant (2) above make it 12*17 = 204 tables, which I'm currently
> maintaining.... and it's too much. With variant (1) I simply wasn't able
> to effectively keep document attributes consistent.
>
> Thus I'm searching for tools (paradigms/sql-idioms) that would fit the
> problem.

Isn't this typically handled with an inheritance (parent-children) 
setup.  MasterDocument has id, subtype and any common columns (create 
date etc) then dependents use the same id from master to complete the 
data for a given type.  This is really common in ORM tools.  Not clear 
from the description if the operations could be similarly handled 
(operation id, operation type as master of 17 dependent 
operationSpecifics; there is also the "Activity Model")
>> same small integer being re-used in various context ("sample id" is the
>> bane for me). Could you use a sufficiently accurate time value?
>>
>>
>>
>
> But thank you all for answers.
>
> -R
>
>



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Karl Czajkowski
Date:
On Sep 14, Rafal Pietrak modulated:

> My main objective is the "guarantee". Which I've tried to indicate
> referring to "future UPDATEs".
> 

With a well-behaved application, it is sufficient to define each
ID column as:
  id int8 UNIQUE NOT NULL DEFAULT nextval('sharedsequence')

and ensure that you always use defaults for INSERT and never
set the column during UPDATE.

You can also use a BEFORE/FOR EACH ROW trigger to enforce these
conventions, if you are worried about accidental violations in your
SQL.  In that case, leave the DEFAULT as NULL in the table definition,
but make the trigger do this for INSERT:
  NEW.id := nextval('sharedsequence');

and this enforcement check for UPDATE:
  IF OLD.id != NEW.id THEN RAISE EXCEPTION 'id is immutable';

If that's not a strong enough guarantee, I'm not sure I understand
your problem statement...


Karl


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Gavin Flower
Date:
On 15/09/17 06:15, Kenneth Marshall wrote:
> On Thu, Sep 14, 2017 at 07:11:19PM +0200, Rafal Pietrak wrote:
>> As I said, I'm not looking for performance or "fair probability" of
>> planetary-wide uniqueness.
>>
>> My main objective is the "guarantee". Which I've tried to indicate
>> referring to "future UPDATEs".
>>
>> What I mean here is functionality similar to "primary key", or "unique
>> constraint". Whenever somebody (application, like faulty application
>> IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
>> fact could possibly be generated earlier by UUID algorithms, or even a
>> sequence), if that value is among table that uses that (misterious)
>> "global primary key"; that application just fails the transaction like
>> any other "not unique" constraint failing.
>>
>> That's the goal.
>>
>> that's why I have an impression, that I'm going into entirely wrong
>> direction here.
> Hi Rafal,
>
> How many tables do you need to support a unique key across? My approach
> to problems like this is to to provide constraints that will allow normal
> DB functions to provide these assurances. For example, give each table
> its on serial generator as a primary key, but make the sequences disjoint.
> There are 9,592 prime numbers less than 100,000. Give each table one of
> those as the increment and within that table you will never hit the sequence
> value generated for a second table. This will at least allow you to prevent
> any table from ever using the value for another table. Obviously, this may
> not fit your use case, but it provides another way to attack the problem.
> Good luck.
>
> Regards,
> Ken
>
>
Hmm...

2 4 6 8 10

3 6

5 10

Adding a prime as an increment is not sufficient to guarantee uniqueness!

You have to ensure that the product of the 2 smallest primes you use is 
greater than any number you'd need to generate.  With such large primes 
you may run out of sequence numbers faster than you would like!


Cheers,
Gavin




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Merlin Moncure
Date:
On Thu, Sep 14, 2017 at 12:11 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> W dniu 14.09.2017 o 15:54, Merlin Moncure pisze:
>> On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>>> Hello everybody,
>>>
>>> Can anybody help me find a way to implement an ID which:
>>>
>>> 1. guarantees being unique across multiple tables.
>>>
>>> 2. guarantees its uniqueness not only during INSERT, but also during the
>>> lifetime of the database/application (e.i. during future UPDATES).
>>>
>>> 3. guarantees persistence of value across database backup/restore/upgrade.
>>>
>>> an obvious candidate - a single SERIAL() (same serial) used in every
>>> table that needs that ID does not guarantee (2).
>>
>> A shared sequence meets all of those requirements.  I tend to prefer
>
> Not really.
>
> As I said, I'm not looking for performance or "fair probability" of
> planetary-wide uniqueness.
>
> My main objective is the "guarantee". Which I've tried to indicate
> referring to "future UPDATEs".
>
> What I mean here is functionality similar to "primary key", or "unique
> constraint". Whenever somebody (application, like faulty application
> IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
> fact could possibly be generated earlier by UUID algorithms, or even a
> sequence), if that value is among table that uses that (misterious)
> "global primary key"; that application just fails the transaction like
> any other "not unique" constraint failing.

hm, I don't understand you.  Are you trying to defend against forged
or erroneous data?  With both sequences and GUIDs, the assumption is
that the application will use the generated id and send it back to the
database.  With sequences, the database *guarantees* that the
identifier is unique with no exceptions; there never will be a unique
value.  Can you give a hypothetical example of how you think they
wouldn't work?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
"David G. Johnston"
Date:
On Thu, Sep 14, 2017 at 12:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

​I don't see how PostgreSQL can provide an absolute guarantee here.  As others have been saying you can get very close, though.  Any of them have the same basic property - you need to convert client SQL into "requests for changes" and perform the actual changes within system-managed code while restricting any possibility for applications to make those changes themselves.  You can get a basic version of this using triggers.  Or you can go all-out and write API functions for every one of these tables and have the application perform CRUD only via these functions.  These functions then, and not the application, would control key generation.​  You disallow updating IDs and when inserting an ID you insert it into a "id log" table that has a unique index on it and if that insertion succeeds you can then associate it with the record being insert into the main table.  You could setup an FK as well but it wouldn't provide the needed guarantee.

Nothing will stop an administrator, or a user with incorrectly configured permissions, from bypassing all of that but properly configured application roles will be unable to do so.

David J.

Re: [GENERAL] looking for a globally unique row ID

From
"Joshua D. Drake"
Date:
On 09/14/2017 03:27 PM, David G. Johnston wrote:
> On Thu, Sep 14, 2017 at 12:45 AM, Rafal Pietrak <rafal@ztk-rp.eu 
> <mailto:rafal@ztk-rp.eu>>wrote:
> 
>     Hello everybody,
> 
>     Can anybody help me find a way to implement an ID which:
> 
>     1. guarantees being unique across multiple tables.
> 
>     2. guarantees its uniqueness not only during INSERT, but also during the
>     lifetime of the database/application (e.i. during future UPDATES).
> 
>     3. guarantees persistence of value across database
>     backup/restore/upgrade.
> 
>     an obvious candidate - a single SERIAL() (same serial) used in every
>     table that needs that ID does not guarantee (2).

Isn't this what GUIDS are for?

JD

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
George Neuner
Date:
On Thu, 14 Sep 2017 17:02:05 -0500, Merlin Moncure
<mmoncure@gmail.com> wrote:

>... With sequences, the database *guarantees* that the
>identifier is unique with no exceptions; there never will be a unique
>value.  Can you give a hypothetical example of how you think they
>wouldn't work?

Jumping in here, but a month or so past someone here was complaining
about a restore resetting all the sequences in the database. Obviously
a strange situation [I've never seen it], but something to worry about
if you are relying on sequences for uniqueness.

Upthread I suggested the timestamp+counter approach.  I wasn't
thinking about this issue specifically, but it is immune to the
counter being reset [accidentally or otherwise].

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 14.09.2017 o 23:15, Rob Sargent pisze:
> 
> 
> On 09/14/2017 02:39 PM, Rafal Pietrak wrote:
>>
>> W dniu 14.09.2017 o 19:30, Rob Sargent pisze:
>>>
>>> On 09/14/2017 11:11 AM, Rafal Pietrak wrote:

[------------------]
>>
>> Throwing actual numbers: 12 basic classes of documents; 17 tables
>> registering various operations document may undergo during its lifetime.
>> Variant (2) above make it 12*17 = 204 tables, which I'm currently
>> maintaining.... and it's too much. With variant (1) I simply wasn't able
>> to effectively keep document attributes consistent.
>>
>> Thus I'm searching for tools (paradigms/sql-idioms) that would fit the
>> problem.
> 
> Isn't this typically handled with an inheritance (parent-children)
> setup.  MasterDocument has id, subtype and any common columns (create
> date etc) then dependents use the same id from master to complete the
> data for a given type.  This is really common in ORM tools.  Not clear
> from the description if the operations could be similarly handled
> (operation id, operation type as master of 17 dependent
> operationSpecifics; there is also the "Activity Model")

I do that, but may be I do that badly.

Currently I do have 6 levels of inheritance which partition my
document-class space. But I cannot see any way to have a unique index
(unique constraint) to cover all those partitions at once.

This is actually the core of my question: How to make one?

So far I only have separate unique indexes on all those 12 child-table
document-class subtables. Is there a way to combine those indexes? I
experimented, and an index created on parent table does not cover
content of child/inheriting tables. If it was, that would solve the problem.

.... or I've just missinterpreted you MasterDocument suggestion?


-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Alban Hertroys
Date:
On 15 September 2017 at 11:03, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

>> Isn't this typically handled with an inheritance (parent-children)
>> setup.  MasterDocument has id, subtype and any common columns (create
>> date etc) then dependents use the same id from master to complete the
>> data for a given type.  This is really common in ORM tools.  Not clear
>> from the description if the operations could be similarly handled
>> (operation id, operation type as master of 17 dependent
>> operationSpecifics; there is also the "Activity Model")
>
> I do that, but may be I do that badly.
>
> Currently I do have 6 levels of inheritance which partition my
> document-class space. But I cannot see any way to have a unique index
> (unique constraint) to cover all those partitions at once.
>
> This is actually the core of my question: How to make one?
>
> So far I only have separate unique indexes on all those 12 child-table
> document-class subtables. Is there a way to combine those indexes? I
> experimented, and an index created on parent table does not cover
> content of child/inheriting tables. If it was, that would solve the problem.
>
> .... or I've just missinterpreted you MasterDocument suggestion?

With table partitioning, provided the partitions are based on the
value(s) of a particular field that is part of the primary key of the
master table, the combination of the child tables' primary key and the
partition's check constraint on the partitioning field guarantee that
records across the partitioned tables are unique.

For example, if we define:
create table master_table (   year int
,   month int
,   example text
,   primary key (year, month)
);

create child2016_table () inherits master_table;

alter table child_table add constraint child2016_year_chk check (year = 2016);
alter table child_table add constraint child2016_pk primary key (year, month);

create child2017_table () inherits master_table;

alter table child_table add constraint child2017_year_chk check (year = 2017);
alter table child_table add constraint child2017_pk primary key (year, month);

In above, the three separate primary keys are guaranteed to contain
distinct ranges of year - provided that we forbid any records to go
directly into the master table or that those records do not have years
already covered by one of the child tables.

Perhaps you can apply this concept to your problem?

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Kenneth Marshall
Date:
> >
> Hmm...
> 
> 2 4 6 8 10
> 
> 3 6
> 
> 5 10
> 
> Adding a prime as an increment is not sufficient to guarantee uniqueness!
> 
> You have to ensure that the product of the 2 smallest primes you use
> is greater than any number you'd need to generate.  With such large
> primes you may run out of sequence numbers faster than you would
> like!
> 
> 
> Cheers,
> Gavin

Yes, you are right. That would not help.

Regards,
Ken


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 15.09.2017 o 13:36, Alban Hertroys pisze:
> On 15 September 2017 at 11:03, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> 

[-----------------]
> 
> For example, if we define:
> create table master_table (
>     year int
> ,   month int
> ,   example text
> ,   primary key (year, month)
> );
> 
> create child2016_table () inherits master_table;
> 
> alter table child_table add constraint child2016_year_chk check (year = 2016);
> alter table child_table add constraint child2016_pk primary key (year, month);
> 
> create child2017_table () inherits master_table;
> 
> alter table child_table add constraint child2017_year_chk check (year = 2017);
> alter table child_table add constraint child2017_pk primary key (year, month);
> 
> In above, the three separate primary keys are guaranteed to contain
> distinct ranges of year - provided that we forbid any records to go
> directly into the master table or that those records do not have years
> already covered by one of the child tables.
> 
> Perhaps you can apply this concept to your problem?
> 

I do it exactly this way.

The problem is, that the documents undergo "postprocessing" - 17 other
tables "would" describe those .... and MUST keep track of what's done
and to which document, which is done by FK into relevant document table.

Having this partitioning, instead of having those 17 "process-tables" I
have to create 17 * 12 = 204 tables to be able to implement those FK (ID
+ child-selector); while if only I could avoid that "child-selector"
(like YEAR in your example), it would let me reduce my schema like
10-fold (hmm, not exactly 12+17 = 29 tables v.s. 204 tables, but close
enough).

The complexity of this design have already made me stop adding
functionality ... so I'm looking for means of reducing it. An obvious
candidate would be something like "global primary key" over all the
partitions of the master document table.

But I understand, no such thing exists.

Thenx anyway,

-R



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rob Sargent
Date:
>> Isn't this typically handled with an inheritance (parent-children)
>> setup.  MasterDocument has id, subtype and any common columns (create
>> date etc) then dependents use the same id from master to complete the
>> data for a given type.  This is really common in ORM tools.  Not clear
>> from the description if the operations could be similarly handled
>> (operation id, operation type as master of 17 dependent
>> operationSpecifics; there is also the "Activity Model")
> I do that, but may be I do that badly.
>
> Currently I do have 6 levels of inheritance which partition my
> document-class space. But I cannot see any way to have a unique index
> (unique constraint) to cover all those partitions at once.
>
> This is actually the core of my question: How to make one?

I cannot image a single postgres index covering more than one physical 
table. Are you really asking for that? Remember each dependent record 
has an entry in the master so the master guarantees a unique set of keys 
across all the dependents.  Now if you have enough documents you may get 
into partitioning but that's a separate issue.
How you model the work done on (or state transition of) those documents 
is a yet another design, but at least the work flow model can safely, 
consistently refer to the master table.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Adam Brusselback
Date:
I cannot image a single postgres index covering more than one physical table. Are you really asking for that?

While not available yet, that is a feature that has had discussion before.  Global indexes are what i've seen it called in those discussions.  One of the main use cases is to provide uniqueness across multiple tables, which would also allow things like foreign keys on partitioned tables.

Re: [GENERAL] looking for a globally unique row ID

From
Rob Sargent
Date:



On 09/15/2017 12:45 PM, Adam Brusselback wrote:
I cannot image a single postgres index covering more than one physical table. Are you really asking for that?

While not available yet, that is a feature that has had discussion before.  Global indexes are what i've seen it called in those discussions.  One of the main use cases is to provide uniqueness across multiple tables, which would also allow things like foreign keys on partitioned tables.
I had a sneaking suspicion that partitioning would be the use-case, but clearly there's at least the 'notion' of a single entity

Re: [GENERAL] looking for a globally unique row ID

From
Christopher Browne
Date:
On 15 September 2017 at 14:45, Adam Brusselback
<adambrusselback@gmail.com> wrote:
>> I cannot image a single postgres index covering more than one physical
>> table. Are you really asking for that?
>
>
> While not available yet, that is a feature that has had discussion before.
> Global indexes are what i've seen it called in those discussions.  One of
> the main use cases is to provide uniqueness across multiple tables, which
> would also allow things like foreign keys on partitioned tables.

It certainly does come up periodically; it seems like a challengingly different
thing to implement (as compared to "regular" indexes), from two perspectives:

a) The present index implementation only needs to reference tuples from one
table, so the tuple references can be direct heap references.

If multiple tables (partitions being the most obvious case) were to be covered,
each index entry would also require indication of which table it comes from.

b) Referencing which index entries can be dropped (e.g. - vacuumed out)
is a fair bit more complex because the index entries depend on multiple
tables.  This adds, um, concurrency complications, if data is being deleted
from multiple tables concurrently.  (Over-simplifying question:  "When
a table that participates in the sharing is vacuumed, does the shared
index get vacuumed?  What if two such tables are vacuumed concurrently?")

This has added up to make it not an easy thing to implement.

To be sure, if a shared index required greatly worsened locking to do
maintenance, or suffered from inability to keep it tidy, that would make the
feature of rather less interest...
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 15.09.2017 o 20:49, Rob Sargent pisze:
> 
> 
> On 09/15/2017 12:45 PM, Adam Brusselback wrote:
>>
>>     I cannot image a single postgres index covering more than one
>>     physical table. Are you really asking for that?
>>
>>
>> While not available yet, that is a feature that has had discussion
>> before.  Global indexes are what i've seen it called in those
>> discussions.  One of the main use cases is to provide uniqueness
>> across multiple tables, which would also allow things like foreign
>> keys on partitioned tables.
> I had a sneaking suspicion that partitioning would be the use-case, but
> clearly there's at least the 'notion' of a single entity
> 

But (from my particular application perspective) it's quite vital.

Still, pondering ways to restructure my schema I came to conclusions
that having an index covering inherited hierarchy could help automate
partitioning - as opposed to current requirements of putting explicit
CHECKs into every child tables of that hierarchy.... and keeping those
check consisting as the schema grows. I'm not too sure though, as I'm
not familiar with postgres implementation internals.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Jehan-Guillaume de Rorthais
Date:

Le 14 septembre 2017 19:11:19 GMT+02:00, Rafal Pietrak <rafal@ztk-rp.eu> a écrit :
>
>
>W dniu 14.09.2017 o 15:54, Merlin Moncure pisze:
>> On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu>
>wrote:
>>> Hello everybody,
>>>
>>> Can anybody help me find a way to implement an ID which:
>>>
>>> 1. guarantees being unique across multiple tables.
>>>
>>> 2. guarantees its uniqueness not only during INSERT, but also during
>the
>>> lifetime of the database/application (e.i. during future UPDATES).
>>>
>>> 3. guarantees persistence of value across database
>backup/restore/upgrade.
>>>
>>> an obvious candidate - a single SERIAL() (same serial) used in every
>>> table that needs that ID does not guarantee (2).
>>
>> A shared sequence meets all of those requirements.  I tend to prefer
>
>Not really.
>
>As I said, I'm not looking for performance or "fair probability" of
>planetary-wide uniqueness.
>
>My main objective is the "guarantee". Which I've tried to indicate
>referring to "future UPDATEs".
>
>What I mean here is functionality similar to "primary key", or "unique
>constraint". Whenever somebody (application, like faulty application
>IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which
>in
>fact could possibly be generated earlier by UUID algorithms, or even a
>sequence), if that value is among table that uses that (misterious)
>"global primary key"; that application just fails the transaction like
>any other "not unique" constraint failing.

I wrote something about this some years ago, this might do the trick for you, maybe with some adjustments depending on
yourschema. The main idea should help anyway. See: 

http://blog.ioguix.net/postgresql/2015/02/05/Partitionning-and-constraints-part-1.html

(no, I never wrote the second part about fk :-/)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 15.09.2017 o 21:30, Christopher Browne pisze:
> On 15 September 2017 at 14:45, Adam Brusselback
> <adambrusselback@gmail.com> wrote:
>>> I cannot image a single postgres index covering more than one physical
>>> table. Are you really asking for that?
>>
>>
>> While not available yet, that is a feature that has had discussion before.
>> Global indexes are what i've seen it called in those discussions.  One of
>> the main use cases is to provide uniqueness across multiple tables, which
>> would also allow things like foreign keys on partitioned tables.
> 
> It certainly does come up periodically; it seems like a challengingly different
> thing to implement (as compared to "regular" indexes), from two perspectives:
> 
> a) The present index implementation only needs to reference tuples from one
> table, so the tuple references can be direct heap references.
> 
> If multiple tables (partitions being the most obvious case) were to be covered,
> each index entry would also require indication of which table it comes from.
> 
> b) Referencing which index entries can be dropped (e.g. - vacuumed out)
> is a fair bit more complex because the index entries depend on multiple
> tables.  This adds, um, concurrency complications, if data is being deleted
> from multiple tables concurrently.  (Over-simplifying question:  "When
> a table that participates in the sharing is vacuumed, does the shared
> index get vacuumed?  What if two such tables are vacuumed concurrently?")
> 
> This has added up to make it not an easy thing to implement.
> 
> To be sure, if a shared index required greatly worsened locking to do
> maintenance, or suffered from inability to keep it tidy, that would make the
> feature of rather less interest...
> 

I feel obliged to add my one final comment:

If the above rocks and logs in the way, so if all this is to be
implemented by an SQL level programmer as functions .... when
application level decisions requires it (say from the lack of
imagination or knowledge of SQL/RDBM proper designs), that SQL level
programmer will have to code most of those pitfalls at higher level -
most probably incompletely. (I'm in particular not able to pinpoint all
the cases that needs to be covered - like by triggers - to have the
global uniqueness checked in all the corner cases scenarios).

But following the signature remark of the email I'm currently replaying
to: is there a way to partition the problem?


-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:
<disclaimer>I'M NOT POSTGRES HACKER. THIS IS JUST NA INTUITION.
</disclaimer>

W dniu 15.09.2017 o 21:30, Christopher Browne pisze:
> On 15 September 2017 at 14:45, Adam Brusselback
> <adambrusselback@gmail.com> wrote:

[-----------]
> 
> b) Referencing which index entries can be dropped (e.g. - vacuumed out)
> is a fair bit more complex because the index entries depend on multiple
> tables.  This adds, um, concurrency complications, if data is being deleted
> from multiple tables concurrently.  (Over-simplifying question:  "When
> a table that participates in the sharing is vacuumed, does the shared
> index get vacuumed?  What if two such tables are vacuumed concurrently?")

This is probably postgresql-hackers knowledge, but I'd appreciate if you
elaborate: why "concurrent vacuum" of two table with common index is
such a problem?

1. why cannot it be "serialized on demand" in such case/exception (e.i
the case of tables being bond by a common index)? In other words, can
the initial concurrency be turned into serialized commands?

2. why common index cannot be implemented as "split files" - locking
with their respective tables the usual way? The problem of concurrent
locking would vanish at the expense of performance hit ... but if
someone desperately needs such "global index", I'd bet he/she will live
with performance hit. I would.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 16.09.2017 o 09:33, Jehan-Guillaume de Rorthais pisze:
> 
[-------------]
> 
> I wrote something about this some years ago, this might do the trick for you, maybe with some adjustments depending
onyour schema. The main idea should help anyway. See:
 
> 
> http://blog.ioguix.net/postgresql/2015/02/05/Partitionning-and-constraints-part-1.html
> 

For the document management schema I need a way to avoid partitioning of
"process-tables", and still be able to FK from it into the document
class forest of inherited tables. Just uniqueness of IDs across that
forest is not enough ... although nice to have, I admit. Good & handy
article. I'll keep a reference to it.

Thenx,

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Tomas Vondra
Date:
On 09/16/2017 11:57 AM, Rafal Pietrak wrote:
> 
> <disclaimer>
>     I'M NOT POSTGRES HACKER. THIS IS JUST NA INTUITION.
> </disclaimer>
> 
> W dniu 15.09.2017 o 21:30, Christopher Browne pisze:
>> On 15 September 2017 at 14:45, Adam Brusselback
>> <adambrusselback@gmail.com> wrote:
> 
> [-----------]
>>
>> b) Referencing which index entries can be dropped (e.g. - vacuumed out)
>> is a fair bit more complex because the index entries depend on multiple
>> tables.  This adds, um, concurrency complications, if data is being deleted
>> from multiple tables concurrently.  (Over-simplifying question:  "When
>> a table that participates in the sharing is vacuumed, does the shared
>> index get vacuumed?  What if two such tables are vacuumed concurrently?")
> 
> This is probably postgresql-hackers knowledge, but I'd appreciate if
> you elaborate: why "concurrent vacuum" of two table with common index
> is such a problem?
> 

Because it means vacuums on partitions will clash on the shared (global)
index, unlike today with only local indexes. Vacuum happens in about
three basic phases:

1) Remove dead tuples from tables, reduce them to item pointers and
stash the row IDs in a list.

2) Scan all indexes on the table, remove items with IDs on the list.

3) Scan the table again, finally removing the item pointers.

Currently this happens on partition level, so the individual vacuums
don't interfere by modifying the same index. With global indexes (at
least when implemented as a single object) that would no longer be the
case, and the consequences are a bit terrifying ...

> 1. why cannot it be "serialized on demand" in such case/exception
> (e.i the case of tables being bond by a common index)? In other
> words, can the initial concurrency be turned into serialized
> commands?
> 

It can. But serialization is pretty much the arch enemy of scalability.
Amdahl's law and all that ...

What's particularly bad on the vacuum serialization is that it does not
serialize client requests but maintenance tasks. That's quite dangerous
IMNSHO, as it may easily result in bloat explosion.

> 2. why common index cannot be implemented as "split files" - locking 
> with their respective tables the usual way? The problem of
> concurrent locking would vanish at the expense of performance hit ...
> but if someone desperately needs such "global index", I'd bet he/she
> will live with performance hit. I would.
> 

It can be implemented by split files. But that once again introduces
massive locking (imagine hundreds or thousands of partitions), and thus
overhead.

It's a bit foolish to say you would live with the performance hit when
you don't know how serious would it be. It may be fine but it may also
be much worse than you expected.

I ran into a number of people who used global indexes on other DBs, and
their position is mostly "I'll never use that, because it pretty much
eliminates all benefits of partitioning." So we need to thread pretty
carefully here, not to run into the same limitations.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 16.09.2017 o 12:43, Tomas Vondra pisze:
> On 09/16/2017 11:57 AM, Rafal Pietrak wrote:
>>
>> <disclaimer>
>>     I'M NOT POSTGRES HACKER. THIS IS JUST NA INTUITION.
>> </disclaimer>
>>
>> W dniu 15.09.2017 o 21:30, Christopher Browne pisze:
>>> On 15 September 2017 at 14:45, Adam Brusselback
>>> <adambrusselback@gmail.com> wrote:
[-----------]
>> This is probably postgresql-hackers knowledge, but I'd appreciate if
>> you elaborate: why "concurrent vacuum" of two table with common index
>> is such a problem?
>>
> 
> Because it means vacuums on partitions will clash on the shared (global)
> index, unlike today with only local indexes. Vacuum happens in about
> three basic phases:
> 
> 1) Remove dead tuples from tables, reduce them to item pointers and
> stash the row IDs in a list.
> 
> 2) Scan all indexes on the table, remove items with IDs on the list.
> 
> 3) Scan the table again, finally removing the item pointers.
> 
> Currently this happens on partition level, so the individual vacuums
> don't interfere by modifying the same index. With global indexes (at
> least when implemented as a single object) that would no longer be the
> case, and the consequences are a bit terrifying ...

This looks hard. True.

I'm wondering if the "global index" problem was deeply discussed before
(like on postgres-hackers) ... or discussions died out early due to all
this implementation obstacles?

In the later case, I'd like to tackle (shake) it a little here.
Obviously not to the point of implementation, but may be just to file
some pros and cons of one way or another to implement it.

So regarding the 1/2/3 above. Is it feasible (?) would it work (?) if:
1. concurrent vacuum is executed concurrently like today.
2. obviously all those descendent & being vacuumed tables will get a
lock by the vacuum process.
3. global index is at this point "opened", so access to other
descendents proceed?
4. the global index may experience some locking due to the above "other
descendent tables" actions, but tuple pointers relating to tables
currently being vacuumed are not effected.... no need to lock??
5. the concurrent processed doing (1) above may take long, but eventualy
they finish and all the collected lists of tuples for the "global index"
to purge are ready.
6. now the process of vacuuming the global index may proceed, as only
this index need updates, and it shouldn't take long - all the work
downstairs is done. Thus the global index will only get blocked for a
minimum amount of time.


[---------------]
> 
> It's a bit foolish to say you would live with the performance hit when
> you don't know how serious would it be. It may be fine but it may also
> be much worse than you expected.

very true. I admit.

But on the other hand, today you have a situation, without any tools for
such occasion. IMHO that's worse.

And with thousands of partitions we are talking billions of records.
This is not an average Joe's (like myself) scenario. Event if my 6
levels of inheritance rise 10-fold (which is very unlikely and on the
other hand ... very large), I'm way away from thousands.


> 
> I ran into a number of people who used global indexes on other DBs, and
> their position is mostly "I'll never use that, because it pretty much
> eliminates all benefits of partitioning." So we need to thread pretty
> carefully here, not to run into the same limitations.

True again.

And again, there should be a way to do it right :)


-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 16.09.2017 o 15:45, Adam Brusselback pisze:
> Here is the last discussion I saw on
> it:
https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru
>
<https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru>
> 

I can see the discussion is packed with implementation details. That's
promising :)

Thenx!

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Gmail
Date:

> On Sep 16, 2017, at 8:23 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>
>
>
> W dniu 16.09.2017 o 15:45, Adam Brusselback pisze:
>> Here is the last discussion I saw on
>> it:
https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru
>>
<https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru>
>>
>
> I can see the discussion is packed with implementation details. That's
> promising :)
>
> Thenx!
>
>
For those of us scoring at home, here's what I have:
1 You have a 6-deep hierarchy over 17 document typesyou concede that standard master-detail/inheritance accomplishes
whatyou need w.r.t to documents 
2 You don't have enough document instances to warrant partitioning
3 Your remaining problem is your workflow
4 You have an academic interest in multi-table indexing

Does  this sum up the situation?
Have you shared your current schema?









--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:
Dear robjsargent@gmail.com,

W dniu 16.09.2017 o 17:19, Gmail pisze:
> 
> 
>> On Sep 16, 2017, at 8:23 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>>
>>
>>
>> W dniu 16.09.2017 o 15:45, Adam Brusselback pisze:
>>> Here is the last discussion I saw on
>>> it:
https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru
>>>
<https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru>
>>>
>>
>> I can see the discussion is packed with implementation details. That's
>> promising :)
>>
>> Thenx!
>>
>>
> For those of us scoring at home, here's what I have:
> 1 You have a 6-deep hierarchy over 17 document types
>     you concede that standard master-detail/inheritance accomplishes what you need w.r.t to documents

I do have 17 "process tables" ... they are "class-B" tables, they DONT
need any hierarchy. One of them contain payment details and has FK do a
document (in one of the 12 tables of "class-A", which are in 6 levels of
hierachy) which this payment covers. They get multiplicated ONLY because
PK in those 12 "class-A" tables must be accessed separately. And those I
have. It goes like this:

CREATE TABLE T1 (id int, b date);    -- level 1
CREATE TABLE T2 (c text) INHERITS (T1);
CREATE TABLE T3 (d text) INHERITS (T1);
CREATE TABLE T4 (e text, tm date) INHERITS (T1);
CREATE TABLE T5 (f text) INHERITS (T1); -- level 2
CREATE TABLE T6 (ca text) INHERITS (T2);
CREATE TABLE T7 (db text, db2 text) INHERITS (T3);
CREATE TABLE T8 (ec text, db4 test) INHERITS (T4);
CREATE TABLE T9 (fd text) INHERITS (T5); -- level 3
CREATE TABLE T10 (db11 text) INHERITS (T7); -- level 4

... still counting? And I haven't yet touch any of the 12 leaf tables
I'm using.

> 2 You don't have enough document instances to warrant partitioning

I don't. My couple of thousands of documents is just nothing. I don't
have "enough documents" (that's why I say I can live with a performance
hit). true, true, true.

But I do have enough document-content variety for that. I could collapse
some of the hierarchy at the expense of some columns getting NULL for
certain rows - but that's just nonesens. I'm not doing that.

> 3 Your remaining problem is your workflow

Sorry I don't understand this point.

> 4 You have an academic interest in multi-table indexing
> 

Yes. so what?

As nobody have tried to advise me to change my schema, from my point of
view the discussion goes just fine. And I've got some very interesting
answers. Is there a problem with that?

What exactly are you trying to say?

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
rob stone
Date:

On Sat, 2017-09-16 at 19:18 +0200, Rafal Pietrak wrote:
> Dear robjsargent@gmail.com,
> 
> 
> I do have 17 "process tables" ... they are "class-B" tables, they
> DONT
> need any hierarchy. One of them contain payment details and has FK do
> a
> document (in one of the 12 tables of "class-A", which are in 6 levels
> of
> hierachy) which this payment covers. They get multiplicated ONLY
> because
> PK in those 12 "class-A" tables must be accessed separately. And
> those I
> have. It goes like this:
> 
> 
Hello Rafal,


I've been trying to follow this discussion but now I'm totally
confused. (Some people might say that this is my normal state.)

However, what do you mean by the following:-

17 "process tables"?

multiplicated -- does this mean replicated?

any of the 12 leaf tables I'm using  -- what is a "leaf" table?

collapse some of the hierarchy at the expense of some columns getting
NULL for certain rows  --  does this mean if you have two input fields
(field A and field B) that if field A is not null and field B is null
the data is inserted into one table and if it's the inverse you insert
into an entirely different table?



IMHO, you need an UML diagram that not only sets out your workflow but
will also provide the basis for your schema.
Keep in mind that a hierarchy can be 'n' tables deep. The foreign keys
point back upwards until you finally reach the parent.

You mention payments being made. Users make mistakes. They can post a
payment to the wrong account and later it has to be reversed. These
things can be modelled via your UML diagram.

Cheers,
Rob


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Gmail
Date:

> On Sep 16, 2017, at 11:18 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>
> Dear robjsargent@gmail.com,
>
> W dniu 16.09.2017 o 17:19, Gmail pisze:
>>
>>
>>> On Sep 16, 2017, at 8:23 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>>>
>>>
>>>
>>> W dniu 16.09.2017 o 15:45, Adam Brusselback pisze:
>>>> Here is the last discussion I saw on
>>>> it:
https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru
>>>>
<https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4464@postgrespro.ru>
>>>>
>>>
>>> I can see the discussion is packed with implementation details. That's
>>> promising :)
>>>
>>> Thenx!
>>>
>>>
>> For those of us scoring at home, here's what I have:
>> 1 You have a 6-deep hierarchy over 17 document types
>>    you concede that standard master-detail/inheritance accomplishes what you need w.r.t to documents
>
> I do have 17 "process tables" ... they are "class-B" tables, they DONT
> need any hierarchy. One of them contain payment details and has FK do a
> document (in one of the 12 tables of "class-A", which are in 6 levels of
> hierachy) which this payment covers. They get multiplicated ONLY because
> PK in those 12 "class-A" tables must be accessed separately. And those I
> have. It goes like this:
>
> CREATE TABLE T1 (id int, b date);    -- level 1
> CREATE TABLE T2 (c text) INHERITS (T1);
> CREATE TABLE T3 (d text) INHERITS (T1);
> CREATE TABLE T4 (e text, tm date) INHERITS (T1);
> CREATE TABLE T5 (f text) INHERITS (T1); -- level 2
> CREATE TABLE T6 (ca text) INHERITS (T2);
> CREATE TABLE T7 (db text, db2 text) INHERITS (T3);
> CREATE TABLE T8 (ec text, db4 test) INHERITS (T4);
> CREATE TABLE T9 (fd text) INHERITS (T5); -- level 3
> CREATE TABLE T10 (db11 text) INHERITS (T7); -- level 4
>
- quick question: do you have more than one table inheriting from T2?

We all skin cats different ways, of course.
Often with such things, especially with ORM tools, one puts a "type"  field in the master table indicating which exact
implementationif referenced for each row. Helps in reporting counts as well.   I assume the columns "c text" are just
placeholdersfor discussion, not the actual structure.    
> ... still counting? And I haven't yet touch any of the 12 leaf tables
> I'm using.
>
>> 2 You don't have enough document instances to warrant partitioning
>
> I don't. My couple of thousands of documents is just nothing. I don't
> have "enough documents" (that's why I say I can live with a performance
> hit). true, true, true.
>
> But I do have enough document-content variety for that. I could collapse
> some of the hierarchy at the expense of some columns getting NULL for
> certain rows - but that's just nonesens. I'm not doing that.
We pick our poisons, I guess.  I sure might be tempted to denormalize some of those if it made life much easier/faster.
Hard to say from what's been presented - still assuming we're seeing pseudo-tables. 
>
>> 3 Your remaining problem is your workflow
>
> Sorry I don't understand this point.

I meant that your document tables are fine (as described above) and that you were still having trouble with the
persistenceaspects of what happens to the documents. 
>
>> 4 You have an academic interest in multi-table indexing
>>
>
> Yes. so what?
>
Nothing.  Just trying to see if I'm following your thread.
> As nobody have tried to advise me to change my schema, from my point of
> view the discussion goes just fine. And I've got some very interesting
> answers. Is there a problem with that?
>
> What exactly are you trying to say?
> -R
>
All your documents are represented in your "T1" table.  So your processing can always refer to that table - which is
excellent. T1 guarantees unique ids across all other T-tables.  Activity related records have no need to be under the
sameunique ID space (though personally I'm a fan of UUID anyway).  I'm not seeing where you would benefit from the
titleof this thread. 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 16.09.2017 o 22:20, rob stone pisze:
> 
> 
> On Sat, 2017-09-16 at 19:18 +0200, Rafal Pietrak wrote:
>> Dear robjsargent@gmail.com,
[-------------]
>>
>>
> Hello Rafal,

Hi,

> 
> 
> I've been trying to follow this discussion but now I'm totally
> confused. (Some people might say that this is my normal state.)
> 
> However, what do you mean by the following:-
> 
> 17 "process tables"?
> 
> multiplicated -- does this mean replicated?

To pinpoint the schema design I'm reducing the example complexity just
to  one level of hierarchy in "class-A", and just to one table in
"class-B". It goes like this:

CREATE TABLE T1 (id int, tm timestamp)
CREATE TABLE T2 (me int, him int) INHERITS (T1) -- level 1
CREATE TABLE T3 (me int, him int) INHERITS (T1)

ALTER TABLE T2 ADD CONSTRAINT t1_pk PRIMARY KEY(me,id)
ALTER TABLe T3 ACC CONSTRAINT t1_pk PRIMARY KEY(him,id)

the "logic" of processing requires here just one class-B table:
CREATE TABLE BP (id int references t1(id), info text)

but i have to create two tables instead:
CREATE TABLE BP2 (id int, me int, him int, info text, FOREIGN KEY
(id,me) references t2(id,me));
CREATE TABLE BP2 (id int, me int, him int, info text, FOREIGN KEY
(id,him) references t3(id,him));

That's the multiplication.

Every class-B table has to have the same inheritance structure as
class-A tables (frankly, not all but actually only the most, but that's
not really important).

> 
> any of the 12 leaf tables I'm using  -- what is a "leaf" table?
> 
> collapse some of the hierarchy at the expense of some columns getting
> NULL for certain rows  --  does this mean if you have two input fields
> (field A and field B) that if field A is not null and field B is null
> the data is inserted into one table and if it's the inverse you insert
> into an entirely different table?

I had this exact impression some time ago, so I've rewritten my schema
to "flat table". This didn't work. I've lost some of the constraints I
use to keep the data in order.

So I've rewritten the schema back to the inheritance hierarchy. And it
wasn't just a rollback, as the application lived with the new layout for
something over half a year and new things got implemented along the road.

Both rewrites costed me like 2 month *intense* work each. I've learned,
that current layout suits its usage best. The "features not covered
properly" are now reduced to just this single "global index problem".
And this problem is just "slower development" - I'm reluctant to put new
features, because the schema is so unnecessarily complex now. Well, be it.

> 
> 
> 
> IMHO, you need an UML diagram that not only sets out your workflow but
> will also provide the basis for your schema.
> Keep in mind that a hierarchy can be 'n' tables deep. The foreign keys
> point back upwards until you finally reach the parent.

No, really, pls don't.

I've spend A LOT of time figuring out (and numerous times taking the
wrong turns) of the optimal schema.

This is not helping.

> 
> You mention payments being made. Users make mistakes. They can post a
> payment to the wrong account and later it has to be reversed. These
> things can be modelled via your UML diagram.
> 

This kind of events just go to history table.

Meaning FK in payment table get updated, while the old value is
log-registered in history table ... that log-table does not have any FK
constraints, as it's ment to be just an audit trail.

Really. All this works just fine.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 17.09.2017 o 03:30, Gmail pisze:
> 
> 
>> On Sep 16, 2017, at 11:18 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>>
>> Dear robjsargent@gmail.com,
>>
>> W dniu 16.09.2017 o 17:19, Gmail pisze:
>>>
>>>
>>>> On Sep 16, 2017, at 8:23 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

[---------------------]
>> PK in those 12 "class-A" tables must be accessed separately. And those I
>> have. It goes like this:
>>
>> CREATE TABLE T1 (id int, b date);    -- level 1
>> CREATE TABLE T2 (c text) INHERITS (T1);
>> CREATE TABLE T3 (d text) INHERITS (T1);
>> CREATE TABLE T4 (e text, tm date) INHERITS (T1);
>> CREATE TABLE T5 (f text) INHERITS (T1); -- level 2
>> CREATE TABLE T6 (ca text) INHERITS (T2);
>> CREATE TABLE T7 (db text, db2 text) INHERITS (T3);
>> CREATE TABLE T8 (ec text, db4 test) INHERITS (T4);
>> CREATE TABLE T9 (fd text) INHERITS (T5); -- level 3
>> CREATE TABLE T10 (db11 text) INHERITS (T7); -- level 4
>>
> - quick question: do you have more than one table inheriting from T2?

Pls pls pls. Don't. This leads nowhere.

What if I say "yes" ... but for the same question regarding T3 you would
get: "no".

This is just and indication how incorrect your "counting at home" is.

> 
> We all skin cats different ways, of course.
> Often with such things, especially with ORM tools, one puts a "type"  field in the master table indicating which
exactimplementation if referenced for each row. Helps in reporting counts as well.   I assume the columns "c text" are
justplaceholders for discussion, not the actual structure.   
 

YES!

>> ... still counting? And I haven't yet touch any of the 12 leaf tables
>> I'm using.
>>
>>> 2 You don't have enough document instances to warrant partitioning
>>
>> I don't. My couple of thousands of documents is just nothing. I don't
>> have "enough documents" (that's why I say I can live with a performance
>> hit). true, true, true.
>>
>> But I do have enough document-content variety for that. I could collapse
>> some of the hierarchy at the expense of some columns getting NULL for
>> certain rows - but that's just nonesens. I'm not doing that.
> We pick our poisons, I guess.  I sure might be tempted to denormalize some of those if it made life much
easier/faster. Hard to say from what's been presented - still assuming we're seeing pseudo-tables.
 

I've been there ... and back. If I may, I'd like to avoid any discussion
of whether my schema is sound or it stinks.

As you may have noticed, I've put significant effort to focus the
discussion on my actual question: the "global index" (which btw I didn't
know is called this way here - if I new, I'd probably could have google
it instead). This was intentional. I like my schema design very much and
I'm unwilling to part with it.

>>
>>> 3 Your remaining problem is your workflow
>>
>> Sorry I don't understand this point.
> 
> I meant that your document tables are fine (as described above) and that you were still having trouble with the
persistenceaspects of what happens to the documents.
 

Hmmm... still not so clear. But I'll try:

1. I don't really have problems with "persistence aspects ... of what
happens". This I've implemented by copying the inheritance structure of
"class-A" tables into "class-B" tables. "persistence" is kept in order
by numerous FKs.

2. I do have a problem, that this results in my schema "being larger
then expected" ... but that's just a nuisance - the developer of new
feature (e.i myself) has more headaches then it should.

If that's what you asked for, so the answer is: "No I don't have a
problem with my workflow".

>>
>>> 4 You have an academic interest in multi-table indexing
>>>
>>
>> Yes. so what?
>>
> Nothing.  Just trying to see if I'm following your thread.
>> As nobody have tried to advise me to change my schema, from my point of
>> view the discussion goes just fine. And I've got some very interesting
>> answers. Is there a problem with that?
>>
>> What exactly are you trying to say?
>> -R
>>
> All your documents are represented in your "T1" table.  So your processing can always refer to that table - which is
excellent. T1 guarantees unique ids across all other T-tables.  Activity related records have no 
 

;7

no, it doesn't.

T1 is empty. It's just a head of inheritance tree.

There is no guarantee (index on T1 will have no entries). But naturally
there are ways to "smartly" partition the ID space allocated to
subtables of T1.

Only I was asking for database tools to GUARANTEE that. And across of
future UPDATEs too (not just INSERTS).

need to be under the same unique ID space (though personally I'm a fan
of UUID anyway).  I'm not seeing where you would benefit from the title
of this thread.

To some extend, you are right here.

And missing the "global unique ID" (as per subject of the posted
question) I'm currently using "a sort of" this king of workaround. And
in fact one of the responses in this thread showed me a better way to
implement that workaround - this is really, really good.

But regretably it doesn't help me avoid the explosion of the number of
required "class-B" tables. There is no way to make FK from those
"class-B" tables into subtables of T1 ... no matter how smart I'll be in
partitioning the ID space among them.

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Gmail
Date:
>
> As you may have noticed, I've put significant effort to focus the
> discussion on my actual question: the "global index" (which btw I didn't
> know is called this way here - if I new, I'd probably could have google
> it instead). This was intentional. I like my schema design very much and
> I'm unwilling to part with it.
>
>
> no, it doesn't.
>
> T1 is empty. It's just a head of inheritance tree.
>
> There is no guarantee (index on T1 will have no entries). But naturally
> there are ways to "smartly" partition the ID space allocated to
> subtables of T1.
>
>
OK. Wow, that's sure not how I read the docs on inheritance, but I've never used the construct thinking it was largely
syntacticsugar on master/detail based scheme designs. 

But since you're wed irrevocably to your scheme design, I'll bow out of this discussion.
All the best,
rjs




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
"Jehan-Guillaume (ioguix) de Rorthais"
Date:
On Sat, 16 Sep 2017 12:15:46 +0200
Rafal Pietrak <rafal@ztk-rp.eu> wrote:

> W dniu 16.09.2017 o 09:33, Jehan-Guillaume de Rorthais pisze:
> >   
> [-------------]
> > 
> > I wrote something about this some years ago, this might do the trick for
> > you, maybe with some adjustments depending on your schema. The main idea
> > should help anyway. See:
> > 
> > http://blog.ioguix.net/postgresql/2015/02/05/Partitionning-and-constraints-part-1.html
> >   
> 
> For the document management schema I need a way to avoid partitioning of
> "process-tables", and still be able to FK from it into the document
> class forest of inherited tables. Just uniqueness of IDs across that
> forest is not enough ... although nice to have, I admit. Good & handy
> article. I'll keep a reference to it.

Well, as I wrote in my email, it requiers some adjustments IRW your schema,
meaning you'll have to hack around to make it fit your need.

About FK, just look at pgsql trigger code around real FK and adjust to your
need by creating custom FK trigger either in plpgsql or C. IIRC, there's a
contrib module doing FK as well, look at "contrib/spi/refint.c".


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

From
Rafal Pietrak
Date:

W dniu 18.09.2017 o 17:26, Jehan-Guillaume (ioguix) de Rorthais pisze:
> On Sat, 16 Sep 2017 12:15:46 +0200
> Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> 
>> W dniu 16.09.2017 o 09:33, Jehan-Guillaume de Rorthais pisze:

[-------------]
> 
> About FK, just look at pgsql trigger code around real FK and adjust to your
> need by creating custom FK trigger either in plpgsql or C. IIRC, there's a
> contrib module doing FK as well, look at "contrib/spi/refint.c".
> 
> 

OK. I'll look into it, too. But the "discovery" of
pg_advisory_xact_lock() itself is a revelation for me :)

Thenx,

-R


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general