distibuted transactions, SQL+XPath+XTree - Mailing list pgsql-hackers

From Тюрин Дмитрий
Subject distibuted transactions, SQL+XPath+XTree
Date
Msg-id 1582589687.20080220082204@narod.ru
Whole thread Raw
Responses Re: distibuted transactions, SQL+XPath+XTree  (Richard Huxton <dev@archonet.com>)
List pgsql-hackers
Hi list,

I see the following business opportunity for Postgres:
I) Simple man can't program middleware to connect XML-client and
Postgres.
II) Request into several databases does not exist.
III) Notebooks need several switching-on and switching-off during
transaction.
IV) Distance between strings are not supported, that makes
aproximate searching impossible.
V) There is no possibility to hide some (not all) records of table,
granted to other users, from these users

Proposed solutions to these opportunities:
I) DBMS inserts into tables and selects from tables along correlation
Primary Key - Foreign Key.
II) Databases get nicknames, groups of databases get name of group.
These names are used in requests.
III) Operator to freeze transaction.
IV) Operator to order records on base of distance between strings.
V) Subdivide records of all tables into classes, specify class
number in record.

I ask you to implement these solutions, that Postgres get
advantage before other DBMS-es. I have prepered several drawing
http://sql50.euro.ru/sql5.11.3.ppt to explain ideas.
More details are described below.

===

1) [slides 2-12] Problem: Browser is very widespread client in epoch of internet.
Non-programmers can master 'insert', 'select', 'update', 'delete',
but are not capable to use sophisticated syntax of proprietary
web-server for input of XML. It's necessary to exclude this
syntax, and give possibility to install DBMS and immediately
use it, like user install and use Teleport, FlashGet, browser
and so on.
 Solution: DBMS itself must communicate via HTTP, accept XML, and
place data from it into tables under some agreement. My proposal
about agreement:
*) xml-element is written into table with identical name (i.e.
tag name coincides with table name), xml-attribute is written
into field with identical name (i.e. attribute name coincides
with field name);
*) tables are bound into tree by values of Primary Keys and
Foreign Keys;
*) value of primary keys of new record is assigned by trigger. If user uses simple scheme, than this is enough! An
ambiguity
can exist in complex scheme because of several refering fields,
than user must append symbol '#' and name of necessary refering
field to end of name of sending XML-tag (it looks like new
tag name with symbol '#' inside name). Let's name this by
term 'determination' [symbol '$' is used for list to have
possibility to solve ambiguity for list simultaneously with
ambiguity for enclosed XML-element, i.e. to append two refering
field to name of sending XML-tag].
 P.S. [slides 13-21] Of course, we spead decision to manual 'insert'.

2) [slides 22-31] Problem: Usage of both SQL/XML-functions, and syntax of proprietary
web-server give very bulky code to extract tree as XML. This
makes more difficulties for contact of DBMS on CML, GML,
HumanML, OPML, RCML, SBML, ebXML, MDDL, RIXML, XBRL, xCBL and
other (turing all relational fields into XML-elements is
suitable for browser, but not suitable for other cases).
 Solution: To avoid sophisticated programing, 'select' itself must return
data to client (if only 'select' is not used inside 'insert ...
select ...'). I propose laconic 'select a.b.c' to select data
from tables 'a', 'b', 'c'. Let's name this by term 'XTree' -
in analogy with 'XPath'. If user uses simple scheme, than this is enough! An ambiguity
can exist in complex scheme because of several refering fields,
than user must append symbol '#' and name of necessary refering
field to end of table name (it looks like new table name with
symbol '#' inside name). Let's name this by term 'refinement'.
 P.S. [slides 32-39] All possible compositions of determinations in XML-tree and
all possible compositions of refinement in 'select' are
considered, appropriate XML- and SQL-syntaxes are proposed. P.S. [slides 40-49] Examples of usage of refinement are
demonstrated.

3) [slides 50-58] Problem: Non-predictable/non-repeated input data (XML-elements) is
written into XML-field of relational table. XQuery is offered
to process data in these xml-fields. But user is not capable
to manipulate records by SQL and XML-elements by XQuery in
one request (even in case of refusal from relational storage
in favour of XML-database, that means in favour of
non-relational 'engine', enclosed cycles of XQuery create very
bulky code, in which user is not orientated).
 Solution: I propose to append XPath into SQL, that SQL can process
XML-elements and attributes (i.e. to avoid XQuery). Thus SQL
can process records and XML-elements simultaneously.
 P.S. [slides 59-72] Of course, we generalize XPath and XTree upto XLang, and
consider all possible use cases.

---

4) [slides 73-83-116] Problem: SQL would more flexible and convenient for distributed
request (gethering data from several databases and scattering
them into several databases), than branded programs; including
SQL is more convenient for replication, than branded programs.
But there is no necessary syntax.
 Solution: Each database has nickname. Nicknames are specified in
requests as prefix before table name. Group of databases is named society. Name of society also
can be specified as prefix before table name, and means
nicknames of all databases of group. Thus one SQL-statement,
containing society name, means a great number of SQL-statements
with nicknames. That nicknames, several societies or several
mentions of one society don't specify the same database
simultaneously, we place symbol '%' before them (let's name so
prefix as restricted prefix). That several mentions of one
society synchronously specify the same database, we place any
(identical) word and symbol '%' before this mentions (let's
name this word as marker, and this prefix as marked prefix). 'Default' database is database, in which all nicknames
and
societies are stored. And prefix 'all' means all databases,
known for default database.  Nickname can has numeric parameter NID (nick identifier).
It is not accessable to change in requests to process data,
and is designated as '%%'.
 In purpose of security, distributed requests must satisfy
some requirements. I propose whole mistrust to DBMS:
*) database does not store login of other database (that to  not give foreign login at crack)
*) database does not edit (update, insert, delete) other  database (that temporaty access to other DBMS, got at  crack,
can'tdestroy other database)
 
*) database does not get data from other database, if it’s  possible (that data from other database not become
accessabletoo at crack)
 
And i also propose to expect quite simplicity of client:
*) client does not know SQL (it can’t simplify or decompose  SQL).
So DBMS-1 can't create and enter SQL-command into DBMS-2
directly or indirectly (asking client to forward command).
And client can't derive SQL-command on base of entered
SQL-command (all, what it has, is last SQL-command, stored
in own stack). I propose to DBMS-1 to transfer __XML__-commands to client,
which force client to make simple (string) transformation of
SQL-command, stored in client stack, and send result of
transformation into DBMS-2. Transformations must be so limited,
that to not allow appearance of SQL-command, harm for DBMS-2.
I propose to arrange these XML-commands as <?name?> to
distinguish them from XML-data (traditionaly arranged as
<name>).

5) [slide 118] Problem: User makes transaction from notebook, and needs to switch-off
notebook without commit or rollback of transaction to continue
transaction from left stage at next switching-on.
 Solution: I propose command 'freeze', similar to command 'disconnet',
which save transaction in current state; and command 'unfreeze',
similar to command 'connect', which continue frozen transaction
(instead to start new transaction). 'Freeze' returns identifier
of frozen transaction, which should be used in 'Unfreeze'.
 P.S. [slide 119] Now savepoint can be used only to rollback to it. I propose
command 'commit savepoint' (commit all actions, made before
savepoint), that is useful in much cases before command
'freeze'. 
6) [slide 120] Problem: Commiting of distrubited transaction (being executed in
several databases) is not fails in all databases at once.
It's not reasonable to rollback transaction in databases,
which remain healthy, to begin transaction in them from
very beginning - it's reasonable to wait repearing of failed
databases to commit transaction together with them.
So we need to freeze command 'commit' in healthy databases in
process of executing it (as well as in case of freezing
transaction, let client messages will be SQL-commands, and
server messages will be XML-commands).
 Solution: I propose to enter client message 'postpone' to freeze commit
on second phase, and client message 'adjourn' to freeze on third
phase.

7) [slide 124] Problem: At stream processing (when new records enter quickly,
in much quantity), it's necessary to execute aggregate only
on several last entered records (to organize slip slot), but
creating index on field, sequencing records, sorting on this
index with purpose to cut only needed quantity of records
brakes processing of stream.
 Solution: Limit quantity of records in a table, make queue of records,
automatically delete records from beginning of queue at
arrival of new records - and start aggregates for all records
of such specially orginized table. To save records,
automatically deleted from beginning of the queue, it's
possible to copy them automatically into other usual table
(which will save them permanently).

8) [slides 125-129] Problem: Distance between strings are not supported, that makes
search of similar strings and ordering by degree of
resemblance.
 Solution: Method of calculation of distance between strings and
operator, ordering records by this factor.

9) [slide 132] Problem: No possibility to make some (personal) records unaccessable
for other users.
 Solution: Subdivide records of all tables into not crossing classes,
specify number of class in special field of records.




Dmitry Turin
HTML6     (6. 5.4)  http://html60.euro.ru
SQL5      (5.11.3)  http://sql50.euro.ru
Unicode7  (7. 2.1)  http://unicode70.euro.ru
Computer2 (2. 0.2)  http://computer20.euro.ru



pgsql-hackers by date:

Previous
From: Robert Treat
Date:
Subject: Re: Permanent settings
Next
From: tomas@tuxteam.de
Date:
Subject: Re: Permanent settings