Thread: Plans for 2.8

Plans for 2.8

From

Daniele Varrazzo

Date:

04 October 2018, 15:38:03

Hello,

next week I will have some time and maybe could end all the work to
release psycopg 2.8. I am aware there are several features which have
been awaiting for some time. The state of the work can be seen in [1].

[1] https://github.com/psycopg/psycopg2/issues?q=milestone%3A"psycopg+2.8"

Psycopg 2.8 will only support Python 2.7 and 3.4+. The codebase has
been heavily hammered by Jon Dufresne who has killed a lot of Py2isms
and the whole use of 2to3 is now replaced by a minimal compatibility
module. So Python is now as modern as C in supporting both 2 and 3
with a single codebase :P Other deprecated and unused objects have
also been dropped, see the news file [2]. Fog, if you can take a look
at examples/sandbox and delete what's no more required there (#645),
that would be great :)

[2] https://github.com/psycopg/psycopg2/blob/master/NEWS

The feature I'm the most excited about (and worried about its
reception) is to raise a different exception for every postgres error
message (see #682) . For instance `SELECT * FROM wrong_name` will
raise `UndefinedTable` rather than `ProgrammingError`. Currently
handling a specific exception requires catching a broader class and
looking at the pgcode:

    try:
        cur.execute("lock table %s in access exclusive mode nowait" % name)
    except psycopg2.OperationalError as e:
        if e.pgcode == psycopg2.errorcodes.LOCK_NOT_AVAILABLE:
            locked = True
        else:
            raise

This can become a much more natural:

    try:
        cur.execute("lock table %s in access exclusive mode nowait" % name)
    except psycopg2.errors.LockNotAvailable:
            locked = True

The error classes are generated automatically from Postgres source
code and are subclasses of the previously existing ones, so existing
code should be unaffected. I'd be happy to have input about the
feature and suggestions before releasing it.

A tiny improvement to SQL generation is already ready^W merged in
#732: it will be possible to use `Identifier("schema", "name")` which
would be rendered in dotted notation in the query. Currently
`Identifier()` takes a single param so this extension is backward
compatible and there is no need to introduce a new `Composable` type
to represent dotted sequences of identifiers.

There are requests to get extra info about the connection or the
result (see #726, #661). They are reasonable and not too difficult to
implement so I'd like to give them a go. However they are easy enough
for someone to contribute if you feel? That would be very appreciated
and would reduce the surface of the works to perform on my part.
Another tiny feature would be to support IntEnum out-of-the-box
(#591), which I've never used in Python.

In the other thread these days we have discussed about introducing
capsules: we can take a look to that too... Added #782.

Thank you very much for any contribution, with ideas and even more with code :)

-- Daniele

Re: Plans for 2.8

From

Federico Di Gregorio

Date:

04 October 2018, 16:27:20

On 10/04/2018 02:38 PM, Daniele Varrazzo wrote:
> A tiny improvement to SQL generation is already ready^W merged in
> #732: it will be possible to use `Identifier("schema", "name")` which
> would be rendered in dotted notation in the query. Currently
> `Identifier()` takes a single param so this extension is backward
> compatible and there is no need to introduce a new `Composable` type
> to represent dotted sequences of identifiers.

I understand that from a compatibility point of view everything works 
with the "schema", "name" order of arguments (you just switch on the 
number of arguments) but usually such approach causes infinite headaches 
when you remove or add the namespace from the call.

`Identifier(name, schema=None)` is better, IMHO because makes explicit 
that the mandatory and first argument is always the identifier itself, 
while the schema is optional.

federico

-- 
Federico Di Gregorio                         federico.digregorio@dndg.it
DNDG srl                                                  http://dndg.it
      One key. One input. One enter. All right. -- An american consultant
            (then the system crashed and took down the *entire* network)

Re: Plans for 2.8

From

Rory Campbell-Lange

Date:

04 October 2018, 16:55:01

On 04/10/18, Daniele Varrazzo (daniele.varrazzo@gmail.com) wrote:
> The feature I'm the most excited about (and worried about its
> reception) is to raise a different exception for every postgres error
> message (see #682) . For instance `SELECT * FROM wrong_name` will
> raise `UndefinedTable` rather than `ProgrammingError`. Currently
> handling a specific exception requires catching a broader class and
> looking at the pgcode:
> 
>     try:
>         cur.execute("lock table %s in access exclusive mode nowait" % name)
>     except psycopg2.OperationalError as e:
>         if e.pgcode == psycopg2.errorcodes.LOCK_NOT_AVAILABLE:
>             locked = True
>         else:
>             raise
> 
> This can become a much more natural:
> 
>     try:
>         cur.execute("lock table %s in access exclusive mode nowait" % name)
>     except psycopg2.errors.LockNotAvailable:
>             locked = True
> 
> The error classes are generated automatically from Postgres source
> code and are subclasses of the previously existing ones, so existing
> code should be unaffected. I'd be happy to have input about the
> feature and suggestions before releasing it.

Hi Daniele

The greater depth of exception reporting looks great to me, particularly
if they are subclasses of the existing ones.

Regards
Rory

Re: Plans for 2.8

From

Daniele Varrazzo

Date:

04 October 2018, 17:05:51

On Thu, Oct 4, 2018 at 2:27 PM Federico Di Gregorio <fog@dndg.it> wrote:
>
> On 10/04/2018 02:38 PM, Daniele Varrazzo wrote:
> > A tiny improvement to SQL generation is already ready^W merged in
> > #732: it will be possible to use `Identifier("schema", "name")` which
> > would be rendered in dotted notation in the query. Currently
> > `Identifier()` takes a single param so this extension is backward
> > compatible and there is no need to introduce a new `Composable` type
> > to represent dotted sequences of identifiers.
>
> I understand that from a compatibility point of view everything works
> with the "schema", "name" order of arguments (you just switch on the
> number of arguments) but usually such approach causes infinite headaches
> when you remove or add the namespace from the call.
>
> `Identifier(name, schema=None)` is better, IMHO because makes explicit
> that the mandatory and first argument is always the identifier itself,
> while the schema is optional.

"schema", "table" is only an example: it could be "table"."field",
even "schema"."table"."field", or "extension"."setting"... The object
only wants to represent a dotted sequence of identifiers, at lexical
level, nothing with semantics attached such as "an optionally
schema-qualified table name" or "a field name". If the object was
`Table()` or `Field()` rather than `Identifier()` I'd totally agree
with you.

-- Daniele

Re: Plans for 2.8

From

Federico Di Gregorio

Date:

04 October 2018, 17:08:54

On 10/04/2018 04:05 PM, Daniele Varrazzo wrote:
> On Thu, Oct 4, 2018 at 2:27 PM Federico Di Gregorio<fog@dndg.it>  wrote:
>> On 10/04/2018 02:38 PM, Daniele Varrazzo wrote:
>>> A tiny improvement to SQL generation is already ready^W merged in
>>> #732: it will be possible to use `Identifier("schema", "name")` which
>>> would be rendered in dotted notation in the query. Currently
>>> `Identifier()` takes a single param so this extension is backward
>>> compatible and there is no need to introduce a new `Composable` type
>>> to represent dotted sequences of identifiers.
>> I understand that from a compatibility point of view everything works
>> with the "schema", "name" order of arguments (you just switch on the
>> number of arguments) but usually such approach causes infinite headaches
>> when you remove or add the namespace from the call.
>>
>> `Identifier(name, schema=None)` is better, IMHO because makes explicit
>> that the mandatory and first argument is always the identifier itself,
>> while the schema is optional.
> "schema", "table" is only an example: it could be "table"."field",
> even "schema"."table"."field", or "extension"."setting"... The object
> only wants to represent a dotted sequence of identifiers, at lexical
> level, nothing with semantics attached such as "an optionally
> schema-qualified table name" or "a field name". If the object was
> `Table()` or `Field()` rather than `Identifier()` I'd totally agree
> with you.

Sorry, I misread your example. Obviously you're right.

federico

p.s. yep, I'll remove all the old cruft from sandbox.

-- 
Federico Di Gregorio                         federico.digregorio@dndg.it
DNDG srl                                                  http://dndg.it
                       The number of the beast: vi vi vi. -- Delexa Jones

Re: Plans for 2.8

From

Mike Bayer

Date:

04 October 2018, 17:18:08

On Thu, Oct 4, 2018 at 8:38 AM Daniele Varrazzo
<daniele.varrazzo@gmail.com> wrote:
>
> Hello,
>
> next week I will have some time and maybe could end all the work to
> release psycopg 2.8. I am aware there are several features which have
> been awaiting for some time. The state of the work can be seen in [1].
>
> [1] https://github.com/psycopg/psycopg2/issues?q=milestone%3A"psycopg+2.8"
>
> Psycopg 2.8 will only support Python 2.7 and 3.4+. The codebase has
> been heavily hammered by Jon Dufresne who has killed a lot of Py2isms
> and the whole use of 2to3 is now replaced by a minimal compatibility
> module. So Python is now as modern as C in supporting both 2 and 3
> with a single codebase :P Other deprecated and unused objects have
> also been dropped, see the news file [2]. Fog, if you can take a look
> at examples/sandbox and delete what's no more required there (#645),
> that would be great :)
>
> [2] https://github.com/psycopg/psycopg2/blob/master/NEWS
>
> The feature I'm the most excited about (and worried about its
> reception) is to raise a different exception for every postgres error
> message (see #682) . For instance `SELECT * FROM wrong_name` will
> raise `UndefinedTable` rather than `ProgrammingError`. Currently
> handling a specific exception requires catching a broader class and
> looking at the pgcode:
>
>     try:
>         cur.execute("lock table %s in access exclusive mode nowait" % name)
>     except psycopg2.OperationalError as e:
>         if e.pgcode == psycopg2.errorcodes.LOCK_NOT_AVAILABLE:
>             locked = True
>         else:
>             raise
>
> This can become a much more natural:
>
>     try:
>         cur.execute("lock table %s in access exclusive mode nowait" % name)
>     except psycopg2.errors.LockNotAvailable:
>             locked = True
>
> The error classes are generated automatically from Postgres source
> code and are subclasses of the previously existing ones, so existing
> code should be unaffected. I'd be happy to have input about the
> feature and suggestions before releasing it.

I can't provide any suggestions, as the feature is very reasonable and
useful.   But I will lament that pep-249 has nothing about this, which
means from a driver-agnostic point of view, the situation is pretty
much unchanged.   Here's code I wrote for Openstack to try to apply
more specificity to database errors, basically a library of regexes:
https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/exc_filters.py#L55

one thing that would be helpful would be if your fine-grained
exception classes included more context about the failure.  Like
UndefinedTable would include the table name as an individual
datamember e.g. exception.table_name, an error about a foreign key
constraint would include the constraint name e.g.
exception.constraint_name, things like that.  You can see in my
oslo.db library above we are also pulling out other elements from the
error message to provide more context.






>
> A tiny improvement to SQL generation is already ready^W merged in
> #732: it will be possible to use `Identifier("schema", "name")` which
> would be rendered in dotted notation in the query. Currently
> `Identifier()` takes a single param so this extension is backward
> compatible and there is no need to introduce a new `Composable` type
> to represent dotted sequences of identifiers.
>
> There are requests to get extra info about the connection or the
> result (see #726, #661). They are reasonable and not too difficult to
> implement so I'd like to give them a go. However they are easy enough
> for someone to contribute if you feel? That would be very appreciated
> and would reduce the surface of the works to perform on my part.
> Another tiny feature would be to support IntEnum out-of-the-box
> (#591), which I've never used in Python.
>
> In the other thread these days we have discussed about introducing
> capsules: we can take a look to that too... Added #782.
>
> Thank you very much for any contribution, with ideas and even more with code :)
>
> -- Daniele
>

Re: Plans for 2.8

From

Daniele Varrazzo

Date:

04 October 2018, 17:47:17

On Thu, Oct 4, 2018 at 3:18 PM Mike Bayer <mike_mp@zzzcomputing.com> wrote:

> I can't provide any suggestions, as the feature is very reasonable and
> useful.   But I will lament that pep-249 has nothing about this, which
> means from a driver-agnostic point of view, the situation is pretty
> much unchanged.   Here's code I wrote for Openstack to try to apply
> more specificity to database errors, basically a library of regexes:
> https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/exc_filters.py#L55

Uhm, they also have the problem of not working if the message is
localised... :\ With postgres it woud have been more robust to look at
the extension `pgcode` but of course that's not portable either.

Is there anything in common to all the databases which might be
exposed in an uniform way by the drivers? e.g. the pgcode is actually
something more standard than just postgres: "sqlstate" (or, SQLSTATE,
because '70s) is supposed to be a standard.

If you know that many/all the database emit a sqlstate you may suggest
the dbsig to bless an exception attribute - e.g. `Exception.sqlstate`
- to report it. Of course if postgres says "40P01" and IBM DB2 says
"0911N" to report a deadlock, that's way out of what we can control...

> one thing that would be helpful would be if your fine-grained
> exception classes included more context about the failure.  Like
> UndefinedTable would include the table name as an individual
> datamember e.g. exception.table_name, an error about a foreign key
> constraint would include the constraint name e.g.
> exception.constraint_name, things like that.  You can see in my
> oslo.db library above we are also pulling out other elements from the
> error message to provide more context.

We do already: more details about the exception, if made available by
the database, are made available by the exception `diag` attribute:
see <http://initd.org/psycopg/docs/extensions.html#psycopg2.extensions.Diagnostics>.

-- Daniele

Re: Plans for 2.8

From

Karsten Hilbert

Date:

05 October 2018, 15:16:50

On Thu, Oct 04, 2018 at 01:38:03PM +0100, Daniele Varrazzo wrote:

> [2] https://github.com/psycopg/psycopg2/blob/master/NEWS
> 
> The feature I'm the most excited about (and worried about its
> reception) is to raise a different exception for every postgres error
> message (see #682).

You needn't as far as I'm concerned because as long as this

> The error classes are [...] subclasses of the previously existing ones

is true there should not be a problem.

Will the new classes still have .pgcode set to their
respective values ?

Thanks,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B