Thread: selecting from cursor

selecting from cursor

From
Alex Pilosov
Date:
I hope I'm not taking too much time from people here...

Here's where I'm now (and don't beat me too hard :)

I'm done with change of RangeTblEntry into three different node types:
RangeTblEntryRelation,RangeTblEntrySubSelect,RangeTblEntryPortal which
have different fields. All the existing places instead of using
rte->subquery to determine type now use IsA(rte, RangeTblEntrySubSelect),
and later access fields after casting ((RangeTblEntrySubSelect *)rte)->xxx

Some functions that always work on Relation RTEs are declared to accept
RangeTblEntryRelation. Asserts are added everywhere before casting of RTE
into specific type. (Unless there was an IsA before, then I didn't put an
Assert).

Let me know if that is an acceptable way of doing things, or casting makes
things too ugly. (I believe its the best way, unions are more dangerous
in this context).

Let me know if it would make sense to rename these types into
RTERelation,RTESubSelect,RTEPortal to cut down on their length ...

Now, I'm going to more interesting stuff, planner:

SELECT blah FROM CURSOR FOO, list_of_tables

This really should place CURSOR FOO at the very top of join list, since
(generally), you don't know what cursor is opened for, and you cannot
ReScan a portal. 

I'm still trying to find out how to explain that to optimizer, but if
someone gives me a hint, I'd appreciate it.

Alternatively, we can store everything we retrieved from portal for future
ReScan purposes, but I really don't like this idea. 

By the same token, I think, optimizer must reject queries of type:
"SELECT ... FROM CURSOR FOO, CURSOR BAR" because such thing will lead to a
nested loop scan, which is impossible because, again, you can't ReScan a
cursor.

Let me know if I'm very off track with these assumptions, and if you have
to throw rocks at me, pick little ones ;)

-alex





Re: selecting from cursor

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> I'm done with change of RangeTblEntry into three different node types:
> RangeTblEntryRelation,RangeTblEntrySubSelect,RangeTblEntryPortal which
> have different fields. All the existing places instead of using
> rte->subquery to determine type now use IsA(rte, RangeTblEntrySubSelect),
> and later access fields after casting ((RangeTblEntrySubSelect *)rte)->xxx

> Some functions that always work on Relation RTEs are declared to accept
> RangeTblEntryRelation. Asserts are added everywhere before casting of RTE
> into specific type. (Unless there was an IsA before, then I didn't put an
> Assert).

> Let me know if that is an acceptable way of doing things, or casting makes
> things too ugly. (I believe its the best way, unions are more dangerous
> in this context).

And what are you doing with the places that don't care which kind of RTE
they are dealing with (which is most of them IIRC)?  While you haven't
shown us the proposed changes, I really suspect that a union would be
cleaner, because it'd avoid ugliness in those places.  Bear in mind that
the three RTE types that you have are going to become five or six real
soon now, because I have other things to fix that need to be done that
way --- so the notational advantage of a union is going to increase.

> ... you cannot ReScan a portal. 

That's gonna have to be fixed.  If you're not up for it, don't implement
this.  Given that cursors (are supposed to) support FETCH BACKWARDS,
I really don't see why they shouldn't be expected to handle ReScan...
        regards, tom lane


Re: selecting from cursor

From
Alex Pilosov
Date:
On Mon, 2 Jul 2001, Tom Lane wrote:

> Alex Pilosov <alex@pilosoft.com> writes:
> > I'm done with change of RangeTblEntry into three different node types:
> > RangeTblEntryRelation,RangeTblEntrySubSelect,RangeTblEntryPortal which
> > have different fields. All the existing places instead of using
> > rte->subquery to determine type now use IsA(rte, RangeTblEntrySubSelect),
> > and later access fields after casting ((RangeTblEntrySubSelect *)rte)->xxx
> 
> > Some functions that always work on Relation RTEs are declared to accept
> > RangeTblEntryRelation. Asserts are added everywhere before casting of RTE
> > into specific type. (Unless there was an IsA before, then I didn't put an
> > Assert).
> 
> > Let me know if that is an acceptable way of doing things, or casting makes
> > things too ugly. (I believe its the best way, unions are more dangerous
> > in this context).
> 
> And what are you doing with the places that don't care which kind of RTE
> they are dealing with (which is most of them IIRC)?  While you haven't

They just have things declared as RangeTblEntry *, and as long as they
don't access type-specific fields, they are fine.

> shown us the proposed changes, I really suspect that a union would be
> cleaner, because it'd avoid ugliness in those places.  Bear in mind that
I have attached the diff of what I have now. Please take a look through
it. Its not exactly done (only the parser and pieces of executor parts of
"FROM CURSOR" are there, and I didn't attach gram.y diff because its just
too ugly now), so its definitely not ready for application, but with these
changes, it compiles and 'make check' goes through fine ;)

> the three RTE types that you have are going to become five or six real
> soon now, because I have other things to fix that need to be done that
> way --- so the notational advantage of a union is going to increase.
> 
> > ... you cannot ReScan a portal. 
> 
> That's gonna have to be fixed.  If you're not up for it, don't implement
> this.  Given that cursors (are supposed to) support FETCH BACKWARDS,
> I really don't see why they shouldn't be expected to handle ReScan...
I thought only scrollable cursors can do that. What if cursor isn't
scrollable? Should it error during the execution?

For scrollable cursors, Rescan should be implemented as 'scroll backwards
until you can't scroll no more', correct?

-alex



Re: selecting from cursor

From
Alex Pilosov
Date:
On Mon, 2 Jul 2001, Alex Pilosov wrote:

> > shown us the proposed changes, I really suspect that a union would be
> > cleaner, because it'd avoid ugliness in those places.  Bear in mind that
> I have attached the diff of what I have now. Please take a look through
> it. Its not exactly done (only the parser and pieces of executor parts of
> "FROM CURSOR" are there, and I didn't attach gram.y diff because its just
> too ugly now), so its definitely not ready for application, but with these
> changes, it compiles and 'make check' goes through fine ;)
Apparently the diff was too large for mailserver. I have posted it at 
http://www.formenos.org/pg/rte.diff

95% of that diff is obvious changes. 5% are not-so-obvious :)

-alex



Re: selecting from cursor

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
>> And what are you doing with the places that don't care which kind of RTE
>> they are dealing with (which is most of them IIRC)?  While you haven't

> They just have things declared as RangeTblEntry *, and as long as they
> don't access type-specific fields, they are fine.

So you have four (soon to be six or seven) different structs that *must*
have the same fields?  I don't think that's cleaner than a union ...
at the very least, declare it as structs containing RangeTblEntry,
similar to the way the various Plan node types work (see plannodes.h).

> For scrollable cursors, Rescan should be implemented as 'scroll backwards
> until you can't scroll no more', correct?

No, it should be implemented as Rescan.  The portal mechanism needs to
expose the Rescan call for the contained querytree.
        regards, tom lane


Re: selecting from cursor

From
Alex Pilosov
Date:
On Tue, 3 Jul 2001, Tom Lane wrote:

> Alex Pilosov <alex@pilosoft.com> writes:
> >> And what are you doing with the places that don't care which kind of RTE
> >> they are dealing with (which is most of them IIRC)?  While you haven't
> 
> > They just have things declared as RangeTblEntry *, and as long as they
> > don't access type-specific fields, they are fine.
> 
> So you have four (soon to be six or seven) different structs that *must*
> have the same fields?  I don't think that's cleaner than a union ...
> at the very least, declare it as structs containing RangeTblEntry,
> similar to the way the various Plan node types work (see plannodes.h).
Please see my diffs. Its implemented via #define to declare all common
fields. 

I.E.:
#define RTE_COMMON_FIELDS \   NodeTag     type; \   /* \    * Fields valid in all RTEs: \    */ \   Attr       *alias;
       /* user-written alias clause, if any */ \   Attr       *eref;           /* expanded reference names */ \   bool
     inh;            /* inheritance requested? */ \   bool        inFromCl;       /* present in FROM clause */ \   bool
      checkForRead;   /* check rel for read access */ \   bool        checkForWrite;  /* check rel for write access */
\  Oid         checkAsUser;    /* if not zero, check access as this user
 
*/ \

typedef struct RangeTblEntry
{   RTE_COMMON_FIELDS
} RangeTblEntry;

typedef struct RangeTblEntryRelation
{   RTE_COMMON_FIELDS   /* Fields valid for a plain relation RTE */   char       *relname;        /* real name of the
relation*/   Oid         relid;          /* OID of the relation */
 
} RangeTblEntryRelation;


If RTEs are done the way plan nodes done, the syntax would be pretty much
the same, only with one more indirection to access common fields.

This is how code looks with my changes:
RangeTblEntry *rte=rt_fetch(..)

For common fields
rte->eref

For type-specific fields
((RangeTblEntryRelation *) rte)->relid

This is how it would look if it was done like Plan nodes are done:
RangeTblEntry *rte=rt_fetch(..)

For common fields:
rte->common->eref

For type-specific fields:
((RangeTblEntryRelation *)rte)->relid                                          

> > For scrollable cursors, Rescan should be implemented as 'scroll backwards
> > until you can't scroll no more', correct?
> 
> No, it should be implemented as Rescan.  The portal mechanism needs to
> expose the Rescan call for the contained querytree.
Ok.




Re: selecting from cursor

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> On Tue, 3 Jul 2001, Tom Lane wrote:
>> So you have four (soon to be six or seven) different structs that *must*
>> have the same fields?  I don't think that's cleaner than a union ...

> Please see my diffs. Its implemented via #define to declare all common
> fields. 
> #define RTE_COMMON_FIELDS \
>     NodeTag     type; \
>     [etc]

I don't think that technique is cleaner than a union, either ;-).
The macro definition is a pain in the neck: you have to play games with
semicolon placement, most tools won't autoindent it nicely, etc etc.

But the main point is that I think NodeType = RangeTblEntry with
a separate subtype field is a better way to go than making a bunch of
different NodeType values.  When most of the fields are common, as in
this case, it's going to be true that many places only want to know
"is it a rangetable entry or not?"
        regards, tom lane


Re: selecting from cursor

From
Alex Pilosov
Date:
On Tue, 3 Jul 2001, Tom Lane wrote:

> Alex Pilosov <alex@pilosoft.com> writes:
> > On Tue, 3 Jul 2001, Tom Lane wrote:
> >> So you have four (soon to be six or seven) different structs that *must*
> >> have the same fields?  I don't think that's cleaner than a union ...
> 
> > Please see my diffs. Its implemented via #define to declare all common
> > fields. 
> > #define RTE_COMMON_FIELDS \
> >     NodeTag     type; \
> >     [etc]
> 
> I don't think that technique is cleaner than a union, either ;-).
> The macro definition is a pain in the neck: you have to play games with
> semicolon placement, most tools won't autoindent it nicely, etc etc.

True true. On other hand, unlike union, its automatically typechecked, you
cannot by mistake reference a field you shouldn't be referencing.

Strict typechecking allows one to explicitly declare which type your
function takes if you want, and force errors if you miscast something. I
think discipline is a good thing here...

But really its your call, no point in arguing :)

> But the main point is that I think NodeType = RangeTblEntry with
> a separate subtype field is a better way to go than making a bunch of
> different NodeType values.  When most of the fields are common, as in
> this case, it's going to be true that many places only want to know
> "is it a rangetable entry or not?"

That's why I have IsA_RTE(node) macro. (Same as we have IsA_Join and
IsA_JoinPath already). It 




Re: selecting from cursor

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> True true. On other hand, unlike union, its automatically typechecked, you
> cannot by mistake reference a field you shouldn't be referencing.

Only true to the extent that you have cast a generic pointer to the
correct type to begin with.  However, we've probably wasted more time
arguing the point than it's really worth.

I would suggest leaving off the final semicolon in the macro definition
so that you can write

typedef struct RangeTblEntryRelation
{   RTE_COMMON_FIELDS;   /* Fields valid for a plain relation RTE */   char       *relname;        /* real name of the
relation*/   Oid         relid;          /* OID of the relation */
 

Without this, tools like pgindent will almost certainly mess up these
struct declarations (I know emacs' C mode will get it wrong...)
        regards, tom lane


Re: selecting from cursor

From
Alex Pilosov
Date:
Erm, forgot to attach the patch. Here it is.

On Mon, 2 Jul 2001, Alex Pilosov wrote:

> They just have things declared as RangeTblEntry *, and as long as they
> don't access type-specific fields, they are fine.
> 
> > shown us the proposed changes, I really suspect that a union would be
> > cleaner, because it'd avoid ugliness in those places.  Bear in mind that
> I have attached the diff of what I have now. Please take a look through
> it. Its not exactly done (only the parser and pieces of executor parts of
> "FROM CURSOR" are there, and I didn't attach gram.y diff because its just
> too ugly now), so its definitely not ready for application, but with these
> changes, it compiles and 'make check' goes through fine ;)

Re: selecting from cursor

From
Alex Pilosov
Date:
On Mon, 2 Jul 2001, Alex Pilosov wrote:

> Erm, forgot to attach the patch. Here it is.
(yow) don't even bother looking at this patch. mail server delayed this
message by almost a week, and by now, the code is totally changed.

I took Tom's suggestion and made RTE a union. So, the below is a new
definition of RTE:

I have most of portal-related code working, only executor needs some more
fixes. Code properly makes PortalScan Path entry, PortalScan Plan nodes,
etc. I have added PortalReScan to tell portal it needs to rescan itself. 

I'll post a correct patch next week. Thank you to everyone and especially
Tom for bearing with my often stupid questions.

--cut here--rte definition--
typedef enum RTEType {   RTE_RELATION,   RTE_SUBSELECT,   RTE_PORTAL
} RTEType;

typedef struct RangeTblEntry
{   NodeTag     type;   RTEType     rtetype;   /*    * Fields valid in all RTEs:    */   Attr       *alias;          /*
user-writtenalias clause, if any */   Attr       *eref;           /* expanded reference names */   bool        inh;
      /* inheritance requested? */   bool        inFromCl;       /* present in FROM clause */   bool
checkForRead;  /* check rel for read access */   bool        checkForWrite;  /* check rel for write access */   Oid
   checkAsUser;    /* if not zero, check access as this user
 
*/     union {       struct  {           /* Fields for a plain relation RTE (rtetype=RTE_RELATION) */           char
  *relname;        /* real name of the relation */           Oid         relid;          /* OID of the relation */
} rel;       struct {           /* Fields for a subquery RTE (rtetype=RTE_SUBSELECT) */           Query      *subquery;
     /* the sub-query */       } sub;       struct {           /* fields for portal RTE (rtetype=RTE_PORTAL) */
 char          *portalname;    /* portal's name */       } portal;   } u;
 
} RangeTblEntry;