Thread: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

[PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Ranier Vilela
Date:
Hi,
ItemPointerData, on the contrary, from what the name says,
it is not a pointer to a structure, but a structure in fact.
When assigning the name of the structure variable to a pointer, it may even work,
but, it is not the right thing to do and it becomes a nightmare,
to discover that any other error they have is at cause.

So:
1. In some cases, there may be a misunderstanding in the use of ItemPointerData.
2. When using the variable name in an assignment, the variable's address is used.
3. While this works for a structure, it shouldn't be the right thing to do.
4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage.
5. While this may be working for heapam.c, I believe it is being abused and should be compliant with
    the Postgres API and use the functions that were created for this.

The patch is primarily intended to correct the use of ItemPointerData.
But it is also changing the style, reducing the scope of some variables.
If that was not acceptable, reduce the scope and someone has objections,
I can change the patch, to focus only on the use of ItemPointerData.
But as style changes are rare, if possible, it would be good to seize the opportunity.

regards,
Ranier Vilela
Attachment

Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Mark Dilger
Date:

> On May 14, 2020, at 10:40 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Hi,
> ItemPointerData, on the contrary, from what the name says,
> it is not a pointer to a structure, but a structure in fact.

The project frequently uses the pattern

  typedef struct FooData {
  ...
  } FooData;

  typedef FooData *Foo;

where, in this example, "Foo" = "ItemPointer".

So the "Data" part of "ItemPointerData" clues the reader into the fact that ItemPointerData is a struct, not a pointer.
Granted, the "Pointer" part of that name may confuse some readers, but the struct itself does contain what is
essentiallya 48-bit pointer, so that name is not nuts. 


> When assigning the name of the structure variable to a pointer, it may even work,
> but, it is not the right thing to do and it becomes a nightmare,
> to discover that any other error they have is at cause.

Can you give a code example of the type of assigment you mean?

> So:
> 1. In some cases, there may be a misunderstanding in the use of ItemPointerData.
> 2. When using the variable name in an assignment, the variable's address is used.
> 3. While this works for a structure, it shouldn't be the right thing to do.
> 4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage.
> 5. While this may be working for heapam.c, I believe it is being abused and should be compliant with
>     the Postgres API and use the functions that were created for this.
>
> The patch is primarily intended to correct the use of ItemPointerData.
> But it is also changing the style, reducing the scope of some variables.
> If that was not acceptable, reduce the scope and someone has objections,
> I can change the patch, to focus only on the use of ItemPointerData.
> But as style changes are rare, if possible, it would be good to seize the opportunity.

I would like to see a version of the patch that only addresses your concerns about ItemPointerData, not because other
aspectsof the patch are unacceptable (I'm not ready to even contemplate that yet), but because I'm not sure what your
complaintis about.  Can you restrict the patch to just address that one issue? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> The patch is primarily intended to correct the use of ItemPointerData.

What do you think is being "corrected" here?  It looks to me like
just some random code rearrangements that aren't even clearly
bug-free, let alone being stylistic improvements.

            regards, tom lane



Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Ranier Vilela
Date:
Em qui., 14 de mai. de 2020 às 15:03, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:


> On May 14, 2020, at 10:40 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Hi,
> ItemPointerData, on the contrary, from what the name says,
> it is not a pointer to a structure, but a structure in fact.

The project frequently uses the pattern

  typedef struct FooData {
  ...
  } FooData;

  typedef FooData *Foo;

where, in this example, "Foo" = "ItemPointer".

So the "Data" part of "ItemPointerData" clues the reader into the fact that ItemPointerData is a struct, not a pointer.  Granted, the "Pointer" part of that name may confuse some readers, but the struct itself does contain what is essentially a 48-bit pointer, so that name is not nuts.


> When assigning the name of the structure variable to a pointer, it may even work,
> but, it is not the right thing to do and it becomes a nightmare,
> to discover that any other error they have is at cause.

Can you give a code example of the type of assigment you mean?
htup->t_ctid = target_tid;
htup->t_ctid = newtid;
Both target_tid and newtid are local variable, whe loss scope, memory is garbage.
 

> So:
> 1. In some cases, there may be a misunderstanding in the use of ItemPointerData.
> 2. When using the variable name in an assignment, the variable's address is used.
> 3. While this works for a structure, it shouldn't be the right thing to do.
> 4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage.
> 5. While this may be working for heapam.c, I believe it is being abused and should be compliant with
>     the Postgres API and use the functions that were created for this.
>
> The patch is primarily intended to correct the use of ItemPointerData.
> But it is also changing the style, reducing the scope of some variables.
> If that was not acceptable, reduce the scope and someone has objections,
> I can change the patch, to focus only on the use of ItemPointerData.
> But as style changes are rare, if possible, it would be good to seize the opportunity.

I would like to see a version of the patch that only addresses your concerns about ItemPointerData, not because other aspects of the patch are unacceptable (I'm not ready to even contemplate that yet), but because I'm not sure what your complaint is about.  Can you restrict the patch to just address that one issue?
Certainly.
In the same file you can find the appropriate use of the API.
ItemPointerSet(&heapTuple->t_self, blkno, offnum);

regards,
Ranier Vilela
Attachment

Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Ranier Vilela
Date:
Em qui., 14 de mai. de 2020 às 15:07, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> The patch is primarily intended to correct the use of ItemPointerData.

What do you think is being "corrected" here?  It looks to me like
just some random code rearrangements that aren't even clearly
bug-free, let alone being stylistic improvements.
It is certainly working, but trusting that the memory of a local variable will not change,
when it loses its scope, is a risk that, certainly, can cause bugs, elsewhere.
And it is certainly very difficult to discover its primary cause.
 
regards,
Ranier Vilela

Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Mark Dilger
Date:

> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> htup->t_ctid = target_tid;
> htup->t_ctid = newtid;
> Both target_tid and newtid are local variable, whe loss scope, memory is garbage.

Ok, thanks for the concrete example of what is bothering you.

In htup_details, I see that struct HeapTupleHeaderData has a field named t_ctid of type struct ItemPointerData.  I also
seein heapam that target_tid is of type ItemPointerData.  The 

    htup->t_ctid = target_tid

copies the contents of target_tid.  By the time target_tid goes out of scope, the contents are already copied.  I would
shareyour concern if t_ctid were of type ItemPointer (aka ItemPointerData *) and the code said 

    htup->t_ctid = &target_tid

but it doesn't say that, so I don't see the issue.

Also in heapam, I see that newtid is likewise of type ItemPointerData, so the same logic applies.  By the time newtid
goesout of scope, its contents have already been copied into t_ctid, so there is no problem. 

But maybe you know all that and are just complaining that the name "ItemPointerData" sounds like a pointer rather than
astruct?  I'm still unclear whether you believe this is a bug, or whether you just don't like the naming that is used. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Mark Dilger
Date:

> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Certainly.
> In the same file you can find the appropriate use of the API.
> ItemPointerSet(&heapTuple->t_self, blkno, offnum);

It took a couple reads through your patch to figure out what you were trying to accomplish, and I think you are
uncomfortablewith assigning one ItemPointerData variable from another.    ItemPointerData is just a struct with three
int16variables.  To make a standalone program that has the same structure without depending on any postgres headers,
I'musing "short int" instead of "int16" and structs "TwoData" and "ThreeData" that are analogous to BlockIdData and
OffsetNumber.

#include <stdio.h>

typedef struct TwoData {
    short int a;
    short int b;
} TwoData;

typedef struct ThreeData {
    TwoData left;
    short int right;
} ThreeData;

int main(int argc, char **argv)
{
    ThreeData x = { { 5, 10 }, 15 };
    ThreeData y = x;
    x.left.a = 0;
    x.left.b = 1;
    x.right = 2;

    printf("y = { { %d, %d }, %d }\n",
        y.left.a, y.left.b, y.right);

    return 0;
}

If you compile and run this, you'll notice it outputs:

y = { { 5, 10 }, 15 }

and not the { { 0, 1}, 2 } that you would expect if y were merely pointing at x.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Ranier Vilela
Date:
Em qui., 14 de mai. de 2020 às 19:23, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:


> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> htup->t_ctid = target_tid;
> htup->t_ctid = newtid;
> Both target_tid and newtid are local variable, whe loss scope, memory is garbage.

Ok, thanks for the concrete example of what is bothering you.

In htup_details, I see that struct HeapTupleHeaderData has a field named t_ctid of type struct ItemPointerData.  I also see in heapam that target_tid is of type ItemPointerData.  The

        htup->t_ctid = target_tid

copies the contents of target_tid.  By the time target_tid goes out of scope, the contents are already copied.  I would share your concern if t_ctid were of type ItemPointer (aka ItemPointerData *) and the code said

        htup->t_ctid = &target_tid

but it doesn't say that, so I don't see the issue. 
Even if the patch simplifies and uses the API to make the assignments.
Really, the central problem does not exist, my fault.
Perhaps because it has never made such use, structure assignment.
And I failed to do research on the subject before.
Sorry.
 

Also in heapam, I see that newtid is likewise of type ItemPointerData, so the same logic applies.  By the time newtid goes out of scope, its contents have already been copied into t_ctid, so there is no problem.

But maybe you know all that and are just complaining that the name "ItemPointerData" sounds like a pointer rather than a struct?  I'm still unclear whether you believe this is a bug, or whether you just don't like the naming that is used.
My concerns were about whether attribution really would copy the structure's content and not just its address.
The name makes it difficult, but that was not the point.

The tool warned about uninitialized variable, which I mistakenly deduced for loss of scope.
 
Thank you very much for the clarifications and your time.
We never stopped learning, and using structure assignment was a new learning experience.

regards
Ranier Vilela

Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)

From
Ranier Vilela
Date:
Em qui., 14 de mai. de 2020 às 19:49, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:


> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Certainly.
> In the same file you can find the appropriate use of the API.
> ItemPointerSet(&heapTuple->t_self, blkno, offnum);

It took a couple reads through your patch to figure out what you were trying to accomplish, and I think you are uncomfortable with assigning one ItemPointerData variable from another.    ItemPointerData is just a struct with three int16 variables.  To make a standalone program that has the same structure without depending on any postgres headers, I'm using "short int" instead of "int16" and structs "TwoData" and "ThreeData" that are analogous to BlockIdData and OffsetNumber.

#include <stdio.h>

typedef struct TwoData {
    short int a;
    short int b;
} TwoData;

typedef struct ThreeData {
    TwoData left;
    short int right;
} ThreeData;

int main(int argc, char **argv)
{
    ThreeData x = { { 5, 10 }, 15 };
    ThreeData y = x;
    x.left.a = 0;
    x.left.b = 1;
    x.right = 2;

    printf("y = { { %d, %d }, %d }\n",
        y.left.a, y.left.b, y.right);

    return 0;
}

If you compile and run this, you'll notice it outputs:

y = { { 5, 10 }, 15 }

and not the { { 0, 1}, 2 } that you would expect if y were merely pointing at x.
Thanks for the example.
But what I wanted to test was
struct1 = struct2;
Both being of the same type of structure.

What I wrongly deduced was that the address of struct2 was saved and not its content.

Again, thanks for your time and clarification.

regards,
Ranier Vilela