Thread: [PATCH] Fix ouside scope t_ctid (ItemPointerData)
Hi,
ItemPointerData, on the contrary, from what the name says,
it is not a pointer to a structure, but a structure in fact.
When assigning the name of the structure variable to a pointer, it may even work,
but, it is not the right thing to do and it becomes a nightmare,
to discover that any other error they have is at cause.
So:
1. In some cases, there may be a misunderstanding in the use of ItemPointerData.
2. When using the variable name in an assignment, the variable's address is used.
3. While this works for a structure, it shouldn't be the right thing to do.
4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage.
5. While this may be working for heapam.c, I believe it is being abused and should be compliant with
the Postgres API and use the functions that were created for this.
The patch is primarily intended to correct the use of ItemPointerData.
But it is also changing the style, reducing the scope of some variables.
If that was not acceptable, reduce the scope and someone has objections,
But it is also changing the style, reducing the scope of some variables.
If that was not acceptable, reduce the scope and someone has objections,
I can change the patch, to focus only on the use of ItemPointerData.
But as style changes are rare, if possible, it would be good to seize the opportunity.
But as style changes are rare, if possible, it would be good to seize the opportunity.
regards,
Ranier Vilela
Attachment
> On May 14, 2020, at 10:40 AM, Ranier Vilela <ranier.vf@gmail.com> wrote: > > Hi, > ItemPointerData, on the contrary, from what the name says, > it is not a pointer to a structure, but a structure in fact. The project frequently uses the pattern typedef struct FooData { ... } FooData; typedef FooData *Foo; where, in this example, "Foo" = "ItemPointer". So the "Data" part of "ItemPointerData" clues the reader into the fact that ItemPointerData is a struct, not a pointer. Granted, the "Pointer" part of that name may confuse some readers, but the struct itself does contain what is essentiallya 48-bit pointer, so that name is not nuts. > When assigning the name of the structure variable to a pointer, it may even work, > but, it is not the right thing to do and it becomes a nightmare, > to discover that any other error they have is at cause. Can you give a code example of the type of assigment you mean? > So: > 1. In some cases, there may be a misunderstanding in the use of ItemPointerData. > 2. When using the variable name in an assignment, the variable's address is used. > 3. While this works for a structure, it shouldn't be the right thing to do. > 4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage. > 5. While this may be working for heapam.c, I believe it is being abused and should be compliant with > the Postgres API and use the functions that were created for this. > > The patch is primarily intended to correct the use of ItemPointerData. > But it is also changing the style, reducing the scope of some variables. > If that was not acceptable, reduce the scope and someone has objections, > I can change the patch, to focus only on the use of ItemPointerData. > But as style changes are rare, if possible, it would be good to seize the opportunity. I would like to see a version of the patch that only addresses your concerns about ItemPointerData, not because other aspectsof the patch are unacceptable (I'm not ready to even contemplate that yet), but because I'm not sure what your complaintis about. Can you restrict the patch to just address that one issue? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Ranier Vilela <ranier.vf@gmail.com> writes: > The patch is primarily intended to correct the use of ItemPointerData. What do you think is being "corrected" here? It looks to me like just some random code rearrangements that aren't even clearly bug-free, let alone being stylistic improvements. regards, tom lane
Em qui., 14 de mai. de 2020 às 15:03, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:
> On May 14, 2020, at 10:40 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Hi,
> ItemPointerData, on the contrary, from what the name says,
> it is not a pointer to a structure, but a structure in fact.
The project frequently uses the pattern
typedef struct FooData {
...
} FooData;
typedef FooData *Foo;
where, in this example, "Foo" = "ItemPointer".
So the "Data" part of "ItemPointerData" clues the reader into the fact that ItemPointerData is a struct, not a pointer. Granted, the "Pointer" part of that name may confuse some readers, but the struct itself does contain what is essentially a 48-bit pointer, so that name is not nuts.
> When assigning the name of the structure variable to a pointer, it may even work,
> but, it is not the right thing to do and it becomes a nightmare,
> to discover that any other error they have is at cause.
Can you give a code example of the type of assigment you mean?
htup->t_ctid = target_tid;
htup->t_ctid = newtid;
Both target_tid and newtid are local variable, whe loss scope, memory is garbage.
> So:
> 1. In some cases, there may be a misunderstanding in the use of ItemPointerData.
> 2. When using the variable name in an assignment, the variable's address is used.
> 3. While this works for a structure, it shouldn't be the right thing to do.
> 4. If we have a local variable, its scope is limited and when it loses its scope, memory is certainly garbage.
> 5. While this may be working for heapam.c, I believe it is being abused and should be compliant with
> the Postgres API and use the functions that were created for this.
>
> The patch is primarily intended to correct the use of ItemPointerData.
> But it is also changing the style, reducing the scope of some variables.
> If that was not acceptable, reduce the scope and someone has objections,
> I can change the patch, to focus only on the use of ItemPointerData.
> But as style changes are rare, if possible, it would be good to seize the opportunity.
I would like to see a version of the patch that only addresses your concerns about ItemPointerData, not because other aspects of the patch are unacceptable (I'm not ready to even contemplate that yet), but because I'm not sure what your complaint is about. Can you restrict the patch to just address that one issue?
Certainly.
In the same file you can find the appropriate use of the API.
ItemPointerSet(&heapTuple->t_self, blkno, offnum);
Ranier Vilela
Attachment
Em qui., 14 de mai. de 2020 às 15:07, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> The patch is primarily intended to correct the use of ItemPointerData.
What do you think is being "corrected" here? It looks to me like
just some random code rearrangements that aren't even clearly
bug-free, let alone being stylistic improvements.
It is certainly working, but trusting that the memory of a local variable will not change,
when it loses its scope, is a risk that, certainly, can cause bugs, elsewhere.
And it is certainly very difficult to discover its primary cause.
And it is certainly very difficult to discover its primary cause.
regards,
Ranier Vilela
> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote: > > htup->t_ctid = target_tid; > htup->t_ctid = newtid; > Both target_tid and newtid are local variable, whe loss scope, memory is garbage. Ok, thanks for the concrete example of what is bothering you. In htup_details, I see that struct HeapTupleHeaderData has a field named t_ctid of type struct ItemPointerData. I also seein heapam that target_tid is of type ItemPointerData. The htup->t_ctid = target_tid copies the contents of target_tid. By the time target_tid goes out of scope, the contents are already copied. I would shareyour concern if t_ctid were of type ItemPointer (aka ItemPointerData *) and the code said htup->t_ctid = &target_tid but it doesn't say that, so I don't see the issue. Also in heapam, I see that newtid is likewise of type ItemPointerData, so the same logic applies. By the time newtid goesout of scope, its contents have already been copied into t_ctid, so there is no problem. But maybe you know all that and are just complaining that the name "ItemPointerData" sounds like a pointer rather than astruct? I'm still unclear whether you believe this is a bug, or whether you just don't like the naming that is used. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote: > > Certainly. > In the same file you can find the appropriate use of the API. > ItemPointerSet(&heapTuple->t_self, blkno, offnum); It took a couple reads through your patch to figure out what you were trying to accomplish, and I think you are uncomfortablewith assigning one ItemPointerData variable from another. ItemPointerData is just a struct with three int16variables. To make a standalone program that has the same structure without depending on any postgres headers, I'musing "short int" instead of "int16" and structs "TwoData" and "ThreeData" that are analogous to BlockIdData and OffsetNumber. #include <stdio.h> typedef struct TwoData { short int a; short int b; } TwoData; typedef struct ThreeData { TwoData left; short int right; } ThreeData; int main(int argc, char **argv) { ThreeData x = { { 5, 10 }, 15 }; ThreeData y = x; x.left.a = 0; x.left.b = 1; x.right = 2; printf("y = { { %d, %d }, %d }\n", y.left.a, y.left.b, y.right); return 0; } If you compile and run this, you'll notice it outputs: y = { { 5, 10 }, 15 } and not the { { 0, 1}, 2 } that you would expect if y were merely pointing at x. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Em qui., 14 de mai. de 2020 às 19:23, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:
> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> htup->t_ctid = target_tid;
> htup->t_ctid = newtid;
> Both target_tid and newtid are local variable, whe loss scope, memory is garbage.
Ok, thanks for the concrete example of what is bothering you.
In htup_details, I see that struct HeapTupleHeaderData has a field named t_ctid of type struct ItemPointerData. I also see in heapam that target_tid is of type ItemPointerData. The
htup->t_ctid = target_tid
copies the contents of target_tid. By the time target_tid goes out of scope, the contents are already copied. I would share your concern if t_ctid were of type ItemPointer (aka ItemPointerData *) and the code said
htup->t_ctid = &target_tid
but it doesn't say that, so I don't see the issue.
Even if the patch simplifies and uses the API to make the assignments.
Really, the central problem does not exist, my fault.
Perhaps because it has never made such use, structure assignment.
And I failed to do research on the subject before.
Sorry.
Really, the central problem does not exist, my fault.
Perhaps because it has never made such use, structure assignment.
And I failed to do research on the subject before.
Sorry.
Also in heapam, I see that newtid is likewise of type ItemPointerData, so the same logic applies. By the time newtid goes out of scope, its contents have already been copied into t_ctid, so there is no problem.
But maybe you know all that and are just complaining that the name "ItemPointerData" sounds like a pointer rather than a struct? I'm still unclear whether you believe this is a bug, or whether you just don't like the naming that is used.
My concerns were about whether attribution really would copy the structure's content and not just its address.
The name makes it difficult, but that was not the point.
The tool warned about uninitialized variable, which I mistakenly deduced for loss of scope.
The name makes it difficult, but that was not the point.
The tool warned about uninitialized variable, which I mistakenly deduced for loss of scope.
We never stopped learning, and using structure assignment was a new learning experience.
regards
Ranier Vilela
Em qui., 14 de mai. de 2020 às 19:49, Mark Dilger <mark.dilger@enterprisedb.com> escreveu:
> On May 14, 2020, at 11:34 AM, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Certainly.
> In the same file you can find the appropriate use of the API.
> ItemPointerSet(&heapTuple->t_self, blkno, offnum);
It took a couple reads through your patch to figure out what you were trying to accomplish, and I think you are uncomfortable with assigning one ItemPointerData variable from another. ItemPointerData is just a struct with three int16 variables. To make a standalone program that has the same structure without depending on any postgres headers, I'm using "short int" instead of "int16" and structs "TwoData" and "ThreeData" that are analogous to BlockIdData and OffsetNumber.
#include <stdio.h>
typedef struct TwoData {
short int a;
short int b;
} TwoData;
typedef struct ThreeData {
TwoData left;
short int right;
} ThreeData;
int main(int argc, char **argv)
{
ThreeData x = { { 5, 10 }, 15 };
ThreeData y = x;
x.left.a = 0;
x.left.b = 1;
x.right = 2;
printf("y = { { %d, %d }, %d }\n",
y.left.a, y.left.b, y.right);
return 0;
}
If you compile and run this, you'll notice it outputs:
y = { { 5, 10 }, 15 }
and not the { { 0, 1}, 2 } that you would expect if y were merely pointing at x.
Thanks for the example.
But what I wanted to test was
struct1 = struct2;
Both being of the same type of structure.
But what I wanted to test was
struct1 = struct2;
Both being of the same type of structure.
What I wrongly deduced was that the address of struct2 was saved and not its content.
Again, thanks for your time and clarification.
regards,
Ranier Vilela