Thread: Computer VARSIZE_ANY(PTR) during debugging

Computer VARSIZE_ANY(PTR) during debugging

From
Amit Langote
Date:
Hello,

Is it possible to compute VARSIZE_ANY(PTR) during debugging?

---------------------------------------------------------
#define VARSIZE_ANY(PTR) \       (VARATT_IS_1B_E(PTR) ? VARSIZE_1B_E(PTR) : \        (VARATT_IS_1B(PTR) ?
VARSIZE_1B(PTR): \         VARSIZE_4B(PTR)))
 

#define VARATT_IS_1B_E(PTR) \       ((((varattrib_1b *) (PTR))->va_header) == 0x80)
-----------------------------------------------------------

I tried using above expression, but it gives following:

(gdb) p ((((varattrib_1b *) ( tp+off ))->va_header) == 0x80)
No symbol "varattrib_1b" in current context.

Am I missing some gdb technique here using which I could find value of
this expression?

By the way, I am trying to find cause of a segmentation fault using a
core dump which occurred in slot_deform_tuple() of heaptuple.c in
8.4.2 (excuse me, but if this is unheard off, maybe there is an
issue). The segfault in question happens at line 1141:

off = att_align_pointer(off, thisatt->attalign, -1, tp + off);

char       *tp;                         /* ptr to tuple data */
long        off;                    /* offset in tuple data */

Disassembling seems to suggest (tp + off) is the faulting address.
Apparently, the segfault happens when 5th text column is being
extracted from a tuple (char(n), char(n), int4, char(n), text, ...).
Since, tp is fixed for the whole duration of loop and only off is
subject to change over iterations, it may have happened due to wrong
offset in this iteration.

Has anything of this kind been encountered/reported before?


--
Amit Langote



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Alvaro Herrera
Date:
Amit Langote escribió:

> The segfault in question happens at line 1141:
> 
> off = att_align_pointer(off, thisatt->attalign, -1, tp + off);
> 
> char       *tp;                         /* ptr to tuple data */
> long        off;                    /* offset in tuple data */
> 
> Disassembling seems to suggest (tp + off) is the faulting address.
> Apparently, the segfault happens when 5th text column is being
> extracted from a tuple (char(n), char(n), int4, char(n), text, ...).
> Since, tp is fixed for the whole duration of loop and only off is
> subject to change over iterations, it may have happened due to wrong
> offset in this iteration.
> 
> Has anything of this kind been encountered/reported before?

Yes, I vaguely recall I have seen this in cases where tuples contain
corrupt data.  I think you just need the length word of the fourth datum
to be wrong.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Amit Langote
Date:
On Thu, Jun 27, 2013 at 12:02 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Amit Langote escribió:
>
>> The segfault in question happens at line 1141:
>>
>> off = att_align_pointer(off, thisatt->attalign, -1, tp + off);
>>
>> char       *tp;                         /* ptr to tuple data */
>> long        off;                    /* offset in tuple data */
>>
>> Disassembling seems to suggest (tp + off) is the faulting address.
>> Apparently, the segfault happens when 5th text column is being
>> extracted from a tuple (char(n), char(n), int4, char(n), text, ...).
>> Since, tp is fixed for the whole duration of loop and only off is
>> subject to change over iterations, it may have happened due to wrong
>> offset in this iteration.
>>
>> Has anything of this kind been encountered/reported before?
>
> Yes, I vaguely recall I have seen this in cases where tuples contain
> corrupt data.  I think you just need the length word of the fourth datum
> to be wrong.
>

The query in question is:

select col1, col2, col4, octet_length(col5) from table where
octet_length(col5) > 8000000;

In case of corrupt data, even select * from table should give
segfault, shouldn't it?

--
Amit Langote



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Amit Langote
Date:
On Thu, Jun 27, 2013 at 12:02 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Amit Langote escribió:
>
>> The segfault in question happens at line 1141:
>>
>> off = att_align_pointer(off, thisatt->attalign, -1, tp + off);
>>
>> char       *tp;                         /* ptr to tuple data */
>> long        off;                    /* offset in tuple data */
>>
>> Disassembling seems to suggest (tp + off) is the faulting address.
>> Apparently, the segfault happens when 5th text column is being
>> extracted from a tuple (char(n), char(n), int4, char(n), text, ...).
>> Since, tp is fixed for the whole duration of loop and only off is
>> subject to change over iterations, it may have happened due to wrong
>> offset in this iteration.
>>
>> Has anything of this kind been encountered/reported before?
>
> Yes, I vaguely recall I have seen this in cases where tuples contain
> corrupt data.  I think you just need the length word of the fourth datum
> to be wrong.
>

I want to find exactly that. Is there any way to get that value?
AFAIU, a tuple would not contain all of the data of individual
attributes; some might be TOAST'd, but is the total length (including
TOAST'd part) added to offset (in 'tp + offset') to point to the next
attribute in the tuple?

Looking at the attlen == -1 value in tupDescriptor of the
ResultTupleSlot, VARSIZE_ANY() is used to calculate the length and
added to offset, but I find no way to calculate that while I am
dubugging.


--
Amit Langote



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Alvaro Herrera
Date:
Amit Langote escribió:

> Looking at the attlen == -1 value in tupDescriptor of the
> ResultTupleSlot, VARSIZE_ANY() is used to calculate the length and
> added to offset, but I find no way to calculate that while I am
> dubugging.

I guess you could just add a few "macro define" lines to your .gdbinit,
containing definitions equivalent to those in postgres.h.  Haven't tried
this for the varlena macros, though I do have a couple of others in
there and they ease work at times.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Greg Stark
Date:
I think there's some magic in gdb for this but I'm not sure how to
make it happen. If you figure it out I would think it would be
generally useful and we should find a way to put it in the source tree
so it works for everyone.

You might find it useful to put breakpoints in heap_deformtuple() with
conditions that catch the tuple you're looking for. That function will
often (but not always) be the first function to see your corrupt
tuple.



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Amit Langote
Date:
On Wed, Jul 31, 2013 at 2:33 AM, Greg Stark <stark@mit.edu> wrote:
> I think there's some magic in gdb for this but I'm not sure how to
> make it happen. If you figure it out I would think it would be
> generally useful and we should find a way to put it in the source tree
> so it works for everyone.
>
> You might find it useful to put breakpoints in heap_deformtuple() with
> conditions that catch the tuple you're looking for. That function will
> often (but not always) be the first function to see your corrupt
> tuple.


With the core dump using which I worked on this problem about a month
back, I couldn't find heap_deformtuple() in the code path that
resulted in the segfault. As I said, it was slot_deform_tuple(). Is it
possible that it would be in the code path for a query like:

select bpcharcol1, bpcharcol2, int4col3, bpcharcol4,
octet_length(textcol5) from table where octet_length(textcol5) >
8000000;

-- 
Amit Langote



Re: Computer VARSIZE_ANY(PTR) during debugging

From
Andres Freund
Date:
Hi,

On 2013-06-26 13:27:15 +0900, Amit Langote wrote:
> Is it possible to compute VARSIZE_ANY(PTR) during debugging?
>
> ---------------------------------------------------------
> #define VARSIZE_ANY(PTR) \
>         (VARATT_IS_1B_E(PTR) ? VARSIZE_1B_E(PTR) : \
>          (VARATT_IS_1B(PTR) ? VARSIZE_1B(PTR) : \
>           VARSIZE_4B(PTR)))
>
> #define VARATT_IS_1B_E(PTR) \
>         ((((varattrib_1b *) (PTR))->va_header) == 0x80)
> -----------------------------------------------------------
>
> I tried using above expression, but it gives following:
>
> (gdb) p ((((varattrib_1b *) ( tp+off ))->va_header) == 0x80)
> No symbol "varattrib_1b" in current context.

FWIW, for me, just replacing typedefs in such cases by the actual
struct's name often works. Unfortunately varattrib_1b is an anonymous
struct, but that's easy enough to change.
In HEAD it seems enough to replace the usages in VARTAG_SIZE by the
actual structs. Like in the attached patch.

If you compile postgres with -g3 or higher, it will include most macro
definitions in the binary. If you then additionally define:
macro define __builtin_offsetof(T, F) ((int) &(((T *) 0)->F))
macro define __extension__

In your .gdbinit, many macros work OOTB.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Computer VARSIZE_ANY(PTR) during debugging

From
Peter Geoghegan
Date:
On Tue, Jul 30, 2013 at 10:33 AM, Greg Stark <stark@mit.edu> wrote:
> I think there's some magic in gdb for this but I'm not sure how to
> make it happen. If you figure it out I would think it would be
> generally useful and we should find a way to put it in the source tree
> so it works for everyone.

You can write custom pretty printers for varlena types using GDB's
pretty printers (Python bindings expose this stuff). You can even
differentiate between text and bytea, even though they're both just
typedefs for varlena. I've done this myself in the past, but
unfortunately I don't control the source code. I can tell you that the
bindings are excellent, though.

I was even able to do things like printing output more or less
equivalent to what MemoryContextStats() dumps, but directly from GDB
(i.e I could walk the tree of memory contexts), even though there is a
bunch of private macros involved - I essentially re-rewrote
AllocSetStats() in weirdly C-like Python.

-- 
Peter Geoghegan