Thread: BUG #5121: Segmentation Fault when using pam w/ krb5

BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Ryan Douglas"
Date:
The following bug has been logged online:

Bug reference:      5121
Logged by:          Ryan Douglas
Email address:      rdouglas@arbinet.com
PostgreSQL version: 8.4.1
Operating system:   Fedora 11
Description:        Segmentation Fault when using pam w/ krb5
Details:

Whenever I use psql to remotely connect to the database the server crashes
(see log below). If I use psql with the '-W' option then it's fine.

I also tested with pam_tacplus.so and in both cases the db didn't crash. It
just complained about not having credentials to authenticate when the -W
option is not used.

I can reproduce at will so let me know if you need more information.




----- pam configuration

auth       sufficient           pam_krb5.so no_user_check
account    required             pam_permit.so
session    required             pam_permit.so



-------- postgresql log -with krb5 configured in pam ------

<[unknown]@[unknown]  2009-10-15 16:21:11.939 EDT>LOG:  connection received:
host=10.0.20.38 port=42662
<rdouglas@tacacs 10.0.20.38(42662) 2009-10-15 16:21:11.982 EDT>LOG:  could
not receive data from client: Connection reset by peer
<@  2009-10-15 16:21:11.987 EDT>LOG:  server process (PID 16978) was
terminated by signal 11: Segmentation fault
<@  2009-10-15 16:21:11.987 EDT>LOG:  terminating any other active server
processes
<@  2009-10-15 16:21:11.989 EDT>LOG:  all server processes terminated;
reinitializing
<@  2009-10-15 16:21:12.109 EDT>LOG:  database system was interrupted; last
known up at 2009-10-15 16:21:07 EDT
<@  2009-10-15 16:21:12.109 EDT>LOG:  database system was not properly shut
down; automatic recovery in progress
<@  2009-10-15 16:21:12.110 EDT>LOG:  record with zero length at 3/B7C396B8
<@  2009-10-15 16:21:12.110 EDT>LOG:  redo is not required
<@  2009-10-15 16:21:12.137 EDT>LOG:  database system is ready to accept
connections
<@  2009-10-15 16:21:12.137 EDT>LOG:  autovacuum launcher started



-------- postgresql log -with tacplus configured in pam ------

<[unknown]@[unknown]  2009-10-15 16:41:01.544 EDT>LOG:  connection received:
host=10.0.20.38 port=58894
<rdouglas@tacacs 10.0.20.38(58894) 2009-10-15 16:41:01.575 EDT>LOG:  could
not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(58894) 2009-10-15 16:41:01.576 EDT>LOG:
pam_authenticate failed: Insufficient credentials to access authentication
data
<rdouglas@tacacs 10.0.20.38(58894) 2009-10-15 16:41:01.576 EDT>FATAL:  PAM
authentication failed for user "rdouglas"
<[unknown]@[unknown]  2009-10-15 16:41:05.298 EDT>LOG:  connection received:
host=10.0.20.38 port=58895
<rdouglas@tacacs 10.0.20.38(58895) 2009-10-15 16:41:05.681 EDT>LOG:
connection authorized: user=rdouglas database=tacacs



---- /var/log/messages ----

Oct 15 16:21:07 va-mp-db02 kernel: postgres[16971]: segfault at 0 ip
0000000000559624 sp 00007fff43dbe180 error 4 in postgres[400000+439000]
Oct 15 16:21:11 va-mp-db02 kernel: postgres[16978]: segfault at 0 ip
0000000000559624 sp 00007fff43dbe180 error 4 in postgres[400000+439000]
-

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Ryan Douglas" <rdouglas@arbinet.com> writes:
> Whenever I use psql to remotely connect to the database the server crashes
> (see log below). If I use psql with the '-W' option then it's fine.

What this looks like at first glance is a bug in the PAM module you're
using, since Postgres really has no idea which PAM configuration is
being invoked.  Can you get a stack trace to narrow down where the sig11
is happening?  (You will probably need to insert "ulimit -c unlimited"
into the .bash_profile for Postgres to get the postmaster into a context
where it will produce a core dump.)

Also, is this your own build of Postgres, or if not whose?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Heikki Linnakangas
Date:
Ryan Douglas wrote:
> The following bug has been logged online:
>
> Bug reference:      5121
> Logged by:          Ryan Douglas
> Email address:      rdouglas@arbinet.com
> PostgreSQL version: 8.4.1
> Operating system:   Fedora 11
> Description:        Segmentation Fault when using pam w/ krb5
> Details:
>
> Whenever I use psql to remotely connect to the database the server crashes
> (see log below). If I use psql with the '-W' option then it's fine.
>
> I also tested with pam_tacplus.so and in both cases the db didn't crash. It
> just complained about not having credentials to authenticate when the -W
> option is not used.
>
> I can reproduce at will so let me know if you need more information.

Can you get a stack trace with gdb? Something along the lines of:

ulimit -c unlimited
(start postmaster)
(reproduce the crash)
gdb /usr/bin/postgres $PGDATA/core
bt

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> (gdb) bt
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> #1  0x00007f738dfeedd8 in _pam_krb5_conv_call (pamh=<value optimized out>, messages=0xb51780, n_prompts=0,
responses=0x7fff2e356668)at conv.c:99 
> #2  0x00007f738dfefb38 in _pam_krb5_generic_prompter (context=<value optimized out>, data=0x7fff2e357fe0, name=<value
optimizedout>, banner=<value optimized out>, num_prompts=1, 
>     prompts=<value optimized out>, suppress_password_prompts=1) at prompter.c:330
> #3  0x00007f738dfefe10 in _pam_krb5_normal_prompter (context=0x0, data=0xb51890, name=0x7fff2e356668 "",
banner=0x79df27"", num_prompts=0, prompts=0x101010101010101) 
>     at prompter.c:409

Okay, so the dump is definitely way down inside libpam.  The next question
is whether it's really libpam's fault, or are we passing it some bad
data.  Please install the pam-debuginfo RPM that corresponds to the
pam version you have, and retry --- hopefully that will add a bit more
information to the stack trace.

(The debuginfo-install hint you got might help, though I'm not sure why
it's referencing audit-libs and not pam ... maybe this code isn't
actually in libpam?)

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
VG9tLyBIZWlra2kgLA0KDQogICBUaGlzIGlzIGEgY3VzdG9tIGJ1aWxkLiBJ
IHVzZWQgICIuL2NvbmZpZ3VyZSAtLXdpdGgtcGFtIC0td2l0aC1wZXJsIC0t
d2l0aC1weXRob24gLS1lbmFibGUtdGhyZWFkLXNhZmV0eSAtLXdpdGgtb3Bl
bnNzbCAtLXdpdGgta3JiNSIuDQoNCg0KR2RiIG91dHB1dCBiZWxvdy4uLiAg
DQoNCg0KQ29yZSB3YXMgZ2VuZXJhdGVkIGJ5IGBwb3N0Z3JlczogcmRvdWds
YXMgdGFjYWNzIDEwLjAnLiAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICANClByb2dyYW0gdGVybWluYXRlZCB3aXRoIHNpZ25hbCAxMSwg
U2VnbWVudGF0aW9uIGZhdWx0Lg0KIzAgIDB4MDAwMDAwMDAwMDU1OTYyNCBp
biBwYW1fcGFzc3dkX2NvbnZfcHJvYyAoKQ0KTWlzc2luZyBzZXBhcmF0ZSBk
ZWJ1Z2luZm9zLCB1c2U6IGRlYnVnaW5mby1pbnN0YWxsIGF1ZGl0LWxpYnMt
MS43LjEzLTEuZmMxMS54ODZfNjQNCihnZGIpIGJ0DQojMCAgMHgwMDAwMDAw
MDAwNTU5NjI0IGluIHBhbV9wYXNzd2RfY29udl9wcm9jICgpDQojMSAgMHgw
MDAwN2Y3MzhkZmVlZGQ4IGluIF9wYW1fa3JiNV9jb252X2NhbGwgKHBhbWg9
PHZhbHVlIG9wdGltaXplZCBvdXQ+LCBtZXNzYWdlcz0weGI1MTc4MCwgbl9w
cm9tcHRzPTAsIHJlc3BvbnNlcz0weDdmZmYyZTM1NjY2OCkgYXQgY29udi5j
Ojk5DQojMiAgMHgwMDAwN2Y3MzhkZmVmYjM4IGluIF9wYW1fa3JiNV9nZW5l
cmljX3Byb21wdGVyIChjb250ZXh0PTx2YWx1ZSBvcHRpbWl6ZWQgb3V0Piwg
ZGF0YT0weDdmZmYyZTM1N2ZlMCwgbmFtZT08dmFsdWUgb3B0aW1pemVkIG91
dD4sIGJhbm5lcj08dmFsdWUgb3B0aW1pemVkIG91dD4sIG51bV9wcm9tcHRz
PTEsDQogICAgcHJvbXB0cz08dmFsdWUgb3B0aW1pemVkIG91dD4sIHN1cHBy
ZXNzX3Bhc3N3b3JkX3Byb21wdHM9MSkgYXQgcHJvbXB0ZXIuYzozMzANCiMz
ICAweDAwMDA3ZjczOGRmZWZlMTAgaW4gX3BhbV9rcmI1X25vcm1hbF9wcm9t
cHRlciAoY29udGV4dD0weDAsIGRhdGE9MHhiNTE4OTAsIG5hbWU9MHg3ZmZm
MmUzNTY2NjggIiIsIGJhbm5lcj0weDc5ZGYyNyAiIiwgbnVtX3Byb21wdHM9
MCwgcHJvbXB0cz0weDEwMTAxMDEwMTAxMDEwMSkNCiAgICBhdCBwcm9tcHRl
ci5jOjQwOQ0KIzQgIDB4MDAwMDAwMzFkMzY2MGJjZSBpbiBrcmI1X2dldF9h
c19rZXlfcGFzc3dvcmQgKGNvbnRleHQ9MHhiNGU3MTAsIGNsaWVudD08dmFs
dWUgb3B0aW1pemVkIG91dD4sIGV0eXBlPTIzLCBwcm9tcHRlcj08dmFsdWUg
b3B0aW1pemVkIG91dD4sIHByb21wdGVyX2RhdGE9PHZhbHVlIG9wdGltaXpl
ZCBvdXQ+LA0KICAgIHNhbHQ9MHg3ZmZmMmUzNTZmMDAsIHBhcmFtcz0weDdm
ZmYyZTM1NmVmMCwgYXNfa2V5PTB4N2ZmZjJlMzU2ZWMwLCBnYWtfZGF0YT0w
eDdmZmYyZTM1NzEyMCkgYXQgZ2ljX3B3ZC5jOjYxDQojNSAgMHgwMDAwMDAz
MWQzNjY3NzEzIGluIHBhX2VuY190aW1lc3RhbXAgKGNvbnRleHQ9MHhiNGU3
MTAsIHJlcXVlc3Q9PHZhbHVlIG9wdGltaXplZCBvdXQ+LCBpbl9wYWRhdGE9
PHZhbHVlIG9wdGltaXplZCBvdXQ+LCBvdXRfcGFkYXRhPTB4N2ZmZjJlMzU2
ZDMwLCBzYWx0PTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwNCiAgICBzMmtwYXJh
bXM9PHZhbHVlIG9wdGltaXplZCBvdXQ+LCBldHlwZT0weDdmZmYyZTM1NmY0
YywgYXNfa2V5PTB4N2ZmZjJlMzU2ZWMwLCBwcm9tcHRlcj0weDdmNzM4ZGZl
ZmUwMCA8X3BhbV9rcmI1X25vcm1hbF9wcm9tcHRlcj4sIHByb21wdGVyX2Rh
dGE9MHg3ZmZmMmUzNTdmZTAsDQogICAgZ2FrX2ZjdD0weDMxZDM2NjA5ZjAg
PGtyYjVfZ2V0X2FzX2tleV9wYXNzd29yZD4sIGdha19kYXRhPTB4N2ZmZjJl
MzU3MTIwKSBhdCBwcmVhdXRoMi5jOjYzNQ0KIzYgIDB4MDAwMDAwMzFkMzY2
N2UwYyBpbiBrcmI1X2RvX3ByZWF1dGggKGNvbnRleHQ9PHZhbHVlIG9wdGlt
aXplZCBvdXQ+LCByZXF1ZXN0PTB4N2ZmZjJlMzU2ZTQwLCBlbmNvZGVkX3Jl
cXVlc3RfYm9keT08dmFsdWUgb3B0aW1pemVkIG91dD4sDQogICAgZW5jb2Rl
ZF9wcmV2aW91c19yZXF1ZXN0PTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwgaW5f
cGFkYXRhPTB4YjUxMDYwLCBvdXRfcGFkYXRhPTx2YWx1ZSBvcHRpbWl6ZWQg
b3V0Piwgc2FsdD0weDdmZmYyZTM1NmYwMCwgczJrcGFyYW1zPTB4N2ZmZjJl
MzU2ZWYwLCBldHlwZT0weDdmZmYyZTM1NmY0YywNCiAgICBhc19rZXk9MHg3
ZmZmMmUzNTZlYzAsIHByb21wdGVyPTB4N2Y3MzhkZmVmZTAwIDxfcGFtX2ty
YjVfbm9ybWFsX3Byb21wdGVyPiwgcHJvbXB0ZXJfZGF0YT0weDdmZmYyZTM1
N2ZlMCwgZ2FrX2ZjdD0weDMxZDM2NjA5ZjAgPGtyYjVfZ2V0X2FzX2tleV9w
YXNzd29yZD4sDQogICAgZ2FrX2RhdGE9MHg3ZmZmMmUzNTcxMjAsIGdldF9k
YXRhX3JvY2s9MHg3ZmZmMmUzNTZlZTAsIG9wdGU9MHhiNGVjNTApIGF0IHBy
ZWF1dGgyLmM6MTU4Ng0KIzcgIDB4MDAwMDAwMzFkMzY1ZjI1MSBpbiBrcmI1
X2dldF9pbml0X2NyZWRzIChjb250ZXh0PTB4YjRlNzEwLCBjcmVkcz08dmFs
dWUgb3B0aW1pemVkIG91dD4sIGNsaWVudD08dmFsdWUgb3B0aW1pemVkIG91
dD4sIHByb21wdGVyPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwNCiAgICBwcm9t
cHRlcl9kYXRhPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0Piwgc3RhcnRfdGltZT08
dmFsdWUgb3B0aW1pemVkIG91dD4sIGluX3RrdF9zZXJ2aWNlPTB4N2ZmZjJl
MzU4MDUwICJrcmJ0Z3QvVEhFWENIQU5HRS5DT01AVEhFWENIQU5HRS5DT00i
LCBvcHRpb25zPTB4YjRlYzUwLA0KICAgIGdha19mY3Q9MHgzMWQzNjYwOWYw
IDxrcmI1X2dldF9hc19rZXlfcGFzc3dvcmQ+LCBnYWtfZGF0YT0weDdmZmYy
ZTM1NzEyMCwgdXNlX21hc3Rlcj0weDdmZmYyZTM1NzE1YywgYXNfcmVwbHk9
MHg3ZmZmMmUzNTcxNTApIGF0IGdldF9pbl90a3QuYzoxMTA2DQojOCAgMHgw
MDAwMDAzMWQzNjYwZjE4IGluIGtyYjVfZ2V0X2luaXRfY3JlZHNfcGFzc3dv
cmQgKGNvbnRleHQ9MHhiNGU3MTAsIGNyZWRzPTx2YWx1ZSBvcHRpbWl6ZWQg
b3V0PiwgY2xpZW50PTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwgcGFzc3dvcmQ9
PHZhbHVlIG9wdGltaXplZCBvdXQ+LA0KICAgIHByb21wdGVyPTB4N2Y3Mzhk
ZmVmZTAwIDxfcGFtX2tyYjVfbm9ybWFsX3Byb21wdGVyPiwgZGF0YT08dmFs
dWUgb3B0aW1pemVkIG91dD4sIHN0YXJ0X3RpbWU9MCwgaW5fdGt0X3NlcnZp
Y2U9MHg3ZmZmMmUzNTgwNTAgImtyYnRndC9USEVYQ0hBTkdFLkNPTUBUSEVY
Q0hBTkdFLkNPTSIsDQogICAgb3B0aW9ucz0weGI0ZWM1MCkgYXQgZ2ljX3B3
ZC5jOjEzOQ0KIzkgIDB4MDAwMDdmNzM4ZGZmNTU3MSBpbiB2NV9nZXRfY3Jl
ZHMgKGN0eD0weGI0ZTcxMCwgcGFtaD08dmFsdWUgb3B0aW1pemVkIG91dD4s
IGNyZWRzPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwgdXNlcj08dmFsdWUgb3B0
aW1pemVkIG91dD4sIHVzZXJpbmZvPTB4YjRlZmUwLCBvcHRpb25zPTB4YjRl
Y2IwLA0KICAgIHNlcnZpY2U9MHg3ZjczOGRmZjliZjggImtyYnRndCIsIHBh
c3N3b3JkPTB4MCwgZ2ljX29wdGlvbnM9MHhiNGVjNTAsIHByb21wdGVyPTB4
N2Y3MzhkZmVmZTAwIDxfcGFtX2tyYjVfbm9ybWFsX3Byb21wdGVyPiwgcmVz
dWx0PTB4YjUwNWM0KSBhdCB2NS5jOjEwMTQNCiMxMCAweDAwMDA3ZjczOGRm
ZWIzY2YgaW4gcGFtX3NtX2F1dGhlbnRpY2F0ZSAocGFtaD0weGI1ZjQ2MCwg
ZmxhZ3M9MCwgYXJnYz08dmFsdWUgb3B0aW1pemVkIG91dD4sIGFyZ3Y9PHZh
bHVlIG9wdGltaXplZCBvdXQ+KSBhdCBhdXRoLmM6NDIzDQojMTEgMHgwMDAw
MDAzMWQwMjAyYzFlIGluIF9wYW1fZGlzcGF0Y2hfYXV4ICh1c2VfY2FjaGVk
X2NoYWluPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwgcmVzdW1lZD08dmFsdWUg
b3B0aW1pemVkIG91dD4sIGg9PHZhbHVlIG9wdGltaXplZCBvdXQ+LCBmbGFn
cz08dmFsdWUgb3B0aW1pemVkIG91dD4sDQogICAgcGFtaD08dmFsdWUgb3B0
aW1pemVkIG91dD4pIGF0IHBhbV9kaXNwYXRjaC5jOjExMA0KIzEyIF9wYW1f
ZGlzcGF0Y2ggKHVzZV9jYWNoZWRfY2hhaW49PHZhbHVlIG9wdGltaXplZCBv
dXQ+LCByZXN1bWVkPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0PiwgaD08dmFsdWUg
b3B0aW1pemVkIG91dD4sIGZsYWdzPTx2YWx1ZSBvcHRpbWl6ZWQgb3V0Piwg
cGFtaD08dmFsdWUgb3B0aW1pemVkIG91dD4pDQogICAgYXQgcGFtX2Rpc3Bh
dGNoLmM6NDA3DQojMTMgMHgwMDAwMDAzMWQwMjAyNTAwIGluIHBhbV9hdXRo
ZW50aWNhdGUgKHBhbWg9MHhiNWY0NjAsIGZsYWdzPTApIGF0IHBhbV9hdXRo
LmM6MzQNCiMxNCAweDAwMDAwMDAwMDA1NTk4ZWQgaW4gQ2hlY2tQQU1BdXRo
LmNsb25lLjAgKCkNCiMxNSAweDAwMDAwMDAwMDA1NTliOTYgaW4gQ2xpZW50
QXV0aGVudGljYXRpb24gKCkNCiMxNiAweDAwMDAwMDAwMDA1YjI1ZGMgaW4g
QmFja2VuZEluaXRpYWxpemUgKCkNCiMxNyAweDAwMDAwMDAwMDA1YjJlYmMg
aW4gU2VydmVyTG9vcCAoKQ0KIzE4IDB4MDAwMDAwMDAwMDViNTU5YyBpbiBQ
b3N0bWFzdGVyTWFpbiAoKQ0KIzE5IDB4MDAwMDAwMDAwMDU2MTdkMCBpbiBt
YWluICgpDQoNCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBI
ZWlra2kgTGlubmFrYW5nYXMgW21haWx0bzpoZWlra2kubGlubmFrYW5nYXNA
ZW50ZXJwcmlzZWRiLmNvbV0gDQpTZW50OiBUaHVyc2RheSwgT2N0b2JlciAx
NSwgMjAwOSA1OjIzIFBNDQpUbzogRG91Z2xhcywgUnlhbg0KQ2M6IHBnc3Fs
LWJ1Z3NAcG9zdGdyZXNxbC5vcmcNClN1YmplY3Q6IFJlOiBbQlVHU10gQlVH
ICM1MTIxOiBTZWdtZW50YXRpb24gRmF1bHQgd2hlbiB1c2luZyBwYW0gdy8g
a3JiNQ0KDQpSeWFuIERvdWdsYXMgd3JvdGU6DQo+IFRoZSBmb2xsb3dpbmcg
YnVnIGhhcyBiZWVuIGxvZ2dlZCBvbmxpbmU6DQo+IA0KPiBCdWcgcmVmZXJl
bmNlOiAgICAgIDUxMjENCj4gTG9nZ2VkIGJ5OiAgICAgICAgICBSeWFuIERv
dWdsYXMNCj4gRW1haWwgYWRkcmVzczogICAgICByZG91Z2xhc0BhcmJpbmV0
LmNvbQ0KPiBQb3N0Z3JlU1FMIHZlcnNpb246IDguNC4xDQo+IE9wZXJhdGlu
ZyBzeXN0ZW06ICAgRmVkb3JhIDExDQo+IERlc2NyaXB0aW9uOiAgICAgICAg
U2VnbWVudGF0aW9uIEZhdWx0IHdoZW4gdXNpbmcgcGFtIHcvIGtyYjUNCj4g
RGV0YWlsczogDQo+IA0KPiBXaGVuZXZlciBJIHVzZSBwc3FsIHRvIHJlbW90
ZWx5IGNvbm5lY3QgdG8gdGhlIGRhdGFiYXNlIHRoZSBzZXJ2ZXIgY3Jhc2hl
cw0KPiAoc2VlIGxvZyBiZWxvdykuIElmIEkgdXNlIHBzcWwgd2l0aCB0aGUg
Jy1XJyBvcHRpb24gdGhlbiBpdCdzIGZpbmUuIA0KPiANCj4gSSBhbHNvIHRl
c3RlZCB3aXRoIHBhbV90YWNwbHVzLnNvIGFuZCBpbiBib3RoIGNhc2VzIHRo
ZSBkYiBkaWRuJ3QgY3Jhc2guIEl0DQo+IGp1c3QgY29tcGxhaW5lZCBhYm91
dCBub3QgaGF2aW5nIGNyZWRlbnRpYWxzIHRvIGF1dGhlbnRpY2F0ZSB3aGVu
IHRoZSAtVw0KPiBvcHRpb24gaXMgbm90IHVzZWQuDQo+IA0KPiBJIGNhbiBy
ZXByb2R1Y2UgYXQgd2lsbCBzbyBsZXQgbWUga25vdyBpZiB5b3UgbmVlZCBt
b3JlIGluZm9ybWF0aW9uLg0KDQpDYW4geW91IGdldCBhIHN0YWNrIHRyYWNl
IHdpdGggZ2RiPyBTb21ldGhpbmcgYWxvbmcgdGhlIGxpbmVzIG9mOg0KDQp1
bGltaXQgLWMgdW5saW1pdGVkDQooc3RhcnQgcG9zdG1hc3RlcikNCihyZXBy
b2R1Y2UgdGhlIGNyYXNoKQ0KZ2RiIC91c3IvYmluL3Bvc3RncmVzICRQR0RB
VEEvY29yZQ0KYnQNCg0KLS0gDQogIEhlaWtraSBMaW5uYWthbmdhcw0KICBF
bnRlcnByaXNlREIgICBodHRwOi8vd3d3LmVudGVycHJpc2VkYi5jb20NCg==

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
When I initially ran gdb I got the following....

=2E
=2E
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000559624 in pam_passwd_conv_proc ()
Missing separate debuginfos, use: debuginfo-install
audit-libs-1.7.13-1.fc11.x86_64 e2fsprogs-libs-1.41.4-12.fc11.x86_64
glibc-2.10.1-5.x86_64 keyutils-libs-1.2-5.fc11.x86_64
krb5-libs-1.6.3-20.fc11.x86_64 libselinux-2.0.80-1.fc11.x86_64
nss-softokn-freebl-3.12.4-3.fc11.x86_64 openssl-0.9.8k-5.fc11.x86_64
pam-1.0.91-6.fc11.x86_64 pam_krb5-2.3.5-1.fc11.x86_64
zlib-1.2.3-22.fc11.x86_64
(gdb) bt
#0  0x0000000000559624 in pam_passwd_conv_proc ()
#1  0x00007f738dfeedd8 in ?? () from /lib64/security/pam_krb5.so
=2E
=2E


I ran debuginfo-install which successfully installed all the related
packages, including pam,  except for audit-libs due to missing
dependency. Probably because of some package I didn't install.=20

For the sake of completeness, I'll install 8.4.1 on another machine
which has all the deps met and try to reproduce the problem.=20



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Thursday, October 15, 2009 6:07 PM
To: Douglas, Ryan
Cc: pgsql-bugs@postgreSQL.org
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> (gdb) bt
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> #1  0x00007f738dfeedd8 in _pam_krb5_conv_call (pamh=3D<value optimized
out>, messages=3D0xb51780, n_prompts=3D0, responses=3D0x7fff2e356668) at
conv.c:99
> #2  0x00007f738dfefb38 in _pam_krb5_generic_prompter (context=3D<value
optimized out>, data=3D0x7fff2e357fe0, name=3D<value optimized out>,
banner=3D<value optimized out>, num_prompts=3D1,
>     prompts=3D<value optimized out>, suppress_password_prompts=3D1) at
prompter.c:330
> #3  0x00007f738dfefe10 in _pam_krb5_normal_prompter (context=3D0x0,
data=3D0xb51890, name=3D0x7fff2e356668 "", banner=3D0x79df27 "",
num_prompts=3D0, prompts=3D0x101010101010101)
>     at prompter.c:409

Okay, so the dump is definitely way down inside libpam.  The next
question
is whether it's really libpam's fault, or are we passing it some bad
data.  Please install the pam-debuginfo RPM that corresponds to the
pam version you have, and retry --- hopefully that will add a bit more
information to the stack trace.

(The debuginfo-install hint you got might help, though I'm not sure why
it's referencing audit-libs and not pam ... maybe this code isn't
actually in libpam?)

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
Ok... I compiled 8.2.14 and 8.3.8 with the same configure options. They
both segfaut when doing the authentication. Note that when I use -W
option with psql everything works ok.  I went back to 8.4.1 enabled the
max debug level then re-tested and got the info below in the log. I went
to line 769 in pqcomm.c and commented the ereport and return EOF lines
and recompiled.  Now when I connect it works without a segfault. Hope
this narrows things down.


----- Pqcomm.c ----

                        /*
                         * Careful: an ereport() that tries to write to
the client would
                         * cause recursion to here, leading to stack
overflow and core
                         * dump!  This message must go *only* to the
postmaster log.
                         */
                        /* ereport(COMMERROR,
                                        (errcode_for_socket_access(),
                                         errmsg("could not receive data
from client: %m")));
                        return EOF; */






------ postgresql log -----
=2E
=2E
=2E
<@  2009-10-15 23:39:06.725 EDT>DEBUG:  00000: forked new backend,
pid=3D23184 socket=3D8
<@  2009-10-15 23:39:06.725 EDT>LOCATION:  BackendStartup,
postmaster.c:3083
<[unknown]@[unknown]  2009-10-15 23:39:06.726 EDT>LOG:  00000:
connection received: host=3D10.0.20.38 port=3D36929
<[unknown]@[unknown]  2009-10-15 23:39:06.726 EDT>LOCATION:
BackendInitialize, postmaster.c:3259
<[unknown]@[unknown] 10.0.20.38(36929) 2009-10-15 23:39:06.754
EDT>DEBUG:  00000: SSL connection from "(anonymous)"
<[unknown]@[unknown] 10.0.20.38(36929) 2009-10-15 23:39:06.754
EDT>LOCATION:  open_server_SSL, be-secure.c:961
<rdouglas@tacacs 10.0.20.38(36929) 2009-10-15 23:39:06.759 EDT>DEBUG:
00000: SSL: read alert (0x0100)
<rdouglas@tacacs 10.0.20.38(36929) 2009-10-15 23:39:06.759 EDT>LOCATION:
info_cb, be-secure.c:699
<rdouglas@tacacs 10.0.20.38(36929) 2009-10-15 23:39:06.759 EDT>LOG:
08006: could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(36929) 2009-10-15 23:39:06.759 EDT>LOCATION:
pq_recvbuf, pqcomm.c:769
<@  2009-10-15 23:39:20.546 EDT>DEBUG:  00000: reaping dead processes
<@  2009-10-15 23:39:20.546 EDT>LOCATION:  reaper, postmaster.c:2236
<@  2009-10-15 23:39:20.546 EDT>DEBUG:  00000: server process (PID
23184) was terminated by signal 11: Segmentation fault
<@  2009-10-15 23:39:20.546 EDT>LOCATION:  LogChildExit,
postmaster.c:2725
<@  2009-10-15 23:39:20.546 EDT>LOG:  00000: server process (PID 23184)
was terminated by signal 11: Segmentation fault
<@  2009-10-15 23:39:20.546 EDT>LOCATION:  LogChildExit,
postmaster.c:2725
<@  2009-10-15 23:39:20.546 EDT>LOG:  00000: terminating any other
active server processes
=2E
=2E
=2E

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Thursday, October 15, 2009 7:26 PM
To: Douglas, Ryan
Cc: Magnus Hagander
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> My krb5.conf points to MS Active Directory...

Oh, that's *definitely* relevant information.  I have no idea how well
tested that combination is.  Magnus, can you comment on whether there
might be bugs in the combination of PG 8.4.1, Fedora 11 PAM libraries,
and an MS controller?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
Both actually...=20=20

When just the "return" is commented out the routine goes into a loop and
the following shows up in the logs...

<rdouglas@tacacs 10.0.20.38(47254) 2009-10-16 01:12:18.935 EDT>LOG:
could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(47254) 2009-10-16 01:12:18.935 EDT>LOG:
could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(47254) 2009-10-16 01:12:18.935 EDT>LOG:
could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(47254) 2009-10-16 01:12:18.935 EDT>LOG:
could not receive data from client: Connection reset by peer

When just "ereport" is commented out I still get a sigfault...=20

<[unknown]@[unknown]  2009-10-16 01:13:49.211 EDT>LOG:  connection
received: host=3D10.0.20.38 port=3D32815
<@  2009-10-16 01:13:49.258 EDT>LOG:  server process (PID 13201) was
terminated by signal 11: Segmentation fault
<@  2009-10-16 01:13:49.258 EDT>LOG:  terminating any other active
server processes
<@  2009-10-16 01:13:49.262 EDT>LOG:  all server processes terminated;
reinitializing
<@  2009-10-16 01:13:49.381 EDT>LOG:  database system was interrupted;
last known up at 2009-10-16 01:13:37 EDT
<@  2009-10-16 01:13:49.382 EDT>LOG:  database system was not properly
shut down; automatic recovery in progress
<@  2009-10-16 01:13:49.382 EDT>LOG:  record with zero length at
3/B7D75538
<@  2009-10-16 01:13:49.383 EDT>LOG:  redo is not required
<@  2009-10-16 01:13:49.409 EDT>LOG:  autovacuum launcher started
<@  2009-10-16 01:13:49.410 EDT>LOG:  database system is ready to accept
connections


But when both are commented it seems to work...

<[unknown]@[unknown]  2009-10-16 01:15:05.330 EDT>LOG:  connection
received: host=3D10.0.20.38 port=3D32816
<[unknown]@[unknown]  2009-10-16 01:15:17.436 EDT>LOG:  connection
received: host=3D10.0.20.38 port=3D32817
<rdouglas@tacacs 10.0.20.38(32817) 2009-10-16 01:15:17.486 EDT>LOG:
connection authorized: user=3Drdouglas database=3Dtacacs


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Friday, October 16, 2009 1:01 AM
To: Douglas, Ryan
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> Ok... I compiled 8.2.14 and 8.3.8 with the same configure options.
They
> both segfaut when doing the authentication. Note that when I use -W
> option with psql everything works ok.  I went back to 8.4.1 enabled
the
> max debug level then re-tested and got the info below in the log. I
went
> to line 769 in pqcomm.c and commented the ereport and return EOF lines
> and recompiled.  Now when I connect it works without a segfault. Hope
> this narrows things down.

Which is it that makes the difference ... the ereport or the return?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
Well, we certainly are not going to break pq_recvbuf in such a
fundamental way as that ;-).  I think that the more likely place
to work around this is going to be in pam_passwd_conv_proc() in
src/backend/libpq/auth.c.  In particular, I see that when it gets
a NULL from recv_password_packet (which is what's going to happen
when pq_recvbuf returns EOF), it returns PAM_CONV_ERR without
bothering to set *resp.  I wonder whether we need to be initializing
*resp to NULL, or even making it really valid?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> Program terminated with signal 11, Segmentation fault.
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> Missing separate debuginfos, use: debuginfo-install audit-libs-1.7.13-1.fc11.x86_64
> (gdb) bt
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> #1  0x00007f738dfeedd8 in _pam_krb5_conv_call (pamh=<value optimized out>, messages=0xb51780, n_prompts=0,
responses=0x7fff2e356668)at conv.c:99 
> #2  0x00007f738dfefb38 in _pam_krb5_generic_prompter (context=<value optimized out>, data=0x7fff2e357fe0, name=<value
optimizedout>, banner=<value optimized out>, num_prompts=1, 
>     prompts=<value optimized out>, suppress_password_prompts=1) at prompter.c:330

Actually, now that I look more closely at that stack trace,
pam_passwd_conv_proc *is* a Postgres function --- so the core dump
is happening when libpam calls us back.  (I wonder why gdb failed
to present any information about it?  Are you using a stripped
postgres executable?)

In a quick look at the source for pam_passwd_conv_proc, the only
very plausible explanation for why it would segfault in isolated
cases seems to be that the initial sanity check on the passed-in
message status might be assuming more than it should --- in particular
it would obviously dump core if msg is null or msg[0] is null.

I am thinking that maybe, when the KDC is Active Directory and there's
no password supplied already, libpam makes additional calls to the
conv_proc with parameter values that we're not prepared to handle.
Can you add additional debug printouts or step through the code
and verify what's happening there?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> I added some logging statements in the pam_passwd_conv_proc function and
> it gets as far as checking if the password is null then returning
> PAM_CONV_ERR.

That's what I would expect to happen if you don't use -W in psql.
What I suspect is happening with the MS KDC is that an additional
call to the conv_proc happens later with some parameter values that
we aren't prepared for.  Can you stick some debugging printout
into the checks at the very top of the routine?

>                 if (passwd == NULL)
>                                         ereport(LOG,(errmsg("RD - passwd
> is NULL... returning PAM_CONV_ERR")));
>                         return PAM_CONV_ERR;    /* client didn't want to
> send password */

I think you need to add some braces if you don't want to break the
behavior here.  This is C not Python ...

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> Any tips on using gdb to step through the code?

If you set pre_auth_delay to 30s or so in postgresql.conf, it'll
slow things down enough so that you can get the backend's PID from
"ps" and then attach to it with gdb.  Then set a breakpoint at
pam_passwd_conv_proc, continue, and away you go.

You'll definitely want a debug-enabled postgres executable, though,
which you did not seem to have before.

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
I added some logging statements in the pam_passwd_conv_proc function and
it gets as far as checking if the password is null then returning
PAM_CONV_ERR.


        if (strlen(appdata_ptr) =3D=3D 0)
        {
                char       *passwd;

                sendAuthRequest(pam_port_cludge, AUTH_REQ_PASSWORD);
                passwd =3D recv_password_packet(pam_port_cludge);

                if (passwd =3D=3D NULL)
                                        ereport(LOG,(errmsg("RD - passwd
is NULL... returning PAM_CONV_ERR")));
                        return PAM_CONV_ERR;    /* client didn't want to
send password */

                if (strlen(passwd) =3D=3D 0)
                {
                        ereport(LOG,
                                        (errmsg("empty password returned
by client")));
                        return PAM_CONV_ERR;
                }
                appdata_ptr =3D passwd;
        }


----- pg log -----

<[unknown]@[unknown]  2009-10-16 12:16:07.033 EDT>LOG:  connection
received: host=3D10.0.20.38 port=3D35945
<rdouglas@tacacs 10.0.20.38(35945) 2009-10-16 12:16:07.065 EDT>LOG:
could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(35945) 2009-10-16 12:16:07.065 EDT>LOG:  RD
- passwd is NULL... returning PAM_CONV_ERR
<@  2009-10-16 12:16:07.071 EDT>LOG:  server process (PID 2176) was
terminated by signal 11: Segmentation fault
<@  2009-10-16 12:16:07.071 EDT>LOG:  terminating any other active
server processes
<@  2009-10-16 12:16:07.074 EDT>LOG:  all server processes terminated;
reinitializing
<@  2009-10-16 12:16:07.194 EDT>LOG:  database system was interrupted;
last known up at 2009-10-16 12:15:58 EDT
<@  2009-10-16 12:16:07.195 EDT>LOG:  database system was not properly
shut down; automatic recovery in progress
<@  2009-10-16 12:16:07.196 EDT>LOG:  record with zero length at
3/B7D75778
<@  2009-10-16 12:16:07.196 EDT>LOG:  redo is not required
<@  2009-10-16 12:16:07.223 EDT>LOG:  database system is ready to accept
connections
<@  2009-10-16 12:16:07.223 EDT>LOG:  autovacuum launcher started


-Ryan

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Friday, October 16, 2009 11:11 AM
To: Douglas, Ryan
Cc: pgsql-bugs@postgreSQL.org
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


Well, we certainly are not going to break pq_recvbuf in such a
fundamental way as that ;-).  I think that the more likely place
to work around this is going to be in pam_passwd_conv_proc() in
src/backend/libpq/auth.c.  In particular, I see that when it gets
a NULL from recv_password_packet (which is what's going to happen
when pq_recvbuf returns EOF), it returns PAM_CONV_ERR without
bothering to set *resp.  I wonder whether we need to be initializing
*resp to NULL, or even making it really valid?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
Any tips on using gdb to step through the code?


[postgres@va-mp-db02 ~]$ file /usr/local/pgsql/bin/postgres
/usr/local/pgsql/bin/postgres: ELF 64-bit LSB executable, x86-64,
version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux
2.6.18, not stripped

---- Pg Log ---

<[unknown]@[unknown]  2009-10-16 12:33:33.600 EDT>LOG:  00000:
connection received: host=3D10.0.20.38 port=3D57014
<[unknown]@[unknown]  2009-10-16 12:33:33.600 EDT>LOCATION:
BackendInitialize, postmaster.c:3259
<[unknown]@[unknown] 10.0.20.38(57014) 2009-10-16 12:33:33.629
EDT>DEBUG:  00000: SSL connection from "(anonymous)"
<[unknown]@[unknown] 10.0.20.38(57014) 2009-10-16 12:33:33.629
EDT>LOCATION:  open_server_SSL, be-secure.c:961
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>DEBUG:
00000: SSL: read alert (0x0100)
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>LOCATION:
info_cb, be-secure.c:699
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>LOG:
08006: could not receive data from client: Connection reset by peer
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>LOCATION:
pq_recvbuf, pqcomm.c:769
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>LOG:
00000: RD - passwd is NULL... returning PAM_CONV_ERR
<rdouglas@tacacs 10.0.20.38(57014) 2009-10-16 12:33:33.634 EDT>LOCATION:
pam_passwd_conv_proc, auth.c:1906
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: reaping dead processes
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  reaper, postmaster.c:2236
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: server process (PID 2257)
was terminated by signal 11: Segmentation fault
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  LogChildExit,
postmaster.c:2725
<@  2009-10-16 12:33:33.641 EDT>LOG:  00000: server process (PID 2257)
was terminated by signal 11: Segmentation fault
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  LogChildExit,
postmaster.c:2725
<@  2009-10-16 12:33:33.641 EDT>LOG:  00000: terminating any other
active server processes
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  HandleChildCrash,
postmaster.c:2552
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: sending SIGQUIT to
process 2251
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  HandleChildCrash,
postmaster.c:2621
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: sending SIGQUIT to
process 2252
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  HandleChildCrash,
postmaster.c:2633
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: shmem_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  shmem_exit, ipc.c:197
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: proc_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  proc_exit_prepare, ipc.c:169
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: shmem_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  shmem_exit, ipc.c:197
<@  2009-10-16 12:33:33.641 EDT>DEBUG:  00000: proc_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.641 EDT>LOCATION:  proc_exit_prepare, ipc.c:169
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: sending SIGQUIT to
process 2253
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  HandleChildCrash,
postmaster.c:2645
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: sending SIGQUIT to
process 2254
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  HandleChildCrash,
postmaster.c:2675
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: shmem_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  shmem_exit, ipc.c:197
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: proc_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  proc_exit_prepare, ipc.c:169
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: shmem_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  shmem_exit, ipc.c:197
<@  2009-10-16 12:33:33.643 EDT>DEBUG:  00000: proc_exit(-1): 0
callbacks to make
<@  2009-10-16 12:33:33.643 EDT>LOCATION:  proc_exit_prepare, ipc.c:169
<@  2009-10-16 12:33:33.644 EDT>DEBUG:  00000: reaping dead processes
<@  2009-10-16 12:33:33.644 EDT>LOCATION:  reaper, postmaster.c:2236
<@  2009-10-16 12:33:33.644 EDT>DEBUG:  00000: reaping dead processes
<@  2009-10-16 12:33:33.644 EDT>LOCATION:  reaper, postmaster.c:2236
<@  2009-10-16 12:33:33.644 EDT>LOG:  00000: all server processes
terminated; reinitializing




-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Friday, October 16, 2009 12:19 PM
To: Douglas, Ryan
Cc: pgsql-bugs@postgreSQL.org
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


"Douglas, Ryan" <RDouglas@arbinet.com> writes:
> Program terminated with signal 11, Segmentation fault.
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> Missing separate debuginfos, use: debuginfo-install
audit-libs-1.7.13-1.fc11.x86_64
> (gdb) bt
> #0  0x0000000000559624 in pam_passwd_conv_proc ()
> #1  0x00007f738dfeedd8 in _pam_krb5_conv_call (pamh=3D<value optimized
out>, messages=3D0xb51780, n_prompts=3D0, responses=3D0x7fff2e356668) at
conv.c:99
> #2  0x00007f738dfefb38 in _pam_krb5_generic_prompter (context=3D<value
optimized out>, data=3D0x7fff2e357fe0, name=3D<value optimized out>,
banner=3D<value optimized out>, num_prompts=3D1,
>     prompts=3D<value optimized out>, suppress_password_prompts=3D1) at
prompter.c:330

Actually, now that I look more closely at that stack trace,
pam_passwd_conv_proc *is* a Postgres function --- so the core dump
is happening when libpam calls us back.  (I wonder why gdb failed
to present any information about it?  Are you using a stripped
postgres executable?)

In a quick look at the source for pam_passwd_conv_proc, the only
very plausible explanation for why it would segfault in isolated
cases seems to be that the initial sanity check on the passed-in
message status might be assuming more than it should --- in particular
it would obviously dump core if msg is null or msg[0] is null.

I am thinking that maybe, when the KDC is Active Directory and there's
no password supplied already, libpam makes additional calls to the
conv_proc with parameter values that we're not prepared to handle.
Can you add additional debug printouts or step through the code
and verify what's happening there?

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
I wrote:
> The best idea I can come up with is that the conv_proc is being called
> with zero messages and is dumping core because it tries to print the
> contents of msg[0].  However, it's far from clear why libpam would
> bother to call it with zero messages.

Hah --- found it.  (Man, it is so nice working with open source that
you can actually look at...)  prompter.c in pam_krb5 has

        /* Skip any prompt for which the supplied default answer is the
         * previously-entered password -- it's just a waste of the
         * user's time.  */

So it definitely is possible to call our proc with zero messages, and
whether this will happen or not is probably dependent on the behavior
of the KDC, and even then, ereport might or might not dump core depending
on the contents of the not-allocated msg[0] array member.

I will go and rewrite this function to look more like openssh's,
on the assumption that their version is probably pretty well battle
tested.

            regards, tom lane

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Magnus Hagander
Date:
2009/10/16 Tom Lane <tgl@sss.pgh.pa.us>:
> I wrote:
>> The best idea I can come up with is that the conv_proc is being called
>> with zero messages and is dumping core because it tries to print the
>> contents of msg[0]. =A0However, it's far from clear why libpam would
>> bother to call it with zero messages.
>
> Hah --- found it. =A0(Man, it is so nice working with open source that
> you can actually look at...) =A0prompter.c in pam_krb5 has
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Skip any prompt for which the supplied =
default answer is the
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * previously-entered password -- it's jus=
t a waste of the
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * user's time. =A0*/
>
> So it definitely is possible to call our proc with zero messages, and
> whether this will happen or not is probably dependent on the behavior
> of the KDC, and even then, ereport might or might not dump core depending
> on the contents of the not-allocated msg[0] array member.
>
> I will go and rewrite this function to look more like openssh's,
> on the assumption that their version is probably pretty well battle
> tested.

Yeah, that sounds like a reasonable thing to do.

--=20
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
Tom Lane
Date:
"Douglas, Ryan" <RDouglas@arbinet.com> writes:
>    You were right. According to the trace msg[0] is null.

Hah.  This must be triggered by something Active Directory does that
a KDC doesn't, because I'm still not seeing it here.  But anyway the
problem is clear now, we have to avoid referencing msg[0] when num_msg
is zero.

Please try the attached patch and see if it behaves sanely for you.
This is based on openssh's PAM callback, so it ought to be more
robust than what we had.  (This is against 8.4 branch tip, but it
should apply to 8.4.1 with maybe a few lines' offset.)

            regards, tom lane

Index: auth.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/libpq/auth.c,v
retrieving revision 1.183.2.2
diff -c -r1.183.2.2 auth.c
*** auth.c    14 Oct 2009 22:10:01 -0000    1.183.2.2
--- auth.c    16 Oct 2009 20:08:39 -0000
***************
*** 441,447 ****

          case uaPAM:
  #ifdef USE_PAM
-             pam_port_cludge = port;
              status = CheckPAMAuth(port, port->user_name, "");
  #else
              Assert(false);
--- 441,446 ----
***************
*** 1880,1940 ****
  pam_passwd_conv_proc(int num_msg, const struct pam_message ** msg,
                       struct pam_response ** resp, void *appdata_ptr)
  {
!     if (num_msg != 1 || msg[0]->msg_style != PAM_PROMPT_ECHO_OFF)
!     {
!         switch (msg[0]->msg_style)
!         {
!             case PAM_ERROR_MSG:
!                 ereport(LOG,
!                         (errmsg("error from underlying PAM layer: %s",
!                                 msg[0]->msg)));
!                 return PAM_CONV_ERR;
!             default:
!                 ereport(LOG,
!                         (errmsg("unsupported PAM conversation %d/%s",
!                                 msg[0]->msg_style, msg[0]->msg)));
!                 return PAM_CONV_ERR;
!         }
!     }

!     if (!appdata_ptr)
      {
          /*
           * Workaround for Solaris 2.6 where the PAM library is broken and does
           * not pass appdata_ptr to the conversation routine
           */
!         appdata_ptr = pam_passwd;
      }

!     /*
!      * Password wasn't passed to PAM the first time around - let's go ask the
!      * client to send a password, which we then stuff into PAM.
!      */
!     if (strlen(appdata_ptr) == 0)
!     {
!         char       *passwd;
!
!         sendAuthRequest(pam_port_cludge, AUTH_REQ_PASSWORD);
!         passwd = recv_password_packet(pam_port_cludge);

!         if (passwd == NULL)
!             return PAM_CONV_ERR;    /* client didn't want to send password */
!
!         if (strlen(passwd) == 0)
!         {
!             ereport(LOG,
!                     (errmsg("empty password returned by client")));
!             return PAM_CONV_ERR;
!         }
!         appdata_ptr = passwd;
!     }

      /*
       * Explicitly not using palloc here - PAM will free this memory in
       * pam_end()
       */
!     *resp = calloc(num_msg, sizeof(struct pam_response));
!     if (!*resp)
      {
          ereport(LOG,
                  (errcode(ERRCODE_OUT_OF_MEMORY),
--- 1879,1909 ----
  pam_passwd_conv_proc(int num_msg, const struct pam_message ** msg,
                       struct pam_response ** resp, void *appdata_ptr)
  {
!     char       *passwd;
!     struct pam_response *reply;
!     int            i;

!     if (appdata_ptr)
!         passwd = (char *) appdata_ptr;
!     else
      {
          /*
           * Workaround for Solaris 2.6 where the PAM library is broken and does
           * not pass appdata_ptr to the conversation routine
           */
!         passwd = pam_passwd;
      }

!     *resp = NULL;                /* in case of error exit */

!     if (num_msg <= 0 || num_msg > PAM_MAX_NUM_MSG)
!         return PAM_CONV_ERR;

      /*
       * Explicitly not using palloc here - PAM will free this memory in
       * pam_end()
       */
!     if ((reply = calloc(num_msg, sizeof(struct pam_response))) == NULL)
      {
          ereport(LOG,
                  (errcode(ERRCODE_OUT_OF_MEMORY),
***************
*** 1942,1951 ****
          return PAM_CONV_ERR;
      }

!     (*resp)[0].resp = strdup((char *) appdata_ptr);
!     (*resp)[0].resp_retcode = 0;

!     return ((*resp)[0].resp ? PAM_SUCCESS : PAM_CONV_ERR);
  }


--- 1911,1981 ----
          return PAM_CONV_ERR;
      }

!     for (i = 0; i < num_msg; i++)
!     {
!         switch (msg[i]->msg_style)
!         {
!             case PAM_PROMPT_ECHO_OFF:
!                 if (strlen(passwd) == 0)
!                 {
!                     /*
!                      * Password wasn't passed to PAM the first time around -
!                      * let's go ask the client to send a password, which we
!                      * then stuff into PAM.
!                      */
!                     sendAuthRequest(pam_port_cludge, AUTH_REQ_PASSWORD);
!                     passwd = recv_password_packet(pam_port_cludge);
!                     if (passwd == NULL)
!                     {
!                         /*
!                          * Client didn't want to send password.  We
!                          * intentionally do not log anything about this.
!                          */
!                         goto fail;
!                     }
!                     if (strlen(passwd) == 0)
!                     {
!                         ereport(LOG,
!                                 (errmsg("empty password returned by client")));
!                         goto fail;
!                     }
!                 }
!                 if ((reply[i].resp = strdup(passwd)) == NULL)
!                     goto fail;
!                 reply[i].resp_retcode = PAM_SUCCESS;
!                 break;
!             case PAM_ERROR_MSG:
!                 ereport(LOG,
!                         (errmsg("error from underlying PAM layer: %s",
!                                 msg[i]->msg)));
!                 /* FALL THROUGH */
!             case PAM_TEXT_INFO:
!                 /* we don't bother to log TEXT_INFO messages */
!                 if ((reply[i].resp = strdup("")) == NULL)
!                     goto fail;
!                 reply[i].resp_retcode = PAM_SUCCESS;
!                 break;
!             default:
!                 elog(LOG, "unsupported PAM conversation %d/\"%s\"",
!                      msg[i]->msg_style,
!                      msg[i]->msg ? msg[i]->msg : "(none)");
!                 goto fail;
!         }
!     }
!
!     *resp = reply;
!     return PAM_SUCCESS;
!
! fail:
!     /* free up whatever we allocated */
!     for (i = 0; i < num_msg; i++)
!     {
!         if (reply[i].resp != NULL)
!             free(reply[i].resp);
!     }
!     free(reply);

!     return PAM_CONV_ERR;
  }


***************
*** 1959,1968 ****
      pam_handle_t *pamh = NULL;

      /*
!      * Apparently, Solaris 2.6 is broken, and needs ugly static variable
!      * workaround
       */
      pam_passwd = password;

      /*
       * Set the application data portion of the conversation struct This is
--- 1989,2000 ----
      pam_handle_t *pamh = NULL;

      /*
!      * We can't entirely rely on PAM to pass through appdata --- it appears
!      * not to work on at least Solaris 2.6.  So use these ugly static
!      * variables instead.
       */
      pam_passwd = password;
+     pam_port_cludge = port;

      /*
       * Set the application data portion of the conversation struct This is

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
Tom,
   You were right. According to the trace msg[0] is null.



(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f5a6c2b77b0 (LWP 23208)]
0x0000000000580cf4 in pam_passwd_conv_proc (num_msg=3D0, msg=3D0x21015a0,
    resp=3D0x7fff5955a0b8, appdata_ptr=3D0x7f20c7) at auth.c:1868
1868    auth.c: No such file or directory.
        in auth.c
(gdb) backtrace
#0  0x0000000000580cf4 in pam_passwd_conv_proc (num_msg=3D0, msg=3D0x21015a=
0,
    resp=3D0x7fff5955a0b8, appdata_ptr=3D0x7f20c7) at auth.c:1868
#1  0x00007f59e36f8dd8 in _pam_krb5_conv_call (pamh=3D<value optimized out>,
    messages=3D0x2101490, n_prompts=3D0, responses=3D0x7fff5955a0b8) at con=
v.c:99
#2  0x00007f59e36f9b38 in _pam_krb5_generic_prompter (
    context=3D<value optimized out>, data=3D0x7fff5955ba30,
    name=3D<value optimized out>, banner=3D<value optimized out>, num_promp=
ts=3D1,
    prompts=3D<value optimized out>, suppress_password_prompts=3D1)
    at prompter.c:330
#3  0x00007f59e36f9e10 in _pam_krb5_normal_prompter (context=3D0x0,
    data=3D0x21015a0, name=3D0x7fff5955a0b8 "", banner=3D0x7f20c7 "",
    num_prompts=3D0, prompts=3D0x101010101010101) at prompter.c:409
#4  0x00000031d3660bce in krb5_get_as_key_password (context=3D0x20fe420,
    client=3D<value optimized out>, etype=3D23, prompter=3D<value optimized=
 out>,
    prompter_data=3D<value optimized out>, salt=3D0x7fff5955a950,
    params=3D0x7fff5955a940, as_key=3D0x7fff5955a910, gak_data=3D0x7fff5955=
ab70)
    at gic_pwd.c:61
#5  0x00000031d3667713 in pa_enc_timestamp (context=3D0x20fe420,
    request=3D<value optimized out>, in_padata=3D<value optimized out>,
    out_padata=3D0x7fff5955a780, salt=3D<value optimized out>,
    s2kparams=3D<value optimized out>, etype=3D0x7fff5955a99c,
    as_key=3D0x7fff5955a910,
    prompter=3D0x7f59e36f9e00 <_pam_krb5_normal_prompter>,
    prompter_data=3D0x7fff5955ba30,
---Type <return> to continue, or q <return> to quit---
    gak_fct=3D0x31d36609f0 <krb5_get_as_key_password>, gak_data=3D0x7fff595=
5ab70)
    at preauth2.c:635
#6  0x00000031d3667e0c in krb5_do_preauth (context=3D<value optimized out>,
    request=3D0x7fff5955a890, encoded_request_body=3D<value optimized out>,
    encoded_previous_request=3D<value optimized out>, in_padata=3D0x2100ac0,
    out_padata=3D<value optimized out>, salt=3D0x7fff5955a950,
    s2kparams=3D0x7fff5955a940, etype=3D0x7fff5955a99c, as_key=3D0x7fff5955=
a910,
    prompter=3D0x7f59e36f9e00 <_pam_krb5_normal_prompter>,
    prompter_data=3D0x7fff5955ba30,
    gak_fct=3D0x31d36609f0 <krb5_get_as_key_password>, gak_data=3D0x7fff595=
5ab70,
    get_data_rock=3D0x7fff5955a930, opte=3D0x20fe960) at preauth2.c:1586
#7  0x00000031d365f251 in krb5_get_init_creds (context=3D0x20fe420,
    creds=3D<value optimized out>, client=3D<value optimized out>,
    prompter=3D<value optimized out>, prompter_data=3D<value optimized out>,
    start_time=3D<value optimized out>,
    in_tkt_service=3D0x7fff5955baa0 "krbtgt/THEXCHANGE.COM@THEXCHANGE.COM",
    options=3D0x20fe960, gak_fct=3D0x31d36609f0 <krb5_get_as_key_password>,
    gak_data=3D0x7fff5955ab70, use_master=3D0x7fff5955abac,
    as_reply=3D0x7fff5955aba0) at get_in_tkt.c:1106
#8  0x00000031d3660f18 in krb5_get_init_creds_password (context=3D0x20fe420,
    creds=3D<value optimized out>, client=3D<value optimized out>,
    password=3D<value optimized out>,
    prompter=3D0x7f59e36f9e00 <_pam_krb5_normal_prompter>,
    data=3D<value optimized out>, start_time=3D0, ---Type <return> to conti=
nue, or q <return> to quit---
    in_tkt_service=3D0x7fff5955baa0 "krbtgt/THEXCHANGE.COM@THEXCHANGE.COM",
    options=3D0x20fe960) at gic_pwd.c:139
#9  0x00007f59e36ff571 in v5_get_creds (ctx=3D0x20fe420,
    pamh=3D<value optimized out>, creds=3D<value optimized out>,
    user=3D<value optimized out>, userinfo=3D0x20fecf0, options=3D0x20fe9c0,
    service=3D0x7f59e3703bf8 "krbtgt", password=3D0x0, gic_options=3D0x20fe=
960,
    prompter=3D0x7f59e36f9e00 <_pam_krb5_normal_prompter>, result=3D0x21002=
d4)
    at v5.c:1014
#10 0x00007f59e36f53cf in pam_sm_authenticate (pamh=3D0x210f5a0, flags=3D0,
    argc=3D<value optimized out>, argv=3D<value optimized out>) at auth.c:4=
23
#11 0x00000031d0202c1e in _pam_dispatch_aux (
    use_cached_chain=3D<value optimized out>, resumed=3D<value optimized ou=
t>,
    h=3D<value optimized out>, flags=3D<value optimized out>,
    pamh=3D<value optimized out>) at pam_dispatch.c:110
#12 _pam_dispatch (use_cached_chain=3D<value optimized out>,
    resumed=3D<value optimized out>, h=3D<value optimized out>,
    flags=3D<value optimized out>, pamh=3D<value optimized out>)
    at pam_dispatch.c:407
#13 0x00000031d0202500 in pam_authenticate (pamh=3D0x210f5a0, flags=3D0)
    at pam_auth.c:34
#14 0x00000000005810d1 in CheckPAMAuth (user=3D<value optimized out>,
    port=3D<value optimized out>, password=3D<value optimized out>) at auth=
.c:1999
#15 ClientAuthentication (user=3D<value optimized out>,
    port=3D<value optimized out>, password=3D<value optimized out>) at auth=
.c:430 ---Type <return> to continue, or q <return> to quit---
#16 0x00000000005e035c in BackendInitialize (port=3D0x20fd460)
    at postmaster.c:3324
#17 0x00000000005e0c3c in BackendStartup (port=3D<value optimized out>)
    at postmaster.c:3058
#18 ServerLoop (port=3D<value optimized out>) at postmaster.c:1387
#19 0x00000000005e354d in PostmasterMain (argc=3D1, argv=3D0x20b9010)
    at postmaster.c:1040
#20 0x0000000000588900 in main (argc=3D1, argv=3D0x20b9010) at main.c:188
(gdb) print num_msg
$1 =3D 0
(gdb) print msg[0]
$2 =3D (const struct pam_message *) 0x0
(gdb)

-----Original Message-----
From: Magnus Hagander [mailto:magnus@hagander.net]=20
Sent: Friday, October 16, 2009 2:05 PM
To: Tom Lane
Cc: Douglas, Ryan; pgsql-bugs
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5

2009/10/16 Tom Lane <tgl@sss.pgh.pa.us>:
> I wrote:
>> The best idea I can come up with is that the conv_proc is being called
>> with zero messages and is dumping core because it tries to print the
>> contents of msg[0]. =A0However, it's far from clear why libpam would
>> bother to call it with zero messages.
>
> Hah --- found it. =A0(Man, it is so nice working with open source that
> you can actually look at...) =A0prompter.c in pam_krb5 has
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* Skip any prompt for which the supplied =
default answer is the
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * previously-entered password -- it's jus=
t a waste of the
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * user's time. =A0*/
>
> So it definitely is possible to call our proc with zero messages, and
> whether this will happen or not is probably dependent on the behavior
> of the KDC, and even then, ereport might or might not dump core depending
> on the contents of the not-allocated msg[0] array member.
>
> I will go and rewrite this function to look more like openssh's,
> on the assumption that their version is probably pretty well battle
> tested.

Yeah, that sounds like a reasonable thing to do.

--=20
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #5121: Segmentation Fault when using pam w/ krb5

From
"Douglas, Ryan"
Date:
It works like champ... cool.. thanks.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Friday, October 16, 2009 4:15 PM
To: Douglas, Ryan
Cc: Magnus Hagander; pgsql-bugs
Subject: Re: [BUGS] BUG #5121: Segmentation Fault when using pam w/ krb5


"Douglas, Ryan" <RDouglas@arbinet.com> writes:
>    You were right. According to the trace msg[0] is null.

Hah.  This must be triggered by something Active Directory does that a
KDC doesn't, because I'm still not seeing it here.  But anyway the
problem is clear now, we have to avoid referencing msg[0] when num_msg
is zero.

Please try the attached patch and see if it behaves sanely for you.
This is based on openssh's PAM callback, so it ought to be more robust
than what we had.  (This is against 8.4 branch tip, but it should apply
to 8.4.1 with maybe a few lines' offset.)

            regards, tom lane