Thread: 500 "too many non super user connections". Guilty parameter: "statement_timeout = 0"

Hello:

Warning: this both a ranting and an advice to novices
about stupid configuration parameter
"statement_timeout", which is by default set to "0",
rendering thus statements inmortal, and increasing up
the number of connections occupied by days-old unused
statements. If you think I am wrong, please correct me
after reading on.

Problems aroused after switching from postgres 6.2,
where everything worked fine, to postgres 7.3 (details
later). So I had no old postgresql.conf to relay on.

I lost the exact wording of the 500 error, but was
like this:

"too many non super user connections"

Now, this was an error rendered by our JSP server
(caucho.resin-3.0.3, which makes roughly same things
as Tomcat), but the source of the error is obviously
the database backend (postgresql-server-7.3.2-3 RedHat
RPM), as the JSP configuration has no parameter named
like that, and postgres does have one like that.

OS is RedHat 9.0 totally up to date running 24 hours
in a Pentium 2.4 GHz, 512 Mg RAM, 80 GB hard disk, so
it is not a perfomance problem. Server is standalone
JSP+Postgres.

JSP pages use these instructions to open and close the
database.

DRIVER = "org.postgresql.Driver";

STRING = "jdbc:postgresql://127.0.0.1:5432/aip";

// get driver and connection
Driver DriverPest =
(Driver)Class.forName(DRIVER).newInstance();

Connection ConnPest =
DriverManager.getConnection(STRING,USERNAME,PASSWORD);

//
// some statements, prepared statements, etc.
//

// politely close ResultSets and connections (NOT IN
//   ALL FILES, SOME FILES "FORGET" TO CLOSE THE
//   CONNECTION, surely the  problem starts there)
rstPest.close();

ConnPest.close();


The 500 error appeared after five/six days in a row of
flawless working, so I interrogated the machine before
restarting the server, to guess the source problem.

Command "netstat -antp" gives me +150 lines the likes
of this:

tcp        0      0 127.0.0.1:5432
127.0.0.1:43741         ESTABLISHED 10033/postgres:
aip

tcp        0      0 127.0.0.1:43741
127.0.0.1:5432          ESTABLISHED 19181/java

...and another +150 lines saying CLOSE_WAIT instead of
ESTABLISHED, all in paired postgres/java ports. Really
slick.

This looks like the "java" process (the JSP server)
connecting to postgres, and the postgres connecting
back to the "java" process.

"aip" is a database name. There are only two databases
in the server, "aip" and "cel". Some connections have
"aip" name, others hace "cel" name, so I know it is
not database especific.

"ps aux" is very funny (look at the end of the
message). Postgres processes are inmortal, and more
processes keep happily popping up day after day until
they stop, in sunday. The 500 error was first noticed
in monday morning, and JSP server process was not
restarted until tuesday morning, after half an hour
collecting enough evidence to make a forensics
analysis. Once the JSP server was stopped, all extra
postgres processes and connections desappeared after a
few seconds. JSP server was restarted and worked
apparently normal.

Then I looked over the postgresql.conf file and found
the guilty line:

statement_timeout = 0        # 0 is disabled, in
milliseconds

Statements never time out! I changed it to:

statement_timeout = 600000        # 0 is disabled, in
milliseconds

Perhaps a bit too long timeout (10 minutes), but want
be on the sure side, and server is fast and new, so
what the heck. The JSP server gets only 1500-2000 hits
a day. That's 75 to  100 visits. With
max_connections=128, it should work fine.

Please understand: if all JSP pages closed connections
correctly, then *perhaps* this could not happen, but
these are legacy pages from long ago, and anyways I
can't make myself responsible for every clumsy
programmer mistake. The server has to work perfectly
*always*.

I must say that Apache+PHP+mysql (all latest RPMs from
RedHat) was running happily all that time in that
server with no problem at all. Oh, well.

Thanks for listening to my rant. Corrections welcome.


*************************
POSTGRESQL.CONF
**************************

#
#    Connection Parameters
#
tcpip_socket = true
ssl = false

max_connections = 128
superuser_reserved_connections = 2

port = 5432
#hostname_lookup = false
#show_source_port = false

#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777    # octal

#virtual_host = ''

#krb_server_keyfile = ''


#
#    Shared Memory Size
#
shared_buffers = 256        # min max_connections*2 or 16,
8KB each
max_fsm_relations = 2000    # min 10, fsm is free space
map, ~40 bytes
max_fsm_pages = 20000        # min 1000, fsm is free space
map, ~6 bytes
max_locks_per_transaction = 64    # min 10
wal_buffers = 32        # min 4, typically 8KB each

#
#    Non-shared Memory Sizes
#
sort_mem = 1024        # min 64, size in KB
vacuum_mem = 8192        # min 1024, size in KB


#
#    Write-ahead log (WAL)
#
#checkpoint_segments = 3    # in logfile segments, min 1,
16MB each
#checkpoint_timeout = 300    # range 30-3600, in seconds
#
#commit_delay = 0        # range 0-100000, in microseconds
#commit_siblings = 5        # range 1-1000
#
#fsync = true
#wal_sync_method = fsync    # the default varies across
platforms:
#                # fsync, fdatasync, open_sync, or open_datasync
#wal_debug = 0            # range 0-16


#
#    Optimizer Parameters
#
#enable_seqscan = true
#enable_indexscan = true
#enable_tidscan = true
#enable_sort = true
#enable_nestloop = true
#enable_mergejoin = true
#enable_hashjoin = true


effective_cache_size = 2000    # typically 8KB each
#random_page_cost = 4        # units are one sequential page
fetch cost
#cpu_tuple_cost = 0.01        # (same)
#cpu_index_tuple_cost = 0.001    # (same)
#cpu_operator_cost = 0.0025    # (same)

#default_statistics_target = 10    # range 1-1000

#
#    GEQO Optimizer Parameters
#
#geqo = true
#geqo_selection_bias = 2.0    # range 1.5-2.0
#geqo_threshold = 11
#geqo_pool_size = 0        # default based on tables in
statement,
                # range 128-1024
#geqo_effort = 1
#geqo_generations = 0
#geqo_random_seed = -1        # auto-compute seed


#
#    Message display
#
#server_min_messages = notice    # Values, in order of
decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   info, notice, warning, error, log, fatal,
                #   panic
#client_min_messages = notice    # Values, in order of
decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   log, info, notice, warning, error
#silent_mode = false

log_connections = true
#log_pid = false
#log_statement = false
log_duration = true
log_timestamp = true


log_min_error_statement = error # Values in order of
increasing severity:
                 #   debug5, debug4, debug3, debug2, debug1,
                 #   info, notice, warning, error, panic(off)

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false

#explain_pretty_print = true

# requires USE_ASSERT_CHECKING
#debug_assertions = true


#
#    Syslog
#
syslog = 0            # range 0-2
#syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'


#
#    Statistics
#
#show_parser_stats = false
#show_planner_stats = false
#show_executor_stats = false
#show_statement_stats = false

# requires BTREE_BUILD_STATS
#show_btree_build_stats = false


#
#    Access statistics collection
#
#stats_start_collector = true
#stats_reset_on_server_start = true
#stats_command_string = false
#stats_row_level = false
#stats_block_level = false


#
#    Lock Tracing
#
#trace_notify = false

# requires LOCK_DEBUG
#trace_locks = false
#trace_userlocks = false
#trace_lwlocks = false
#debug_deadlocks = false
#trace_lock_oidmin = 16384
#trace_lock_table = 0


#
#    Misc
#
#autocommit = true
#dynamic_library_path = '$libdir'
search_path = '$user,public'
#datestyle = 'iso, us'
#timezone = unknown        # actually, defaults to TZ
environment setting
#australian_timezones = false
#client_encoding = sql_ascii    # actually, defaults to
database encoding
#authentication_timeout = 60    # 1-600, in seconds
#deadlock_timeout = 1000    # in milliseconds
#default_transaction_isolation = 'read committed'
#max_expr_depth = 10000        # min 10
#max_files_per_process = 1000    # min 25
password_encryption = true
#sql_inheritance = true
#transform_null_equals = false
statement_timeout = 600000        # 0 is disabled, in
milliseconds
#db_user_namespace = false



#
#    Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'ca_ES.UTF-8'
LC_MONETARY = 'ca_ES.UTF-8'
LC_NUMERIC = 'ca_ES.UTF-8'
LC_TIME = 'ca_ES.UTF-8'



******************************
NETSTAT -ATNP  OUTPUT
*******************************

postgres   680  0.0  0.0 13940    0 ?        SW
Nov06   0:00 postgres: aipo ai
postgres  4641  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  4642  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6410  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6411  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6412  0.0  0.0 13940    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6414  0.0  0.0 13940    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6416  0.0  0.0 13940    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6417  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  6418  0.0  0.0 13924    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  7720  0.0  0.0 13940    0 ?        SW
Nov07   0:00 postgres: aipo ai
postgres  8275  0.0  0.1 13908  604 ?        S
Nov07   0:00 postgres: cel cel
postgres  8278  0.0  0.1 13908  592 ?        S
Nov07   0:00 postgres: cel cel
postgres  8281  0.0  0.1 13908  548 ?        S
Nov07   0:00 postgres: cel cel
postgres  8526  0.0  0.1 13940  668 ?        S
Nov07   0:00 postgres: aipo ai
postgres  8585  0.0  0.1 13924  576 ?        S
Nov07   0:00 postgres: aipo ai
postgres  8586  0.0  0.1 13924  672 ?        S
Nov07   0:00 postgres: aipo ai
postgres  8587  0.0  0.0 13940  300 ?        S
Nov07   0:00 postgres: aipo ai
postgres  9156  0.0  0.1 13924  532 ?        S
Nov08   0:00 postgres: aipo ai
postgres  9157  0.0  0.1 13924  672 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10028  0.0  0.1 13932  608 ?        S
Nov08   0:00 postgres: cel cel
postgres 10033  0.0  0.1 13924  556 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10034  0.0  0.1 13924  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10035  0.0  0.1 13940  676 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10159  0.0  0.1 13932  652 ?        S
Nov08   0:00 postgres: cel cel
postgres 10233  0.0  0.0 13924  496 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10234  0.0  0.1 13924  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 10235  0.0  0.1 13940  676 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11014  0.0  0.1 13924  608 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11015  0.0  0.1 13924  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11016  0.0  0.1 13940  676 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11163  0.0  0.1 13932  584 ?        S
Nov08   0:00 postgres: cel cel
postgres 11479  0.0  0.1 13940  676 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11643  0.0  0.1 13924  636 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11644  0.0  0.1 13924  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 11645  0.0  0.1 13940  676 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12114  0.0  0.0 13940  500 ?        S
Nov08   0:00 postgres: cel cel
postgres 12124  0.0  0.1 13940  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12138  0.0  0.1 13924  672 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12139  0.0  0.1 13924  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12140  0.0  0.1 13940  680 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12277  0.0  0.1 13932  660 ?        S
Nov08   0:00 postgres: cel cel
postgres 12282  0.0  0.1 13940  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12312  0.0  0.1 13940  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12314  0.0  0.1 13924  624 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12315  0.0  0.1 13924  688 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12316  0.0  0.1 13940  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12318  0.0  0.1 13932  520 ?        S
Nov08   0:00 postgres: cel cel
postgres 12471  0.0  0.1 13924  556 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12472  0.0  0.1 13924  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12473  0.0  0.1 13940  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12517  0.0  0.1 13924  604 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12518  0.0  0.1 13924  696 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12519  0.0  0.1 13940  688 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12625  0.0  0.1 13940  632 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12642  0.0  0.1 13940  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12657  0.0  0.1 13940  576 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12816  0.0  0.1 13932  676 ?        S
Nov08   0:00 postgres: cel cel
postgres 12824  0.0  0.0 13932  284 ?        S
Nov08   0:00 postgres: cel cel
postgres 12828  0.0  0.0 13924  432 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12829  0.0  0.1 13924  668 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12830  0.0  0.1 13940  696 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12832  0.0  0.1 13940  696 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12839  0.0  0.1 13944  540 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12842  0.0  0.1 13940  696 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12984  0.0  0.1 13924  572 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12985  0.0  0.1 13924  700 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12986  0.0  0.1 13940  660 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12993  0.0  0.1 13924  696 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12994  0.0  0.1 13924  700 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12995  0.0  0.1 13944  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12996  0.0  0.1 13928  584 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12997  0.0  0.1 13928  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12998  0.0  0.1 13944  688 ?        S
Nov08   0:00 postgres: aipo ai
postgres 12999  0.0  0.1 13928  552 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13000  0.0  0.1 13928  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13001  0.0  0.1 13944  688 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13007  0.0  0.1 13928  564 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13008  0.0  0.1 13928  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13009  0.0  0.1 13944  688 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13017  0.0  0.1 13928  684 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13018  0.0  0.1 13928  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13019  0.0  0.1 13944  692 ?        S
Nov08   0:00 postgres: aipo ai
postgres 13319  0.0  0.1 13944  692 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13321  0.0  0.1 13936  560 ?        S
Nov09   0:00 postgres: cel cel
postgres 13340  0.0  0.1 13944  696 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13496  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13497  0.0  0.1 13928  700 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13498  0.0  0.1 13944  696 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13500  0.0  0.1 13928  532 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13501  0.0  0.1 13928  696 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13502  0.0  0.1 13944  692 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13503  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13504  0.0  0.1 13928  696 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13505  0.0  0.1 13944  608 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13509  0.0  0.0 13928  460 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13510  0.0  0.1 13928  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 13511  0.0  0.1 13944  680 ?        S
Nov09   0:00 postgres: aipo ai
postgres 14274  0.0  0.1 13936  568 ?        S
Nov09   0:00 postgres: cel cel
postgres 14284  0.0  0.1 13928  600 ?        S
Nov09   0:00 postgres: aipo ai
postgres 14285  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 14286  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21700  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21990  0.0  0.0 13928  508 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21991  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21992  0.0  0.1 13944  632 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21993  0.0  0.0 13928  388 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21994  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 21995  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22003  0.0  0.1 13912  648 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22004  0.0  0.1 13912  628 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22010  0.0  0.1 13928  564 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22011  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22012  0.0  0.1 13944  680 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22016  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22019  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22022  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22025  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22028  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22030  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22033  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22034  0.0  0.0 13928  416 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22035  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22036  0.0  0.1 13944  684 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22038  0.0  0.1 13928  644 ?        S
Nov09   0:00 postgres: aipo ai
postgres 22039  0.0  0.1 13928  688 ?        S
Nov09   0:00 postgres: aipo ai

=====
Enric Naval
Estudiante de Inform�tica de Gesti�n en la Udl (Lleida)

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

Enric,

I've noticed a few things that may be major problems, IMHO:

1 - If you're using JSP/Tomcat with postgresql, than you, at
least, should consider using a connection pool. You simply
can't predict the number of pageviews being requested by
web users. Period. If every page spawns a new database connection,
soon or later you're doomed.

2 - You made clear that you're dealing with legacy code, but,
in using java, the programmer should code database access like
this:
try {

  get_db_connection();
  do_some_queries();

} catch {
  deal_exceptions();
} finally {
  close_db_connection();
}

I'm not an expert in computer programming. Maybe you can think
better ways to do the same thing, but, at the end of the process,
the database connection should be closed. Otherwise you'll get
leaked connections.

3 - We have a similar setup here. But our database was being
accessed from different perspectives:
 - a JSP page for report generation
 - normal, application specific data handling, storage and retrival
 - human beings checking data and managing the database

One thing in common: everyone was connecting from the same machine
running the database (it is also a web/jsp server), with the same
user. We noticed some connection leaks and, despite I could link
the leakage with apache/jsp/tomcat (more on that later), we agreed that in
other
circumstances we simply could not be able to
determine where it was coming from. Quick solution: one database,
multiple users, so we can track what process have spawned the
db connection more easily.

4 - connection leaks from JSP

Application servers normally are set up with very specific memory
and garbage collection tuning.
In java, garbage collection has a bit of undesirability, since it
freezes the entire VM during its own execution.
Maybe your JSP pages are leaking connections because the GC isn't
running too often as you desire. Since your server isn't a very
busy one, increasing the number of garbage collection runnings should
close the undesired connections without touching legacy code.

5 - server reliability

You stated that servers should work despite client behaviour or
configuration. I respectfully disagree with that. Well, I even
don't think your issue with the number of database connections
is a problem after all. Postgres was able to determine that more
people were trying to connect to it than what was speficied for
it to handle. During this entire process, all your data were
safe, all other active connections were being serviced.
So, what's the issue here? If the number of connections were set
to "infinite", you really could get into a Denial Of Service issue.
That's why hard limits are useful. They are not mistakes or errors,
but part of the process.

That's all I had to say... Sorry for not including your original
message, but I think this email is already too long to read without
configuration files and commands output.


Marcus.