Re: Strange slowness - Mailing list psycopg
From | Russ Neufeld |
---|---|
Subject | Re: Strange slowness |
Date | |
Msg-id | D6C8F24A-5991-4184-8BD5-8A75DECA9FCE@nasuni.com Whole thread Raw |
In response to | Re: Strange slowness (Daniele Varrazzo <daniele.varrazzo@gmail.com>) |
List | psycopg |
Thanks Daniele. We're not using ssl, but I think you might be right to look at pgbouncer. It does not seem to be maintainingconnections to postgresql for very long, and as a result when a load spike comes in I suspect there's a fair bitof time and contention in reestablishing them. I'm looking there now. On Apr 30, 2012, at 5:46 PM, Daniele Varrazzo wrote: > On Mon, Apr 30, 2012 at 10:29 PM, Russ Neufeld <russn@nasuni.com> wrote: >> Hi all, >> >> I'm investigating some occasional slowness in our API servers, and I've hit >> upon some strange behavior that I can't explain. I'm hoping someone on this >> list can help. >> >> Stack is django 1.3.1, Ubuntu 10.04, python 2.6.5, psycopg 2.4.5, postgresql >> 9.0, pgbouncer 1.4.1. All of these run on the same machine with the >> exception of postgresql, which runs on a dedicated db machine. So, when >> django creates a connection to the database, it is connecting to localhost >> and really talking to pgbouncer. >> >> Here's the relevant code >> from django/db/backends/postgresql_psycopg2/base.py: >> >> self.connection = Database.connect(**conn_params) >> self.connection.set_client_encoding('UTF8') >> self.connection.set_isolation_level(self.isolation_level) >> >> cursor = self.connection.cursor() >> >> cursor.execute("SET TIME ZONE %s", [settings_dict['TIME_ZONE']]) >> >> I put timing code between these lines, and found that calling >> set_client_encoding() was sometimes taking upwards of 6 seconds. Very >> strange, but it led me to the fact that our db encoding was mistakenly set >> to sql_ascii, so I've fixed that. Now, I'm seeing occasional 6 seconds >> delays on the next line - set_isolation_level. We're using defaults, so >> this is setting the level to 1, or "read committed". Theoretically that's a >> noop, but obviously the code needs to check that the value passed in equals >> the value already set, and then return. Still, it shouldn't take upwards of >> 6 seconds to do this at times. The machines are not swapping, and not doing >> anything else besides serving some static content. >> >> It seems very strange that after I fixed the db encoding, the timing problem >> would jump to the next line. Before I fixed the db encoding, we never saw >> long times spent in set_isolation_level. This leads me to believe the issue >> is something else, perhaps network related. For example, >> supposed Database.connect lazily creates a connection, and now >> that set_client_encoding can short circuit due to the encodings matching, >> the first time the connection is used is now in set_isolation_level. This >> is just wild speculation, but I'm at a loss in explaining what I'm seeing. >> >> Anyone have any ideas about what could be going on, or where to look >> further? Thanks, > > From the description I'd look into how pgbouncer connects to the db: > if it was a normal PG connection, the slowest of the commands would be > connect() (and looking at Django's code, Database is just an alias for > psycopg2, so it's really just it). For a test I would connect directly > to the db instead of pgbouncer and see if connect() becomes the slow > part. You can also enable logging in the server to get the duration of > each statement, which should show you that set time zone or whatever > is the first command is not the slow one (you could combine it with a > couple of prints, which I bet will show you that the delay happens > between the execute() statement and the moment the server receives the > command). > > Could it be that pgbouncer is slow to connect because of some ssl > issue? If you rule it out of your setup and see the slowness at > connect time, then it would be an area to look into. > > -- Daniele