Thread: Slow Query Help

Slow Query Help

From
Will Platnick
Date:

We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries has raised significantly, especially during our peak time where lots of users are logging in.  According to New Relic, this query is now taking up the most amount of time during peak activity and my pg_stat_activity and slow log sampling agrees. We have 3 DB servers referenced here, production running 9.2.2, semi-idle (idle except for replication when I ran the test) running 9.2.2, and 9.1.3 completely idle with an old dump restored. 


HW: All servers have the same amount of RAM and CPU and all our indexes fit in RAM. HD's differ, but they're all SSD and since the explain analyze's are all indexes, I'm going to assume it doesn't mean that much in the context. 


Configuration across all three servers is the same.


Table was auto vacuumed today.


Idle DB running 9.1.3


http://explain.depesz.com/s/7DZ


Semi-Idle DB running 9.2.2 (transactions ran previously and replication happening, but no active traffic when query was ran)


http://explain.depesz.com/s/fEE


Production DB running 9.2.2 (Incredibly Active)


http://explain.depesz.com/s/qVW


                                       Table "public.users"

      Column       |           Type           |                     Modifiers                      

-------------------+--------------------------+----------------------------------------------------

 id                | integer                  | not null default nextval('users_id_seq'::regclass)

 created_dt        | timestamp with time zone | default now()

 last_login_dt     | timestamp with time zone | default now()

 email             | character varying        | not null

 verified          | boolean                  | default false

 first_name        | character varying        | 

 last_name         | character varying        | 

 username          | character varying(50)    | not null

 location          | character varying        | 

 im_type           | character varying        | 

 im_username       | character varying        | 

 website           | character varying        | 

 phone_mobile      | character varying(30)    | 

 postal_code       | character varying(10)    | 

 language_tag      | character varying(7)     | default 'en'::character varying

 settings          | text                     | default ''::text

 country           | character(2)             | default ''::bpchar

 timezone          | character varying(50)    | not null default 'UTC'::character varying

 verify_hash       | character varying        | 

 bio               | character varying(160)   | 

 twitter           | character varying        | 

 facebook          | character varying        | 

 search_updated_dt | timestamp with time zone | not null default now()

 source            | character varying        | 

 auto_verified     | boolean                  | 

 password          | character(60)            | 

 google            | character varying        | 

 gender            | smallint                 | not null default 0

 birthdate         | date                     | 

 weibo             | character varying        | 

Indexes:

    "users_pkey" PRIMARY KEY, btree (id)

    "u_auth_email_index" UNIQUE, btree (lower(email::text))

    "u_auth_uname_index" UNIQUE, btree (lower(username::text))

    "u_auth_verify_hash_idx" UNIQUE, btree (verify_hash)

    "users_search_updated_dt_idx" btree (search_updated_dt DESC)



-- 
Will Platnick
Sent with Sparrow

Re: Slow Query Help

From
Heikki Linnakangas
Date:
On 05.02.2013 05:45, Will Platnick wrote:
> We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries has raised significantly, especially
duringour peak time where lots of users are logging in.  According to New Relic, this query is now taking up the most
amountof time during peak activity and my pg_stat_activity and slow log sampling agrees. We have 3 DB servers
referencedhere, production running 9.2.2, semi-idle (idle except for replication when I ran the test) running 9.2.2,
and9.1.3 completely idle with an old dump restored. 

The only thing that stands out is that it always checks both indexes for
matches. Since you only want a single row as a result, it seems like it
would be better to first check one index, and only check the other one
if there's no match. Rewriting the query with UNION should do that:

SELECT id, username, password, email, verified, timezone FROM users
WHERE lower(username) = 'randomuser'
UNION ALL
SELECT id, username, password, email, verified, timezone FROM users
WHERE lower(email) = 'randomuser'
LIMIT 1;

Also, if you can assume that email addresses always contain the
@-character, you could take advantage of that and only do the
lower(email) = 'randomuser' search if there is one.

- Heikki


Re: Slow Query Help

From
Pavan Deolasee
Date:
On Tue, Feb 5, 2013 at 9:15 AM, Will Platnick <wplatnick@gmail.com> wrote:
> We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries
> has raised significantly, especially during our peak time where lots of
> users are logging in.  According to New Relic, this query is now taking up
> the most amount of time during peak activity and my pg_stat_activity and
> slow log sampling agrees. We have 3 DB servers referenced here, production
> running 9.2.2, semi-idle (idle except for replication when I ran the test)
> running 9.2.2, and 9.1.3 completely idle with an old dump restored.
>

The only thing that stands out is: on your production server I see
"Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows
actual time as 0.179 ms. Not sure where that additional time is being
spent though. It could be ExecutorStart/End, but have no idea why they
should take so long.

Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: Slow Query Help

From
Will Platnick
Date:
Good eye, I totally missed that!

Any ideas on how to troubleshoot this delay?


On Wednesday, February 6, 2013 at 3:51 AM, Pavan Deolasee wrote:

On Tue, Feb 5, 2013 at 9:15 AM, Will Platnick <wplatnick@gmail.com> wrote:
We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries
has raised significantly, especially during our peak time where lots of
users are logging in. According to New Relic, this query is now taking up
the most amount of time during peak activity and my pg_stat_activity and
slow log sampling agrees. We have 3 DB servers referenced here, production
running 9.2.2, semi-idle (idle except for replication when I ran the test)
running 9.2.2, and 9.1.3 completely idle with an old dump restored.

The only thing that stands out is: on your production server I see
"Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows
actual time as 0.179 ms. Not sure where that additional time is being
spent though. It could be ExecutorStart/End, but have no idea why they
should take so long.

Thanks,
Pavan
--
Pavan Deolasee

Re: Slow Query Help

From
Kevin Grittner
Date:
Will Platnick <wplatnick@gmail.com> wrote:
> Will Platnick <wplatnick@gmail.com> wrote:

>> The only thing that stands out is: on your production server I see
>> "Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows
>> actual time as 0.179 ms. Not sure where that additional time is being
>> spent though. It could be ExecutorStart/End, but have no idea why they
>> should take so long.

> Any ideas on how to troubleshoot this delay?

Is the client which is running the query on the same machine as the
server?  If not, what's the ping time between them?

-Kevin


Re: Slow Query Help

From
Will Platnick
Date:
Clients are technically our pgbouncer which is on the same machine. The explain analyze was local through psql direct to postgresql.

On Wednesday, February 6, 2013 at 11:22 AM, Kevin Grittner wrote:

Will Platnick <wplatnick@gmail.com> wrote:
Will Platnick <wplatnick@gmail.com> wrote:

The only thing that stands out is: on your production server I see
"Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows
actual time as 0.179 ms. Not sure where that additional time is being
spent though. It could be ExecutorStart/End, but have no idea why they
should take so long.

Any ideas on how to troubleshoot this delay?

Is the client which is running the query on the same machine as the
server?  If not, what's the ping time between them?

-Kevin

Re: Slow Query Help

From
Pavan Deolasee
Date:
On Wed, Feb 6, 2013 at 9:52 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Will Platnick <wplatnick@gmail.com> wrote:
>> Will Platnick <wplatnick@gmail.com> wrote:
>
>>> The only thing that stands out is: on your production server I see
>>> "Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows
>>> actual time as 0.179 ms. Not sure where that additional time is being
>>> spent though. It could be ExecutorStart/End, but have no idea why they
>>> should take so long.
>
>> Any ideas on how to troubleshoot this delay?
>
> Is the client which is running the query on the same machine as the
> server?  If not, what's the ping time between them?
>

I don't think the network latency can cause that. The "Total runtime"
is calculated on the server side itself - see ExplainOnePlan().

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee