Re: stored proc and inserting hundreds of thousands of rows - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: stored proc and inserting hundreds of thousands of rows
Date
Msg-id 4DBC0D7C020000250003D09E@gw.wicourts.gov
Whole thread Raw
In response to Re: stored proc and inserting hundreds of thousands of rows  (Joel Reymont <joelr1@gmail.com>)
Responses Re: stored proc and inserting hundreds of thousands of rows  (Joel Reymont <joelr1@gmail.com>)
List pgsql-performance
[rearranging to correct for top-posting]

Joel Reymont <joelr1@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> Joel Reymont <joelr1@gmail.com> wrote:
>>
>>> We have 2 million documents now and linking an ad to all of them
>>> takes 5 minutes on my top-of-the-line SSD MacBook Pro.
>>
>> How long does it take to run just the SELECT part of the INSERT
>> by itself?

> Are you suggesting eliminating the physical linking and
> calculating matching documents on the fly?

I'm not suggesting anything other than it being a good idea to
determine where the time is being spent before trying to make it
faster.  You showed this as the apparent source of the five minute
delay:

  INSERT INTO doc_ads (doc_id, ad_id, distance)
  SELECT doc, (t).ad_id, (t).distance
  FROM (SELECT ads_within_distance(topics, threshold) AS t
       FROM docs
       WHERE id = doc) AS x;

What we don't know is how much of that time is due to writing to the
doc_ads table, and how much is due to reading the other tables.  We
can find that out by running this:

  SELECT doc, (t).ad_id, (t).distance
  FROM (SELECT ads_within_distance(topics, threshold) AS t
       FROM docs
       WHERE id = doc) AS x;

If this is where most of the time is, the next thing is to run it
with EXPLAIN ANALYZE, and post the output.  It's a whole different
set of things to try to tune if that part is fast and the INSERT
itself is slow.

Of course, be aware of caching effects when you time this.

-Kevin

pgsql-performance by date:

Previous
From: Joel Reymont
Date:
Subject: Re: stored proc and inserting hundreds of thousands of rows
Next
From: "Kevin Grittner"
Date:
Subject: Re: stored proc and inserting hundreds of thousands of rows