Re: about multiprocessingmassdata - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: about multiprocessingmassdata
Date
Msg-id CAHyXU0zKZGAcU3cpdAYGFi846OoZv6my7C7rhxOQaaSMD+8eyw@mail.gmail.com
Whole thread Raw
In response to Re: about multiprocessingmassdata  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-performance
On Mon, Apr 9, 2012 at 6:50 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 10.4.2012 00:37, Merlin Moncure wrote:
>> On Thu, Apr 5, 2012 at 9:47 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>> If you have slower drives, the dependency is about linear (half the
>>> speed -> twice the time). So either your drives are very slow, or
>>> there's something rotten.
>>>
>>> I still haven's seen iostat / vmstat output ... that'd tell us much more
>>> about the causes.
>>
>> geometry column can potentially quite wide.  one thing we need to see
>> is the table has any indexes -- in particular gist/gin on the
>> geometry.
>
> Yeah, but in one of the previous posts the OP posted this:
>
> relname      | relpages  |  reltuples
> -------------+----------+-------------
> poi_all_new |  2421133 | 6.53328e+06
>
> which means the table has ~ 19GB for 6.5 million rows, so it's like
> 2.8GB per 1 million of rows, i.e. ~3kB per row. I've been working with 1
> million rows and 1.3GB of data, so it's like 50% of the expected amount.
>
> But this does not explain why the SELECT COUNT(*) takes 364 seconds on
> that machine. That'd mean ~8MB/s.
>
> Regarding the indexes, the the OP already posted a description of the
> table and apparently there are these indexes:
>
> Indexes:
>    "poi_all_new_pk" PRIMARY KEY, btree (ogc_fid)
>    "poi_all_new_flname_idx" btree (flname)
>    "poi_all_new_geom_idx" btree (wkb_geometry)
>    "poi_all_new_ogc_fid_idx" btree (ogc_fid)
>    "poi_all_new_pinyin_idx" btree (pinyin)
>
> So none of them is GIN/GIST although some one of them is on the geometry
> column.

hm. well, there's a duplicate index in there: ogc_fid is indexed
twice.  how much bloat is on the table (let's see an ANALYZE VERBOSE)?
 what's the storage for this database?

merlin

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster
Next
From: Steve Crawford
Date:
Subject: Re: Linux machine aggressively clearing cache