Re: Slow concurrent processing - Mailing list pgsql-performance

From Steve Crawford
Subject Re: Slow concurrent processing
Date
Msg-id 513F4FD1.1080303@pinpointresearch.com
Whole thread Raw
In response to Re: Slow concurrent processing  (Misa Simic <misa.simic@gmail.com>)
Responses Re: Slow concurrent processing  (Misa Simic <misa.simic@gmail.com>)
List pgsql-performance
On 03/12/2013 08:06 AM, Misa Simic wrote:
> Thanks Steve
>
> Well, the full story is too complex - but point was - whatever
> blackbox does - it last 0.5 to 2secs per 1 processed record (maybe I
> was wrong but I thought the reason why it takes the time how much it
> needs to actually do the task -CPU/IO/memory whatever is not that
> important....) - so I really don't see difference between: call web
> service, insert row in the table (takes 3 secs) and sleep 3 seconds -
> insert result in the table...
>
> if we do above task for two things sequential - it will last 6
> secs...but if we do it "concurentelly" - it should last 3 secs... (in
> theory :) )

Not at all - even in "theory." Sleep involves little, if any, contention
for resources. Real processing does. So if a process requires 100% of
available CPU then one process gets it all while many running
simultaneously will have to share the available CPU resource and thus
each will take longer to complete. Or, if you prefer, think of a file
download. If it takes an hour to download a 1GB file it doesn't mean
that you can download two 1GB files concurrently in one hour even if
"simulating" the process by a sleep(3600) suggests it is possible.

I should note, however, that depending on the resource that is limiting
your speed there is often room for optimization through simultaneous
processing - especially when processes are CPU bound. Since PostgreSQL
associates each back-end with one CPU *core*, you can have a situation
where one core is spinning and the others are more-or-less idle. In
those cases you may see an improvement by increasing the number of
simultaneous processes to somewhere shy of the number of cores.

>
> I was guessed somewhere is lock - but wasn't clear where/why when
> there are no updates - just inserts...
>
> But I haven't know that during INSERT is done row lock on refferenced
> tables as well - from FK columns...
>
> So I guess now it is cause of the problem...
>
> We will see how it goes with insert into unlogged tables with no FK...
>

It will almost certainly go faster as you have eliminated integrity and
data-safety. This may be acceptable to you (non-real-time crunching of
data that can be reloaded from external sources or temporary processing
that is ultimately written back to durable storage) but it doesn't mean
you have identified the actual cause.

One thing you didn't state. Is all this processing taking place in
PostgreSQL? (i.e. update foo set bar = do_the_math(baz, zap, boom))
where do_the_math is a PL/pgSQL, PL/Python, ...  or are external
processes involved?

Cheers,
Steve



pgsql-performance by date:

Previous
From: Misa Simic
Date:
Subject: Re: Slow concurrent processing
Next
From: Jeff Janes
Date:
Subject: Re: Slow concurrent processing