Re: Google Summer of code 2013 - Mailing list pgsql-students

From Stephen Frost
Subject Re: Google Summer of code 2013
Date
Msg-id 20130415162745.GT4361@tamriel.snowman.net
Whole thread Raw
In response to Re: Google Summer of code 2013  ("Karel K. Rozhoň" <karel.rozhon@gmail.com>)
List pgsql-students
* Karel K. Rozhoň (karel.rozhon@gmail.com) wrote:
> Of course I don't see all aspects of this problem, so I cannot tell what should be good for future. But I have done
someprofiles of group by select and I believe, parallel calling of some hash procedures could help.  

There seems to be some confuison here.  It's certainly true that *many*
(most?  all?) pieces of query processing would benefit from parallel
execution; there is no debate on that.

The issue is that PG is not currently set up to do *any* per-query
parallel processing and it is *not* a trival thing to change that.  We
can talk all day about how wonderful it'd be to do parallel hashing,
parallel sorting, etc, but until PG has a way to parallelize query
processing, there's really no point to writing code to parallelize
individual nodes.

> Of course I know, these simply case is only teoretical and in real tables are data much more complicated, but as I
cansee, almost 40% of CPU time was computed only one hash function: hash_search_with_hash_value.  

Improvements to that would be great, but you can't simply call
pthread_create() in a PG backend and expect things to work.

    Thanks,

        Stephen

Attachment

pgsql-students by date:

Previous
From: David Fetter
Date:
Subject: Re: Google Summer of code 2013
Next
From: viod
Date:
Subject: Re: Google Summer of code 2013