Thread: PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011/9/17 Tomas Vondra <tv@fuzzy.cz> wrote:
(...)
> We've been asked by a local university for PostgreSQL-related topics of
> theses and seminary works

I'm also interested in such proposals or ideas!

Here's some list of topics:
* Adding WAL-support to hash indexes in PostgreSQL (see ex-topic)
* Time in PostgreSQL
* Storing (Weather) Sensor Data in PostgreSQL
* Fast Bulk Data Inserting in PostgreSQL with Unlogged tables (incl.
adding GiST support)
* Performance Tuning of Read-Only a PostgreSQL Database
* Materialized Views in PostgreSQL: Experiments around Jonathan
Gardner's Proposal
* more... ?

Yours, Stefan

17.09.11 23:01, Stefan Keller написав(ла):
> * more... ?
What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms
this would be part of shared buffers dedicated to a relation or a set of
relations. When you have a big DB (not fitting in memory) you also
usually want some small tables/indexes be in memory, no matter what
other load DB has.
Complimentary features are:
1) Relations preloading at startup - ensure this relation are in memory.
2) Per buffer pool (or relation) page costs - tell it that this
indexes/tables ARE in memory

Best regards, Vitalii Tymchyshyn.

Stefan Keller, 17.09.2011 22:01:
> I'm also interested in such proposals or ideas!
>
> Here's some list of topics:
> * Time in PostgreSQL
> * Fast Bulk Data Inserting in PostgreSQL with Unlogged tables

I don't understand these two items. Postgres does have a time data type and it has unlogged tables since 9.1

Regards
Thomas

2011/9/19 Vitalii Tymchyshyn <tivv00@gmail.com>:
> 17.09.11 23:01, Stefan Keller написав(ла):
>>
>> * more... ?
>
> What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this
> would be part of shared buffers dedicated to a relation or a set of
> relations. When you have a big DB (not fitting in memory) you also usually
> want some small tables/indexes be in memory, no matter what other load DB
> has.
> Complimentary features are:
> 1) Relations preloading at startup - ensure this relation are in memory.

you can use pgfincore extension to achieve that, for the OS cache. It
does not look interesting to do that for shared_buffers of postgresql
(the subject has been discussed and can be discussed again, please
check mailling list archieve first)

> 2) Per buffer pool (or relation) page costs - tell it that this
> indexes/tables ARE in memory

you can use tablespace parameters (*_cost) for that, it has been
rejected for tables in the past.
I did propose something to start to work in this direction.
See "[WIP] cache estimates, cache access cost" in postgresql-hackers
mailling list.

This proposal let inform the planner of the table memory usage and
take that into account.


>
> Best regards, Vitalii Tymchyshyn.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>



--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Hello.

I did read and AFAIR sometimes responded on this long discussions. The
main point for me is that many DBAs dont want to have even more random
plans with postgresql knowing what's in memory now and using this
information directly in runtime. I also think this point is valid.
What I would like to have is to force some relations to be in memory by
giving them fixed part of shared buffers and to tell postgresql they are
in memory (lowering page costs) to have fixed optimal plans.

Best regards, Vitalii Tymchyshyn.

19.09.11 14:57, Cédric Villemain написав(ла):
> 2011/9/19 Vitalii Tymchyshyn<tivv00@gmail.com>:
>> 17.09.11 23:01, Stefan Keller написав(ла):
>>> * more... ?
>> What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this
>> would be part of shared buffers dedicated to a relation or a set of
>> relations. When you have a big DB (not fitting in memory) you also usually
>> want some small tables/indexes be in memory, no matter what other load DB
>> has.
>> Complimentary features are:
>> 1) Relations preloading at startup - ensure this relation are in memory.
> you can use pgfincore extension to achieve that, for the OS cache. It
> does not look interesting to do that for shared_buffers of postgresql
> (the subject has been discussed and can be discussed again, please
> check mailling list archieve first)
>
>> 2) Per buffer pool (or relation) page costs - tell it that this
>> indexes/tables ARE in memory
> you can use tablespace parameters (*_cost) for that, it has been
> rejected for tables in the past.
> I did propose something to start to work in this direction.
> See "[WIP] cache estimates, cache access cost" in postgresql-hackers
> mailling list.
>
> This proposal let inform the planner of the table memory usage and
> take that into account.


On 17/09/2011 22:01, Stefan Keller wrote:
> 2011/9/17 Tomas Vondra <tv@fuzzy.cz> wrote:
> (...)
>> We've been asked by a local university for PostgreSQL-related topics of
>> theses and seminary works
>
> I'm also interested in such proposals or ideas!
>
> Here's some list of topics:
> * Adding WAL-support to hash indexes in PostgreSQL (see ex-topic)
> * Time in PostgreSQL
> * Storing (Weather) Sensor Data in PostgreSQL
> * Fast Bulk Data Inserting in PostgreSQL with Unlogged tables (incl.
> adding GiST support)
> * Performance Tuning of Read-Only a PostgreSQL Database
> * Materialized Views in PostgreSQL: Experiments around Jonathan
> Gardner's Proposal
> * more... ?

 * Covering indexes
 * Controllable record compression
 * Memory tables