Re: About "Our CLUSTER implementation is pessimal" patch - Mailing list pgsql-hackers

From Leonardo F
Subject Re: About "Our CLUSTER implementation is pessimal" patch
Date
Msg-id 925824.32998.qm@web29018.mail.ird.yahoo.com
Whole thread Raw
In response to About "Our CLUSTER implementation is pessimal" patch  (Leonardo F <m_lists@yahoo.it>)
Responses Re: About "Our CLUSTER implementation is pessimal" patch  (Leonardo F <m_lists@yahoo.it>)
Re: About "Our CLUSTER implementation is pessimal" patch  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
> I read the thread "Our CLUSTER implementation is pessimal"
> http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php .
>
> I would like to try/integrate that patch as we use CLUSTER a lot on our system.
>
> I was going to try to add the proper cost_index/cost_sort calls to decide which
> "path" should be executed, as in:
>
> http://archives.postgresql.org/pgsql-hackers/2008-09/msg00517.php

I think I got something up and running to check if a table scan + sort is supposed
to be faster than an index scan for a certain CLUSTER operation.

The way I did it is (I guess...) wrong: I created the elements needed by
get_relation_info, create_seqscan_path, create_index_path, cost_sort.

It has been, obviously, a trial and error approach: I added the member values as
soon as one function call crashed... and I bet I didn't get all the corner cases.
Is there any better way of doing it?

Leonardo

(this is called in copy_heap_data to decide which path to choose:)

static bool use_index_scan(Oid tableOid, Oid indexOid)
{
RelOptInfo *rel;
PlannerInfo *root;
Query *query;
PlannerGlobal *glob;
Path *seqAndSortPath;
IndexPath *indexPath;
RangeTblEntry *rte;

rel = makeNode(RelOptInfo);
rel->reloptkind = RELOPT_BASEREL;
rel->relid = 1;
rel->rtekind = RTE_RELATION;

/* needed by get_relation_info */
glob = makeNode(PlannerGlobal);

/* needed by get_relation_info: */
query = makeNode(Query);
query->resultRelation = 0;

root = makeNode(PlannerInfo);

root->parse = query;
root->glob = glob;

get_relation_info(root, tableOid, false, rel);
seqAndSortPath = create_seqscan_path(NULL, rel);

rel->rows = rel->tuples;

rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
rte->relid = tableOid;

root->simple_rel_array_size = 2;
root->simple_rte_array = (RangeTblEntry **)
palloc0(root->simple_rel_array_size * sizeof(RangeTblEntry *));
root->simple_rte_array[1] = rte;

root->total_table_pages = rel->pages;

indexPath = create_index_path(root, (IndexOptInfo*)(list_head(rel->indexlist)->data.ptr_value), NULL, NULL,
ForwardScanDirection,NULL); 
cost_sort(seqAndSortPath, root, NULL, seqAndSortPath->total_cost, rel->tuples, rel->width, -1);

return indexPath->path.total_cost < seqAndSortPath->total_cost;
}






pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: An example of bugs for Hot Standby
Next
From: Tom Lane
Date:
Subject: Re: [NOVICE] Python verison for build in config.pl (Win32)