On 2017/09/11 23:38, Robert Haas wrote:
> On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
> <yamada.tatsuro@lab.ntt.co.jp> wrote:
>> Thanks for the comment.
>>
>> As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
>> cost estimation. In the case of SEQ SCAN, these two phases not overlap.
>> However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
>> heap and write new heap" when INDEX SCAN was selected.
>>
>> I agree that progress reporting for sort is difficult. So it only reports
>> the phase ("sorting tuples") in the current design of progress monitor of
>> cluster.
>> It doesn't report counter of sort.
>
> Doesn't that make it almost useless? I would guess that scanning the
> heap and writing the new heap would ordinarily account for most of the
> runtime, or at least enough that you're going to want something more
> than just knowing that's the phase you're in.
Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort processing.
Is this the right way?
>>> The patch is getting the value reported as heap_tuples_total from
>>> OldHeap->rd_rel->reltuples. I think this is pointless: the user can
>>> see that value anyway if they wish. The point of the progress
>>> counters is to expose things the user couldn't otherwise see. It's
>>> also not necessarily accurate: it's only an estimate in the best case,
>>> and may be way off if the relation has recently be extended by a large
>>> amount. I think it's pretty important that we try hard to only report
>>> values that are known to be accurate, because users hate (and mock)
>>> inaccurate progress reports.
>>
>> Do you mean to use the number of rows by using below calculation instead
>> OldHeap->rd_rel->reltuples?
>>
>> estimate rows = physical table size / average row length
>
> No, I mean don't report it at all. The caller can do that calculation
> if they wish, without any help from the progress reporting machinery.
I see. I'll remove that column on next patch.
Regards,
Tatsuro Yamada
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers