Re: Pluggable Storage - Andres's take - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Pluggable Storage - Andres's take
Date
Msg-id CAJrrPGeUqNy0BxWuDd1n5cLQoWUwQ5jhKnDQ6Ty3wt=KsyOoPQ@mail.gmail.com
Whole thread Raw
In response to Re: Pluggable Storage - Andres's take  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Pluggable Storage - Andres's take
List pgsql-hackers

On Tue, Mar 12, 2019 at 7:28 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

I had a look on the patch set. I cannot see the thread structure
due to the depth and cannot get the picture on the all patches,
but I have some comments. I apologize in advance for possible
duplicate with upthread.


Thanks for the review.
 

0001-Reduce-the...

This doesn't apply master.

Yes, these patches doesn't apply to the master.
These patches can only be applied to the code present in [1].
 
>  TupleTableSlot *
>  ExecStoreHeapTuple(HeapTuple tuple,
>                                  TupleTableSlot *slot,
> +                                Oid relid,
>                                  bool shouldFree)

  The comment for ExecStoreHeapTuple is missing the description
  on "relid" parameter.

Corrected.
 
> -        if (HeapTupleSatisfiesVisibility(tuple, &SnapshotDirty, hscan->rs_cbuf))
> +        if (HeapTupleSatisfiesVisibility(tuple, RelationGetRelid(hscan->rs_scan.rs_rd),
> +                                &SnapshotDirty, hscan->rs_cbuf))

The second parameter seems to be always
RelationGetRelid(...). Actually only relid is required but isn't
it better to take Relation instead of Oid as the second
parameter?

Currently the passed relid is used only in the case of 
history MVCC verification function. Passing the direct relation
pointer will lessen the performance impact as there is no
need of calculation to find out the relid.

Will update and share it.
 


0005-Reorganize-am-as...

> +   catalog. The contents of an table are entirely under the control of its

 "of an table" => "of a table"

Corrected. 
 
0006-Doc-update-of-Create-access..

> +      be <type>index_am_handler</type> and for <literal>TABLE</literal>
> +      access methods, it must be <type>table_am_handler</type>.
> +      The C-level API that the handler function must implement varies
> +      depending on the type of access method. The index access method API
> +      is described in <xref linkend="index-access-methods"/> and the table access method
> +      API is described in <xref linkend="table-access-methods"/>.

If table is the primary object, talbe-am should precede index-am?

Changed.
 

0007-Doc-update-of-create-materi..

> +   This clause specifies optional access method for the new materialize view;

 "materialize view" => "materialized view"?

Corrected.
 
> +   If this option is not specified, then the default table access method

 I'm not sure the 'then' is needed.

> +   is chosen for the new materialized view. see <xref linkend="guc-default-table-access-method"/>

 "see" => "See"


0008-Doc-update-of-CREATE_TABLE..

> +[ USING <replaceable class="parameter">method</replaceable> ]

 *I* prefer access_method to just method.

> +  If this option is not specified, then the default table access method

 Same to 0007. "I'm not sure the 'then' is needed.".

> +  is chosen for the new table. see <xref linkend="guc-default-table-access-method"/>

 Same to 0007. " "see" => "See" "
"

0009-Doc-of-CREATE-TABLE-AS

 Same as 0008.

Corrected as per your suggestions.
 
0010-Table-access-method-API-

> +    Any new <literal>TABLE ACCSESS METHOD</literal> developers can refer the exisitng <literal>HEAP</literal>


> +    There are differnt type of API's that are defined and those details are below.

  "differnt" => "different"

> +     by the AM, in case if it support parallel scan.

  "support" => "supports"

Corrected above both.
 
> + This API to return the total size that is required for the AM to perform

  Total size of what? Shared memory chunk? Or parallel scan descriptor?

It returns the required parallel scan descriptor memory size.
 

> +     the parallel table scan. The minimum size that is required is
> +     <structname>ParallelBlockTableScanDescData</structname>.

  I don't get what the "The minimum size" tells. Just reading
  this I would always return the minimum size...


> +     This API to perform the initialization of the <literal>parallel_scan</literal>
> +     that is required for the parallel scan to be performed by the AM and also return
> +     the total size that is required for the AM to perform the parallel table scan.

 (Note: I'm not good at English..) Similar to the above. I cannot
 read what the "size" is for.

 In the code it is used as:

  > Size snapshot_off = rel->rd_tableam->parallelscan_initialize(rel, pscan);

  (The varialbe name should be snapshot_offset) It is the offset
  from the beginning of parallel scan descriptor but it should be
  described in other representation, which I'm not sure of..

  Something like this?

  > This API to initialize a parallel scan by the AM and also
  > return the consumed size so far of parallel scan descriptor.

I updated the doc around those API's to make easy to understand.
Can you please check whether that helps?
 
updated patches are attached.

[1] -  https://github.com/anarazel/postgres-pluggable-storage.git

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Justin Pryzby
Date:
Subject: Re: Change ereport level for QueuePartitionConstraintValidation