Thread: How to accurately determine when a relation should use local buffers?

How to accurately determine when a relation should use local buffers?

From

Давыдов Виталий

Date:

21 November 2023, 05:04:05

Dear Hackers,

I would like to clarify, what the correct way is to determine that a given relation is using local buffers. Local buffers, as far as I know, are used for temporary tables in backends. There are two functions/macros (bufmgr.c): SmgrIsTemp, RelationUsesLocalBuffers. The first function verifies that the current process is a regular session backend, while the other macro verifies the relation persistence characteristic. It seems, the use of each function independently is not correct. I think, these functions should be applied in pair to check for local buffers use, but, it seems, these functions are used independently. It works until temporary tables are allowed only in session backends.

I'm concerned, how to determine the use of local buffers in some other theoretical cases? For example, if we decide to replicate temporary tables? Are there the other cases, when local buffers can be used with relations in the Vanilla? Do we allow the use of relations with RELPERSISTENCE_TEMP not only in session backends?

Thank you in advance for your help!

With best regards,
Vitaly Davydov

Re: How to accurately determine when a relation should use local buffers?

From

Aleksander Alekseev

Date:

21 November 2023, 08:52:13

Hi,

> I would like to clarify, what the correct way is to determine that a given relation is using local buffers. Local
buffers,as far as I know, are used for temporary tables in backends. There are two functions/macros (bufmgr.c):
SmgrIsTemp,RelationUsesLocalBuffers. The first function verifies that the current process is a regular session backend,
whilethe other macro verifies the relation persistence characteristic. It seems, the use of each function independently
isnot correct. I think, these functions should be applied in pair to check for local buffers use, but, it seems, these
functionsare used independently. It works until temporary tables are allowed only in session backends. 

Could you please provide a specific example when the current code will
do something wrong/unintended?

> I'm concerned, how to determine the use of local buffers in some other theoretical cases? For example, if we decide
toreplicate temporary tables? Are there the other cases, when local buffers can be used with relations in the Vanilla?
Dowe allow the use of relations with RELPERSISTENCE_TEMP not only in session backends? 

Temporary tables, by definition, are visible only within one session.
I can't imagine how and why they would be replicated.

--
Best regards,
Aleksander Alekseev

Re: How to accurately determine when a relation should use local buffers?

From

Vitaly Davydov

Date:

21 November 2023, 10:18:06

Hi Aleksander,

Thank you for the reply.

Could you please provide a specific example when the current code willdo something wrong/unintended?

I can't say that something is wrong in vanilla. But if you decide to replicate DDL in some solutions like multimaster, you might want to replicate CREATE TEMPORARY TABLE. Furthermore, there is some possible inconsistency in the code show below (REL_16_STABLE) in bufmgr.c file:

FlushRelationBuffers, PrefetchBuffer uses RelationUsesLocalBuffers(rel).
ExtendBufferedRel_common finally use BufferManagerRelation.relpersistence which is actually rd_rel->relpersistence, works like RelationUsesLocalBuffers.
ReadBuffer_common uses isLocalBuf = SmgrIsTemp(smgr), that checks rlocator.backend for InvalidBackendId.

I would like to clarify, do we completely refuse the use of temporary tables in other contexts than in backends or there is some work-in-progress to allow some other usage contexts? If so, the check of rd_rel->relpersistence is enough. Not sure why we use SmgrIsTemp instead of RelationUsesLocalBuffers in ReadBuffer_common.

With best regards,

Vitaly Davydov

вт, 21 нояб. 2023 г. в 11:52, Aleksander Alekseev <aleksander@timescale.com>:

Hi,

> I would like to clarify, what the correct way is to determine that a given relation is using local buffers. Local buffers, as far as I know, are used for temporary tables in backends. There are two functions/macros (bufmgr.c): SmgrIsTemp, RelationUsesLocalBuffers. The first function verifies that the current process is a regular session backend, while the other macro verifies the relation persistence characteristic. It seems, the use of each function independently is not correct. I think, these functions should be applied in pair to check for local buffers use, but, it seems, these functions are used independently. It works until temporary tables are allowed only in session backends.

Could you please provide a specific example when the current code will
do something wrong/unintended?

> I'm concerned, how to determine the use of local buffers in some other theoretical cases? For example, if we decide to replicate temporary tables? Are there the other cases, when local buffers can be used with relations in the Vanilla? Do we allow the use of relations with RELPERSISTENCE_TEMP not only in session backends?

Temporary tables, by definition, are visible only within one session.
I can't imagine how and why they would be replicated.

--
Best regards,
Aleksander Alekseev

С уважением,
Давыдов Виталий
http://www.vdavydov.ru

Re: How to accurately determine when a relation should use local buffers?

From

Aleksander Alekseev

Date:

21 November 2023, 15:01:30

Hi,

> Furthermore, there is some possible inconsistency in the code show below (REL_16_STABLE) in bufmgr.c file:
>
> FlushRelationBuffers, PrefetchBuffer uses RelationUsesLocalBuffers(rel).
> ExtendBufferedRel_common finally use BufferManagerRelation.relpersistence which is actually rd_rel->relpersistence,
workslike RelationUsesLocalBuffers. 
> ReadBuffer_common uses isLocalBuf = SmgrIsTemp(smgr), that checks rlocator.backend for InvalidBackendId.

I didn't do a deep investigation of the code in this particular aspect
but that could be a fair point. Would you like to propose a
refactoring that unifies the way we check if the relation is
temporary?

> I would like to clarify, do we completely refuse the use of temporary tables in other contexts than in backends or
thereis some work-in-progress to allow some other usage contexts? If so, the check of rd_rel->relpersistence is enough.
Notsure why we use SmgrIsTemp instead of RelationUsesLocalBuffers in ReadBuffer_common. 

According to the comments in relfilelocator.h:

```
/*
 * Augmenting a relfilelocator with the backend ID provides all the information
 * we need to locate the physical storage.  The backend ID is InvalidBackendId
 * for regular relations (those accessible to more than one backend), or the
 * owning backend's ID for backend-local relations.  Backend-local relations
 * are always transient and removed in case of a database crash; they are
 * never WAL-logged or fsync'd.
 */
typedef struct RelFileLocatorBackend
{
    RelFileLocator locator;
    BackendId    backend;
} RelFileLocatorBackend;

#define RelFileLocatorBackendIsTemp(rlocator) \
    ((rlocator).backend != InvalidBackendId)
```

And this is what ReadBuffer_common() and other callers of SmgrIsTemp()
are using. So no, you can't have a temporary table without an assigned
RelFileLocatorBackend.backend.

It is my understanding that SmgrIsTemp() and
RelationUsesLocalBuffers() are equivalent except the fact that the
first macro works with SMgrRelation objects and the second one - with
Relation objects.

--
Best regards,
Aleksander Alekseev

Re: How to accurately determine when a relation should use local buffers?

From

Давыдов Виталий

Date:

22 November 2023, 10:29:30

Hi Aleksander,

Thank you for your answers. It seems, local buffers are used for temporary relations unconditionally. In this case, we may check either relpersistence or backend id, or both of them.

I didn't do a deep investigation of the code in this particular aspect but that could be a fair point. Would you like to propose a refactoring that unifies the way we check if the relation is temporary?

I would propose not to associate temporary relations with local buffers. I would say, that we that we should choose local buffers only in a backend context. It is the primary condition. Thus, to choose local buffers, two checks should be succeeded:

relpersistence (RelationUsesLocalBuffers)
backend id (SmgrIsTemp)

I know, it may be not as effective as to check relpersistence only, but it makes the internal architecture more flexible, I believe.

With best regards,
Vitaly Davydov

Re: How to accurately determine when a relation should use local buffers?

From

Aleksander Alekseev

Date:

22 November 2023, 13:38:52

Hi,

> I would propose not to associate temporary relations with local buffers

The whole point of why local buffers exist is to place the buffers of
temp tables into MemoryContexts so that these tables will not fight
for the locks for shared buffers with the rest of the system. If we
start treating them as regular tables this will cause a severe
performance degradation. I doubt that such a patch will make it.

I sort of suspect that you are working on a very specific extension
and/or feature for PG fork. Any chance you could give us more details
about the case?

-- 
Best regards,
Aleksander Alekseev

Re: How to accurately determine when a relation should use local buffers?

From

Давыдов Виталий

Date:

24 November 2023, 07:10:17

Hi Aleksander,

I sort of suspect that you are working on a very specific extension
and/or feature for PG fork. Any chance you could give us more details
about the case?

I'm trying to adapt a multimaster solution to some changes in pg16. We replicate temp table DDL due to some reasons. Furthermore, such tables should be accessible from other processes than the replication receiver process on a replica, and they still should be temporary. I understand that DML replication for temporary tables will cause a severe performance degradation. But it is not our case.

There are some changes in ReadBuffer logic if to compare with pg15. To define which buffers to use, ReadBuffer used SmgrIsTemp function in pg15. The decision was based on backend id of the relation. In pg16 the decision is based on relpersistence attribute, that caused some problems on my side. My opinion, we should choose local buffers based on backend ids of relations, not on its persistence. Additional check for relpersistence prior to backend id may improve the performance in some cases, I think. The internal design may become more flexible as a result.

With best regards,
Vitaly Davydov

Re: How to accurately determine when a relation should use local buffers?

From

Aleksander Alekseev

Date:

24 November 2023, 12:51:59

Hi,

> There are some changes in ReadBuffer logic if to compare with pg15. To define which buffers to use, ReadBuffer used
SmgrIsTempfunction in pg15. The decision was based on backend id of the relation. In pg16 the decision is based on
relpersistenceattribute, that caused some problems on my side. My opinion, we should choose local buffers based on
backendids of relations, not on its persistence. Additional check for relpersistence prior to backend id may improve
theperformance in some cases, I think. The internal design may become more flexible as a result. 

Well even assuming this patch will make it to the upstream some day,
which I seriously doubt, it will take somewhere between 2 and 5 years.
Personally I would recommend reconsidering this design.

--
Best regards,
Aleksander Alekseev

Re: How to accurately determine when a relation should use local buffers?

From

Давыдов Виталий

Date:

27 November 2023, 08:56:11

Hi Aleksander,

Well even assuming this patch will make it to the upstream some day,
which I seriously doubt, it will take somewhere between 2 and 5 years.
Personally I would recommend reconsidering this design.

I understand what you are saying. I have no plans to create a patch for this issue. I would like to believe that my case will be taken into consideration for next developments. Thank you very much for your help!

With best regards,
Vitaly