Re: POC: Extension for adding distributed tracing - pg_tracing - Mailing list pgsql-hackers

From Anthonin Bonnefoy
Subject Re: POC: Extension for adding distributed tracing - pg_tracing
Date
Msg-id CAO6_Xqr5cNC-fK7kX4Pt9LkYSNjuZQjDfxbm_ckypM9LD8PT1Q@mail.gmail.com
Whole thread Raw
In response to Re: POC: Extension for adding distributed tracing - pg_tracing  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Re: POC: Extension for adding distributed tracing - pg_tracing
List pgsql-hackers

I've initially thought of sending the spans from PostgreSQL since this is the usual behavior of tracing libraries. 

However, this created a lot potential issues:

- Protocol support and differences between trace collectors. OpenTelemetry seems to use gRPC, others are using http and those will require additional libraries (plus gRPC support in C doesn't look good) and any change in protobuf definition would require updating the extension.

- Do we send the spans within the query hooks? This means that we could block the process if the trace collector is slow to answer or we can’t connect. Sending spans from a background process sounded rather complex and resource heavy.

Moving to a pull model fixed those issues and felt more natural as this is the way PostgreSQL exposes its metrics.


On Wed, Jul 26, 2023 at 4:11 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
Nikita,

> This patch looks very interesting, I'm working on the same subject too. But I've used
> another approach - I'm using C wrapper for C++ API library from OpenTelemetry, and
> handle span storage and output to this library. There are some nuances though, but it
> is possible. Have you tried to use OpenTelemetry APIs instead of implementing all
> functionality around spans?

I don't think that PostgreSQL accepts such kind of C++ code, not to
mention the fact that the PostgreSQL license is not necessarily
compatible with Apache 2.0 (I'm not a lawyer; this is not a legal
advice). Such a design decision will probably require using separate
compile flags since the user doesn't necessarily have a corresponding
dependency installed. Similarly to how we do with LLVM, OpenSSL, etc.

So -1 to the OpenTelemetry C++ library and +1 to the properly licensed
C implementation without 3rd party dependencies from me. Especially
considering the fact that the implementation seems to be rather
simple.

--
Best regards,
Aleksander Alekseev

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences, take 2
Next
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] Check more invariants during syscache initialization