Re: POC: Extension for adding distributed tracing - pg_tracing - Mailing list pgsql-hackers

From Anthonin Bonnefoy
Subject Re: POC: Extension for adding distributed tracing - pg_tracing
Date
Msg-id CAO6_XqpDGFw=HRW_vLTk+wSjYu1FchB8Ab3=mUZYZj_P0htQLA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Extension for adding distributed tracing - pg_tracing  (Nikita Malakhov <hukutoc@gmail.com>)
Responses Re: POC: Extension for adding distributed tracing - pg_tracing
List pgsql-hackers
Hi!

> 1) query_id added so span to be able to join it with pg_stat_activity and pg_stat_statements;
Sounds good, I've added your changes with my code.

> 2) table for storing spans added, to flush spans buffer
I'm not sure about this. It means that this is something that would only be available on primary as replicas won't be able 
to write data in the table. It will also make version updates and migrations much more complex and I haven't seen a similar
pattern on other extensions.

> 3) added setter function for sampling_rate GUC to tweak it on-the-fly without restart
ok, I've added this in my branch.

On my side, I've made the following changes:
1) All spans are now kept in palloced buffers and only added during end_tracing. This way, we limit the shared_spans lock.
2) I've added a pg_tracing.drop_on_full_buffer parameter to drop all spans when the buffer is full. This could be useful to always keep 
the latest spans when the consuming app is not fast enough. This is also useful for testing.
3) I'm testing more complex queries. Most of my previous tests were using simple query protocol but extended protocol introduces
differences that break some assumptions I did. For example, with multi statement transaction like
BEGIN;
SELECT 1;
SELECT 2;
The parse of SELECT 2 will happen before the ExecutorEnd (and the end_tracing) of SELECT 1. For now, I'm skipping the post parse 
hook if we still have an ongoing tracing.
I've also started running https://github.com/anse1/sqlsmith on a db with full sample and it's currently failing some assertions and I'm 
working to fix those.


On Thu, Aug 3, 2023 at 9:13 PM Nikita Malakhov <hukutoc@gmail.com> wrote:
Hi!

Please check some suggested improvements -
1) query_id added so span to be able to join it with pg_stat_activity and pg_stat_statements;
2) table for storing spans added, to flush spans buffer, for maintenance reasons - to keep track of spans,
with SQL function that flushes buffer into table instead of recordset;
3) added setter function for sampling_rate GUC to tweak it on-the-fly without restart.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Adding a pg_servername() function
Next
From: Tatsuo Ishii
Date:
Subject: Re: Row pattern recognition