Re: Index for range queries on JSON (user defined fields) - Mailing list pgsql-performance

From Marco Colli
Subject Re: Index for range queries on JSON (user defined fields)
Date
Msg-id CAFvCgN4fZc4ZGw5z-y_xbj9sJrK9nMNgzDiUkXBDQQuunZgstQ@mail.gmail.com
Whole thread Raw
In response to Re: Index for range queries on JSON (user defined fields)  (Nick Cleaton <nick@cleaton.net>)
List pgsql-performance
Thanks for the suggestion: I had already considered that solution (first link), but the fear is having to JOIN large tables with hundreds of millions of records. 

For my understanding **using JOIN when dealing with big data is bad and a nightmare for performance**: can you confirm? Or am I missing something?

That tables would be frequently read and updated and are the core of the application: that also means that every update on a user would produce **many dead rows** - not just 1 user row, as in the case of JSON, but many rows in the user metadata table.





On Fri, Dec 4, 2020 at 11:40 PM Nick Cleaton <nick@cleaton.net> wrote:
On Fri, 4 Dec 2020 at 15:39, Marco Colli <collimarco91@gmail.com> wrote:
Hello!

We have a multi-tenant service where each customer has millions of users (total: ~150M rows). Now we would like to let each customer define some custom columns for his users and then let the customer search his users efficiently based on these columns.

This problem seems really hard to solve with PostgreSQL:

In particular the easiest way would be to add a JSON field on the users table (e.g. user metadata). However the PostgreSQL GIN index only supports exact matches and not range queries. This means that a query on a range (e.g. age > 30) would be extremely inefficient and would result in a table scan.

You could have a table of (tenant, customer, setting_name, setting_value) so that a btree index on (tenant, setting_name, setting_value) would work for "select customer from my_table where tenant=$1 and setting_name='age' and setting_value > 30"

That doesn't deal with setting values having a variety of types, but you could have a distinct user defined settings table for each setting value type that you want to support.

pgsql-performance by date:

Previous
From: Nick Cleaton
Date:
Subject: Re: Index for range queries on JSON (user defined fields)
Next
From: Atul Kumar
Date:
Subject: "Required checkpoints occurs too frequently"