Thread: Removing a key from jsonb is sloooow

Removing a key from jsonb is sloooow

From
Volkan Unsal
Date:
I'm trying to remove a key from a jsonb column in a table with 10K rows, and the performance is abysmal. When the key is missing, it takes 5 minutes. When the key is present, it takes even longer.

Test with non-existent key:

>> update projects set misc = misc - 'foo';
Time: 324711.960 ms (05:24.712)

What can I do to improve this?

Re: Removing a key from jsonb is sloooow

From
Adrian Klaver
Date:
On 7/17/19 7:30 AM, Volkan Unsal wrote:
> I'm trying to remove a key from a jsonb column in a table with 10K rows, 
> and the performance is abysmal. When the key is missing, it takes 5 
> minutes. When the key is present, it takes even longer.
> 
> Test with non-existent key:
> 
>  >> update projects set misc = misc - 'foo';
> Time: 324711.960 ms (05:24.712)
> 
> What can I do to improve this?

Provide some useful information:

1) Postgres version

2) Table schema

3) Explain analyze of query



-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Removing a key from jsonb is sloooow

From
Tumasgiu Rossini
Date:
Hi,

Your current query actually process your 10K rows,
it is a bit of an overkill if only a few row contains the key you want to delete.
Depending on how big your json data is, this could be problematic.

Have you considered adding a where clause to your query ?

Also, maybe you could create an index on your jsonb column
to improve the identification of rows which
contains the key you want to delete ?



Le mer. 17 juil. 2019 à 16:31, Volkan Unsal <spocksplanet@gmail.com> a écrit :
I'm trying to remove a key from a jsonb column in a table with 10K rows, and the performance is abysmal. When the key is missing, it takes 5 minutes. When the key is present, it takes even longer.

Test with non-existent key:

>> update projects set misc = misc - 'foo';
Time: 324711.960 ms (05:24.712)

What can I do to improve this?

Re: Removing a key from jsonb is sloooow

From
Adrian Klaver
Date:
On 7/17/19 7:58 AM, Volkan Unsal wrote:

Please post to list also.
Ccing list
> @Adrian
> 
> More information about my setup:
> 
> Postgres version:
> PostgreSQL 10.9 (Debian 10.9-1.pgdg90+1) on x86_64-pc-linux-gnu, 
> compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
> 
> Table schema:
> CREATE TABLE public.projects (
>     misc jsonb DEFAULT '{}'::jsonb NOT NULL
> );
> 
> Explain analyze:
> explain analyze update projects set misc = misc - 'foo';
> 
> Update on projects  (cost=0.00..4240.93 rows=10314 width=1149) (actual 
> time=346318.291..346318.295 rows=0 loops=1)
>    ->  Seq Scan on projects  (cost=0.00..4240.93 rows=10314 width=1149) 
> (actual time=1.011..266.435 rows=10314 loops=1)
> Planning time: 40.087 ms
> Trigger trigger_populate_tsv_body_on_projects: time=341202.492 calls=10314
> Execution time: 346320.260 ms
> 
> Time: 345969.035 ms (05:45.969)
> 
> 
> 
> On Wed, Jul 17, 2019 at 10:39 AM Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 7/17/19 7:30 AM, Volkan Unsal wrote:
>      > I'm trying to remove a key from a jsonb column in a table with
>     10K rows,
>      > and the performance is abysmal. When the key is missing, it takes 5
>      > minutes. When the key is present, it takes even longer.
>      >
>      > Test with non-existent key:
>      >
>      >  >> update projects set misc = misc - 'foo';
>      > Time: 324711.960 ms (05:24.712)
>      >
>      > What can I do to improve this?
> 
>     Provide some useful information:
> 
>     1) Postgres version
> 
>     2) Table schema
> 
>     3) Explain analyze of query
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 
> 
> -- 
> *Volkan Unsal*
> /Product Engineer/
> volkanunsal.com <http://volkanunsal.com>


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Removing a key from jsonb is sloooow

From
Volkan Unsal
Date:
More information about my setup:

Postgres version:
PostgreSQL 10.9 (Debian 10.9-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

Table schema:
CREATE TABLE public.projects (
   misc jsonb DEFAULT '{}'::jsonb NOT NULL
);

Explain analyze:
explain analyze update projects set misc = misc - 'foo';

Update on projects  (cost=0.00..4240.93 rows=10314 width=1149) (actual time=346318.291..346318.295 rows=0 loops=1)     
  ->  Seq Scan on projects  (cost=0.00..4240.93 rows=10314 width=1149) (actual time=1.011..266.435 rows=10314 loops=1) 
Planning time: 40.087 ms                                                                                               
Trigger trigger_populate_tsv_body_on_projects: time=341202.492 calls=10314                                             
Execution time: 346320.260 ms                                                                                          

Time: 345969.035 ms (05:45.969)


Figured out that it's due to the trigger. Thanks for your help, Adrian!


On Wed, Jul 17, 2019 at 10:39 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/17/19 7:30 AM, Volkan Unsal wrote:
> I'm trying to remove a key from a jsonb column in a table with 10K rows,
> and the performance is abysmal. When the key is missing, it takes 5
> minutes. When the key is present, it takes even longer.
>
> Test with non-existent key:
>
>  >> update projects set misc = misc - 'foo';
> Time: 324711.960 ms (05:24.712)
>
> What can I do to improve this?

Provide some useful information:

1) Postgres version

2) Table schema

3) Explain analyze of query



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Volkan Unsal
Product Engineer

Re: Removing a key from jsonb is sloooow

From
Adrian Klaver
Date:
On 7/17/19 7:59 AM, Volkan Unsal wrote:

> Aha, it's due to the trigger, isn't it?

Yes.
> 
> On Wed, Jul 17, 2019 at 10:58 AM Volkan Unsal <spocksplanet@gmail.com 
> <mailto:spocksplanet@gmail.com>> wrote:
> 
>     @Adrian
> 
>     More information about my setup:
> 
>     Postgres version:
>     PostgreSQL 10.9 (Debian 10.9-1.pgdg90+1) on x86_64-pc-linux-gnu,
>     compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
> 
>     Table schema:
>     CREATE TABLE public.projects (
>         misc jsonb DEFAULT '{}'::jsonb NOT NULL
>     );
> 
>     Explain analyze:
>     explain analyze update projects set misc = misc - 'foo';
> 
>     Update on projects  (cost=0.00..4240.93 rows=10314 width=1149)
>     (actual time=346318.291..346318.295 rows=0 loops=1)
>        ->  Seq Scan on projects  (cost=0.00..4240.93 rows=10314
>     width=1149) (actual time=1.011..266.435 rows=10314 loops=1)
>     Planning time: 40.087 ms
>     Trigger trigger_populate_tsv_body_on_projects: time=341202.492
>     calls=10314
>     Execution time: 346320.260 ms
> 
>     Time: 345969.035 ms (05:45.969)
> 
> 
> 
>     On Wed, Jul 17, 2019 at 10:39 AM Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>         On 7/17/19 7:30 AM, Volkan Unsal wrote:
>          > I'm trying to remove a key from a jsonb column in a table
>         with 10K rows,
>          > and the performance is abysmal. When the key is missing, it
>         takes 5
>          > minutes. When the key is present, it takes even longer.
>          >
>          > Test with non-existent key:
>          >
>          >  >> update projects set misc = misc - 'foo';
>          > Time: 324711.960 ms (05:24.712)
>          >
>          > What can I do to improve this?
> 
>         Provide some useful information:
> 
>         1) Postgres version
> 
>         2) Table schema
> 
>         3) Explain analyze of query
> 
> 
> 
>         -- 
>         Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 
> 
>     -- 
>     *Volkan Unsal*
>     /Product Engineer/
>     volkanunsal.com <http://volkanunsal.com>
> 
> 
> 
> -- 
> *Volkan Unsal*
> /Product Engineer/
> volkanunsal.com <http://volkanunsal.com>


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Removing a key from jsonb is sloooow

From
Tom Lane
Date:
Volkan Unsal <spocksplanet@gmail.com> writes:
> I'm trying to remove a key from a jsonb column in a table with 10K rows,
> and the performance is abysmal. When the key is missing, it takes 5
> minutes. When the key is present, it takes even longer.

How wide are the jsonb values?  It seems likely that most of this
is TOAST overhead [1], ie time to reassemble wide jsonb values
and then split them up again.

As Tumasgiu already mentioned, it'd likely be useful to suppress
updates of rows that don't actually need to change, assuming that
the key appears in a minority of rows.  And an index could help
even more, by avoiding the need to reconstruct wide values to
see if the key appears in them.

Of course, if most of the rows need an update, neither of these
will help and you just gotta live with it.  Possibly reconsider
your approach of using a large JSONB value to contain fields
you need to update individually.  That's never going to be great
for performance.  SQL (or at least Postgres) is incapable of
updating portions of columns efficiently.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/storage-toast.html