Thread: Batch insert heavily affecting query performance.

Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

One table only, called Messages:

Uuid
Country  (ISO)
Role (Text)
User id  (Text)
GroupId (integer)
Channel (text)
Title (Text)
Payload (JSON, up to 20kb)
Starts_in (UTC)
Expires_in (UTC)
Seen (boolean)
Deleted (boolean)
LastUpdate (UTC)
Created_by (UTC)
Created_in (UTC)

Indexes:

UUID (PK)
UserID + Country (main index)
LastUpdate 
GroupID 


We inserted 160MM rows, around 2KB each. No partitioning.

Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.

Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!

We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql. 

The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)

We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.

We haven't done any performance tuning in the DB. 

With all that said, the question is:

What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).

We are limited to use AWS RDS at the moment.

Cheers


Re: Batch insert heavily affecting query performance.

From
"MichaelDBA@sqlexec.com"
Date:
Are the inserts being done through one connection or multiple connections concurrently?

Sent from my iPhone

> On Dec 24, 2017, at 2:51 PM, Jean Baro <jfbaro@gmail.com> wrote:
>
> Hi there,
>
> We are testing a new application to try to find performance issues.
>
> AWS RDS m4.large 500GB storage (SSD)
>
> One table only, called Messages:
>
> Uuid
> Country  (ISO)
> Role (Text)
> User id  (Text)
> GroupId (integer)
> Channel (text)
> Title (Text)
> Payload (JSON, up to 20kb)
> Starts_in (UTC)
> Expires_in (UTC)
> Seen (boolean)
> Deleted (boolean)
> LastUpdate (UTC)
> Created_by (UTC)
> Created_in (UTC)
>
> Indexes:
>
> UUID (PK)
> UserID + Country (main index)
> LastUpdate
> GroupID
>
>
> We inserted 160MM rows, around 2KB each. No partitioning.
>
> Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows
increased. In the end we got around 500 inserts per second. 
>
> Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over
20seconds!!! 
>
> We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql.
>
> The insert performance is important, but we would slow it down if needed in order to ensure a more flat query
performance.(Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered
andorder by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text
searchin Postgres. In some ways we use it more like a glorified file system. :) 
>
> We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the
queryperformance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each)
under2 seconds, even when there is millions of messages on SQS being inserted into PG. 
>
> We haven't done any performance tuning in the DB.
>
> With all that said, the question is:
>
> What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low
priority).
>
> We are limited to use AWS RDS at the moment.
>
> Cheers
>
>



Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Multiple connections, but we are going to test it with only one. Would it make any difference?

Thanks 



Em 24 de dez de 2017 21:52, "MichaelDBA@sqlexec.com" <michaeldba@sqlexec.com> escreveu:
Are the inserts being done through one connection or multiple connections concurrently?

Sent from my iPhone

> On Dec 24, 2017, at 2:51 PM, Jean Baro <jfbaro@gmail.com> wrote:
>
> Hi there,
>
> We are testing a new application to try to find performance issues.
>
> AWS RDS m4.large 500GB storage (SSD)
>
> One table only, called Messages:
>
> Uuid
> Country  (ISO)
> Role (Text)
> User id  (Text)
> GroupId (integer)
> Channel (text)
> Title (Text)
> Payload (JSON, up to 20kb)
> Starts_in (UTC)
> Expires_in (UTC)
> Seen (boolean)
> Deleted (boolean)
> LastUpdate (UTC)
> Created_by (UTC)
> Created_in (UTC)
>
> Indexes:
>
> UUID (PK)
> UserID + Country (main index)
> LastUpdate
> GroupID
>
>
> We inserted 160MM rows, around 2KB each. No partitioning.
>
> Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.
>
> Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!
>
> We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql.
>
> The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)
>
> We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.
>
> We haven't done any performance tuning in the DB.
>
> With all that said, the question is:
>
> What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).
>
> We are limited to use AWS RDS at the moment.
>
> Cheers
>
>

Re: Batch insert heavily affecting query performance.

From
MichaelDBA
Date:
Yes it would/does make a difference!  When you do it with one connection you should see a big performance gain.  Delayed, granted, extend locks (locktype=extend) can happen due to many concurrent connections trying to insert into the same table at the same time. Each insert request results in an extend lock (8k extension), which blocks other writers. What normally happens is the these extend locks happen so fast that you hardly ever see them in the pg_locks table, except in the case where many concurrent connections are trying to do a lot of inserts into the same table. The following query will show if this is the case:

select * from pg_locks where granted = false and locktype = 'extend';

Sunday, December 24, 2017 7:09 PM
Multiple connections, but we are going to test it with only one. Would it make any difference?

Thanks 



Sunday, December 24, 2017 6:52 PM
Are the inserts being done through one connection or multiple connections concurrently?

Sent from my iPhone



Sunday, December 24, 2017 2:51 PM
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

One table only, called Messages:

Uuid
Country  (ISO)
Role (Text)
User id  (Text)
GroupId (integer)
Channel (text)
Title (Text)
Payload (JSON, up to 20kb)
Starts_in (UTC)
Expires_in (UTC)
Seen (boolean)
Deleted (boolean)
LastUpdate (UTC)
Created_by (UTC)
Created_in (UTC)

Indexes:

UUID (PK)
UserID + Country (main index)
LastUpdate 
GroupID 


We inserted 160MM rows, around 2KB each. No partitioning.

Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.

Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!

We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql. 

The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)

We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.

We haven't done any performance tuning in the DB. 

With all that said, the question is:

What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).

We are limited to use AWS RDS at the moment.

Cheers



Re: Batch insert heavily affecting query performance.

From
Danylo Hlynskyi
Date:
I had an opportunity to perform insertion of 700MM rows into Aurora Postgresql, for which performance insights are available. Turns out, that there are two stages of insert slowdown - first happens when max WAL buffers limit reached, second happens around 1 hour after.

The first stage cuts insert performance twice, and WALWrite lock is main bottleneck. I think WAL just can't sync changes log that fast, so it waits while older log entries are flushed. This creates both read and write IO.

The second stage is unique to Aurora/RDS and is characterized by excessive read data locks and total read IO. I couldn't figure out why does it read so much in a write only process, and AWS support didn't answer yet.

So, for you, try to throttle inserts so WAL is never overfilled and you don't experience WALWrite locks, and then increase wal buffers to max.

24 груд. 2017 р. 21:51 "Jean Baro" <jfbaro@gmail.com> пише:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

One table only, called Messages:

Uuid
Country  (ISO)
Role (Text)
User id  (Text)
GroupId (integer)
Channel (text)
Title (Text)
Payload (JSON, up to 20kb)
Starts_in (UTC)
Expires_in (UTC)
Seen (boolean)
Deleted (boolean)
LastUpdate (UTC)
Created_by (UTC)
Created_in (UTC)

Indexes:

UUID (PK)
UserID + Country (main index)
LastUpdate 
GroupID 


We inserted 160MM rows, around 2KB each. No partitioning.

Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.

Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!

We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql. 

The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)

We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.

We haven't done any performance tuning in the DB. 

With all that said, the question is:

What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).

We are limited to use AWS RDS at the moment.

Cheers



Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Thanks for the clarification guys.

It will be super useful. After trying this I'll post the results!

Merry Christmas!

Em 25 de dez de 2017 00:59, "Danylo Hlynskyi" <abcz2.uprola@gmail.com> escreveu:
I had an opportunity to perform insertion of 700MM rows into Aurora Postgresql, for which performance insights are available. Turns out, that there are two stages of insert slowdown - first happens when max WAL buffers limit reached, second happens around 1 hour after.

The first stage cuts insert performance twice, and WALWrite lock is main bottleneck. I think WAL just can't sync changes log that fast, so it waits while older log entries are flushed. This creates both read and write IO.

The second stage is unique to Aurora/RDS and is characterized by excessive read data locks and total read IO. I couldn't figure out why does it read so much in a write only process, and AWS support didn't answer yet.

So, for you, try to throttle inserts so WAL is never overfilled and you don't experience WALWrite locks, and then increase wal buffers to max.

24 груд. 2017 р. 21:51 "Jean Baro" <jfbaro@gmail.com> пише:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

One table only, called Messages:

Uuid
Country  (ISO)
Role (Text)
User id  (Text)
GroupId (integer)
Channel (text)
Title (Text)
Payload (JSON, up to 20kb)
Starts_in (UTC)
Expires_in (UTC)
Seen (boolean)
Deleted (boolean)
LastUpdate (UTC)
Created_by (UTC)
Created_in (UTC)

Indexes:

UUID (PK)
UserID + Country (main index)
LastUpdate 
GroupID 


We inserted 160MM rows, around 2KB each. No partitioning.

Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.

Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!

We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql. 

The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)

We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.

We haven't done any performance tuning in the DB. 

With all that said, the question is:

What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).

We are limited to use AWS RDS at the moment.

Cheers



RE: Batch insert heavily affecting query performance.

From
"Mike Sofen"
Date:

Hi Jean,

 

I’ve used Postgres on a regular EC2 instance (an m4.xlarge), storing complex genomic data, hundreds of millions of rows in a table and “normal” queries that used an index returned in 50-100ms, depending on the query…so this isn’t a Postgres issue per se. 

 

Your table and index structures look ok, although in PG, use the “text” datatype instead of varchar, it is the optimized type for storing string data of any size (even a 2 char country code).  Since you have 2 such columns that you’ve indexed and are querying for, there is a chance you’ll see an improvement. 

 

I have not yet used Aurora or RDS for any large data…it sure seems like the finger could be pointing there, but it isn’t clear what mechanism in Aurora could be creating the slowness.

 

Is there a possibility of you creating the same db on a normal EC2 instance with PG installed and running the same test?  There is nothing else obvious about your data/structure that could result in such terrible performance.

 

Mike Sofen

 

From: Jean Baro [mailto:jfbaro@gmail.com]
Sent: Wednesday, December 27, 2017 7:14 AM

Hello,

 

We are still seeing queries  (by UserID + UserCountry) taking over 2 seconds, even when there is no batch insert going on at the same time.

 

Each query returns from 100 to 200 messagens, which would be a 400kb pay load, which is super tiny.

 

I don't know what else I can do with the limitations (m4.large), 167MM rows, almost 500GB database and 29GB of indexes (all indexes).

 

I am probably to optimistic, but I was expecting queries (up to 50 queries per second) to return  (99th) under 500ms or even less, as the index is simple, there is no aggregation or join involves.

 

Any suggestion?

 

The table structure:

CREATE TABLE public.card

(

    id character(36) NOT NULL,

    user_id character varying(40) NOT NULL,

    user_country character(2) NOT NULL,

    user_channel character varying(40),

    user_role character varying(40),

    created_by_system_key character(36) NOT NULL,

    created_by_username character varying(40),

    created_at timestamp with time zone NOT NULL,

    last_modified_at timestamp with time zone NOT NULL,

    date_start timestamp with time zone NOT NULL,

    date_end timestamp with time zone NOT NULL,

    payload json NOT NULL,

    tags character varying(500),

    menu character varying(50),

    deleted boolean NOT NULL,

    campaign character varying(500) NOT NULL,

    correlation_id character varying(50),

    PRIMARY KEY (id)

);

 

CREATE INDEX idx_user_country

    ON public.card USING btree

    (user_id COLLATE pg_catalog."default", user_country COLLATE pg_catalog."default");

 

CREATE INDEX idx_last_modified_at

    ON public.card USING btree

    (last_modified_at ASC NULLS LAST);

 

CREATE INDEX idx_campaign

    ON public.card USING btree

    (campaign ASC NULLS LAST)

 

The EXPLAIN

 

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'

'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

 

 

 

Em 25 de dez de 2017 01:10, "Jean Baro" <jfbaro@gmail.com> escreveu:

Thanks for the clarification guys.

 

It will be super useful. After trying this I'll post the results!

 

Merry Christmas!

 

Re: Batch insert heavily affecting query performance.

From
Jeremy Finzel
Date:


The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.

Re: Batch insert heavily affecting query performance.

From
Jeff Janes
Date:
On Sun, Dec 24, 2017 at 11:51 AM, Jean Baro <jfbaro@gmail.com> wrote:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

Is that general purpose SSD, or provisioned IOPS SSD?  If provisioned, what is the level of provisioning?

Cheers,

Jeff

Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Thanks Rick,

We are now partitioning the DB (one table) into 100 sets of data.

As soon as we finish this new experiment we will provide a better EXPLAIN as you suggested. :)

Em 27 de dez de 2017 13:38, "Rick Otten" <rottenwindfish@gmail.com> escreveu:


On Wed, Dec 27, 2017 at 10:13 AM, Jean Baro <jfbaro@gmail.com> wrote:
Hello,

We are still seeing queries  (by UserID + UserCountry) taking over 2 seconds, even when there is no batch insert going on at the same time.

Each query returns from 100 to 200 messagens, which would be a 400kb pay load, which is super tiny.

I don't know what else I can do with the limitations (m4.large), 167MM rows, almost 500GB database and 29GB of indexes (all indexes).

I am probably to optimistic, but I was expecting queries (up to 50 queries per second) to return  (99th) under 500ms or even less, as the index is simple, there is no aggregation or join involves.

Any suggestion?


Although you aren't querying by it, if your id column is actually a UUID, as a best practice I strongly recommend switching the column type to uuid.  If you do query by the primary key, a uuid query will be much faster than a char or varchar column query.

You'll need to submit a more complete explain plan than what you have below.
  Try using:
       explain (analyze, costs, verbose, buffers) select ...



The table structure:
CREATE TABLE public.card
(
    id character(36) NOT NULL,
    user_id character varying(40) NOT NULL,
    user_country character(2) NOT NULL,
    user_channel character varying(40),
    user_role character varying(40),
    created_by_system_key character(36) NOT NULL,
    created_by_username character varying(40),
    created_at timestamp with time zone NOT NULL,
    last_modified_at timestamp with time zone NOT NULL,
    date_start timestamp with time zone NOT NULL,
    date_end timestamp with time zone NOT NULL,
    payload json NOT NULL,
    tags character varying(500),
    menu character varying(50),
    deleted boolean NOT NULL,
    campaign character varying(500) NOT NULL,
    correlation_id character varying(50),
    PRIMARY KEY (id)
);

CREATE INDEX idx_user_country
    ON public.card USING btree
    (user_id COLLATE pg_catalog."default", user_country COLLATE pg_catalog."default");
 
CREATE INDEX idx_last_modified_at
    ON public.card USING btree
    (last_modified_at ASC NULLS LAST);
 
CREATE INDEX idx_campaign
    ON public.card USING btree
    (campaign ASC NULLS LAST)

The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'



Em 25 de dez de 2017 01:10, "Jean Baro" <jfbaro@gmail.com> escreveu:
Thanks for the clarification guys.

It will be super useful. After trying this I'll post the results!

Merry Christmas!

Em 25 de dez de 2017 00:59, "Danylo Hlynskyi" <abcz2.uprola@gmail.com> escreveu:
I had an opportunity to perform insertion of 700MM rows into Aurora Postgresql, for which performance insights are available. Turns out, that there are two stages of insert slowdown - first happens when max WAL buffers limit reached, second happens around 1 hour after.

The first stage cuts insert performance twice, and WALWrite lock is main bottleneck. I think WAL just can't sync changes log that fast, so it waits while older log entries are flushed. This creates both read and write IO.

The second stage is unique to Aurora/RDS and is characterized by excessive read data locks and total read IO. I couldn't figure out why does it read so much in a write only process, and AWS support didn't answer yet.

So, for you, try to throttle inserts so WAL is never overfilled and you don't experience WALWrite locks, and then increase wal buffers to max.

24 груд. 2017 р. 21:51 "Jean Baro" <jfbaro@gmail.com> пише:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

One table only, called Messages:

Uuid
Country  (ISO)
Role (Text)
User id  (Text)
GroupId (integer)
Channel (text)
Title (Text)
Payload (JSON, up to 20kb)
Starts_in (UTC)
Expires_in (UTC)
Seen (boolean)
Deleted (boolean)
LastUpdate (UTC)
Created_by (UTC)
Created_in (UTC)

Indexes:

UUID (PK)
UserID + Country (main index)
LastUpdate 
GroupID 


We inserted 160MM rows, around 2KB each. No partitioning.

Insert started at around  3.000 inserts per second, but (as expected) started to slow down as the number of rows increased.  In the end we got around 500 inserts per second.

Queries by Userd_ID + Country took less than 2 seconds, but while the batch insert was running the queries took over 20 seconds!!!

We had 20 Lambda getting messages from SQS and bulk inserting them into Postgresql. 

The insert performance is important, but we would slow it down if needed in order to ensure a more flat query performance. (Below 2 seconds). Each query (userId + country) returns around 100 diferent messages, which are filtered and order by the synchronous Lambda function. So we don't do any special filtering, sorting, ordering or full text search in Postgres. In some ways we use it more like a glorified file system. :)

We are going to limit the number of lambda workers to 1 or 2, and then run some queries concurrently to see if the query performance is not affect too much. We aim to get at least 50 queries per second (returning 100 messages each) under 2 seconds, even when there is millions of messages on SQS being inserted into PG.

We haven't done any performance tuning in the DB. 

With all that said, the question is:

What can be done to ensure good query performance (UserID+ country) even when the bulk insert is running (low priority).

We are limited to use AWS RDS at the moment.

Cheers





Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Thanks Jeremy,

We will provide a more complete EXPLAIN as other people have suggested. 

I am glad we might end up with a much better performance (currently each query takes around 2 seconds!).

Cheers


Em 27 de dez de 2017 14:02, "Jeremy Finzel" <finzelj@gmail.com> escreveu:


The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.

RE: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Thanks Mike, 

We are using the standard RDS instance m4.large, it's not Aurora, which is a much more powerful server  (according to AWS).

Yes, we could install it on EC2, but it would take some extra effort from our side, it can be an investment though in case it will help us finding the bottle neck, BUT after tuning the database it must run on RDS for production use. As the company I work for demands we run microseconds DB as a managed service (RDS in this case).

Mike, what can we expect to see if we run PG on EC2? More logging? More tuning options? Let me know what your intention is so that I can convince other people on the team. But keep in mind in the end that payload should run on RDS m4.large (500gb to 1TB of general purpose SSD).

Again, thanks a lot!

Em 27 de dez de 2017 13:59, "Mike Sofen" <msofen@runbox.com> escreveu:

Hi Jean,

 

I’ve used Postgres on a regular EC2 instance (an m4.xlarge), storing complex genomic data, hundreds of millions of rows in a table and “normal” queries that used an index returned in 50-100ms, depending on the query…so this isn’t a Postgres issue per se. 

 

Your table and index structures look ok, although in PG, use the “text” datatype instead of varchar, it is the optimized type for storing string data of any size (even a 2 char country code).  Since you have 2 such columns that you’ve indexed and are querying for, there is a chance you’ll see an improvement. 

 

I have not yet used Aurora or RDS for any large data…it sure seems like the finger could be pointing there, but it isn’t clear what mechanism in Aurora could be creating the slowness.

 

Is there a possibility of you creating the same db on a normal EC2 instance with PG installed and running the same test?  There is nothing else obvious about your data/structure that could result in such terrible performance.

 

Mike Sofen

 

From: Jean Baro [mailto:jfbaro@gmail.com]
Sent: Wednesday, December 27, 2017 7:14 AM

Hello,

 

We are still seeing queries  (by UserID + UserCountry) taking over 2 seconds, even when there is no batch insert going on at the same time.

 

Each query returns from 100 to 200 messagens, which would be a 400kb pay load, which is super tiny.

 

I don't know what else I can do with the limitations (m4.large), 167MM rows, almost 500GB database and 29GB of indexes (all indexes).

 

I am probably to optimistic, but I was expecting queries (up to 50 queries per second) to return  (99th) under 500ms or even less, as the index is simple, there is no aggregation or join involves.

 

Any suggestion?

 

The table structure:

CREATE TABLE public.card

(

    id character(36) NOT NULL,

    user_id character varying(40) NOT NULL,

    user_country character(2) NOT NULL,

    user_channel character varying(40),

    user_role character varying(40),

    created_by_system_key character(36) NOT NULL,

    created_by_username character varying(40),

    created_at timestamp with time zone NOT NULL,

    last_modified_at timestamp with time zone NOT NULL,

    date_start timestamp with time zone NOT NULL,

    date_end timestamp with time zone NOT NULL,

    payload json NOT NULL,

    tags character varying(500),

    menu character varying(50),

    deleted boolean NOT NULL,

    campaign character varying(500) NOT NULL,

    correlation_id character varying(50),

    PRIMARY KEY (id)

);

 

CREATE INDEX idx_user_country

    ON public.card USING btree

    (user_id COLLATE pg_catalog."default", user_country COLLATE pg_catalog."default");

 

CREATE INDEX idx_last_modified_at

    ON public.card USING btree

    (last_modified_at ASC NULLS LAST);

 

CREATE INDEX idx_campaign

    ON public.card USING btree

    (campaign ASC NULLS LAST)

 

The EXPLAIN

 

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'

'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

 

 

 

Em 25 de dez de 2017 01:10, "Jean Baro" <jfbaro@gmail.com> escreveu:

Thanks for the clarification guys.

 

It will be super useful. After trying this I'll post the results!

 

Merry Christmas!

 


Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
General purpose, 500GB but we are planing to increase it to 1TB before going into production.

500GB 1.500 iops  (some burst of 3.000 iops)

1TB 3.000 iops

Em 27 de dez de 2017 14:23, "Jeff Janes" <jeff.janes@gmail.com> escreveu:
On Sun, Dec 24, 2017 at 11:51 AM, Jean Baro <jfbaro@gmail.com> wrote:
Hi there,

We are testing a new application to try to find performance issues.

AWS RDS m4.large 500GB storage (SSD)

Is that general purpose SSD, or provisioned IOPS SSD?  If provisioned, what is the level of provisioning?

Cheers,

Jeff

Re: Batch insert heavily affecting query performance.

From
Jean Baro
Date:
Sorry guys,

The performance problem is not caused by PG. 

'Index Scan using idx_user_country on public.old_card  (cost=0.57..1854.66 rows=460 width=922) (actual time=3.442..76.606 rows=200 loops=1)'
'  Output: id, user_id, user_country, user_channel, user_role, created_by_system_key, created_by_username, created_at, last_modified_at, date_start, date_end, payload, tags, menu, deleted, campaign, correlation_id'
'  Index Cond: (((old_card.user_id)::text = '1234'::text) AND (old_card.user_country = 'BR'::bpchar))'
'  Buffers: shared hit=11 read=138 written=35'
'Planning time: 7.748 ms'
'Execution time: 76.755 ms'

77ms on an 8GB database with 167MM rows and almost 500GB in size is amazing!!

Now we are investigating other bottlenecks, is it the creation of a new connection to PG  (no connection poller at the moment, like PGBouncer), is it the Lambda start up time? Is it the network performance  between PG and Lambda?

I am sorry for wasting your time guys, it helped us to find the problem though, even if it wasn't a PG problem. 

BTW, what a performance! I am impressed. 

Thanks PG community! 

Em 27 de dez de 2017 14:34, "Jean Baro" <jfbaro@gmail.com> escreveu:
Thanks Jeremy,

We will provide a more complete EXPLAIN as other people have suggested. 

I am glad we might end up with a much better performance (currently each query takes around 2 seconds!).

Cheers


Em 27 de dez de 2017 14:02, "Jeremy Finzel" <finzelj@gmail.com> escreveu:


The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.

Re: Batch insert heavily affecting query performance.

From
Alvaro Hernandez
Date:


On 27/12/17 18:02, Jean Baro wrote:
Sorry guys,

The performance problem is not caused by PG. 

'Index Scan using idx_user_country on public.old_card  (cost=0.57..1854.66 rows=460 width=922) (actual time=3.442..76.606 rows=200 loops=1)'
'  Output: id, user_id, user_country, user_channel, user_role, created_by_system_key, created_by_username, created_at, last_modified_at, date_start, date_end, payload, tags, menu, deleted, campaign, correlation_id'
'  Index Cond: (((old_card.user_id)::text = '1234'::text) AND (old_card.user_country = 'BR'::bpchar))'
'  Buffers: shared hit=11 read=138 written=35'
'Planning time: 7.748 ms'
'Execution time: 76.755 ms'

77ms on an 8GB database with 167MM rows and almost 500GB in size is amazing!!


    gp2 disks are of *variable* performance. Once you exhaust the I/O credits, you are capped to a baseline IOPS that are proportional to the size. I guess you would experience low performance in this scenario since your disk is not big. And actually performance numbers with gp2 disks are unreliable as you don't know in which credit status you are.

    Benchmark with provisioned iops to get a right picture of the desired performance.


    Cheers,

    Álvaro


-- 

Alvaro Hernandez


-----------
OnGres





Now we are investigating other bottlenecks, is it the creation of a new connection to PG  (no connection poller at the moment, like PGBouncer), is it the Lambda start up time? Is it the network performance  between PG and Lambda?

I am sorry for wasting your time guys, it helped us to find the problem though, even if it wasn't a PG problem. 

BTW, what a performance! I am impressed. 

Thanks PG community! 

Em 27 de dez de 2017 14:34, "Jean Baro" <jfbaro@gmail.com> escreveu:
Thanks Jeremy,

We will provide a more complete EXPLAIN as other people have suggested. 

I am glad we might end up with a much better performance (currently each query takes around 2 seconds!).

Cheers


Em 27 de dez de 2017 14:02, "Jeremy Finzel" <finzelj@gmail.com> escreveu:


The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.


RE: Batch insert heavily affecting query performance.

From
"Mike Sofen"
Date:

In my experience, that 77ms will stay quite constant even if your db grew to > 1TB.  Postgres IS amazing.  BTW, for a db, you should always have provisioned IOPS or else your performance can vary wildly, since the SSDs are shared.

 

Re Lambda:  another team is working on a new web app using Lambda calls and they were also experiencing horrific performance, just like yours (2 seconds per call).  They discovered it was the Lambda connection/spin-up time causing the problem.  They solved it by keeping several Lambda’s “hot”, for an instant connection…solved the problem, the last I heard.  Google for that topic, you’ll find solutions.

 

Mike

 

From: Jean Baro [mailto:jfbaro@gmail.com]
Sent: Wednesday, December 27, 2017 9:03 AM

Sorry guys,

 

The performance problem is not caused by PG. 

 

'Index Scan using idx_user_country on public.old_card  (cost=0.57..1854.66 rows=460 width=922) (actual time=3.442..76.606 rows=200 loops=1)'

'  Output: id, user_id, user_country, user_channel, user_role, created_by_system_key, created_by_username, created_at, last_modified_at, date_start, date_end, payload, tags, menu, deleted, campaign, correlation_id'

'  Index Cond: (((old_card.user_id)::text = '1234'::text) AND (old_card.user_country = 'BR'::bpchar))'

'  Buffers: shared hit=11 read=138 written=35'

'Planning time: 7.748 ms'

'Execution time: 76.755 ms'

 

77ms on an 8GB database with 167MM rows and almost 500GB in size is amazing!!

 

Now we are investigating other bottlenecks, is it the creation of a new connection to PG  (no connection poller at the moment, like PGBouncer), is it the Lambda start up time? Is it the network performance  between PG and Lambda?

 

I am sorry for wasting your time guys, it helped us to find the problem though, even if it wasn't a PG problem. 

 

BTW, what a performance! I am impressed. 

 

Thanks PG community! 

 

Em 27 de dez de 2017 14:34, "Jean Baro" <jfbaro@gmail.com> escreveu:

Thanks Jeremy,

 

We will provide a more complete EXPLAIN as other people have suggested. 

 

I am glad we might end up with a much better performance (currently each query takes around 2 seconds!).

 

Cheers

 

 

Em 27 de dez de 2017 14:02, "Jeremy Finzel" <finzelj@gmail.com> escreveu:

 

 

The EXPLAIN

 

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'

'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

 

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

 

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.

 

Re: Batch insert heavily affecting query performance.

From
David Miller
Date:
Jean,

It is very likely you are running out of IOPS with that size of server. We have several Postgres databases running at AWS. We consistently run out of IOPS on our development servers due to the types queries and sizing of our development databases. I would check the AWS monitoring graphs to determine the cause. We typically see low CPU and high IOPS just prior to our degraded performance. Our production environment runs provisioned IOPS to avoid this very issue.

Regards,
 
David 



From: Jean Baro <jfbaro@gmail.com>
To: Jeremy Finzel <finzelj@gmail.com>
Cc: Danylo Hlynskyi <abcz2.uprola@gmail.com>; pgsql-performance@postgresql.org
Sent: Wednesday, December 27, 2017 11:03 AM
Subject: Re: Batch insert heavily affecting query performance.

Sorry guys,

The performance problem is not caused by PG. 

'Index Scan using idx_user_country on public.old_card  (cost=0.57..1854.66 rows=460 width=922) (actual time=3.442..76.606 rows=200 loops=1)'
'  Output: id, user_id, user_country, user_channel, user_role, created_by_system_key, created_by_username, created_at, last_modified_at, date_start, date_end, payload, tags, menu, deleted, campaign, correlation_id'
'  Index Cond: (((old_card.user_id)::text = '1234'::text) AND (old_card.user_country = 'BR'::bpchar))'
'  Buffers: shared hit=11 read=138 written=35'
'Planning time: 7.748 ms'
'Execution time: 76.755 ms'

77ms on an 8GB database with 167MM rows and almost 500GB in size is amazing!!

Now we are investigating other bottlenecks, is it the creation of a new connection to PG  (no connection poller at the moment, like PGBouncer), is it the Lambda start up time? Is it the network performance  between PG and Lambda?

I am sorry for wasting your time guys, it helped us to find the problem though, even if it wasn't a PG problem. 

BTW, what a performance! I am impressed. 

Thanks PG community! 

Em 27 de dez de 2017 14:34, "Jean Baro" <jfbaro@gmail.com> escreveu:
Thanks Jeremy,

We will provide a more complete EXPLAIN as other people have suggested. 

I am glad we might end up with a much better performance (currently each query takes around 2 seconds!).

Cheers


Em 27 de dez de 2017 14:02, "Jeremy Finzel" <finzelj@gmail.com> escreveu:


The EXPLAIN

'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460 width=922)'
'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country = 'BR'::bpchar))'

Show 3 runs of the full explain analyze plan on given condition so that we can also see cold vs warm cache performance.

There is definitely something wrong as there is no way a query like that should take 500ms.  Your instinct is correct there.



Re: Batch insert heavily affecting query performance.

From
Claudio Freire
Date:


On Wed, Dec 27, 2017 at 2:10 PM, Mike Sofen <msofen@runbox.com> wrote:

In my experience, that 77ms will stay quite constant even if your db grew to > 1TB.  Postgres IS amazing.  BTW, for a db, you should always have provisioned IOPS or else your performance can vary wildly, since the SSDs are shared.

 

Re Lambda:  another team is working on a new web app using Lambda calls and they were also experiencing horrific performance, just like yours (2 seconds per call).  They discovered it was the Lambda connection/spin-up time causing the problem.  They solved it by keeping several Lambda’s “hot”, for an instant connection…solved the problem, the last I heard.  Google for that topic, you’ll find solutions.


You should try to implement an internal connection pool in your lambda.

Lambda functions are reused. You have no guarantees as to how long these processes will live, but they will live for more than one request. So if you keep a persistent connection in your lambda code, the first invocation may be slow, but further invocations will be fast. Lambda will try to batch several calls at once. In fact, you can usually configure batching in the event source to try to maximize this effect.

In my experience, your lambda will be most probably network-bound. Increase the lambda's memory allocation, to get a bigger chunk of the available network bandwidth (why they decided to call that "memory" nobody will ever be able to tell).