Home > mailing lists

Re: How different is AWS-RDS postgres? - Mailing list pgsql-general

From	Rob Sargent
Subject	Re: How different is AWS-RDS postgres?
Date	May 27, 2021 22:23:49
Msg-id	dbd962e7-5dd4-eeab-aca0-3a0fd39ed36a@gmail.com Whole thread Raw
In response to	Re: How different is AWS-RDS postgres? (Ron <ronljohnsonjr@gmail.com>)
Responses	Re: How different is AWS-RDS postgres?
List	pgsql-general

Tree view

On 5/27/21 4:10 PM, Ron wrote:

On 5/27/21 4:58 PM, Rob Sargent wrote:
On 5/27/21 3:08 PM, Sam Gendler wrote:
The same JDBC connection that is resulting in lost data? Sounds to me like you aren't connecting to the DB you think you are connecting to.

I almost wish that were true.

However, looking at AWS "Performance Insights" is see the sql statements generate by my app begin executed on the server. Not coincidentally this is from the "Top SQL (10)/Load by waits" view. Now that view does NOT, in point of fact, name the database in which the sql is running, but the rest of my environment pretty much rules out silently switching tracks on myself. I have to read from the correct database, using a UUID, to get data to analyze, then save the results of the analysis back. I'm using my wonderful webapp to run this and I've successfully analyzed and stored result for small starting data sets.

I just notice the "Top database" tab on that screen: there is only one and it's the correct one.

I've reproduced the behaviour. I'm pretty convinced it a size thing, but which part of the system is being max'd out is not clear. Not CPU, but that's the only resource the "Performance Insight" mentions (suggesting this UI wasn't designed by a database person).

The loss of the staging table is most spectacular. It filled from a file which has 7.5M records. It's clear in the tomcat logs that is was created and written to, one record read. Then the "top sql" happens:
insert into segment select * from bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s where s.probandset_id >= ? and s.probandset_id < ?
the "bulk" table is the staging table. The params are filled in with a quasi-uuid which grab roughly 1/16th of the data. In the stack trace on my tomcat server I get
Caused by: org.jooq.exception.DataAccessException: SQL [insert into segment select * from bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s where s.probandset_id >= '30000000-0000-0000-0000-00000000\
0000' and s.probandset_id < '40000000-0000-0000-0000-000000000000' ]; An I/O error occurred while sending to the backend.
So this would have been the fourth such insert statement read from that staging table.

That table is not deleted by my code. It is renamed after the last insert into segment, by appending "_done" to the name. But that table, by either name, is nowhere to be seen on the server.

For those scoring at home, the trouble in the tomcat log start with
0:02:11.452 [https-jsse-nio-10.0.2.28-15002-exec-7] INFO edu.utah.camplab.jx.PayloadFromMux -

Is there any way to replicate this in psql?

I have the json file which feeds the staging table and I have code which is designed to load such files. I suppose an sql file with those 16 insert statements would approximate the the apps call. Let me give that a shot.

--
Angular momentum makes the world go 'round.

pgsql-general by date:

From: Ron
Date: 27 May 2021, 22:10:15
Subject: Re: How different is AWS-RDS postgres?

From: Sam Gendler
Date: 27 May 2021, 22:25:43
Subject: Re: How different is AWS-RDS postgres?

Re: How different is AWS-RDS postgres? - Mailing list pgsql-general

Previous

Next