Thread: Help with insert query

Help with insert query

From
Glenn Schultz
Date:
All,

The query below is designed to insert into a table.  This works when I have a single loan which I insert.  However, if remove the part of the where clause of a single loan the insert does not work.  The table fnmloan is a large table with 500mm + rows and the query runs for about 4 hours.  Any idea of how to get this to work?  I am a little stumped since the query works with one loan.

Glenn

SET max_parallel_workers_per_gather = 8;
SET random_page_cost = 1;

truncate fnmloan_balance;
insert into fnmloan_balance (
fctrdt, loanseqnum, secmnem, beginbal, scheduled,endbal,smm
)

select
fnmloan.fctrdt
,fnmloan.loanseqnum
,fnmloan.secmnem
--,fnmloan.orignoterate
--,fnmloan.loanage
--,fnmloan.origloanamt
,fnmloan.currrpb as beginbal
,round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) as scheduled
,coalesce(endbal.currrpb,0) as endbal
,abs(round(
  cast((fnmloan.currrpb - coalesce(endbal.currrpb,0) - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) )/(fnmloan.currrpb - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) ) as numeric)
  ,4)) as SMM

from
(
 select * from fnmloan 
 where 
 fctrdt < '03-01-2019'
 and 
 loanseqnum = '5991017042'
) as fnmloan


left outer join
(select
fctrdt - interval '1 month' as fctrdt
,loanseqnum
,orignoterate
,loanage
,origloanamt
,currrpb
from fnmloan
) as endbal

on fnmloan.loanseqnum = endbal.loanseqnum
and fnmloan.fctrdt = endbal.fctrdt  

Re: Help with insert query

From
Michel Pelletier
Date:
On Mon, Apr 1, 2019 at 10:37 AM Glenn Schultz <glenn@bondlab.io> wrote:
All,

The query below is designed to insert into a table.  This works when I have a single loan which I insert.  However, if remove the part of the where clause of a single loan the insert does not work.  The table fnmloan is a large table with 500mm + rows and the query runs for about 4 hours.  Any idea of how to get this to work?  I am a little stumped since the query works with one loan.


Inserting one row is fast, inserting 500 million rows is going to take quite a bit longer.  I suggest your break your query up into batches, and insert, say, 1 million rows at a time.  Also it might be a good idea to drop your indexes on the target table and re-create them after you do the bulk insert, and also do an 'ANALYZE' on the target table after you have inserted all the records.

-Michel

 
Glenn

SET max_parallel_workers_per_gather = 8;
SET random_page_cost = 1;

truncate fnmloan_balance;
insert into fnmloan_balance (
fctrdt, loanseqnum, secmnem, beginbal, scheduled,endbal,smm
)

select
fnmloan.fctrdt
,fnmloan.loanseqnum
,fnmloan.secmnem
--,fnmloan.orignoterate
--,fnmloan.loanage
--,fnmloan.origloanamt
,fnmloan.currrpb as beginbal
,round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) as scheduled
,coalesce(endbal.currrpb,0) as endbal
,abs(round(
  cast((fnmloan.currrpb - coalesce(endbal.currrpb,0) - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) )/(fnmloan.currrpb - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) ) as numeric)
  ,4)) as SMM

from
(
 select * from fnmloan 
 where 
 fctrdt < '03-01-2019'
 and 
 loanseqnum = '5991017042'
) as fnmloan


left outer join
(select
fctrdt - interval '1 month' as fctrdt
,loanseqnum
,orignoterate
,loanage
,origloanamt
,currrpb
from fnmloan
) as endbal

on fnmloan.loanseqnum = endbal.loanseqnum
and fnmloan.fctrdt = endbal.fctrdt  

Re: Help with insert query

From
Ron
Date:
On 4/1/19 12:37 PM, Glenn Schultz wrote:
> All,
>
> The query below is designed to insert into a table.  This works when I 
> have a single loan which I insert. However, if remove the part of the 
> where clause of a single loan the insert does not work.  The table fnmloan 
> is a large table with 500mm + rows and the query runs for about 4 hours. 
> Any idea of how to get this to work?  I am a little stumped since the 
> query works with one loan.

Following up to Michael's answer... 500MM rows in 4 hours is an insert rate 
of 34,722.222... records per second.

-- 
Angular momentum makes the world go 'round.



Re: Help with insert query

From
Adrian Klaver
Date:
On 4/1/19 10:37 AM, Glenn Schultz wrote:
> All,
> 
> The query below is designed to insert into a table.  This works when I 
> have a single loan which I insert.  However, if remove the part of the 
> where clause of a single loan the insert does not work.  The table 
> fnmloan is a large table with 500mm + rows and the query runs for about 
> 4 hours.  Any idea of how to get this to work?  I am a little stumped 
> since the query works with one loan.

If you pull the SELECT portion of the query out and run it without 
restricting the loanseqnum and use EXPLAIN ANALYZE how many rows do you 
get and what does the EXPLAIN show?

> 
> Glenn
> 
> SET max_parallel_workers_per_gather = 8;
> SET random_page_cost = 1;
> 
> truncate fnmloan_balance;
> insert into fnmloan_balance (
> fctrdt, loanseqnum, secmnem, beginbal, scheduled,endbal,smm
> )
> 
> select
> fnmloan.fctrdt
> ,fnmloan.loanseqnum
> ,fnmloan.secmnem
> --,fnmloan.orignoterate
> --,fnmloan.loanage
> --,fnmloan.origloanamt
> ,fnmloan.currrpb as beginbal
> ,round(nullif(schedprin(cast(fnmloan.orignoterate as numeric),
>    cast(fnmloan.remterm - 1 as numeric),
>    cast(fnmloan.currrpb as numeric)),4)) as scheduled
> ,coalesce(endbal.currrpb,0) as endbal
> ,abs(round(
>    cast((fnmloan.currrpb - coalesce(endbal.currrpb,0) - 
> round(nullif(schedprin(cast(fnmloan.orignoterate as numeric),
>    cast(fnmloan.remterm - 1 as numeric),
>    cast(fnmloan.currrpb as numeric)),4)) )/(fnmloan.currrpb - 
> round(nullif(schedprin(cast(fnmloan.orignoterate as numeric),
>    cast(fnmloan.remterm - 1 as numeric),
>    cast(fnmloan.currrpb as numeric)),4)) ) as numeric)
>    ,4)) as SMM
> 
> from
> (
>   select * from fnmloan
>   where
>   fctrdt < '03-01-2019'
>   and
>   loanseqnum = '5991017042'
> ) as fnmloan
> 
> 
> left outer join
> (select
> fctrdt - interval '1 month' as fctrdt
> ,loanseqnum
> ,orignoterate
> ,loanage
> ,origloanamt
> ,currrpb
> from fnmloan
> ) as endbal
> 
> on fnmloan.loanseqnum = endbal.loanseqnum
> and fnmloan.fctrdt = endbal.fctrdt
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Help with insert query

From
Michel Pelletier
Date:
Please reply-all to the list and not just me directly.

I didn't realize you meant the table was empty when you did the full insert.  As Adrian pointed out, run your select using explain, it will show you why you are producing no rows.  Looking at your query just superficially, the outer join looks suspicious, maybe using a subqery to get the ending balance is a better approach.




On Mon, Apr 1, 2019 at 11:02 AM Glenn Schultz <glenn@bondlab.io> wrote:
Hi Michael,

I will try that.  What I don’t understand is why, when using just one loan the insert is successful but when working with the whole table once the query is done there is nothing inserted into the table.

Best,
Glenn

Sent from my iPhone


On Apr 1, 2019, at 1:55 PM, Michel Pelletier <pelletier.michel@gmail.com> wrote:

On Mon, Apr 1, 2019 at 10:37 AM Glenn Schultz <glenn@bondlab.io> wrote:
All,

The query below is designed to insert into a table.  This works when I have a single loan which I insert.  However, if remove the part of the where clause of a single loan the insert does not work.  The table fnmloan is a large table with 500mm + rows and the query runs for about 4 hours.  Any idea of how to get this to work?  I am a little stumped since the query works with one loan.


Inserting one row is fast, inserting 500 million rows is going to take quite a bit longer.  I suggest your break your query up into batches, and insert, say, 1 million rows at a time.  Also it might be a good idea to drop your indexes on the target table and re-create them after you do the bulk insert, and also do an 'ANALYZE' on the target table after you have inserted all the records.

-Michel

 
Glenn

SET max_parallel_workers_per_gather = 8;
SET random_page_cost = 1;

truncate fnmloan_balance;
insert into fnmloan_balance (
fctrdt, loanseqnum, secmnem, beginbal, scheduled,endbal,smm
)

select
fnmloan.fctrdt
,fnmloan.loanseqnum
,fnmloan.secmnem
--,fnmloan.orignoterate
--,fnmloan.loanage
--,fnmloan.origloanamt
,fnmloan.currrpb as beginbal
,round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) as scheduled
,coalesce(endbal.currrpb,0) as endbal
,abs(round(
  cast((fnmloan.currrpb - coalesce(endbal.currrpb,0) - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) )/(fnmloan.currrpb - round(nullif(schedprin(cast(fnmloan.orignoterate as numeric), 
  cast(fnmloan.remterm - 1 as numeric), 
  cast(fnmloan.currrpb as numeric)),4)) ) as numeric)
  ,4)) as SMM

from
(
 select * from fnmloan 
 where 
 fctrdt < '03-01-2019'
 and 
 loanseqnum = '5991017042'
) as fnmloan


left outer join
(select
fctrdt - interval '1 month' as fctrdt
,loanseqnum
,orignoterate
,loanage
,origloanamt
,currrpb
from fnmloan
) as endbal

on fnmloan.loanseqnum = endbal.loanseqnum
and fnmloan.fctrdt = endbal.fctrdt