Thread: Initial insert
Laurent CATHALA
Architecte
lca@sylob.com
7 rue Marcel Dassault - Z.A. La Mouline - 81990 Cambon d'Albi - FRANCE
Tel : 05 63 53 08 18 - Fax : 05 63 53 07 42 - www.sylob.com
Support : 05 63 53 78 35 - support@sylob.com
Entreprise certifiée ISO 9001 version 2008 par Bureau Veritas.
Laurent CATHALA
Architecte
lca@sylob.com
7 rue Marcel Dassault - Z.A. La Mouline - 81990 Cambon d'Albi - FRANCE
Tel : 05 63 53 08 18 - Fax : 05 63 53 07 42 - www.sylob.com
Support : 05 63 53 78 35 - support@sylob.com
Entreprise certifiée ISO 9001 version 2008 par Bureau Veritas.
Hi,I'm trying to create datas on an initial import and i'm encountering a performance issue.I've 2 tables, my process create a record in each table and execute a sum with join on this 2 tables. (and other requests but there are very fast)My 2 tables are empty before the import.My count query is :select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article='4028804c4a311178014a346546967c59'i use parameterized request.My process create only 6000 records in each table.During the whole process this sum request lasts longer and longer.The auto-explain plan show an seq scan----------Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article=$1Aggregate (cost=2.04..2.05 rows=1 width=26) (actual time=862.621..862.621 rows=1 loops=1)Output: sum(quantitest0_.quantite_valeur)-> Nested Loop (cost=0.00..2.04 rows=1 width=26) (actual time=862.618..862.618 rows=0 loops=1)Output: quantitest0_.quantite_valeurJoin Filter: ((quantitest0_.id_caracteristiquearticlestock)::text = (caracteris1_.id)::text)Rows Removed by Join Filter: 1869-> Seq Scan on public.dm5_quantitestock quantitest0_ (cost=0.00..1.01 rows=1 width=164) (actual time=0.004..0.408 rows=1869 loops=1)Output: quantitest0_.id, quantitest0_.datefinvalidite, quantitest0_.quantite_valeur, quantitest0_.id_caracteristiquearticlestock, quantitest0_.id_caracteristiquelieustock, quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme-> Seq Scan on public.dm5_caracteristiquearticlestock caracteris1_ (cost=0.00..1.01 rows=1 width=42) (actual time=0.456..0.456 rows=1 loops=1869)Output: caracteris1_.id, caracteris1_.datefinvalidite, caracteris1_.id_lot, caracteris1_.id_article, caracteris1_.id_numeroserie, caracteris1_.datecreationsysteme, caracteris1_.datemodificationsysteme, caracteris1_.id_modifieparsysteme, caracteris1_.id_creeparsystemeFilter: ((caracteris1_.id_article)::text = ($1)::text)Rows Removed by Filter: 1869-----------if a launch an analyse during the process, the explain use index, but the time remains the same.---------Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article=$1Aggregate (cost=16.55..16.56 rows=1 width=26) (actual time=654.998..654.998 rows=1 loops=1)Output: sum(quantitest0_.quantite_valeur)-> Nested Loop (cost=0.00..16.55 rows=1 width=26) (actual time=654.994..654.994 rows=0 loops=1)Output: quantitest0_.quantite_valeurJoin Filter: ((quantitest0_.id_caracteristiquearticlestock)::text = (caracteris1_.id)::text)Rows Removed by Join Filter: 1651-> Index Scan using x_dm5_quantitestock_00 on public.dm5_quantitestock quantitest0_ (cost=0.00..8.27 rows=1 width=164) (actual time=0.011..0.579 rows=1651 loops=1)Output: quantitest0_.id, quantitest0_.datefinvalidite, quantitest0_.quantite_valeur, quantitest0_.id_caracteristiquearticlestock, quantitest0_.id_caracteristiquelieustock, quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme-> Index Scan using dm5_caracteristiquearticlestock_pkey on public.dm5_caracteristiquearticlestock caracteris1_ (cost=0.00..8.27 rows=1 width=42) (actual time=0.395..0.395 rows=1 loops=1651)Output: caracteris1_.id, caracteris1_.datefinvalidite, caracteris1_.id_lot, caracteris1_.id_article, caracteris1_.id_numeroserie, caracteris1_.datecreationsysteme, caracteris1_.datemodificationsysteme, caracteris1_.id_modifieparsysteme, caracteris1_.id_creeparsystemeFilter: ((caracteris1_.id_article)::text = '4028804c4a311178014a346547307cce'::text)Rows Removed by Filter: 1651----------If i create the first 1000 records, commit and end transaction, the whole import is very fast.I can't change my process to cut the process in little part...An idea ?Thanks.Laurent CATHALA
Architecte
lca@sylob.com
7 rue Marcel Dassault - Z.A. La Mouline - 81990 Cambon d'Albi - FRANCE
Tel : 05 63 53 08 18 - Fax : 05 63 53 07 42 - www.sylob.com
Support : 05 63 53 78 35 - support@sylob.com
Entreprise certifiée ISO 9001 version 2008 par Bureau Veritas.
Hi, On 22.1.2015 17:46, Laurent Cathala wrote: > Hi, > I'm trying to create datas on an initial import and i'm encountering a > performance issue. > I've 2 tables, my process create a record in each table and execute a > sum with join on this 2 tables. (and other requests but there are very fast) > > My 2 tables are empty before the import. > > My count query is : > select sum(quantitest0_.quantite_valeur) as col_0_0_ from > dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock > caracteris1_ where > quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and > caracteris1_.id_article='4028804c4a311178014a346546967c59' > > i use parameterized request. > > My process create only 6000 records in each table. > > During the whole process this sum request lasts longer and longer. > > The auto-explain plan show an seq scan > > ---------- > Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from > dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock > caracteris1_ where > quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and > caracteris1_.id_article=$1 > Aggregate (cost=2.04..2.05 rows=1 width=26) (actual > time=862.621..862.621 rows=1 loops=1) > Output: sum(quantitest0_.quantite_valeur) > -> Nested Loop (cost=0.00..2.04 rows=1 width=26) (actual > time=862.618..862.618 rows=0 loops=1) > Output: quantitest0_.quantite_valeur > Join Filter: > ((quantitest0_.id_caracteristiquearticlestock)::text = > (caracteris1_.id)::text) > Rows Removed by Join Filter: 1869 > -> Seq Scan on public.dm5_quantitestock quantitest0_ > (cost=0.00..1.01 rows=1 width=164) (actual time=0.004..0.408 rows=1869 > loops=1) > Output: quantitest0_.id, > quantitest0_.datefinvalidite, quantitest0_.quantite_valeur, > quantitest0_.id_caracteristiquearticlestock, > quantitest0_.id_caracteristiquelieustock, > quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, > quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme > -> Seq Scan on public.dm5_caracteristiquearticlestock > caracteris1_ (cost=0.00..1.01 rows=1 width=42) (actual > time=0.456..0.456 rows=1 loops=1869) > Output: caracteris1_.id, > caracteris1_.datefinvalidite, caracteris1_.id_lot, > caracteris1_.id_article, caracteris1_.id_numeroserie, > caracteris1_.datecreationsysteme, caracteris1_.datemodificationsysteme, > caracteris1_.id_modifieparsysteme, caracteris1_.id_creeparsysteme > Filter: ((caracteris1_.id_article)::text = ($1)::text) > Rows Removed by Filter: 1869 > ----------- > > if a launch an analyse during the process, the explain use index, but > the time remains the same. > > --------- > Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from > dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock > caracteris1_ where > quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and > caracteris1_.id_article=$1 > Aggregate (cost=16.55..16.56 rows=1 width=26) (actual > time=654.998..654.998 rows=1 loops=1) > Output: sum(quantitest0_.quantite_valeur) > -> Nested Loop (cost=0.00..16.55 rows=1 width=26) (actual > time=654.994..654.994 rows=0 loops=1) > Output: quantitest0_.quantite_valeur > Join Filter: ((quantitest0_.id_caracteristiquearticlestock)::text > = (caracteris1_.id)::text) > Rows Removed by Join Filter: 1651 > -> Index Scan using x_dm5_quantitestock_00 on > public.dm5_quantitestock quantitest0_ (cost=0.00..8.27 rows=1 > width=164) (actual time=0.011..0.579 rows=1651 loops=1) > Output: quantitest0_.id, quantitest0_.datefinvalidite, > quantitest0_.quantite_valeur, > quantitest0_.id_caracteristiquearticlestock, > quantitest0_.id_caracteristiquelieustock, > quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, > quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme > -> Index Scan using dm5_caracteristiquearticlestock_pkey on > public.dm5_caracteristiquearticlestock caracteris1_ (cost=0.00..8.27 > rows=1 width=42) (actual time=0.395..0.395 rows=1 loops=1651) > Output: caracteris1_.id, caracteris1_.datefinvalidite, > caracteris1_.id_lot, caracteris1_.id_article, > caracteris1_.id_numeroserie, caracteris1_.datecreationsysteme, > caracteris1_.datemodificationsysteme, caracteris1_.id_modifieparsysteme, > caracteris1_.id_creeparsysteme > Filter: ((caracteris1_.id_article)::text = > '4028804c4a311178014a346547307cce'::text) > Rows Removed by Filter: 1651 > > ---------- Why is the first query using a parameter ($1) while the second one uses a string literal? Have you executed them differently? > > If i create the first 1000 records, commit and end transaction, the > whole import is very fast. And what plans do the queries use? > > I can't change my process to cut the process in little part... We're talking about a 600-800 ms query - even if you cut it to 1 ms, I don't see how this would do a difference in a batch-style job. If you're doing many such queries (with different id_article values), you may do something like this select caracteris1_.id_article, sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id group by caracteris1_.id_article and then query this (supposedly much smaller) table. regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services