Re: left join with smaller table or index on (XXX is not null) to avoid upsert - Mailing list pgsql-general
From | Martin Gainty |
---|---|
Subject | Re: left join with smaller table or index on (XXX is not null) to avoid upsert |
Date | |
Msg-id | BLU142-W238FAB627BAFA95B733BC8AED30@phx.gbl Whole thread Raw |
In response to | left join with smaller table or index on (XXX is not null) to avoid upsert (Ivan Sergio Borgonovo <mail@webthatworks.it>) |
List | pgsql-general |
Sergio-
1)Index all joined columns
2)Put your NOT NULL test up front e.g
where
pd.DiscountPrice is not null
AND
left join ProductDiscount pd on p.ProductID=pd.ProductID
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Sun, 18 Jan 2009 22:12:07 +0100
> From: mail@webthatworks.it
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] left join with smaller table or index on (XXX is not null) to avoid upsert
>
> I've to apply a discounts to products.
>
> For each promotion I've a query that select a list of products and
> should apply a discount.
>
> Queries may have intersections, in these intersections the highest
> discount should be applied.
>
> Since queries may be slow I decided to proxy the discount this way:
>
> create table Product(
> ProductID int primary key,
> ListPrice numeric
> );
>
> create table ProductPrice(
> ProductID int references Products (ProcuctID),
> DiscountedPrice numeric
> );
>
> Case A)
> If I want the ProductPrice to contain just products with a
> discount I'll have to update, then see if the update was successful
> otherwise insert.
> I expect that the products involved may be around 10% of the overall
> products.
>
> To retrieve a list of products I could:
> select [some columns from Product],
> least(coalesce(p.ListPrice,0),
> coalesce(pp.DiscountedPrice,0)) as Price
> from Product
> left join ProductPrice pp on p.ProductID=pp.ProductID
> where [some conditions on Product table];
>
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
>
> Case B)
> Or ProductPrice may just contain ALL the products and everything
> will be managed with updates.
>
> select [some columns from Product],
> least(coalesce(p.ListPrice,0),
> coalesce(pd.DiscountedPrice,0))
> from Product
> left join ProductDiscount pd on p.ProductID=pd.ProductID and
> pd.DiscountPrice is not null
> where [some conditions on Product table];
>
> create index ProductDiscount_DiscountedPrice_idx on DiscountPrice
> (DiscountPrice is not null);
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
>
> I'm expecting that:
> - ProductPrice will contain roughly but less than 10% of the
> catalogue.
> - I may have from 0 to 60% overlap on queries generating the list of
> products to be discounted.
> - The overall number of promotions/query running concurrently may be
> in the range of 20-100
> - promotions will be created/deletes at a rate of 5-10 a day, so
> that discount will have to be recalculated
> - searches in the catalogue have to be fast
>
> Since I haven't been able to find a quick way to build up a
> hierarchy of promotions to apply/re-apply discounts when promotion
> are added/deleted, creating/deleting promotions looks critical as
> well.
> The best thing I was able to plan was just to reapply all promotions
> if one is deleted.
>
> So it looks to me that approach B is going to make updating of
> discounts easier, but I was wondering if it makes retrieval of
> Products and Prices slower.
>
> Having a larger table that is being updated at a rate of 5% to 10% a
> day may make it a bit "fragmented".
>
> Advices on the overall problem of discount overlap management will
> be appreciated too.
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
>
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live™: Keep your life in sync. Check it out.
1)Index all joined columns
2)Put your NOT NULL test up front e.g
where
pd.DiscountPrice is not null
AND
left join ProductDiscount pd on p.ProductID=pd.ProductID
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Sun, 18 Jan 2009 22:12:07 +0100
> From: mail@webthatworks.it
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] left join with smaller table or index on (XXX is not null) to avoid upsert
>
> I've to apply a discounts to products.
>
> For each promotion I've a query that select a list of products and
> should apply a discount.
>
> Queries may have intersections, in these intersections the highest
> discount should be applied.
>
> Since queries may be slow I decided to proxy the discount this way:
>
> create table Product(
> ProductID int primary key,
> ListPrice numeric
> );
>
> create table ProductPrice(
> ProductID int references Products (ProcuctID),
> DiscountedPrice numeric
> );
>
> Case A)
> If I want the ProductPrice to contain just products with a
> discount I'll have to update, then see if the update was successful
> otherwise insert.
> I expect that the products involved may be around 10% of the overall
> products.
>
> To retrieve a list of products I could:
> select [some columns from Product],
> least(coalesce(p.ListPrice,0),
> coalesce(pp.DiscountedPrice,0)) as Price
> from Product
> left join ProductPrice pp on p.ProductID=pp.ProductID
> where [some conditions on Product table];
>
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
>
> Case B)
> Or ProductPrice may just contain ALL the products and everything
> will be managed with updates.
>
> select [some columns from Product],
> least(coalesce(p.ListPrice,0),
> coalesce(pd.DiscountedPrice,0))
> from Product
> left join ProductDiscount pd on p.ProductID=pd.ProductID and
> pd.DiscountPrice is not null
> where [some conditions on Product table];
>
> create index ProductDiscount_DiscountedPrice_idx on DiscountPrice
> (DiscountPrice is not null);
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
>
> I'm expecting that:
> - ProductPrice will contain roughly but less than 10% of the
> catalogue.
> - I may have from 0 to 60% overlap on queries generating the list of
> products to be discounted.
> - The overall number of promotions/query running concurrently may be
> in the range of 20-100
> - promotions will be created/deletes at a rate of 5-10 a day, so
> that discount will have to be recalculated
> - searches in the catalogue have to be fast
>
> Since I haven't been able to find a quick way to build up a
> hierarchy of promotions to apply/re-apply discounts when promotion
> are added/deleted, creating/deleting promotions looks critical as
> well.
> The best thing I was able to plan was just to reapply all promotions
> if one is deleted.
>
> So it looks to me that approach B is going to make updating of
> discounts easier, but I was wondering if it makes retrieval of
> Products and Prices slower.
>
> Having a larger table that is being updated at a rate of 5% to 10% a
> day may make it a bit "fragmented".
>
> Advices on the overall problem of discount overlap management will
> be appreciated too.
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
>
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live™: Keep your life in sync. Check it out.
pgsql-general by date: