Thread: Using expression names in subsequent calculation
Hello, I'm migrating an application from Access, and having great fun so far, but I'm running into problems with some queries. What I'm trying to do boils down to this: SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val, val * tax_rate AS tax_amount, val + tax_amount as total but get a "column qty does not exist" error. In a real query - qty, price and tax_rate would come from actual columns, but then the error is on the "val" column. I'd like to use the name of the expression rather than repeat the calculation. Near the bottom of the SELECT docs, there's a note about namespaces that explains why this won't work, but I'm wondering how to go about this type of query. So far I've come up with: WITH order_details AS (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate), order_values AS (SELECT qty * price AS val from order_details), order_taxes AS (SELECT val * tax_rate AS tax_amount FROM order_details, order_values) SELECT qty, price, tax_rate, val, tax_amount, val + tax_amount as total FROM order_details, order_values, order_taxes However, that looks at lot more complicated than simply repeating the calculations ;) What's the best way go about calculations like this that build upon previous results? Thanks, Tony
Tony Theodore <tony.theodore@gmail.com> writes: > I'm migrating an application from Access, and having great fun so far, > but I'm running into problems with some queries. What I'm trying to do > boils down to this: > SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val, > val * tax_rate AS tax_amount, val + tax_amount as total > but get a "column qty does not exist" error. Yeah, this is entirely contrary to the SQL standard. In the standard, the columns of the result are notionally computed in parallel, so there is no way for one to refer to another. MS hasn't done the world any favors by inventing this incompatible extension. > What's the best way go about calculations like this that build upon > previous results? What you need to do is use a sub-select to create columns that can be referenced at the next level. The particular example you're showing here requires multiple levels of sub-select because you're chaining the operations. Something like SELECT *, val + tax_amount as total FROM (SELECT *, val * tax_rate AS tax_amount FROM (SELECT *, qty * price AS val FROM (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate) as ss1 ) as ss2 ) as ss3; Keep in mind that the Postgres planner will typically flatten sub-selects used this way into a single level of plan, with the same result as if you hadn't chained the calculations but just expanded all the expressions into their primitive constitutents by hand. You can see that in this slightly less silly version of your example: regression=# create table sales (qty int, price numeric, tax_rate numeric); CREATE TABLE regression=# explain verbose SELECT *, val + tax_amount as total FROM (SELECT *, val * tax_rate AS tax_amount FROM (SELECT *, qty * price AS val FROM sales ) as ss2 ) as ss3; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on public.sales (cost=0.00..41.12 rows=830 width=68) Output: sales.qty, sales.price, sales.tax_rate, ((sales.qty)::numeric * sales.price), (((sales.qty)::numeric * sales.price)* sales.tax_rate), (((sales.qty)::numeric * sales.price) + (((sales.qty)::numeric * sales.price) * sales.tax_rate)) (2 rows) In this particular example that's probably just fine, because the individual calculations are cheap enough that repeating them probably beats incurring the overhead of multiple run-time plan levels. But if you were trying to use a structure like this to avoid multiple evaluations of a very expensive function, you'd want to stick an "OFFSET 0" into the sub-select level that had the expensive function, so as to create an optimization fence. regards, tom lane
On 12 September 2011 02:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Tony Theodore <tony.theodore@gmail.com> writes: >> I'm migrating an application from Access, and having great fun so far, >> but I'm running into problems with some queries. What I'm trying to do >> boils down to this: > >> SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val, >> val * tax_rate AS tax_amount, val + tax_amount as total > >> but get a "column qty does not exist" error. > > Yeah, this is entirely contrary to the SQL standard. In the standard, > the columns of the result are notionally computed in parallel, so there > is no way for one to refer to another. MS hasn't done the world any > favors by inventing this incompatible extension. > >> What's the best way go about calculations like this that build upon >> previous results? > > What you need to do is use a sub-select to create columns that can be > referenced at the next level. The particular example you're showing > here requires multiple levels of sub-select because you're chaining > the operations. Something like > > SELECT *, val + tax_amount as total > FROM > (SELECT *, val * tax_rate AS tax_amount > FROM > (SELECT *, qty * price AS val > FROM > (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate) as ss1 > ) as ss2 > ) as ss3; Excellent, thanks for that, I didn't think that was possible and the WITH statement was the only way to do such things. > Keep in mind that the Postgres planner will typically flatten > sub-selects used this way into a single level of plan, with the same > result as if you hadn't chained the calculations but just expanded all > the expressions into their primitive constitutents by hand. You can see > that in this slightly less silly version of your example: > > regression=# create table sales (qty int, price numeric, tax_rate numeric); > CREATE TABLE > regression=# explain verbose SELECT *, val + tax_amount as total > FROM > (SELECT *, val * tax_rate AS tax_amount > FROM > (SELECT *, qty * price AS val > FROM > sales > ) as ss2 > ) as ss3; > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Seq Scan on public.sales (cost=0.00..41.12 rows=830 width=68) > Output: sales.qty, sales.price, sales.tax_rate, ((sales.qty)::numeric * sales.price), (((sales.qty)::numeric * sales.price)* sales.tax_rate), (((sales.qty)::numeric * sales.price) + (((sales.qty)::numeric * sales.price) * sales.tax_rate)) > (2 rows) > > In this particular example that's probably just fine, because the > individual calculations are cheap enough that repeating them probably > beats incurring the overhead of multiple run-time plan levels. > But if you were trying to use a structure like this to avoid multiple > evaluations of a very expensive function, you'd want to stick an > "OFFSET 0" into the sub-select level that had the expensive function, > so as to create an optimization fence. > > regards, tom lane > Okay, I'll do some reading up on that. Thanks again, Tony