Thread: TPC-DS queries
I played with TPC-DS and found some of them can't be executed because of SQL errors and I am not sure why. For example with query 36: select sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin ,i_category ,i_class ,grouping(i_category)+grouping(i_class) as lochierarchy ,rank() over ( partition by grouping(i_category)+grouping(i_class), case when grouping(i_class) = 0 then i_category end order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as rank_within_parent from store_sales ,date_dim d1 ,item ,store where d1.d_year = 2000 and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and s_state in ('TN','TN','TN','TN', 'TN','TN','TN','TN') group by rollup(i_category,i_class) order by lochierarchy desc ,case when lochierarchy = 0 then i_category end -- line 25 is here. ,rank_within_parent limit 100; psql:query_0.sql:1935: ERROR: column "lochierarchy" does not exist LINE 25: ,case when lochierarchy = 0 then i_category end I have follwed the instruction here. https://ankane.org/tpc-ds PostgreSQL is master branch HEAD. For me, the SQL above looks to be valid. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Hi, I think that the sql is not valid. Based on the order by documentation, a column label cannot be used in an expression. from https://www.postgresql.org/docs/11/queries-order.html > Note that an output column name has to stand alone, that is, it cannot be used in an expression. Regards s. On 11.03.2019 06:30, Tatsuo Ishii wrote: > I played with TPC-DS and found some of them can't be executed because > of SQL errors and I am not sure why. > > For example with query 36: > > select > sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin > ,i_category > ,i_class > ,grouping(i_category)+grouping(i_class) as lochierarchy > ,rank() over ( > partition by grouping(i_category)+grouping(i_class), > case when grouping(i_class) = 0 then i_category end > order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as rank_within_parent > from > store_sales > ,date_dim d1 > ,item > ,store > where > d1.d_year = 2000 > and d1.d_date_sk = ss_sold_date_sk > and i_item_sk = ss_item_sk > and s_store_sk = ss_store_sk > and s_state in ('TN','TN','TN','TN', > 'TN','TN','TN','TN') > group by rollup(i_category,i_class) > order by > lochierarchy desc > ,case when lochierarchy = 0 then i_category end -- line 25 is here. > ,rank_within_parent > limit 100; > psql:query_0.sql:1935: ERROR: column "lochierarchy" does not exist > LINE 25: ,case when lochierarchy = 0 then i_category end > > I have follwed the instruction here. > https://ankane.org/tpc-ds > > PostgreSQL is master branch HEAD. For me, the SQL above looks to be > valid. > > Best regards, > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese:http://www.sraoss.co.jp >
> Hi, > > I think that the sql is not valid. Based on the order by > documentation, a column label cannot be used in an expression. > > from https://www.postgresql.org/docs/11/queries-order.html > > Note that an output column name has to stand alone, that is, it > cannot be used in an expression. Thanks. Yes, you are correct. The line should be something like: ,case when grouping(i_category)+grouping(i_class) = 0 then i_category end > Regards > s. > > On 11.03.2019 06:30, Tatsuo Ishii wrote: >> I played with TPC-DS and found some of them can't be executed because >> of SQL errors and I am not sure why. >> >> For example with query 36: >> >> select >> sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin >> ,i_category >> ,i_class >> ,grouping(i_category)+grouping(i_class) as lochierarchy >> ,rank() over ( >> partition by grouping(i_category)+grouping(i_class), >> case when grouping(i_class) = 0 then i_category end >> order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as >> rank_within_parent >> from >> store_sales >> ,date_dim d1 >> ,item >> ,store >> where >> d1.d_year = 2000 >> and d1.d_date_sk = ss_sold_date_sk >> and i_item_sk = ss_item_sk >> and s_store_sk = ss_store_sk >> and s_state in ('TN','TN','TN','TN', >> 'TN','TN','TN','TN') >> group by rollup(i_category,i_class) >> order by >> lochierarchy desc >> ,case when lochierarchy = 0 then i_category end -- line 25 is here. >> ,rank_within_parent >> limit 100; >> psql:query_0.sql:1935: ERROR: column "lochierarchy" does not exist >> LINE 25: ,case when lochierarchy = 0 then i_category end >> >> I have follwed the instruction here. >> https://ankane.org/tpc-ds >> >> PostgreSQL is master branch HEAD. For me, the SQL above looks to be >> valid. >> >> Best regards, >> -- >> Tatsuo Ishii >> SRA OSS, Inc. Japan >> English: http://www.sraoss.co.jp/index_en.php >> Japanese:http://www.sraoss.co.jp >> >
I found this error in queries generated from templates query36.tpl, query70.tpl, and query86.tpl. The problem is, lochierarchy is an alias defined in the SELECT statement, and the alias isn't being recognized in the CASE statement. PostgreSQL does not allow a column alias to be referenced in a CASE statement, you have to use the actual column name. Modify each of the queries throwing errors, and replace the lochierarchy alias with the actual column name you see in the SELECT statement.
-Mark
On Mon, Mar 11, 2019 at 4:00 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Hi,
>
> I think that the sql is not valid. Based on the order by
> documentation, a column label cannot be used in an expression.
>
> from https://www.postgresql.org/docs/11/queries-order.html
> > Note that an output column name has to stand alone, that is, it
> cannot be used in an expression.
Thanks. Yes, you are correct. The line should be something like:
,case when grouping(i_category)+grouping(i_class) = 0 then i_category end
> Regards
> s.
>
> On 11.03.2019 06:30, Tatsuo Ishii wrote:
>> I played with TPC-DS and found some of them can't be executed because
>> of SQL errors and I am not sure why.
>>
>> For example with query 36:
>>
>> select
>> sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin
>> ,i_category
>> ,i_class
>> ,grouping(i_category)+grouping(i_class) as lochierarchy
>> ,rank() over (
>> partition by grouping(i_category)+grouping(i_class),
>> case when grouping(i_class) = 0 then i_category end
>> order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as
>> rank_within_parent
>> from
>> store_sales
>> ,date_dim d1
>> ,item
>> ,store
>> where
>> d1.d_year = 2000
>> and d1.d_date_sk = ss_sold_date_sk
>> and i_item_sk = ss_item_sk
>> and s_store_sk = ss_store_sk
>> and s_state in ('TN','TN','TN','TN',
>> 'TN','TN','TN','TN')
>> group by rollup(i_category,i_class)
>> order by
>> lochierarchy desc
>> ,case when lochierarchy = 0 then i_category end -- line 25 is here.
>> ,rank_within_parent
>> limit 100;
>> psql:query_0.sql:1935: ERROR: column "lochierarchy" does not exist
>> LINE 25: ,case when lochierarchy = 0 then i_category end
>>
>> I have follwed the instruction here.
>> https://ankane.org/tpc-ds
>>
>> PostgreSQL is master branch HEAD. For me, the SQL above looks to be
>> valid.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>