Thread: Creating a query to structure results grouped by two columns, pivoting only a third colum

Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:

````
SELECT 
  t1.date, 
  t1.area, 
  t1.canal,
  SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3 

````

and source query generates a initial structure as in:

###########################
date area  | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
. ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################




Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````
SELECT  
  t2.area,
  t2.canal,
  ( SELECT 
month,
peso_valor
FROM (
    SELECT 
      month(t1.date) month, 
      t1.area, 
      t1.canal,
      SUM(t1.peso_valor) AS peso_valor
    FROM tbl_data t1
    GROUP BY 1, 2, 3
    ORDER BY 1, 2, 3
) source_table 
) PIVOT (  
  peso
  FOR month IN (
    1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
  )  
) as pivot_table
ORDER BY month
) as t2.peso 
FROM tbl_data t2
GROUP BY 1, 2
````


what would be the SQL implementation to achieve a structure as such  ?  
(.i.e.  grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

 illustrated bellow?


Best wishes,
I








###########################
area  canal  2021-03-01  |  2021-04-01 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) ...


area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################




On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
 [FROM] 
###########################
date area  | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
. ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################


[TO] 

###########################
area  canal  2021-03-01  |  2021-04-01 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) ...


area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################


I think you'd want to use the "crosstab" function in the module "tablefunc" https://www.postgresql.org/docs/current/tablefunc.html

This post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905

Does that get you close enough?
Steve 

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:

````
SELECT 
  t1.date, 
  t1.area, 
  t1.canal,
  SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3 

````

and source query generates a initial structure as in:

###########################
date area  | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
. ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################




Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````
SELECT  
  t2.area,
  t2.canal,
  ( SELECT 
month,
peso_valor
FROM (
    SELECT 
      month(t1.date) month, 
      t1.area, 
      t1.canal,
      SUM(t1.peso_valor) AS peso_valor
    FROM tbl_data t1
    GROUP BY 1, 2, 3
    ORDER BY 1, 2, 3
) source_table 
) PIVOT (  
  peso
  FOR month IN (
    1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
  )  
) as pivot_table
ORDER BY month
) as t2.peso 
FROM tbl_data t2
GROUP BY 1, 2
````


what would be the SQL implementation to achieve a structure as such  ?  
(.i.e.  grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

 illustrated bellow?


Best wishes,
I








###########################
area  canal  2021-03-01  |  2021-04-01 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) ...


area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################


Using PySpark API, the solution would be 

````
df = df_table1.groupBy("area", "canal").pivot("date").sum("peso")
````

In oracle 11g,  it would be sort of 

````
SELECT * FROM table1 PIVOT (sum(peso) FOR date IN (
   '2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01’ 
 ) P
````

Oracle has this strange way of dealing with PIVOT, without even need to group columns. 


If I were to write the query, ut would be somehtinglike this (i.e. similar to Oracle but more explicit).

SELECT * FROM (
  SELECT 
    date, 
    area, 
    sub_canal,
    SUM(peso) AS peso
  FROM table1
  GROUP BY 1, 2, 3
  ORDER BY 1, 2, 3 
) t1 
PIVOT (
  peso FOR dat_referencia_mes IN ('2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01' )
) p



another thing that I would have done is to replace the static set of dates to a dynamic one. (i.e. SELECT DISTINCT date FROM table1)



However, I’m dealing with syntax errors now, plus once  I fix them I’m not sure if that will work ( I mean  



On Sep 25, 2021, at 21:53, Steve Midgley <science@misuse.org> wrote:



On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
 [FROM] 
###########################
date area  | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
. ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################


[TO] 

###########################
area  canal  2021-03-01  |  2021-04-01 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) ...


area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################


I think you'd want to use the "crosstab" function in the module "tablefunc" https://www.postgresql.org/docs/current/tablefunc.html

This post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905

Does that get you close enough?
Steve 

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:

````
SELECT 
  t1.date, 
  t1.area, 
  t1.canal,
  SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3 

````

and source query generates a initial structure as in:

###########################
date area  | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
. ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################




Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````
SELECT  
  t2.area,
  t2.canal,
  ( SELECT 
month,
peso_valor
FROM (
    SELECT 
      month(t1.date) month, 
      t1.area, 
      t1.canal,
      SUM(t1.peso_valor) AS peso_valor
    FROM tbl_data t1
    GROUP BY 1, 2, 3
    ORDER BY 1, 2, 3
) source_table 
) PIVOT (  
  peso
  FOR month IN (
    1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
  )  
) as pivot_table
ORDER BY month
) as t2.peso 
FROM tbl_data t2
GROUP BY 1, 2
````


what would be the SQL implementation to achieve a structure as such  ?  
(.i.e.  grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

 illustrated bellow?


Best wishes,
I








###########################
area  canal  2021-03-01  |  2021-04-01 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) ...


area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################