Thread: Creating a query to structure results grouped by two columns, pivoting only a third colum

Creating a query to structure results grouped by two columns, pivoting only a third colum

From

Iuri Sampaio

Date:

25 September 2021, 19:39:34

Hello there,

In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?

The source query is:

````

SELECT

t1.date,

t1.area,

t1.canal,

SUM(t1.peso) AS peso

FROM table1 t1

GROUP BY 1, 2, 3

ORDER BY 1, 2, 3

````

and source query generates a initial structure as in:

###########################

date | area | canal | peso

2021-03-01 area1 can1 45.6768

2021-03-01 area1 can2 54.6768

2021-03-01 area1 can3 87.6

2021-03-01 area2 can1 1.87

2021-03-01 area2 can2 12.7687

2021-03-01 area2 can3 965.568

2021-03-01 area3 can1 968.95

2021-03-01 area3 can2 1.6867

2021-03-01 area3 can3 8.897

…. … … ...

2021-06-01 area1 can1

2021-06-01 area1 can2

2021-06-01 area1 can3

2021-06-01 area2 can1

… … ...

2021-12-01 area1

###########################

Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".

Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````

SELECT

t2.area,

t2.canal,

( SELECT

month,

peso_valor

FROM (

SELECT

month(t1.date) month,

t1.area,

t1.canal,

SUM(t1.peso_valor) AS peso_valor

FROM tbl_data t1

GROUP BY 1, 2, 3

ORDER BY 1, 2, 3

) source_table

) PIVOT (

peso

FOR month IN (

1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec

)

) as pivot_table

ORDER BY month

) as t2.peso

FROM tbl_data t2

GROUP BY 1, 2

````

what would be the SQL implementation to achieve a structure as such ?

(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

illustrated bellow?

Best wishes,

###########################

area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...

peso peso peso peso

area1 can1 45.6768 875.98 1.232 …

area1 can2 54.6768 665.8 2.43 ...

area1 can3 87.6 65.8 4.76 ...

area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 … ...

area2 can2 12.7687

area2 can3 965.568

area2 total SUM(1.87+12.7687+965.568) … ...

area3 can1 968.95

area3 can2 1.6867

area3 can3 8.897

...

###########################

Re: Creating a query to structure results grouped by two columns, pivoting only a third colum

From

Steve Midgley

Date:

26 September 2021, 03:53:39

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:

[FROM]

###########################
date | area | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
…. … … ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
… … ...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################

[TO]

###########################
area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232 …
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 … ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) … ...

area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################

I think you'd want to use the "crosstab" function in the module "tablefunc" https://www.postgresql.org/docs/current/tablefunc.html

This post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905

Does that get you close enough?

Steve

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:

Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:

````
SELECT
t1.date,
t1.area,
t1.canal,
SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3

````

and source query generates a initial structure as in:

###########################
date | area | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
…. … … ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
… … ...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################

Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````
SELECT
t2.area,
t2.canal,
( SELECT
month,
peso_valor
FROM (
SELECT
month(t1.date) month,
t1.area,
t1.canal,
SUM(t1.peso_valor) AS peso_valor
FROM tbl_data t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) source_table
) PIVOT (
peso
FOR month IN (
1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
)
) as pivot_table
ORDER BY month
) as t2.peso
FROM tbl_data t2
GROUP BY 1, 2
````

what would be the SQL implementation to achieve a structure as such ?
(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

illustrated bellow?

Best wishes,
I

###########################
area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232 …
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 … ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) … ...

area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################

Re: Creating a query to structure results grouped by two columns, pivoting only a third colum

From

Iuri Sampaio

Date:

27 September 2021, 17:06:47

Using PySpark API, the solution would be

````

df = df_table1.groupBy("area", "canal").pivot("date").sum("peso")

````

In oracle 11g, it would be sort of

````

SELECT * FROM table1 PIVOT (sum(peso) FOR date IN (

'2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01’

) P

````

Oracle has this strange way of dealing with PIVOT, without even need to group columns.

If I were to write the query, ut would be somehtinglike this (i.e. similar to Oracle but more explicit).

SELECT * FROM (

SELECT

date,

area,

sub_canal,

SUM(peso) AS peso

FROM table1

GROUP BY 1, 2, 3

ORDER BY 1, 2, 3

) t1

PIVOT (

peso FOR dat_referencia_mes IN ('2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01' )

) p

another thing that I would have done is to replace the static set of dates to a dynamic one. (i.e. SELECT DISTINCT date FROM table1)

However, I’m dealing with syntax errors now, plus once I fix them I’m not sure if that will work ( I mean

On Sep 25, 2021, at 21:53, Steve Midgley <science@misuse.org> wrote:

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
[FROM]
###########################
date | area | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
…. … … ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
… … ...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################

[TO]

###########################
area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232 …
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 … ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) … ...

area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################

I think you'd want to use the "crosstab" function in the module "tablefunc" https://www.postgresql.org/docs/current/tablefunc.html

This post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905

Does that get you close enough?
Steve

On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:

````
SELECT
t1.date,
t1.area,
t1.canal,
SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3

````

and source query generates a initial structure as in:

###########################
date | area | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
…. … … ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
… … ...

2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################

Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .

I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso

````
SELECT
t2.area,
t2.canal,
( SELECT
month,
peso_valor
FROM (
SELECT
month(t1.date) month,
t1.area,
t1.canal,
SUM(t1.peso_valor) AS peso_valor
FROM tbl_data t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) source_table
) PIVOT (
peso
FOR month IN (
1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
)
) as pivot_table
ORDER BY month
) as t2.peso
FROM tbl_data t2
GROUP BY 1, 2
````

what would be the SQL implementation to achieve a structure as such ?
(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)

illustrated bellow?

Best wishes,
I

###########################
area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...
peso peso peso peso

area1 can1 45.6768 875.98 1.232 …
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...

area2 can1 1.87 … ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) … ...

area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897

...
###########################