Re: Query to return normalized floats - Mailing list pgsql-novice

From Andreas Kretschmer
Subject Re: Query to return normalized floats
Date
Msg-id 20160228082132.GA5287@tux
Whole thread Raw
In response to Query to return normalized floats  (Kip Warner <kip@thevertigo.com>)
Responses Re: Query to return normalized floats  (Kip Warner <kip@thevertigo.com>)
List pgsql-novice
Kip Warner <kip@thevertigo.com> wrote:

> The some_value_* fields contain floating point data and are at least a
> dozen in number.

maybe a wrong table design, but that's another question.


>
> I would like to be able to perform queries on the table on the
> normalized versions of these values. By normalized I don't mean in the
> database nomenclature, but where all some_value_i's are in the range of
> [0,1].
>
> To do this, I must find the min() and max() of each row's some_value_i
> within the table and divide each some_value_i by the absolute
> difference of these two values.
>
> As an example, if a row contained the lowest some_value_3 of -4.0 and
> the largest row 1.5, then the normalized version of any some_value_3
> field is some_value_3 / (1.5 - -4.0) or some_value_3 / 5.5.
>
> I am having difficulty expressing this as a query to just list every
> row in the table for starters (e.g. SELECT * FROM my_table;). I
> considered creating a VIEW, my_table_normalized, but I'm not sure if
> that is the appropriate strategy here.

simple example:

test=*# select * from bla;
 id |       col1        |        col2
----+-------------------+--------------------
  1 | -27.3061781190336 |   9.23637737520039
  2 | -34.9188138730824 |   4.02728125452995
  3 |  27.7425193693489 |  -1.71850152313709
  4 |  18.2173402048647 |   1.78571328520775
  5 | -49.3932147044688 |   3.25902994722128
  6 | -21.3868645019829 | 0.0395399890840054
  7 |  48.6888256389648 | -0.219368590041995
  8 |  -26.943267416209 |  -2.84633947536349
  9 | -47.2060812171549 |   1.46993971429765
 10 |  -16.008263733238 |   5.15772333368659
 11 |   12.651920504868 |   4.91552650928497
 12 |  38.5760291945189 |   6.94333815015852
 13 |   -47.87487979047 |  -4.18941779062152
 14 | -2.24363747984171 |  0.686697596684098
 15 |  -2.6916132774204 |   7.83255377784371
 16 | -4.97196828946471 |    4.0004417207092
 17 |  35.3446557652205 |    2.0218435768038
 18 | -9.86138512380421 |   2.61743502691388
 19 | -48.4832897316664 |  -8.32880898378789
 20 | -27.9842584393919 |  -9.57530088722706
(20 rows)

test=*# create view my_factor as select max(col1) - min(col1) as c1,
max(col2) - min(col2) as c2 from bla;
CREATE VIEW
test=*# select id, col1 / f.c1, col2 / f.c2 from bla cross join
my_factor f;
 id |      ?column?       |      ?column?
----+---------------------+---------------------
  1 |  -0.278401407876724 |   0.490991672638172
  2 |  -0.356016389451263 |   0.214084102351125
  3 |   0.282850145370229 | -0.0913529085052159
  4 |   0.185735738582485 |   0.094925782819407
  5 |  -0.503590815724457 |   0.173245039690613
  6 |  -0.218050770835282 | 0.00210188525087517
  7 |   0.496409184275543 | -0.0116612982096414
  8 |  -0.274701334942333 |  -0.151307046381315
  9 |  -0.481291794622778 |   0.078139743503564
 10 |  -0.163212996764598 |   0.274176671625737
 11 |   0.128993243417117 |   0.261301859446679
 12 |   0.393303698204536 |   0.369097219997987
 13 |   -0.48811056155476 |  -0.222703032242957
 14 | -0.0228751101831245 |  0.0365037923307268
 15 | -0.0274424682438878 |   0.416366560632055
 16 | -0.0506919337327751 |   0.212657353847013
 17 |   0.360358080250589 |   0.107478107407462
 18 |  -0.100542210268818 |   0.139138836546109
 19 |  -0.494313633381836 |  -0.442746727197808
 20 |  -0.285314807291989 |  -0.509008327361828
(20 rows)

test=*#



is that okay? (you should define aliases for the columns...)



>
> The my_table table also contains a large number of rows. I am therefore
> worried about every query on the normalized variant taking a while to
> find the min() and the max() before it can do anything else.


create indexes on the columns, so it should use indexes for the min/max.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


pgsql-novice by date:

Previous
From: Kip Warner
Date:
Subject: Query to return normalized floats
Next
From: Shmagi Kavtaradze
Date:
Subject: Divide table raw into chunks