Thread: Random Weighted Result Ordering

Random Weighted Result Ordering

From
Eliot Gable
Date:
I have a set of results that I am selecting from a set of tables which I want to return in a random weighted order for each priority group returned. Each row has a priority column and a weight column. I sort by the priority column with 1 being highest priority. Then, for each distinct priority, I want to do a weighted random ordering of all rows that have that same priority. I select the set of rows and pass it to a custom-built function that does the ordering. I have tested both the prioritize and the random weighted ordering functions and they do exactly what I want them to do for ordering the data that I send them.

The problem comes from the fact that I tried to make them generalized. They take an array of a small complex type which holds just an arbitrary ID, the priority, and the weight. The output is the same information but the rows are in the correct order. I thought I could take that output and just INNER JOIN it back to my original data rows from which I obtained the ID, priority, and weight values. However, when I do that, the ordering no longer seems to be preserved, even though there is no ORDER BY clause on the query that joins the data back to the original rows. Basically, I thought that if the original data was:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
34, 2, 0, 'data3'
90, 2, 1, 'data4'
95, 2, 1, 'data5

And the input to the functions was:

50, 1, 5
55, 1, 4
34, 2, 0
90, 2, 1
95, 2, 1

And the prioritized and weighted order came back:

50, 1, 5
55, 1, 4
95, 2, 1
90, 2, 1
34, 2, 0

Then, if I INNER JOINED them like:

  50, 1, 5
  55, 1, 4
  95, 2, 1
  90, 2, 1
  34, 2, 0
) AS randomized INNER JOIN (
  50, 1, 5, 'data1'
  55, 1, 4, 'data2'
  34, 2, 0, 'data3'
  90, 2, 1, 'data4'
  95, 2, 1, 'data5
) AS data ON (
)

Then the rows would come back as:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
95, 2, 1, 'data5'
90, 2, 1, 'data4'
34, 2, 0, 'data3

Unfortunately, that does not seem to be happening. Before I spend a ton of time digging into this issue, I thought I would pose the questions here:

Does anyone know for certain why this would not work? Or, does anyone know why this should work? 

I assumed that the order of the joins would preserve the ordering of the first set of data. However, I am worried about how the planner might re-arrange the joins on me, and I am wondering whether the order is guaranteed to be preserved like this in the first place... Does anyone know for sure about these assumptions?

Thanks in advance for any assistance.


--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero

Re: Random Weighted Result Ordering

From
Dimitri Fontaine
Date:
Eliot Gable <egable+pgsql-general@gmail.com> writes:

> I have a set of results that I am selecting from a set of tables which I want to return in a random weighted order
foreach priority group returned. Each row has a 
> priority column and a weight column. I sort by the priority column with 1 being highest priority. Then, for each
distinctpriority, I want to do a weighted random 
> ordering of all rows that have that same priority. I select the set of rows and pass it to a custom-built function
thatdoes the ordering. I have tested both the 
> prioritize and the random weighted ordering functions and they do exactly what I want them to do for ordering the
datathat I send them. 
>
> The problem comes from the fact that I tried to make them generalized. They take an array of a small complex type
whichholds just an arbitrary ID, the priority, 
> and the weight. The output is the same information but the rows are in
> the correct order.

I'd try having the function return just numbers in the right order, then
use that in the ORDER BY. To have those numbers, you'd still need to
join with the result of the function, tho.

Hope this helps you already, I don't have time to go deeper in the
subject!

Regards,
--
dim

Re: Random Weighted Result Ordering

From
Lew
Date:
Eliot Gable wrote:
> rows. Basically, I thought that if the original data was:
>
> 50, 1, 5, 'data1'
> 55, 1, 4, 'data2'
> 34, 2, 0, 'data3'
> 90, 2, 1, 'data4'
> 95, 2, 1, 'data5
>
> And the input to the functions was:
>
> 50, 1, 5
> 55, 1, 4
> 34, 2, 0
> 90, 2, 1
> 95, 2, 1
>
> And the prioritized and weighted order came back:
>
> 50, 1, 5
> 55, 1, 4
> 95, 2, 1
> 90, 2, 1
> 34, 2, 0
>
> Then, if I INNER JOINED them like:
>
> (
>    50, 1, 5
>    55, 1, 4
>    95, 2, 1
>    90, 2, 1
>    34, 2, 0
> ) AS randomized INNER JOIN (
>    50, 1, 5, 'data1'
>    55, 1, 4, 'data2'
>    34, 2, 0, 'data3'
>    90, 2, 1, 'data4'
>    95, 2, 1, 'data5
> ) AS data ON (
> randomized.id <http://randomized.id> = data.id <http://data.id>
> )
>
> Then the rows would come back as:
>
> 50, 1, 5, 'data1'
> 55, 1, 4, 'data2'
> 95, 2, 1, 'data5'
> 90, 2, 1, 'data4'
> 34, 2, 0, 'data3
>
> Unfortunately, that does not seem to be happening. Before I spend a ton
> of time digging into this issue, I thought I would pose the questions here:
>
> Does anyone know for certain why this would not work? Or, does anyone
> know why this should work?

It should not "work" in the sense you mean, but it does "work" in the way that
SQL promises, namely that the order can be anything if you omit an ORDER BY
clause in the SELECT.

> I assumed that the order of the joins would preserve the ordering of the
> first set of data. However, I am worried about how the planner might
> re-arrange the joins on me, and I am wondering whether the order is
> guaranteed to be preserved like this in the first place... Does anyone
> know for sure about these assumptions?

SELECT makes no promise about the order of returned rows absent an ORDER BY
clause.  Your query lacks an ORDER BY clause.  You could therefore get any
order back, including the possibility of different orders from different runs
of the same query.

Tables in a relational database have no inherent order.  You took no steps
whatsoever to guarantee the order of rows returned by the SELECT, so you
should not be surprised at any order that comes back.

--
Lew

Re: Random Weighted Result Ordering

From
Eliot Gable
Date:
Great suggestion. Thanks. Don't know why I didn't think of that. I do almost exactly the same thing further down in my stored procedure.

On Mon, Jun 7, 2010 at 4:34 AM, Dimitri Fontaine <dfontaine@hi-media.com> wrote:
Eliot Gable <egable+pgsql-general@gmail.com> writes:

> I have a set of results that I am selecting from a set of tables which I want to return in a random weighted order for each priority group returned. Each row has a
> priority column and a weight column. I sort by the priority column with 1 being highest priority. Then, for each distinct priority, I want to do a weighted random
> ordering of all rows that have that same priority. I select the set of rows and pass it to a custom-built function that does the ordering. I have tested both the
> prioritize and the random weighted ordering functions and they do exactly what I want them to do for ordering the data that I send them.
>
> The problem comes from the fact that I tried to make them generalized. They take an array of a small complex type which holds just an arbitrary ID, the priority,
> and the weight. The output is the same information but the rows are in
> the correct order.

I'd try having the function return just numbers in the right order, then
use that in the ORDER BY. To have those numbers, you'd still need to
join with the result of the function, tho.

Hope this helps you already, I don't have time to go deeper in the
subject!

Regards,
--
dim



--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero