Random Weighted Result Ordering - Mailing list pgsql-general

From Eliot Gable
Subject Random Weighted Result Ordering
Date
Msg-id AANLkTimF6IgK7EHhv01W_hljsUcbuNY2O8MjzoC31-b_@mail.gmail.com
Whole thread Raw
Responses Re: Random Weighted Result Ordering  (Dimitri Fontaine <dfontaine@hi-media.com>)
Re: Random Weighted Result Ordering  (Lew <noone@lewscanon.com>)
List pgsql-general
I have a set of results that I am selecting from a set of tables which I want to return in a random weighted order for each priority group returned. Each row has a priority column and a weight column. I sort by the priority column with 1 being highest priority. Then, for each distinct priority, I want to do a weighted random ordering of all rows that have that same priority. I select the set of rows and pass it to a custom-built function that does the ordering. I have tested both the prioritize and the random weighted ordering functions and they do exactly what I want them to do for ordering the data that I send them.

The problem comes from the fact that I tried to make them generalized. They take an array of a small complex type which holds just an arbitrary ID, the priority, and the weight. The output is the same information but the rows are in the correct order. I thought I could take that output and just INNER JOIN it back to my original data rows from which I obtained the ID, priority, and weight values. However, when I do that, the ordering no longer seems to be preserved, even though there is no ORDER BY clause on the query that joins the data back to the original rows. Basically, I thought that if the original data was:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
34, 2, 0, 'data3'
90, 2, 1, 'data4'
95, 2, 1, 'data5

And the input to the functions was:

50, 1, 5
55, 1, 4
34, 2, 0
90, 2, 1
95, 2, 1

And the prioritized and weighted order came back:

50, 1, 5
55, 1, 4
95, 2, 1
90, 2, 1
34, 2, 0

Then, if I INNER JOINED them like:

  50, 1, 5
  55, 1, 4
  95, 2, 1
  90, 2, 1
  34, 2, 0
) AS randomized INNER JOIN (
  50, 1, 5, 'data1'
  55, 1, 4, 'data2'
  34, 2, 0, 'data3'
  90, 2, 1, 'data4'
  95, 2, 1, 'data5
) AS data ON (
)

Then the rows would come back as:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
95, 2, 1, 'data5'
90, 2, 1, 'data4'
34, 2, 0, 'data3

Unfortunately, that does not seem to be happening. Before I spend a ton of time digging into this issue, I thought I would pose the questions here:

Does anyone know for certain why this would not work? Or, does anyone know why this should work? 

I assumed that the order of the joins would preserve the ordering of the first set of data. However, I am worried about how the planner might re-arrange the joins on me, and I am wondering whether the order is guaranteed to be preserved like this in the first place... Does anyone know for sure about these assumptions?

Thanks in advance for any assistance.


--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero

pgsql-general by date:

Previous
From: "John T. Dow"
Date:
Subject: Error on Windows server could not open relation base/xxx/xxx Permission denied
Next
From: Craig Ringer
Date:
Subject: Re: WINDOWS : PostgreSQL 8.4 Server Start Error