Thread: Is there any chance to get some kind of a result set sifting mechanism in Postgres?

Hello Everyone!

Is there any chance to get some kind of a result set sifting mechanism in Postgres? 

What I am looking for is a way to get for example: "nulls last" in a result set, without having to call "order by" or having to use UNION ALL, and if possible to get this in a single result set pass.

Something on this line: SELECT a, b, c FROM my_table WHERE a nulls last OFFSET 0 LIMIT 25

I don't want to use order by or union all because these are time consuming operations, especially on  large data sets and when comparations are done on dynamic values (eg: geolocation distances in between a mobile and a static location) 

What I would expect from such a feature, will be speeds comparable with non sorted selects, while getting a very rudimentary ordering.

A use case for such a mechanism will be the implementation of QUICK relevant search results for a search engine.

I'm not familiar with how Postgres logic handles simple select queries, but the way I would envision a result set sifting logic, would be to collect the result set, in 2 separate lists, based on the sifting condition, and then concatenate these 2 lists and return the result, when the pagination requests conditions are met.

Any idea if such a functionality is feasible ?

Thank you.

  PS: if ever implemented, the  sifting mechanism could be extended to accommodate any type of thresholds, not just null values.



Hi,

do I interpret your idea correctly: You want some sort of ordering without ordering?

Kind regards
WW

Am Montag, 13. Mai 2024 um 10:40:38 MESZ hat aa <ghevge@gmail.com> Folgendes geschrieben:


Hello Everyone!

Is there any chance to get some kind of a result set sifting mechanism in Postgres? 

What I am looking for is a way to get for example: "nulls last" in a result set, without having to call "order by" or having to use UNION ALL, and if possible to get this in a single result set pass.

Something on this line: SELECT a, b, c FROM my_table WHERE a nulls last OFFSET 0 LIMIT 25

I don't want to use order by or union all because these are time consuming operations, especially on  large data sets and when comparations are done on dynamic values (eg: geolocation distances in between a mobile and a static location) 

What I would expect from such a feature, will be speeds comparable with non sorted selects, while getting a very rudimentary ordering.

A use case for such a mechanism will be the implementation of QUICK relevant search results for a search engine.

I'm not familiar with how Postgres logic handles simple select queries, but the way I would envision a result set sifting logic, would be to collect the result set, in 2 separate lists, based on the sifting condition, and then concatenate these 2 lists and return the result, when the pagination requests conditions are met.

Any idea if such a functionality is feasible ?

Thank you.

  PS: if ever implemented, the  sifting mechanism could be extended to accommodate any type of thresholds, not just null values.




Hi,
If you call the action of "sifting" ordering, then yes. If you don't call it ordering, then no.

In essence, is the output of a filtering mechanism, done in a single result set pass. And this pass should be the same pass in charge of collecting the result set in the first place.

Thanks


On Mon, May 13, 2024 at 5:48 AM Wolfgang Wilhelm <wolfgang20121964@yahoo.de> wrote:
Hi,

do I interpret your idea correctly: You want some sort of ordering without ordering?

Kind regards
WW

Am Montag, 13. Mai 2024 um 10:40:38 MESZ hat aa <ghevge@gmail.com> Folgendes geschrieben:


Hello Everyone!

Is there any chance to get some kind of a result set sifting mechanism in Postgres? 

What I am looking for is a way to get for example: "nulls last" in a result set, without having to call "order by" or having to use UNION ALL, and if possible to get this in a single result set pass.

Something on this line: SELECT a, b, c FROM my_table WHERE a nulls last OFFSET 0 LIMIT 25

I don't want to use order by or union all because these are time consuming operations, especially on  large data sets and when comparations are done on dynamic values (eg: geolocation distances in between a mobile and a static location) 

What I would expect from such a feature, will be speeds comparable with non sorted selects, while getting a very rudimentary ordering.

A use case for such a mechanism will be the implementation of QUICK relevant search results for a search engine.

I'm not familiar with how Postgres logic handles simple select queries, but the way I would envision a result set sifting logic, would be to collect the result set, in 2 separate lists, based on the sifting condition, and then concatenate these 2 lists and return the result, when the pagination requests conditions are met.

Any idea if such a functionality is feasible ?

Thank you.

  PS: if ever implemented, the  sifting mechanism could be extended to accommodate any type of thresholds, not just null values.



On Mon, 13 May 2024 at 04:40, aa <ghevge@gmail.com> wrote:
Hello Everyone!

Is there any chance to get some kind of a result set sifting mechanism in Postgres? 

What I am looking for is a way to get for example: "nulls last" in a result set, without having to call "order by" or having to use UNION ALL, and if possible to get this in a single result set pass.

Something on this line: SELECT a, b, c FROM my_table WHERE a nulls last OFFSET 0 LIMIT 25

I don't want to use order by or union all because these are time consuming operations, especially on  large data sets and when comparations are done on dynamic values (eg: geolocation distances in between a mobile and a static location) 

This already exists: ORDER BY a IS NULL

I've found it to be more useful than one might initially expect to order by a boolean expression.
aa <ghevge@gmail.com> writes:
> If you call the action of "sifting" ordering, then yes. If you don't call
> it ordering, then no.
> In essence, is the output of a filtering mechanism, done in a single result
> set pass. And this pass should be the same pass in charge of collecting the
> result set in the first place.

Sounds a lot like a WHERE clause to me.

            regards, tom lane



On 05/13/24 09:35, aa wrote:
> If you call the action of "sifting" ordering, then yes. If you don't call
> it ordering, then no.


One thing seems intriguing about this idea: normally, an expected
property of any ORDER BY is that no result row can be passed down
the pipe until all input rows have been seen.

In the case of ORDER BY <boolean expression>, or more generally
ORDER BY <expression type with small discrete value space>, a
pigeonhole sort could be used—and rows mapping to the ordered-first
pigeonhole could be passed down the pipe on sight. (Rows mapping to
any later pigeonhole still have to be held to the end, unless some
further analysis can identify when all rows for earlier pigeonholes
must have been seen).

I don't know whether any such ORDER BY strategy is already implemented,
or would be useful enough to be worth implementing, but it might be
handy in cases where a large number of rows are expected to map to
the first pigeonhole. Intermediate storage wouldn't be needed for those,
and some follow-on processing could go on concurrently.

The usage example offered here ("sift" nulls last, followed by
a LIMIT) does look a lot like a job for a WHERE clause though.

Regards,
-Chap