Re: ~* + LIMIT => infinite time? - Mailing list pgsql-performance
From | Hannu Krosing |
---|---|
Subject | Re: ~* + LIMIT => infinite time? |
Date | |
Msg-id | 1039997033.12952.6.camel@rh72.home.ee Whole thread Raw |
In response to | Re: ~* + LIMIT => infinite time? (<typea@l-i-e.com>) |
Responses |
Re: ~* + LIMIT => infinite time?
|
List | pgsql-performance |
typea@l-i-e.com kirjutas P, 15.12.2002 kell 05:41: > It occurred to me last night that the actual data *MIGHT* be involved -- > It's some OCR text, and there are a few scattered non-ASCII characters > involved... So *MAYBE* the actual text getting scanned could also be > important. > > It seems unlikely, since the non-LIMIT query returns all the data just > fine, but just in case... Have you tried using DECLARE CURSOR...; FETCH 1; CLOSE CURSOR; instead of LIMIT ? > Here's a schema and a full dump for anybody that wants to dig in: > http://bulletinarchive.org/pg_dump/ gzipping the data could make sense - data.sql goes from 200M to 60M ;) > I could provide PHP source as well, or the query posted in this thread can > serve as the test case. > > At the moment, I've altered the application to not use LIMIT when I have > ~* in the query, and it "works" -- only the paging of results is broken, > and the whole page takes twice as long to load as it should in those > cases, since it's doing the same query twice and snarfing all the monster > data and then throwing away the majority of rows in both cases. I need > the first row to get the highest score, and the rows for paging in the > real application... > > Anyway, my point is that the queries seem fine without the LIMIT clause, > and "hang" with both "~*" and LIMIT, and I've even gone so far as to > incorporate that into the application logic for now, just to have a page > that loads at all instead of one that hangs. > > Meanwhile, I guess I should flail at it and try 7.3 in the hopes the bug > disappeared. I tested (part of) it on 7.3 , had to manually change ::int to case-when-then-else-end as there is no cast from bool to int in7.3 This ran fine: SELECT DISTINCT *, 0 + case when (title ilike '%albert einstein%') then 10 else 0 end + case when ( title iLIKE '%einstein%' AND title iLIKE '%albert%' AND ( (title ~* 'einstein.{0,20}albert') OR (title ~* 'albert.{0,20}einstein'))) then 8 else 0 end as points FROM article WHERE FALSE OR (title iLIKE '%albert%') OR (author_flattened iLIKE '%albert%') OR (subject_flattened iLIKE '%albert%') OR (title iLIKE '%einstein%') OR (author_flattened iLIKE '%einstein%') OR (subject_flattened iLIKE '%einstein%') ORDER BY points desc, volume, number, article.article LIMIT 1 OFFSET 1; I also changed "lower(field) like '%albert%'" to "field ilike '%albert%'" and got about 20% speed boost - EXPLAIN ANALYZE reported 0.189 insead of 0.263 sec as actual time. > I was hoping to know for sure that it was a fixed bug in > that upgrade path. > > Boss actually said we should go ahead and upgrade just on principle > anyway. It's nice to have a smart boss. :-) > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- Hannu Krosing <hannu@tm.ee>
pgsql-performance by date: