Re: [PATCH] pgweb: Search: bugfix: replace sites.baseurl with sites.hostname - Mailing list pgsql-www

From Magnus Hagander
Subject Re: [PATCH] pgweb: Search: bugfix: replace sites.baseurl with sites.hostname
Date
Msg-id CABUevExDQhiAwL7qAKR1=DFmrFVzb+zHKpfurUo8P78hoATe6A@mail.gmail.com
Whole thread Raw
In response to [PATCH] pgweb: Search: bugfix: replace sites.baseurl with sites.hostname  (Célestin Matte <celestin.matte@cmatte.me>)
Responses Re: [PATCH] pgweb: Search: bugfix: replace sites.baseurl with sites.hostname  (Célestin Matte <celestin.matte@cmatte.me>)
List pgsql-www


On Fri, Oct 22, 2021 at 4:59 PM Célestin Matte <celestin.matte@cmatte.me> wrote:
Hello,

There is an issue in the sql files for search in pgweb, as schema.sql does not correspond to what is used in functions.sql.
functions.sql uses sites.baseurl, which is not a field in schema.sql. I replaced it with sites.hostname, which I guess is the targeted field.

Without that patch, I get the following error when performing a search:

Traceback (most recent call last):                                                                                     
  File "/srv/pgweb/local/pgweb/search/views.py", line 284, in search                                                   
    'internal': include_internal,                                                                                     
psycopg2.errors.UndefinedColumn: column sites.baseurl does not exist                                                   
LINE 1: SELECT sites.id AS siteid, sites.baseurl, webpages.suburl, t...                                               
                                   ^                                                                                   
QUERY:  SELECT sites.id AS siteid, sites.baseurl, webpages.suburl, ts_rank_cd(fti,tsq) * relprio AS ts_rank_cd FROM web
pages INNER JOIN sites ON webpages.site=sites.id WHERE fti @@ tsq AND site=1 AND (includeinternal OR NOT isinternal) OR
DER BY ts_rank_cd(fti,tsq) * relprio DESC LIMIT 1000                                                                   
CONTEXT:  PL/pgSQL function site_search(text,integer,integer,text,boolean) line 21 at OPEN


Is that really right/enough?

All the instances I run have both fields, and for example hostname is www.postgresql.org and baseurl is https://www.postgresql.org.

Now, today that's the only site we have, so we could just replace it with putting the https:// part into it int he templates, but I think this originally comes from the times when we indexed multiple sites in a previous system -- and it appears to have been incorrectly imported in the very first commit. The search system is really due for a cleanup/replacement since that has been removed, because a lot of things could be simplified around that.

But short term, I believe the correct thing to do is to add the baseurl field to the table in the SQL file. I'll go push a patch taht does that.
 
--

pgsql-www by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [PATCH] pgweb: Add SEARCH_DSN to example settings.py
Next
From: Magnus Hagander
Date:
Subject: Re: [PATCH] pgarchives: Fix database install procedure: remove redundant tables in schema.sql