Thread: Tsearch2 index silently fails on PG 7.3.2

Tsearch2 index silently fails on PG 7.3.2

From
"Justin L. Kennedy"
Date:
The short question is why does this:

select to_tsvector('default', coalesce(name, '') ||' '||
coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items;

give different results than this:

update link_items set linksfti=to_tsvector('default', coalesce(name, '')
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
select linksfti from link_items;


Here are more details:

I am working with Tsearch2 on a server with version string:
PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96

I have a table with the following schema:
CREATE TABLE link_items
(
  link_id int4,
  name varchar(255),
  url varchar(255),
  description text,
  spanish int4,
  spanishurl varchar(255),
  lastmod date,
  visible int4,
  state varchar(25),
  promisepractice int4,
  keywords text,
  linksfti tsvector
)
WITH OIDS;
ALTER TABLE link_items OWNER TO gate;

I want linksfti to hold the search engine's indexing data (indexed on
'name', 'description', and 'keywords'), so I run the following command:

update link_items set linksfti=to_tsvector('default', coalesce(name, '')
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));

The results are pretty empty.  Most have empty strings for data, other
only index one or two items in the 3 input columns.

For example, after running, my table looks like:
<name>;<description>;<keywords>;<linksfti>
"American Occupational Therapy Association (AOTA) ";"Nationally recognized
professional association for over 60,000 occupational therapists and
occupational therapy assistants. ";"Rehabilitation Professional
Associations and Councils";"'60':1 '000':2"
"American Physical Therapy Association (APTA)";"Represents more than
70,000 physical therapists, physical therapist assistants, and students of
physical therapy. ";"Rehabilitation Professional Associations and
Councils";"'70':1 '000':2"
"U.S. Deaf Ski & Snowboard Association";"Winter sports for people who are
deaf & relevant links.";"Recreation Winter Sports";"'u.s':1"
"Texas Adaptive Aquatics";"Adaptive water skiing program for people with
physical and/or mental disabilities. ";"Recreation Water
Sports";"'and/or':1"
"World T.E.A.M. Sports";"Inclusive sports activities.";"Recreation Team
Sports";"'t.e.a.m':1"
"Tennessee";"Official State Web Site";"Legal State Agencies";""
"Project Vote Smart";"By entering zip code, users get list of all their
elected officials. Links to elected officials' and candidates' web sites,
etc. ";"Government / Public Policy General";""
"TRIPOD Captioned Films";"Captioned Films for people who are deaf or hard
of hearing.";"Recreation Captioned Movies";""


When don't do it as an UPDATE and just print the contents to the screen, I
get the full expected output:

select name, description, keywords, to_tsvector('default', coalesce(name,
'') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from
link_items;

"United States of America Deaf Track and Field";"Promotes training of
track athletes who are deaf and coaches who are deaf and hearing.
";"Recreation Track";"'of':3,11 'and':7,17,22 'are':15,20 'who':14,19
'deaf':5,16,21 'hear':23 'unit':1 'coach':18 'field':8 'state':2
'track':6,12,25 'train':10 'athlet':13 'promot':9 'america':4
'recreat':24"
"Adventure Pursuit, Inc.";"Adventure Pursuit is a group of volunteers who
like spending time with all kinds of people and focus on adventure sports
like kayaking.";"Recreation Water Sports";"'a':7 'is':6 'of':9,18 'on':22
'all':16 'and':20 'inc':3 'who':11 'kind':17 'like':12,25 'time':14
'with':15 'focus':21 'group':8 'kayak':26 'peopl':19 'spend':13
'sport':24,29 'water':28 'volunt':10 'pursuit':2,5 'recreat':27
'adventur':1,4,23"


Using pgAdminIII, I copied (default backup/restore) the database from our
production server and put in on my personal desktop (Windows 2000, PgSQL
8.0.0) and re-ran the update query and it gave proper results.

Is it a known issue with 7.3.2, and is there a workaround without
upgrading the server to 8.0.0?  We will upgrade in a few months, but we
can't take the server offline now because we have too many websites that
depend on it.

--
Justin Kennedy
Systems Analyst I


Re: Tsearch2 index silently fails on PG 7.3.2

From
Oleg Bartunov
Date:
I don't remember such problem ? What's your tsearch2 setup ?


Oleg
On Thu, 17 Mar 2005, Justin L. Kennedy wrote:

>
> The short question is why does this:
>
> select to_tsvector('default', coalesce(name, '') ||' '||
> coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items;
>
> give different results than this:
>
> update link_items set linksfti=to_tsvector('default', coalesce(name, '')
> ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
> select linksfti from link_items;
>
>
> Here are more details:
>
> I am working with Tsearch2 on a server with version string:
> PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96
>
> I have a table with the following schema:
> CREATE TABLE link_items
> (
>  link_id int4,
>  name varchar(255),
>  url varchar(255),
>  description text,
>  spanish int4,
>  spanishurl varchar(255),
>  lastmod date,
>  visible int4,
>  state varchar(25),
>  promisepractice int4,
>  keywords text,
>  linksfti tsvector
> )
> WITH OIDS;
> ALTER TABLE link_items OWNER TO gate;
>
> I want linksfti to hold the search engine's indexing data (indexed on
> 'name', 'description', and 'keywords'), so I run the following command:
>
> update link_items set linksfti=to_tsvector('default', coalesce(name, '')
> ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
>
> The results are pretty empty.  Most have empty strings for data, other
> only index one or two items in the 3 input columns.
>
> For example, after running, my table looks like:
> <name>;<description>;<keywords>;<linksfti>
> "American Occupational Therapy Association (AOTA) ";"Nationally recognized
> professional association for over 60,000 occupational therapists and
> occupational therapy assistants. ";"Rehabilitation Professional
> Associations and Councils";"'60':1 '000':2"
> "American Physical Therapy Association (APTA)";"Represents more than
> 70,000 physical therapists, physical therapist assistants, and students of
> physical therapy. ";"Rehabilitation Professional Associations and
> Councils";"'70':1 '000':2"
> "U.S. Deaf Ski & Snowboard Association";"Winter sports for people who are
> deaf & relevant links.";"Recreation Winter Sports";"'u.s':1"
> "Texas Adaptive Aquatics";"Adaptive water skiing program for people with
> physical and/or mental disabilities. ";"Recreation Water
> Sports";"'and/or':1"
> "World T.E.A.M. Sports";"Inclusive sports activities.";"Recreation Team
> Sports";"'t.e.a.m':1"
> "Tennessee";"Official State Web Site";"Legal State Agencies";""
> "Project Vote Smart";"By entering zip code, users get list of all their
> elected officials. Links to elected officials' and candidates' web sites,
> etc. ";"Government / Public Policy General";""
> "TRIPOD Captioned Films";"Captioned Films for people who are deaf or hard
> of hearing.";"Recreation Captioned Movies";""
>
>
> When don't do it as an UPDATE and just print the contents to the screen, I
> get the full expected output:
>
> select name, description, keywords, to_tsvector('default', coalesce(name,
> '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from
> link_items;
>
> "United States of America Deaf Track and Field";"Promotes training of
> track athletes who are deaf and coaches who are deaf and hearing.
> ";"Recreation Track";"'of':3,11 'and':7,17,22 'are':15,20 'who':14,19
> 'deaf':5,16,21 'hear':23 'unit':1 'coach':18 'field':8 'state':2
> 'track':6,12,25 'train':10 'athlet':13 'promot':9 'america':4
> 'recreat':24"
> "Adventure Pursuit, Inc.";"Adventure Pursuit is a group of volunteers who
> like spending time with all kinds of people and focus on adventure sports
> like kayaking.";"Recreation Water Sports";"'a':7 'is':6 'of':9,18 'on':22
> 'all':16 'and':20 'inc':3 'who':11 'kind':17 'like':12,25 'time':14
> 'with':15 'focus':21 'group':8 'kayak':26 'peopl':19 'spend':13
> 'sport':24,29 'water':28 'volunt':10 'pursuit':2,5 'recreat':27
> 'adventur':1,4,23"
>
>
> Using pgAdminIII, I copied (default backup/restore) the database from our
> production server and put in on my personal desktop (Windows 2000, PgSQL
> 8.0.0) and re-ran the update query and it gave proper results.
>
> Is it a known issue with 7.3.2, and is there a workaround without
> upgrading the server to 8.0.0?  We will upgrade in a few months, but we
> can't take the server offline now because we have too many websites that
> depend on it.
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83