Dear PostgreSQL people,
A little while ago I posted a message about selecting a varchar
range through JDBC. Thomas was nice enough to give me some pointers, and I
went back to the database and tested some things out. The clincher came when
I realized that one of my JUnit tests was actually passing using the varchar
range; some of the regular expressions worked, while others did not.
What I found is that using the collating elements in regex
comparisons does not work if the collating arguments are more than one
character in length. According to
http://www.postgresql.org/idocs/index.php?functions-matching.html#FUNCTIONS-
REGEXP, regular expressions should work with multi-character collating
arguments. I did a bunch of tests at the PostgreSQL command-line to verify
this finding. Here are some highlights:
testdb=# select 'test' ~* '^[[.a.]-[.z.]]';
?column?
----------
t
(1 row)
testdb=# select 'test' ~* '^[[.aa.]-[.za.]]';
ERROR: Invalid regular expression: invalid collating element
testdb=# select 'test' ~* '^[[.a.]-[.za.]]';
ERROR: Invalid regular expression: invalid collating element
testdb=# select 'test' ~* '^[[.aa.]-[.z.]]';
ERROR: Invalid regular expression: invalid collating element
testdb=# select 'chchcc' ~ '[[.ch.]]*c';
ERROR: Invalid regular expression: invalid collating element
That last test was taken straight from the PostgreSQL interactive
documentation at the link that I provided above. The documentation states
that it should match.
Is this a bug? How do I get around it? I'd appreciate any advice.
Sincerely,
Daryl.
P.S. I'm using PostgreSQL 7.2.1.
P.P.S. I changed my regex's slightly in the code found far below (as per
Thomas' suggestions) so that they read '^[[.testArtist1.]-[.testArtist2.]]'
instead of '[[.testArtist1.]-[.testArtist2.]].*'. I believe both should
work, but the one Thomas suggested should be faster.
> -----Original Message-----
> From: Daryl Beattie [mailto:dbeattie@insystems.com]
> Sent: Friday, August 30, 2002 9:18 AM
> To: pgsql-general@postgresql.org; pgsql-jdbc@postgresql.org
> Subject: Re: [JDBC] [GENERAL] Selecting Varchar range (through JDBC).
>
>
> Dear PostgreSQL people,
>
> Thanks for your help from before, Vince. I didn't get around to
> attempting to implement that varchar range function until
> last night, and
> when I did I ran into some problems. Whenever I tried to execute the
> statement:
>
> SELECT ArtistId FROM Artist WHERE SortName ~*
> '[[.testArtist1.]-[.testArtist2.]].*'
>
> I get the following JDBC error:
>
> javax.ejb.FinderException: SQLException while finding Artists:
> java.sql.SQLException: ERROR: Invalid regular expression:
> invalid collating
> element
>
> The SQLException is coming from the database, however,
> it MAY be a
> JDBC driver issue with it not escaping the statement properly.
> My suspicion, however, is that I'm simply not using the regular
> expressions correctly. Can anybody kindly explain to me how to do this
> correctly? (I am trying to search a varchar range, for
> example, any Artists
> whose names start with the letters A to C.)
> I have attached my Java code to the bottom of this e-mail for
> reference.
> For your convenience, here's the link to the idocs that
> describes
> regex functionality in PostgreSQL:
> http://www.postgresql.org/idocs/index.php?functions-matching.h
> tml#FUNCTIONS-
> REGEXP
>
> Sincerely,
>
> Daryl.
>
> P.S. Some notes:
> - I'm using PostgreSQL 7.2.1 on RH Linux 7.2, and the
> latest binary
> PostgreSQL JDBC driver.
> - It doesn't matter what I put in the [. .], it never
> seems to work
> any better. For example, a good test would be WHERE SortName ~*
> '[[.A.]-[.C.]].*'
> - I tried using a straight Statement-from-a-query-string that I
> constructed instead of a PreparedStatement (to bypass any
> automatic escaping
> done by the PreparedStatement), and got the same results.
>
>
> > -----Original Message-----
> > From: Vince Vielhaber [mailto:vev@michvhf.com]
> > Sent: Monday, August 12, 2002 8:49 AM
> > To: Daryl Beattie
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] Selecting Varchar range.
> >
> >
> > On Mon, 12 Aug 2002, Daryl Beattie wrote:
> >
> > > Dear PostgreSQL people,
> > >
> > > I didn't find this in the archives or the idocs, so...
> > >
> > > How does one select a varchar range? The meaning of my
> > question is
> > > best illustrated by the following completely hypothetical
> > and totally
> > > imaginary situation:
> > > Let's say I have a table of musical artists:
> > >
> > > CREATE TABLE Artists (
> > > Id SERIAL PRIMARY KEY,
> > > Name VARCHAR(100)
> > > );
> > >
> > > And I want to eventually display only the artists from
> > 'Aa' to 'Am'.
> > > Naturally doing a select where Name like 'A%' will not
> > return me what I
> > > want.
> > > Any ides on how to do this? [Will "SELECT Name FROM
> > Artists WHERE
> > > Name > 'Aa%' AND Name < 'Am%';" work?]
> > >
> > > Bonus question: How do I select artists whose names start with a
> > > number?
> >
> > Think REGEX. A[a-m]% [0-9]%
> >
> > If any of your artists are indies, send them to
> www.unknown-artists.com
> >
> > Vince.
>
>
>
>
>
> /**
> * Finds all of the Artists whose sort-names fall within a
> particular range,
> * and returns them in a Collection. An example of this would
> be to ask for
> * all of the artists that are within "Aa" and "Cz", and it
> would return all
> * the Artists that start with A, B, and C. Case is ignored
> for the search,
> * but <i>be careful</i> because some bands could start with numbers!
> *
> * @param start The start String of the Artists we want to find.
> * @param end The end String of the Artists we want to find.
> * @return A Collection containing all the Artists whose
> sort-names fall
> * within a particular range.
> * @throws FinderException If there was a problem finding
> such Artists.
> */
> public Collection
> ejbFindArtistsWhoseSortNamesAreWithin(String start, String
> end) throws FinderException {
> ArrayList artists = new ArrayList(); // All of the Artists.
> Connection dbConnection = null; // The database connection.
> PreparedStatement ps = null; // The
> statement(s) to execute.
> ResultSet resultSet = null; // The results of
> the query.
>
> if ((start.indexOf("-") > 0) || (start.indexOf("[") > 0) ||
> (start.indexOf("]") > 0) || (end.indexOf("-") > 0) ||
> (end.indexOf("[") > 0)
> || (end.indexOf("]") > 0)) throw new FinderException("The
> start and end
> strings for findArtistsWhoseSortNamesAreWithin() cannot have
> '-', '[' or ']'
> in them.");
> if ((start == null) || ("".equals(start)) || (end == null) ||
> ("".equals(end))) return artists;
>
> // STEP 1: Get the database connection.
> try { dbConnection = getConnection(); } catch
> (Exception e) { throw
> new EJBException(e.getMessage()); }
>
> // STEP 2: Find the Artists.
> try {
> String queryStr = "SELECT " +
> DatabaseNames.ARTIST_ID + "
> FROM " +
> DatabaseNames.ARTIST_TABLE + " WHERE " +
> DatabaseNames.ARTIST_SORTNAME + " ~* ?";
> ps = dbConnection.prepareStatement(queryStr);
> ps.setString(1, "[[." + start + ".]-[." + end +
> ".]].*");
> resultSet = ps.executeQuery();
> ArtistPK artistPK = null;
> while (resultSet.next()) {
> artistPK = new ArtistPK();
>
> artistPK.setId(resultSet.getInt(DatabaseNames.ARTIST_ID));
> artists.add(artistPK);
> }
> }
> catch(SQLException e) {
> context.setRollbackOnly();
> throw new FinderException("SQLException while finding
> Artists:\n" + e);
> }
> finally {
> try { closeResultSet(resultSet); } catch (Exception e) {
> log.warning("Failed to close ResultSet: " + e.getMessage()); }
> try { closeStatement(ps); } catch (Exception e) {
> log.warning("Failed to close Statement: " + e.getMessage()); }
> try { closeConnection(dbConnection); } catch
> (Exception e) {
> log.warning("Failed to close Connection: " + e.getMessage()); }
> }
>
> // STEP 3: Return that Collection of ArtistPKs.
> return artists;
> }
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>