Re: Small PosgreSQL locking function request - with bounty - Mailing list pgsql-general

From Ralf Schuchardt
Subject Re: Small PosgreSQL locking function request - with bounty
Date
Msg-id F69C9B15-48B3-4EAB-8905-2CC79C50E28C@gmx.de
Whole thread Raw
In response to Small PosgreSQL locking function request - with bounty  (David Noel <david.i.noel@gmail.com>)
Responses Re: Small PosgreSQL locking function request - with bounty
List pgsql-general
Hi David,

have you used the "for update" clause in your select statements? With this clause "select" locks the selected row(s) in
atable for modifications and other "select for updates". 
My understanding is, that "for update" does what you need. You can execute your select and update statements in a
singletransaction in your Java application and get the desired effect (if the "URL" column is the primary key). 

Regards,
Ralf


Am 12.09.2013 um 13:40 schrieb David Noel <david.i.noel@gmail.com>:

> I have a few database queries that I've been running from within a
> Java project. I have recently come to the understanding that I need to
> run them instead within the PostgreSQL server as stored functions. I
> have that understanding because I need to make use of locking
> functionality, and that seems only able to be done by means of
> PostgreSQL functions. Transactions don't seem to be able to provide
> this. I've never written functions for postgres, so I thought maybe
> someone here could help.
>
> To provide some context: the code is a part of a webcrawler. More
> specifically, it is a part of the queuing system that handles the
> management of URL's to be crawled. The system takes URL's from the
> queue with one query, and marks them as active with a second. It then
> sends the results on to the crawler. Once the crawler has crawled the
> URL, a third query removes the URL from the queue.
>
> The code is running concurrently in multiple threads on multiple
> servers, and in scaling it to multiple servers I've run into some
> problems. It seems that due to the way postgres is designed I am
> unable to lock tables, or utilize transactions in Java to acheive
> concurrency. So I need it instead to be run as a postgres
> function/stored procedure. It seems. Am I correct in this, or did I
> misread the PosgreSQL Transactions documentation?
>
> Assuming the only way to accomplish this is with a postgres function,
> would anyone care to implement this for a small reward (though
> relative to the amount of work required I'd say it's probably a decent
> to large reward)?
>
> The queries are as follows:
>
> String querySelect =
> "select \"URL\",\"PublishDate\",\"SiteName\",\"Classification\",\"Special\",\"NextCrawlDate\"
> from \"crawlq\" " +
> "where \"Special\" = ? " +
> "AND \"Active\" = 'true' " +
> "AND \"TimeoutDate\" <= now() " +
> "AND \"CrawlError\" = 'false' " +
> "OR " +
> "\"Special\" = ? " +
> "AND \"Active\" = 'false' " +
> "AND \"CrawlError\" = 'false' " +
> "order by \"NextCrawlDate\" asc limit 1";
>
> String queryUpdateActive =
> "update \"crawlq\" " +
> "set \"Active\" = 'true', " +
> "\"TimeoutDate\" = now() + interval '5 minutes' " +
> "where \"URL\" = ? " ;
>
> This is what I need the function to do:
>
> I need the PostgreSQL function to first lock the table "crawlq".
> I then need it to perform the "querySelect" query.
> I then need it to perform the "queryUpdateActive" query.
> I then need it to unlock the table.
> I then need it to return the values from the select query to the Java project.
>
> Deliverables: I need the postgres function and a simple java program
> that calls the function and returns a result set. It doesn't need to
> do anything with the data, just call the function and return the value
> to the program. This should be a very simple project that shouldn't
> take more than 15 minutes for anyone familiar with writing postgres
> functions. Would $50 via PayPal be enough to entice anyone to offer a
> solution? I could also offer the payment in LiteCoins if you'd rather
> do it that way. Of course if you're feeling benevolent I wouldn't
> object to anyone who felt like doing it for free.
>
> The Java function I'm using that fetches elements from the queue
> currently is as follows:
>
>     public synchronized FetchType fetch(String cs){
>         if(debug_level == 1)
>             System.out.println(new java.util.Date(System.currentTimeMillis()) +
> " : DAO : fetching element from database");
>
>         /**
>          * prepare the select query
>          * execute it -- pull the items from the queue database
>          * load the query results into a return container
>          * clean up
>          * prepare the update query
>          * execute it -- update the record as active
>          * clean up
>          * commit the transaction
>          * return the query results
>          */
>
>         try {
>             if(!dbq.isValid(10))
>                 connectQ();
>         } catch (SQLException e1) {
>             e1.printStackTrace();
>         }
>
>         PreparedStatement stmt = null;
>         PreparedStatement stmt2 = null;
>         ResultSet rset = null;
>         FetchType ret = null;
>
>         // TODO: use a stored function
>         String     querySelect    =
>                 "select \"URL\",\"PublishDate\",\"SiteName\",\"Classification\",\"Special\",\"NextCrawlDate\"
> from \"crawlq\" "    +
>                 "where \"Special\" = ? "        +
>                 "AND \"Active\" = 'true' "            +
>                 "AND \"TimeoutDate\" <= now() "     +
>                 "AND \"CrawlError\" = 'false' "        +
>                 "OR "                                 +
>                 "\"Special\" = ? "             +
>                 "AND \"Active\" = 'false' "            +
>                 "AND \"CrawlError\" = 'false' "        +
>                 "order by \"NextCrawlDate\" asc limit 1";
>
>         String queryUpdateActive =
>                 "update \"crawlq\" "                 +
>                     "set \"Active\" = 'true', "     +
>                     "\"TimeoutDate\" = now() + interval '5 minutes' "     +
>                 "where \"URL\" = ? "                ;
>
>         try {
>             stmt = dbq.prepareStatement(querySelect);
>             stmt.setEscapeProcessing(true);
>             stmt.setString(1, cs);
>             stmt.setString(2, cs);
>             rset = stmt.executeQuery();
>
>             if(rset.next()){
>                 ret = new FetchType(
>                         rset.getString("URL"),
>                         rset.getString("SiteName"),
>                         rset.getString("Classification"),
>                         rset.getDate("PublishDate"),
>                         rset.getString("Special")
>                         );
>             } else
>                 ret = null;
>
>             rset.close();
>             stmt.close();
>
>             if (ret != null){
>                 stmt2 = dbq.prepareStatement(queryUpdateActive);
>                 stmt2.setEscapeProcessing(true);
>                 stmt2.setString(1, ret.getURL());
>                 stmt2.execute();
>                 stmt2.close();
>                 dbq.commit();
>             }
>
>             if(debug_level == 1)
>                 System.out.println(new java.util.Date(System.currentTimeMillis())
> + " : DAO : fetch complete " + ret.getURL());
>
>             return ret;
>         } catch (SQLException e) {
>             try {
>                 e.printStackTrace();
>                 dbq.rollback();
>                 stmt.close();
>                 stmt2.close();
>                 e.printStackTrace();
>                 return null;
>             } catch (SQLException e2) {
>                 e2.printStackTrace();
>                 return null;
>             }
>         }
>     }
>
> Running on one machine I'm bypassing the transaction concurrency issue
> by synchronizing the method. But the Java concurrency constructs I'm
> using here don't scale to multiple machines.
>
> At any rate, have I provided enough information to get the solution
> I'm looking for? Have I provided enough financial incentive to get
> this implemented? If so, please respond with code here to the list so
> multiple people don't implement it and expect to be paid. I can only
> pay one person, though if another person fixes a bug in a proposed
> solution I'm open to splitting the bounty however seems fair.
>
> Thanks for reading, hope to hear back!
>
> -David Noel
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



pgsql-general by date:

Previous
From: Andreas 'ads' Scherbaum
Date:
Subject: Re: Call for design: PostgreSQL mugs
Next
From: againstdemons84
Date:
Subject: Re: Trouble with replication