Home > mailing lists

Re: How to perform a long running dry run transaction without blocking - Mailing list pgsql-general

From	Robert Leach
Subject	Re: How to perform a long running dry run transaction without blocking
Date	February 8 01:02:03
Msg-id	15A999B4-9D53-4F35-84B1-7B8696256EE9@princeton.edu Whole thread Raw
In response to	How to perform a long running dry run transaction without blocking (Robert Leach <rleach@princeton.edu>)
List	pgsql-general

Tree view

>> Anyway, thanks so much for your help.  This discussion has been very useful, and I think I will proceed at first,
exactlyhow you suggested, by queuing every validation job (using celery).  Then I will explore whether or not I can
applythe "on timeout" strategy in a small patch. 
>> Incidentally, during our Wednesday meeting this week, we actually opened our public instance to the world for the
firsttime, in preparation for the upcoming publication.  This discussion is about the data submission interface, but
thatinterface is actually disabled on the public-facing instance.  The other part of the codebase that I was primarily
responsiblefor was the advanced search.  Everything else was primarily by other team members.  If you would like to
checkit out, let me know what you think: http://tracebase.princeton.edu <http://tracebase.princeton.edu> 
>
> I would have to hit the books again to understand all of what is going on here.

It's a mass spec tracing database.  Animals are infused with radio labeled compounds and mass spec is used to see what
theanimal's biochemistry turns those compounds into.  (My undergrad was biochem, so I've been resurrecting my biochem
knowledge,as needed for this project.  I've been mostly doing RNA and DNA sequence analysis since undergrad, and most
ofthat was prokaryotic. 

> One quibble with the Download tab, there is no indication of the size of the datasets. I generally like to know what
Iam getting into before I start a download. Also, is there explicit throttling going on? I am seeing 10.2kb/sec,
whereasfrom here https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page I downloaded a 47.65M file at 41.9MB/s 

Thank you!  Not knowing the download size is exactly a complaint I had.  That download actually uses my advanced search
interface(in browse mode).  There is the same issue with the download buttons on the advanced search.  With the
streaming,we're not dealing with temp files, which is nice, at least for the advanced search, but we can't know the
downloadsize that way.  So I had wanted a progress bar to at least show progress (current record per total).  I could
evenestimate the size (an option I explored for a few days).  Eventually, I proposed a celery solution for that and I
wasoverruled. 

As for the download in the nav bar, we have an issue to change that to a listing of actual files broken down by study
(3files per study).  There's not much actual utility from a user perspective for downloading everything anyway.  We've
justbeen focussed on other things.  In fact, we have a request from a user for that specific feature, done in a way
that'scompatible with curl/scp.  We just have to figure out how to not have to CAS authenticate each command, something
Idon't have experience with.

pgsql-general by date:

From: Adrian Klaver
Date: 08 February, 00:16:08
Subject: Re: How to get a notification

From: Paul Foerster
Date: 08 February, 22:01:43
Subject: Re: libc to libicu via pg_dump/pg_restore?

Re: How to perform a long running dry run transaction without blocking - Mailing list pgsql-general

Previous

Next