Re: Review: query result history in psql - Mailing list pgsql-hackers

From Maciej Gajewski
Subject Re: Review: query result history in psql
Date
Msg-id CAEcSYXJT8SbFuicGJSp5Ujp4-8r+GFhBbDxpYcjs=hdcskHvog@mail.gmail.com
Whole thread Raw
In response to Review: query result history in psql  (ian link <ian@ilink.io>)
Responses Re: Review: query result history in psql
List pgsql-hackers
Thank you for the review!


There were a few english/grammatical mistakes that I went ahead and fixed.

Thank  you for that. If you could send me a patch-to-a-patch so I can correct all the mistakes in the next release?
 
Additionally, I think some of the string manipulation might be placed outside of the main ans.c file. I don't know if there's a better place for 'EscapeForCopy' and 'GetEscapedLen'. Not really a big deal, just an organizational idea. I also changed 'EscapeForCopy' to 'EscapeAndCopy'. I think that better describes the functionality. 'EscapeForCopy' kind of implies that another function is needed to copy the string. 


The 'EscapeForCopy' was meant to mean 'Escape string in a format require by the COPY TEXT format', so 'copy' in the name refers to the escaping format, not the action performed by the function.

They could be, indeed, placed in separate module. I'll do it.
 

What does 'ans' stand for? I am not sure how it relates to the concept of a query history. It didn't stop my understanding of the code, but I don't know if a user will immediately know the meaning.

Some mathematical toolkits, like Matlab or Mathematica, automatically set a variable called 'ans' (short for "answer") containing the result of the last operation. I was trying to emulate exactly this behaviour.

 
Probably the biggest problem is that the query history list is missing a maximum size variable. I think this could be valuable for preventing users from shooting themselves in the foot. If the user is running large queries, they might accidentally store too much data. This probably somewhat of an edge-case but I believe it is worth considering. We could provide a sensible default limit (10 queries?) and also allow the user to change it. 

I was considering such a behaviour. But since the feature is turned off by default, I decided that whoever is using it, is aware of cost. Instead of truncating the history automatically (which could lead to a nasty surprise), I decided to equip the user with \ansclean , a command erasing the history. I believe that it is better to let the user decide when history should be erased, instead of doing it automatically.


Finally, is it worth resetting the query history every time a user reconnects to the database? I can see how this might interrupt a user's workflow. If the user suddenly disconnects (network connection interrupted, etc) then they would lose their history. I think this is definitely up for debate. It would add more management overhead (psql options etc) and might just be unnecessary. However, with a sane limit to the size of the query history, I don't know if there would be too many drawbacks from a storage perspective.

The history is not erased. The history is always stored in the client's memory. When a history item is used for the first time, a TEMPORARY table is created in the database that stores the data server-side. When user disconnects from the database, the session ends and all these tables are dropped.
Tables names have to be removed from the history, so next time the item is used, the table will be created and populated again.

I use the feature while switching often between databases, and it works seamlessly. Actually, it's quite useful to move bits of data across databases:
Connect to database A, run a query, connect to database B, run another query joining local data with the results of the previous query.
 
Those issues aside - I think it's a great feature! I can add the grammatical fixes I made whenever the final patch is ready. Or earlier, whatever works for you. Also, this is my first time reviewing a patch, so please let me know if I can improve on anything. Thanks!

This is  my first submitted patch, so I can't really comment on the process. But if you could add the author's email to CC, the message would be much easier to spot. I replied after two days only because I missed the message in the flood of other pgsql-hacker messages. I think I need to scan the list more carefully...

Maciej

pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Next
From: Amit Kapila
Date:
Subject: Re: Reduce maximum error in tuples estimation after vacuum.