Thread: BUG #18527: Imported data using a csv file and made a latest entry into same table using a form. Got an issue!
BUG #18527: Imported data using a csv file and made a latest entry into same table using a form. Got an issue!
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18527 Logged by: Vishnu Vardhan Vithanala Email address: vishnu.vardhann05@gmail.com PostgreSQL version: 16.3 Operating system: Windows Description: When I bulk uploaded some data using a csv file, it was all good. There were 543 rows in order and it went into the table and sat cleanly! I've connected the same table to an input form. When I hit submit, the latest entry which is supposed to be the last entry with a number id of 544 didn't go to the end of the table (bottom most) but instead sat on top of all previous bulk uploaded entries. Each entry has this kind of standard ID: DCMARIN10042024001 Subsequent entries will have the same ID with last 3 digits incrementing by 1. DCMARIN10042024002 DCMARIN10042024003 ... ... DCMARIN10042024543 These ID's are generated using the below python code (flask): def generate_unique_id(dept, sub_dept, table): ProjectCode = "" # Automatic Start Date capture - Production #start_date = date.today() # Start Date - Testing start_date = datetime.strptime('01-08-2024', '%d-%m-%Y').date() # To check if there is at least 1 entry in the DB or not first_entry = table.query.first() lastRow = str(table.query.order_by(table.id.desc()).first().id) if first_entry else None # Format Date in lastRow from String to Date Object prev_date = datetime.strptime(lastRow[7:15], "%d%m%Y").date() if lastRow is not None else None # If today > latest date that is already existing in DB - For first entry of today if first_entry is None or start_date > prev_date: ProjectCode = f"{dept}{sub_dept}{start_date.strftime('%d%m%Y')}001" # Subsequent entries for the same date elif start_date == prev_date: ProjectCode = f"{dept}{sub_dept}{start_date.strftime('%d%m%Y')}{str(int(lastRow[15:18]) + 1).zfill(3)}" return ProjectCode The difference in ordering of entries between importing data from csv and manually creating a new entry into the table is causing a hindrance in generating proper project codes. When ordering the rows based on project_code in either ascending or descending order, the latest entry which I made using the form is not filtering properly. In short, the order is not being applied to it! This is causing the generation of wrong & repeated project code and dumping into a column that acts as primary key and which by default is supposed to be unique! It's throwing an error that the key already exists! This issue based on my observation is due to the variation of data entry into the table - between bulk upload and manual upload (using a form). When I dumped the same data by using some python code based on pandas library and running it through the csv file and using flask to create and save entries, the issue seems to be resolved! I'm not sure if this is a bug or not but it seemed like a problem that is worth solving!
Re: BUG #18527: Imported data using a csv file and made a latest entry into same table using a form. Got an issue!
From
"David G. Johnston"
Date:
On Mon, Jul 1, 2024 at 8:32 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 18527
Logged by: Vishnu Vardhan Vithanala
Email address: vishnu.vardhann05@gmail.com
PostgreSQL version: 16.3
Operating system: Windows
Description:
When I bulk uploaded some data using a csv file, it was all good. There were
543 rows in order and it went into the table and sat cleanly!
There is no such thing as "in order" when it comes to the contents of a table.
The rest of what you posted I cannot follow but unless you can distill this down to a test case whereby you only use psql and server features it is not going to be recognized as a bug in PostgreSQL.
If you need help writing programs using PostgreSQL the -general mailing list is where to seek it. Right now it seems you have made incorrect assumptions and built on top of those. That will require some work on your end to redesign things to work in a manner consistent with the non-ordering of tables in PostgreSQL. Usually just adding an order by clause to your queries solves the issue.
David J.
Re: BUG #18527: Imported data using a csv file and made a latest entry into same table using a form. Got an issue!
From
Erik Wienhold
Date:
On 2024-07-01 17:08 +0200, PG Bug reporting form wrote: > When I bulk uploaded some data using a csv file, it was all good. There were > 543 rows in order and it went into the table and sat cleanly! > > I've connected the same table to an input form. When I hit submit, the > latest entry which is supposed to be the last entry with a number id of 544 > didn't go to the end of the table (bottom most) but instead sat on top of > all previous bulk uploaded entries. > > Each entry has this kind of standard ID: DCMARIN10042024001 > Subsequent entries will have the same ID with last 3 digits incrementing by > 1. > DCMARIN10042024002 > DCMARIN10042024003 > ... > ... > DCMARIN10042024543 > > These ID's are generated using the below python code (flask): > def generate_unique_id(dept, sub_dept, table): > ProjectCode = "" > > # Automatic Start Date capture - Production > #start_date = date.today() > > # Start Date - Testing > start_date = datetime.strptime('01-08-2024', '%d-%m-%Y').date() > > # To check if there is at least 1 entry in the DB or not > first_entry = table.query.first() > lastRow = str(table.query.order_by(table.id.desc()).first().id) if > first_entry else None Do you expect that lastRow is the one with the latest date encoded in its ID? If so, then date format %d%m%Y won't give you the expected chronological ordering. Use %Y%m%d (big-endian) instead. > # Format Date in lastRow from String to Date Object > prev_date = datetime.strptime(lastRow[7:15], "%d%m%Y").date() if lastRow > is not None else None > > # If today > latest date that is already existing in DB - For first > entry of today > if first_entry is None or start_date > prev_date: > ProjectCode = > f"{dept}{sub_dept}{start_date.strftime('%d%m%Y')}001" > > # Subsequent entries for the same date > elif start_date == prev_date: > ProjectCode = > f"{dept}{sub_dept}{start_date.strftime('%d%m%Y')}{str(int(lastRow[15:18]) + > 1).zfill(3)}" > This will generate duplicate IDs if you generate more than 1000 IDs per day because this uses some weird modular arithmetic with str(int(lastRow[15:18])+1).zfill(3), which will eventually generate 998, 999, 1000, 101, 102, ... The row with number 1000 won't necessarily be the "last row", even with date format %Y%m%d. For example, 101 will come before 1000: test=> select unnest('{DCMARIN202404101000,DCMARIN20240410101}'::text[]) order by 1 desc; unnest --------------------- DCMARIN20240410101 DCMARIN202404101000 (2 rows) > return ProjectCode -- Erik