Re: change natural column order - Mailing list pgsql-general
From | Greg Stark |
---|---|
Subject | Re: change natural column order |
Date | |
Msg-id | 87mzwy7gyl.fsf@stark.xeocode.com Whole thread Raw |
In response to | Re: change natural column order ("Dann Corbit" <DCorbit@connx.com>) |
Responses |
Re: change natural column order
|
List | pgsql-general |
"Dann Corbit" <DCorbit@connx.com> writes: > Putting "SELECT * FROM <table_name>" into a compiled program using libpq or > ESQL is a code defect. Period. > > ALTER TABLE ADD COLUMN /* Most frequent defect maker for SELECT * */ Whatever are you talking about? I've never tried ESQL precisely because it seems a terribly old-fashioned interface. Does it not support referring to columns by name? Even in libpq you can use PQfnumber or load all the columns found with PQfname into an associative array like higher level language drivers typically do automatically. > DROP TABLE/CREATE TABLE /* New version may have the same name and the > same number of columns, and they may even have the same data types but there is > no guarantee that the meaning is the same. */ Um, well, there was no guarantee the meaning was the intended meaning in the first place except for your own interface documentation. Obviously if you replace the table with a new one you're either maintaining compatible semantics or else you're changing the interface and it will require some code changes. > It is a defect of equal magnitude to assume that columns are returned in any > particular order unless specified in a column list (again, from a program and > not interactively). Actually the spec does guarantee that the columns have a fixed defined ordering. However I would agree it would be a poor design to depend on that ordering since it's not self-documenting and requires close synchronization between far distant pieces of code. But then I think it's bad to depend on ordering even when it is an explicitly listed column list because it requires close synchronization between two pieces of code even if they aren't too far distant. I prefer referring to columns by name in all but the simplest cases because it means the only synchronization is the presence or lack of a column, not the precise position in the output list. I'm free to add columns to a select list in the logical position without having to make adjustments elsewhere in the code. > Another typical defect is to assume that columns come backed ordered by the > primary key if the table is clustered on the primary key column. You can > have a page split with many database systems and so there is no guarantee > that data will be returned in order without an ORDER BY clause -- clustered > or not. You're confusing columns with rows. Without an ORDER BY clause there's no guarantee of the order of the rows. Not because of page splits or any other specific technical reason. There just isn't. There are any number of reasons the database might choose another ordering to return rows. In the case of Postgres clustering doesn't really "work" anyways so *any* sequential table scan without an explicit ORDER BY will be out of order unless you haven't made any data changes since the clustering. > Any of (ASSUME NO COLUMN CHANGES/ASSUME COLUMN ORDER/ASSUME CLUSTERED KEY SORT > ORDER) would cause me to fail code in a code review. > > IMO-YMMV Well with Postgres and reasonably modern drivers none of the above impact at all on whether "SELECT *" is a good idea or not. Like I said in my post. Because of historical problems with various other databases the rule of thumb that "SELECT *" is bad has become entrenched in standard DBA dogma. One of those historical problems is that some interfaces depend on column ordering in result sets and have particular problems dealing when the columns change in any way. As you point out this can happen for lots of reasons, not just because "SELECT *" is used. Thankfully nowadays we get to use much more flexible interfaces that find columns based on column names that don't suffer from these problems. -- greg
pgsql-general by date: