Re: Confusing terminology - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Confusing terminology
Date
Msg-id Pine.LNX.4.30.0201181605390.708-100000@peter.localdomain
Whole thread Raw
In response to Re: Confusing terminology  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Confusing terminology  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Tom Lane writes:

> >from the client".  Look at it this way: if Postgres were implemented as
> a monolithic server process, would your documentation still be correct
> and sensible?  If so, say "server".

That was exactly my thought.

> Use the other two terms when you need to distinguish the parts.
> Example:
>
>     After receiving a connection request, the postmaster spawns
>     a backend process to handle that client session.

This is OK, because it's true:  There's a new process and it's at the
backend side of the wire.  (Actually, a session is something that exists
between a client and a server.)  What I don't like is language like "how
many backends are active on this database?" -- It's one: PostgreSQL.  It
would be correct to say "how many (PostgreSQL) backend *processes* are
active...", or maybe just "how many clients are connected to this
database".

> While this is certainly project-specific language, it's useful to people
> who may actually have to look at the code; and if they're reading
> documentation that is talking about the parts of the server in the first
> place, they're not that far away from wanting to look at code.

Right, but there are only specific chapters in the documentation that talk
about this.

> > "tuple" is described in one place as "A tuple is an individual state of a
> > row; each update of a row creates a new tuple for the same logical row."
> > This definition is inconsistent with common usage -- and even the rest of
> > the manual.
>
> Give us "common usage" that distinguishes these two concepts, please.

The libpq API uses tuple to mean row (and field to mean column).  Other
APIs like pgtcl and libpq++ have copied that.  I think that that's more
common usage than xmin and xmax.

> I agree that we've not been consistent, but unless someone lays down
> a clear definition for everyone to follow, it won't get better.

I think it's OK to use tuple == row, and "row state" or "tuple state" when
you're talking about MVCC (which is only rarely done anyway).  A row can
have more than one state at the same time under MVCC, but a row can have
more than one tuple???

> Maybe it's time for someone to prepare an "official" glossary that sets
> out all these terms carefully, so that people will have something to
> refer to when they're trying to pick a word to use.

Yeah, I think I'd like to set something like this up as part of the
program message style guide that I've talked about recently.

-- 
Peter Eisentraut   peter_e@gmx.net



pgsql-hackers by date:

Previous
From: Doug McNaught
Date:
Subject: Re: Text Column limits
Next
From: Bruce Momjian
Date:
Subject: Re: Confusing terminology