Thread: Some questions about PostgreSQL source code

Some questions about PostgreSQL source code

From

Олег Царев

Date:

06 May 2009, 11:55:02

<div style="text-align: left;">Hello all!<br />I need help in study internal structures of PosrgreSQL. Sorry for my bad
english.<br/>I try to get information from source code and spend five days for that, but now have many questions and
fewunderstanding =(<br /> Source code it's clear, great commented, but studing so difficult system as DBMS it's very
strongonly from source code.<br /></div><br />How to PostgreSQL process query?<br />I found some description on <a
href="http://anoncvs.postgresql.org/cvsweb.cgi/%7Echeckout%7E/pgsql/src/tools/backend/index.html"
target="_blank">http://anoncvs.postgresql.org/cvsweb.cgi/~checkout~/pgsql/src/tools/backend/index.html</a><br/>
Nevethelessi have questions.<br /><br />Parser translate from text of query to AST. <br /><br />1) Than AST go to
plannerfor plan normalization and optimization.<br />Planner work on AST structures, or build self internal tree of
logicalplan?<br /><br />2) Who set types of any columns? Parser or planner?<br /><br />After planner, called physical
plan- executor.<br /><br />1) Where in source build executor's node from logical plan (result of planner)?<br /><br
/>2)How to executor's node bulding, linked, and use one another? For example how to linked Table Scan and Sort on query
selecta,b,c,d from table order by a,b? Let's assume query work without indexes, for simple describing.<br /><br />3)
Whatthe function called on Prepare/Execute? How this calls translated to executor's nodes?<br /><br />I try look for 
thisinformation in source code, and found execAim.c, with big swtich.<br />In that switch mixed brachnes of nodes, node
states,some expressions and aggregation.<br /> What is mind that switch in execAim.c? How to Prepare/Execute/Fetch work
withexecutor's nodes?<br /><br />4) How to manipulate data on the nodes? I understand from comments, what every node
useown childs for get "tuple", where "tuple" - list of "cells". I didn't found "cells" in source code =(<br /> Can you
descrivbeme, how to one node get data from source node, return data for parent, and what is "data" and where i can
foundin source code this entity?<br /><br />For start, this questions it's very important for me.<br /> Thank you.<br
/>

Re: Some questions about PostgreSQL source code

From

Tom Lane

Date:

06 May 2009, 12:34:58

Олег Царев <zabivator@gmail.com> writes:
> I need help in study internal structures of PosrgreSQL. Sorry for my bad
> english.
> I try to get information from source code and spend five days for that, but
> now have many questions and few understanding =(
> Source code it's clear, great commented, but studing so difficult system as
> DBMS it's very strong only from source code.

Have you read
http://developer.postgresql.org/pgdocs/postgres/overview.html
?  Also, many of the backend modules have README files that are
worth looking at.

> 1) Than AST go to planner for plan normalization and optimization.
> Planner work on AST structures, or build self internal tree of logical plan?

Well, both.  The input is a query tree and the output is a plan tree.

> 2) Who set types of any columns? Parser or planner?

The parse analysis phase determines all data types.  In principle the
semantics of the query are fully specified by the query tree.

> 1) Where in source build executor's node from logical plan (result of
> planner)?

The planner builds the plan tree (see createplan.c).  There's also
a "plan state" tree that's built during ExecutorStart to hold run-time
variables for each plan node.  This is needed because the plan tree is
read-only as far as the executor is concerned.

> I try look for  this information in source code, and found execAim.c, with
> big swtich.
> In that switch mixed brachnes of nodes, node states, some expressions and
> aggregation.

Uh, no, execAmi just works with planstate trees (I think there's one
function in it that works with plan trees).
        regards, tom lane

Re: Some questions about PostgreSQL source code

From

Heikki Linnakangas

Date:

06 May 2009, 12:43:29

Олег Царев wrote:
> Parser translate from text of query to AST.
> 
> 1) Than AST go to planner for plan normalization and optimization.
> Planner work on AST structures, or build self internal tree of logical plan?

The planner works with different structures in different phases of 
planning. Some transformations are made directly to the Query-tree, 
which is the format that the parser outputs. In intermediate phases, 
various other structures are build, e.g Path-trees. The final result of 
the planner is a Plan-tree.

> 2) Who set types of any columns? Parser or planner?

That's done in the so-called "parse analysis" phase. The entry point for 
that is the parse_analyze() function.

> After planner, called physical plan - executor.
> 
> 1) Where in source build executor's node from logical plan (result of
> planner)?

InitPlan().

> 2) How to executor's node bulding, linked, and use one another? For example
> how to linked Table Scan and Sort on query select a,b,c,d from table order
> by a,b? Let's assume query work without indexes, for simple describing.

The structure used by the executor is a tree of PlanState nodes (which 
reflects the planner's Plan-tree). See PlanState struct in execnodes.h. 
Each executor node (= PlanState) has a pointers to the nodes below it, 
usually in the lefttree and righttree fields, although some node types 
like AppendState use different method (AppendState.appendplans array)

> 3) What the function called on Prepare/Execute? How this calls translated to
> executor's nodes?
> 
> I try look for  this information in source code, and found execAim.c, with
> big swtich.
> In that switch mixed brachnes of nodes, node states, some expressions and
> aggregation.
> What is mind that switch in execAim.c? How to Prepare/Execute/Fetch work
> with executor's nodes?

That's used for internal parameters in the executor, not for 
prepare/execute. They're used for things like correlated subqueries, 
where the subquery is run repeatedly with different values in the 
enclosing query.

For prepare/execute, the executor is initialized, run, and shut down for 
each execution. The Plan tree that came from the planner is reused, but 
the corresponding executor tree (PlanState-tree) is recreated at each 
execution.

> 4) How to manipulate data on the nodes? I understand from comments, what
> every node use own childs for get "tuple", where "tuple" - list of "cells".
> I didn't found "cells" in source code =(
> Can you descrivbe me, how to one node get data from source node, return data
> for parent, and what is "data" and where i can found in source code this
> entity?

This question I didn't quite understand. The basic mechanism is that the 
top node of the executor tree is executed, and that asks for a tuple 
from the node(s) below it as needed (by calling ExecProcNode()), which 
in turn ask for tuples from their child nodes and so forth. IOW it's a 
"pull" system, where the top node pulls the tuples through the tree.

The intermediate tuples are stored in so-called tuple table slots.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com