Thread: directory tree query with big planner variation

directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 07:49:41

Hi group,

this is a directory tree query for a backup system (http://
sourceforge.net/projects/bacula).
You provide a path and get back the names of the children plus a
boolean telling if the child has itself children.
The "%@" stands for the initial path:
---------------------------------------------------------------
explain analyze  SELECT X.name AS name, COUNT(CH) > 1 AS children
   FROM
     ( SELECT  RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name,
                                                  FN.name AS CH
         FROM
           ( SELECT P.path,P.pathid
               FROM bacula.path P
               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
           LEFT OUTER JOIN bacula.file F
             ON
               NLPC.pathid = F.pathid
           LEFT OUTER JOIN bacula.filename FN
             ON
               F.filenameid = FN.filenameid
         GROUP BY NLPC.path, FN.name
       UNION
       SELECT  FN.name AS name, FN.name AS CH
         FROM
           bacula.path P, bacula.file F, bacula.filename FN
         WHERE
           P.path = '%@/'   AND
           P.pathid = F.pathid                           AND
           F.filenameid = FN.filenameid
     ) AS X
   WHERE X.name <> ''
   GROUP BY X.name
---------------------------------------------------------------
The 1st part of the union takes care of directories, the 2nd one of
flat files.
Application allows user navigation in a browser (clicking on a name
in one column triggers the query and fills the next browser column).
Initial path of "/Users/axel/Library/Preferences" results in:
---------------------------------------------------------------
  Sort  (cost=1295.24..1295.47 rows=92 width=64) (actual
time=818.987..819.871 rows=527 loops=1)
    Sort Key: upper(x.name)
    ->  GroupAggregate  (cost=1288.56..1292.24 rows=92 width=64)
(actual time=784.069..814.059 rows=527 loops=1)
          ->  Unique  (cost=1288.56..1289.25 rows=92 width=112)
(actual time=783.931..809.708 rows=684 loops=1)
                ->  Sort  (cost=1288.56..1288.79 rows=92 width=112)
(actual time=783.921..793.150 rows=5350 loops=1)
                      Sort Key: name, ch
                      ->  Append  (cost=642.03..1285.55 rows=92
width=112) (actual time=335.134..723.917 rows=5350 loops=1)
                            ->  Subquery Scan "*SELECT*
1"  (cost=642.03..643.18 rows=46 width=112) (actual
time=335.130..338.564 rows=184 loops=1)
                                  ->  HashAggregate
(cost=642.03..642.72 rows=46 width=112) (actual time=335.121..337.843
rows=184 loops=1)
                                        ->  Nested Loop Left Join
(cost=2.00..641.80 rows=46 width=112) (actual time=39.293..326.831
rows=1685 loops=1)
                                              ->  Nested Loop Left
Join  (cost=0.00..502.63 rows=46 width=97) (actual
time=21.026..202.206 rows=1685 loops=1)
                                                    ->  Index Scan
using path_name_idx on path p  (cost=0.00..3.02 rows=1 width=97)
(actual time=15.480..56.935 rows=27 loops=1)
                                                          Index Cond:
((path >= '/Users/axel/Library/Preferences/'::text) AND (path < '/
Users/axel/Library/Preferences0'::text))
                                                          Filter:
((path ~ '^/Users/axel/Library/Preferences/[^/]*/$'::text) AND (rtrim
("replace"(path, '/Users/axel/Library/Preferences/'::text, ''::text),
'/'::text) <> ''::text))
                                                    ->  Index Scan
using file_path_idx on file f  (cost=0.00..493.28 rows=506 width=8)
(actual time=0.473..5.119 rows=62 loops=27)
                                                          Index Cond:
("outer".pathid = f.pathid)
                                              ->  Bitmap Heap Scan on
filename fn  (cost=2.00..3.01 rows=1 width=23) (actual
time=0.058..0.061 rows=1 loops=1685)
                                                    Recheck Cond:
("outer".filenameid = fn.filenameid)
                                                    ->  Bitmap Index
Scan on filename_pkey  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.030..0.030 rows=1 loops=1685)
                                                          Index Cond:
("outer".filenameid = fn.filenameid)
                            ->  Nested Loop  (cost=2.00..641.91
rows=46 width=19) (actual time=3.349..377.758 rows=5166 loops=1)
                                  ->  Nested Loop  (cost=0.00..502.62
rows=46 width=4) (actual time=3.118..97.375 rows=5200 loops=1)
                                        ->  Index Scan using
path_name_idx on path p  (cost=0.00..3.01 rows=1 width=4) (actual
time=0.045..0.052 rows=1 loops=1)
                                              Index Cond: (path = '/
Users/axel/Library/Preferences/'::text)
                                        ->  Index Scan using
file_path_idx on file f  (cost=0.00..493.28 rows=506 width=8) (actual
time=3.058..76.014 rows=5200 loops=1)
                                              Index Cond:
("outer".pathid = f.pathid)
                                  ->  Bitmap Heap Scan on filename
fn  (cost=2.00..3.02 rows=1 width=23) (actual time=0.037..0.039
rows=1 loops=5200)
                                        Recheck Cond:
("outer".filenameid = fn.filenameid)
                                        Filter: (name <> ''::text)
                                        ->  Bitmap Index Scan on
filename_pkey  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.018..0.018 rows=1 loops=5200)
                                              Index Cond:
("outer".filenameid = fn.filenameid)
Total runtime: 832.458 ms
---------------------------------------------------------------
which is ok, but initial path of "/Users/axel" results in (which is
not ok):
---------------------------------------------------------------
  Sort  (cost=125533.67..125534.17 rows=200 width=64) (actual
time=84273.963..84274.260 rows=181 loops=1)
    Sort Key: upper(x.name)
    ->  GroupAggregate  (cost=123493.01..125526.03 rows=200 width=64)
(actual time=84263.411..84272.427 rows=181 loops=1)
          ->  Unique  (cost=123493.01..124169.51 rows=90201
width=112) (actual time=84263.215..84270.129 rows=522 loops=1)
                ->  Sort  (cost=123493.01..123718.51 rows=90201
width=112) (actual time=84263.208..84265.632 rows=1432 loops=1)
                      Sort Key: name, ch
                      ->  Append  (cost=113172.83..116069.08
rows=90201 width=112) (actual time=83795.274..84251.660 rows=1432
loops=1)
                            ->  Subquery Scan "*SELECT*
1"  (cost=113172.83..115426.71 rows=90155 width=112) (actual
time=83795.270..83803.996 rows=410 loops=1)
                                  ->  HashAggregate
(cost=113172.83..114525.16 rows=90155 width=112) (actual
time=83795.258..83802.369 rows=410 loops=1)
                                        ->  Hash Left Join
(cost=3124.38..112722.06 rows=90155 width=112) (actual
time=56254.547..83779.903 rows=3648 loops=1)
                                              Hash Cond:
("outer".filenameid = "inner".filenameid)
                                              ->  Merge Left Join
(cost=0.00..106216.87 rows=90155 width=97) (actual
time=54926.198..82430.621 rows=3648 loops=1)
                                                    Merge Cond:
("outer".pathid = "inner".pathid)
                                                    ->  Index Scan
using path_pkey on path p  (cost=0.00..2567.57 rows=1941 width=97)
(actual time=527.805..1521.911 rows=69 loops=1)
                                                          Filter:
((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim("replace"(path, '/
Users/axel/'::text, ''::text), '/'::text) <> ''::text))
                                                    ->  Index Scan
using file_path_idx on file f  (cost=0.00..95191.99 rows=3020363
width=8) (actual time=17.561..74392.318 rows=2941790 loops=1)
                                              ->  Hash
(cost=2723.30..2723.30 rows=160430 width=23) (actual
time=1299.103..1299.103 rows=160430 loops=1)
                                                    ->  Seq Scan on
filename fn  (cost=0.00..2723.30 rows=160430 width=23) (actual
time=3.884..684.918 rows=160430 loops=1)
                            ->  Nested Loop  (cost=2.00..641.91
rows=46 width=19) (actual time=93.252..442.196 rows=1022 loops=1)
                                  ->  Nested Loop  (cost=0.00..502.62
rows=46 width=4) (actual time=49.375..209.694 rows=1050 loops=1)
                                        ->  Index Scan using
path_name_idx on path p  (cost=0.00..3.01 rows=1 width=4) (actual
time=29.455..29.462 rows=1 loops=1)
                                              Index Cond: (path = '/
Users/axel/'::text)
                                        ->  Index Scan using
file_path_idx on file f  (cost=0.00..493.28 rows=506 width=8) (actual
time=19.898..175.869 rows=1050 loops=1)
                                              Index Cond:
("outer".pathid = f.pathid)
                                  ->  Bitmap Heap Scan on filename
fn  (cost=2.00..3.02 rows=1 width=23) (actual time=0.206..0.208
rows=1 loops=1050)
                                        Recheck Cond:
("outer".filenameid = fn.filenameid)
                                        Filter: (name <> ''::text)
                                        ->  Bitmap Index Scan on
filename_pkey  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.087..0.087 rows=1 loops=1050)
                                              Index Cond:
("outer".filenameid = fn.filenameid)
  Total runtime: 84295.927 ms
---------------------------------------------------------------
It happened once that the planner resolved the 2nd query with the 1st
plan, but this is not reproducible.
How can I avoid the 2nd plan?

This is 8.1.4 on OpenBSD 3.9 with 2x1GHz PIII and 2GB.
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Michael Stone

Date:

31 July 2006, 08:17:39

On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
>               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC

This can't be indexed. You might try something like
WHERE P.path LIKE '%@%' AND P.path ~ '^%@/[^/]*/$'

The schema could be a lot more intelligent here. (E.g., store path
seperately from file/directory name, store type (file or directory)
seperately, etc.) Without improving the schema I don't think this will
ever be a speed demon.

Mike Stone

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 08:55:52

Am 31.07.2006 um 13:15 schrieb Michael Stone:

> On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
>>               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
>
> This can't be indexed. You might try something like WHERE P.path
> LIKE '%@%' AND P.path ~ '^%@/[^/]*/$'
Why does it quite well in this case:
---------------------------------------
->  Index Scan using path_name_idx on path p  (cost=0.00..3.02 rows=1
width=97) (actual time=15.480..56.935 rows=27 loops=1)
       Index Cond: ((path >= '/Users/axel/Library/
Preferences/'::text) AND (path < '/Users/axel/Library/
Preferences0'::text))
       Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/
$'::text) AND (rtrim("replace"(path, '/Users/axel/Library/
Preferences/'::text, ''::text), '/'::text) <> ''::text))
---------------------------------------
as compared to this case(ignoring the index on path):
---------------------------------------
->  Index Scan using path_pkey on path p  (cost=0.00..2567.57
rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1)
       Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim
("replace"(path, '/Users/axel/'::text, ''::text), '/'::text) <>
''::text))
---------------------------------------
? With all longer path names, I get the above (good)result.
Should I put the rtrim/replace on the client side?
>
> The schema could be a lot more intelligent here. (E.g., store path
> seperately from file/directory name, store type (file or directory)
> seperately, etc.) Without improving the schema I don't think this
> will ever be a speed demon.
PATH holds complete pathnames of directories, FILENAME holds
filenames and pathname components.
Currently the schema is the lowest common denominator between SQLite,
MySQL and pg and the bacula people will stay with that (-;).
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Michael Stone

Date:

31 July 2006, 10:32:08

On Mon, Jul 31, 2006 at 01:54:24PM +0200, Axel Rau wrote:
>Am 31.07.2006 um 13:15 schrieb Michael Stone:
>>On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
>>>               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
>>
>>This can't be indexed. You might try something like WHERE P.path
>>LIKE '%@%' AND P.path ~ '^%@/[^/]*/$'

Ignore that, I wasn't awake yet.

>Why does it quite well in this case:
>---------------------------------------
>->  Index Scan using path_name_idx on path p  (cost=0.00..3.02 rows=1
>width=97) (actual time=15.480..56.935 rows=27 loops=1)
>       Index Cond: ((path >= '/Users/axel/Library/
>Preferences/'::text) AND (path < '/Users/axel/Library/
>Preferences0'::text))
>       Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/
>$'::text) AND (rtrim("replace"(path, '/Users/axel/Library/
>Preferences/'::text, ''::text), '/'::text) <> ''::text))
>---------------------------------------
>as compared to this case(ignoring the index on path):
>---------------------------------------
>->  Index Scan using path_pkey on path p  (cost=0.00..2567.57
>rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1)
>       Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim
>("replace"(path, '/Users/axel/'::text, ''::text), '/'::text) <>
>''::text))
>---------------------------------------
>? With all longer path names, I get the above (good)result.
>Should I put the rtrim/replace on the client side?

That's not the slow part. The slow part is retrieving every single file
for each of the subdirectories in order to determine whether there are
any files in the subdirectories.

>>The schema could be a lot more intelligent here. (E.g., store path
>>seperately from file/directory name, store type (file or directory)
>>seperately, etc.) Without improving the schema I don't think this
>>will ever be a speed demon.

>PATH holds complete pathnames of directories, FILENAME holds
>filenames and pathname components.
>Currently the schema is the lowest common denominator between SQLite,
>MySQL and pg and the bacula people will stay with that (-;).

Nothing I suggested raises the bar for the "lowest common denominator".
If I understand the intend of this SQL, you're pulling all the entries
in a directory in two parts. The first part (files) is fairly
straightforward. The second part (directories) consists of pulling any
file whose parent is a subdirectory of the directory you're looking for
(this is *all* children of the directory, since you have to retrieve
every element that begins with the directory, then discard those that
have an additional / in their name), counting how many of these there
are for each subdirectory, and discarding those results except for a
binary (yes there are children or no there aren't). This is a lot of
useless work to go through, and is going to be slow if you've got a lot
of stuff in a subdirectory. An alternative approach would be, for each
directory, to store all its children (files and subdirectories) along
with a flag indicating which it is. This would allow you to create the
collapsed tree view without walking all the children of a subdirectory.

Assuming you can't make changes to the schema, what about the query?
You've got this:

explain analyze  SELECT X.name AS name, COUNT(CH) > 1 AS children
  FROM
    ( SELECT  RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name,
                                                 FN.name AS CH
        FROM
          ( SELECT P.path,P.pathid
              FROM bacula.path P
              WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
          LEFT OUTER JOIN bacula.file F
            ON
              NLPC.pathid = F.pathid
          LEFT OUTER JOIN bacula.filename FN
            ON
              F.filenameid = FN.filenameid
        GROUP BY NLPC.path, FN.name
      UNION
      SELECT  FN.name AS name, FN.name AS CH
        FROM
          bacula.path P, bacula.file F, bacula.filename FN
        WHERE
          P.path = '%@/'   AND
          P.pathid = F.pathid                           AND
          F.filenameid = FN.filenameid
    ) AS X
  WHERE X.name <> ''
  GROUP BY X.name

I'm only looking at the first part, which reduces to
SELECT X.name AS name, COUNT(CH) > 1 AS children FROM
  SELECT NLPC.path AS name, FN.name as CH
    FROM ( SELECT P.path,P.pathid FROM bacula.path AS NLPC
           LEFT OUTER JOIN bacula.file F ON NLPC.pathid=F.pathid
           LEFT OUTER JOIN bacula.filename FN ON F.filenameid=FN.filenameid
           GROUP BY NLPC.path,FN.name

Why is the filename table even accessed? Would the results be the
same if you did
  SELECT NLPC.path AS name, F.fileid AS CH
and drop the LEFT OUTER JOIN bacula.filename altogether?

And then what happens if you try something like
SELECT X.name,X.children
 FROM
       (SELECT [rtrim]P.path,(SELECT count(*) FROM bacula.file F
                              WHERE F.pathid = P.pathid
                              LIMIT 2) > 1
          FROM bacula.path P
          WHERE P.path ~ '^%@/[^/]*/$'
        UNION
        SELECT FN.name,0
          FROM bacula.path P, bacula.file F, bacula.filename FN
          WHERE
            P.path = '%@/'   AND
            P.pathid = F.pathid                           AND
            F.filenameid = FN.filenameid
        ) AS X
 WHERE X.name <> ''
 GROUP BY X.name

It's hard to say without knowing what's actually *in* the tables, but
the existing query definately doesn't scale well for what I think it's
trying to do.

Mike Stone

Re: directory tree query with big planner variation

From

Mark Lewis

Date:

31 July 2006, 10:54:06

It seems like you might be able to avoid the expensive directory lookups
entirely without changing the schema by defining an immutable function
dir_depth(path), which would just count the number of forward slashes.
Then create a functional index on dir_depth(path) and in the query do a
check for directories with a given prefix and the expected dir_depth.

In 8.1 and later, this kind of query might be able to use a bitmap index
combining thingamajigger (the actual name escapes me right now).

This is just a hunch and I haven't looked too carefully at the
schema/query requirements to see if its feasible, but seems like a
reasonable approach.

-- Mark Lewis


On Mon, 2006-07-31 at 09:30 -0400, Michael Stone wrote:
> On Mon, Jul 31, 2006 at 01:54:24PM +0200, Axel Rau wrote:
> >Am 31.07.2006 um 13:15 schrieb Michael Stone:
> >>On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
> >>>               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
> >>
> >>This can't be indexed. You might try something like WHERE P.path
> >>LIKE '%@%' AND P.path ~ '^%@/[^/]*/$'
>
> Ignore that, I wasn't awake yet.
>
> >Why does it quite well in this case:
> >---------------------------------------
> >->  Index Scan using path_name_idx on path p  (cost=0.00..3.02 rows=1
> >width=97) (actual time=15.480..56.935 rows=27 loops=1)
> >       Index Cond: ((path >= '/Users/axel/Library/
> >Preferences/'::text) AND (path < '/Users/axel/Library/
> >Preferences0'::text))
> >       Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/
> >$'::text) AND (rtrim("replace"(path, '/Users/axel/Library/
> >Preferences/'::text, ''::text), '/'::text) <> ''::text))
> >---------------------------------------
> >as compared to this case(ignoring the index on path):
> >---------------------------------------
> >->  Index Scan using path_pkey on path p  (cost=0.00..2567.57
> >rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1)
> >       Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim
> >("replace"(path, '/Users/axel/'::text, ''::text), '/'::text) <>
> >''::text))
> >---------------------------------------
> >? With all longer path names, I get the above (good)result.
> >Should I put the rtrim/replace on the client side?
>
> That's not the slow part. The slow part is retrieving every single file
> for each of the subdirectories in order to determine whether there are
> any files in the subdirectories.
>
> >>The schema could be a lot more intelligent here. (E.g., store path
> >>seperately from file/directory name, store type (file or directory)
> >>seperately, etc.) Without improving the schema I don't think this
> >>will ever be a speed demon.
>
> >PATH holds complete pathnames of directories, FILENAME holds
> >filenames and pathname components.
> >Currently the schema is the lowest common denominator between SQLite,
> >MySQL and pg and the bacula people will stay with that (-;).
>
> Nothing I suggested raises the bar for the "lowest common denominator".
> If I understand the intend of this SQL, you're pulling all the entries
> in a directory in two parts. The first part (files) is fairly
> straightforward. The second part (directories) consists of pulling any
> file whose parent is a subdirectory of the directory you're looking for
> (this is *all* children of the directory, since you have to retrieve
> every element that begins with the directory, then discard those that
> have an additional / in their name), counting how many of these there
> are for each subdirectory, and discarding those results except for a
> binary (yes there are children or no there aren't). This is a lot of
> useless work to go through, and is going to be slow if you've got a lot
> of stuff in a subdirectory. An alternative approach would be, for each
> directory, to store all its children (files and subdirectories) along
> with a flag indicating which it is. This would allow you to create the
> collapsed tree view without walking all the children of a subdirectory.
>
> Assuming you can't make changes to the schema, what about the query?
> You've got this:
>
> explain analyze  SELECT X.name AS name, COUNT(CH) > 1 AS children
>   FROM
>     ( SELECT  RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name,
>                                                  FN.name AS CH
>         FROM
>           ( SELECT P.path,P.pathid
>               FROM bacula.path P
>               WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
>           LEFT OUTER JOIN bacula.file F
>             ON
>               NLPC.pathid = F.pathid
>           LEFT OUTER JOIN bacula.filename FN
>             ON
>               F.filenameid = FN.filenameid
>         GROUP BY NLPC.path, FN.name
>       UNION
>       SELECT  FN.name AS name, FN.name AS CH
>         FROM
>           bacula.path P, bacula.file F, bacula.filename FN
>         WHERE
>           P.path = '%@/'   AND
>           P.pathid = F.pathid                           AND
>           F.filenameid = FN.filenameid
>     ) AS X
>   WHERE X.name <> ''
>   GROUP BY X.name
>
> I'm only looking at the first part, which reduces to
> SELECT X.name AS name, COUNT(CH) > 1 AS children FROM
>   SELECT NLPC.path AS name, FN.name as CH
>     FROM ( SELECT P.path,P.pathid FROM bacula.path AS NLPC
>            LEFT OUTER JOIN bacula.file F ON NLPC.pathid=F.pathid
>            LEFT OUTER JOIN bacula.filename FN ON F.filenameid=FN.filenameid
>            GROUP BY NLPC.path,FN.name
>
> Why is the filename table even accessed? Would the results be the
> same if you did
>   SELECT NLPC.path AS name, F.fileid AS CH
> and drop the LEFT OUTER JOIN bacula.filename altogether?
>
> And then what happens if you try something like
> SELECT X.name,X.children
>  FROM
>        (SELECT [rtrim]P.path,(SELECT count(*) FROM bacula.file F
>                               WHERE F.pathid = P.pathid
>                               LIMIT 2) > 1
>           FROM bacula.path P
>           WHERE P.path ~ '^%@/[^/]*/$'
>         UNION
>         SELECT FN.name,0
>           FROM bacula.path P, bacula.file F, bacula.filename FN
>           WHERE
>             P.path = '%@/'   AND
>             P.pathid = F.pathid                           AND
>             F.filenameid = FN.filenameid
>         ) AS X
>  WHERE X.name <> ''
>  GROUP BY X.name
>
> It's hard to say without knowing what's actually *in* the tables, but
> the existing query definately doesn't scale well for what I think it's
> trying to do.
>
> Mike Stone
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 12:07:09

Am 31.07.2006 um 15:30 schrieb Michael Stone:

> If I understand the intend of this SQL,
Let me show the tables first:
                            Table "bacula.path"        (  65031 rows)
Column |  Type   |                       Modifiers
--------+---------
+-------------------------------------------------------
pathid | integer | not null default nextval('path_pathid_seq'::regclass)
path   | text    | not null      ( complete pathnames of all
directories )
Indexes:
     "path_pkey" PRIMARY KEY, btree (pathid)
     "path_name_idx" btree (path)

                              Table "bacula.file"       (3021903 rows)
    Column   |  Type   |                       Modifiers
------------+---------
+-------------------------------------------------------
fileid     | integer | not null default nextval
('file_fileid_seq'::regclass)
fileindex  | integer | not null default 0
jobid      | integer | not null
pathid     | integer | not null                (FK)
filenameid | integer | not null                (FK)
markid     | integer | not null default 0
lstat      | text    | not null
md5        | text    | not null
Indexes:
     "file_pkey" PRIMARY KEY, btree (fileid)
     "file_fp_idx" btree (filenameid, pathid)
     "file_jobid_idx" btree (jobid)
     "file_path_idx" btree (pathid)

                                Table "bacula.filename" ( 160559 rows)
    Column   |  Type   |                           Modifiers
------------+---------
+---------------------------------------------------------------
filenameid | integer | not null default nextval
('filename_filenameid_seq'::regclass)
name       | text    | not null
Indexes:
     "filename_pkey" PRIMARY KEY, btree (filenameid)
     "filename_name_idx" btree (name)

And now the query;

Task: Return the names of subdirectories and files immediately below
a given path. For each none-empty subdirectory return children=true.
The 1st part of the union selects all subdirecories (per regex) and
the flatfiles contained in them plus one entry for the subdirectory
itself (left outer joins). More than one joined filename means: "The
subdirectory has children".
The 2nd part of the union returns all flatfiles, contained in the
given path.
The surrounding SELECT removes the given path and the trailing "/"
keeping only the subdirectory names from the pathnames, so they can
be merged with the flatfile names.

> you're pulling all the entries
> in a directory in two parts. The first
(second)
> part (files) is fairly straightforward. The second
(first)
> part (directories) consists of pulling any file whose parent is a
> subdirectory of the directory you're looking for (this is *all*
> children of the directory, since you have to retrieve every element
> that begins with the directory, then discard those that have an
> additional / in their name), counting how many of these there are
> for each subdirectory, and discarding those results except for a
> binary (yes there are children or no there aren't). This is a lot
> of useless work to go through, and is going to be slow if you've
> got a lot of stuff in a subdirectory.
I agree, but did not yet find another way.
> An alternative approach would be, for each directory, to store all
> its children (files and subdirectories) along with a flag
> indicating which it is. This would allow you to create the
> collapsed tree view without walking all the children of a
> subdirectory.
Perhaps in a temporary table?
>
> Assuming you can't make changes to the schema, what about the query?
Can be changed.
> You've got this:
Please reconsider your proposals with the above

> It's hard to say without knowing what's actually *in* the tables,
> but the existing query definately doesn't scale well for what I
> think it's trying to do.
>
> Mike Stone
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Michael Stone

Date:

31 July 2006, 12:23:01

On Mon, Jul 31, 2006 at 05:06:00PM +0200, Axel Rau wrote:
>Please reconsider your proposals with the above

I'm not sure what you're getting at; could you be more specific?

Mike Stone

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 12:56:16

Am 31.07.2006 um 17:21 schrieb Michael Stone:

> On Mon, Jul 31, 2006 at 05:06:00PM +0200, Axel Rau wrote:
>> Please reconsider your proposals with the above
>
> I'm not sure what you're getting at; could you be more specific?
Let's see...


Am 31.07.2006 um 15:30 schrieb Michael Stone:
> And then what happens if you try something like SELECT
> X.name,X.children
> FROM       (SELECT [rtrim]P.path,(SELECT count(*) FROM bacula.file F
The file table is the biggest one, because it contains one row per
backup job and file (see my column description).
You need the filename table here.
>                               WHERE F.pathid = P.pathid
>                              LIMIT 2) > 1
>          FROM bacula.path P
>          WHERE P.path ~ '^%@/[^/]*/$'
>        UNION
>        SELECT FN.name,0
>          FROM bacula.path P, bacula.file F, bacula.filename FN
>          WHERE
>            P.path = '%@/'   AND
>            P.pathid = F.pathid                           AND
>            F.filenameid = FN.filenameid
>        ) AS X
> WHERE X.name <> ''
> GROUP BY X.name

Tweaking your query and omitting the RTRIM/REPLACE stuff, I get:
-------------------------------
SELECT X.path,X.children
FROM       (SELECT P.path,(SELECT count(*) FROM bacula.file F,
bacula.filename FN                             WHERE F.pathid =
P.pathid AND F.filenameid = FN.filenameid
                              LIMIT 2) > 1 AS children
          FROM bacula.path P
          WHERE P.path ~ '^/Users/axel/ports/[^/]*/$'
        UNION
        SELECT FN.name,0=1
          FROM bacula.path P, bacula.file F, bacula.filename FN
          WHERE
            P.path = '/Users/axel/ports/'   AND
            P.pathid = F.pathid                           AND
            F.filenameid = FN.filenameid
        ) AS X
WHERE X.path <> ''
GROUP BY X.path, X.children ;
              path             | children
------------------------------+----------
.cvsignore                   | f
/Users/axel/ports/CVS/       | t
/Users/axel/ports/archivers/ | t
INDEX                        | f
Makefile                     | f
README                       | f
(6 rows)

Time: 35.221 ms
-------------------------------
While my version returns:
-------------------------------
     name    | children
------------+----------
.cvsignore | f
archivers  | t
CVS        | t
INDEX      | f
Makefile   | f
README     | f
(6 rows)

Time: 30.263 ms
------------+----------
How would you complete your version?

Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 13:19:20

Am 31.07.2006 um 17:54 schrieb Axel Rau:
>
> Tweaking your query and omitting the RTRIM/REPLACE stuff, I get:
My example did not cover the case of empty subdirectories, in which
case your simplified query fails:
-------------------------------
              path             | children
------------------------------+----------
.DS_Store                    | f
/Users/axel/Projects/ADMIN/  | t
/Users/axel/Projects/DB/     | t
/Users/axel/Projects/HW/     | t
/Users/axel/Projects/JETSEC/ | t
/Users/axel/Projects/MISC/   | t
/Users/axel/Projects/NET/    | t
/Users/axel/Projects/SW/     | t
/Users/axel/Projects/TOOLS/  | t
(9 rows)
-------------------------------
Where it shoould be:
-------------------------------
    name    | children
-----------+----------
.DS_Store | f
ADMIN     | t
DB        | t
HW        | f
JETSEC    | f
MISC      | f
NET       | t
SW        | t
TOOLS     | t
(9 rows)
-------------------------------
Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 13:21:34

Am 31.07.2006 um 15:53 schrieb Mark Lewis:

> It seems like you might be able to avoid the expensive directory
> lookups
> entirely without changing the schema by defining an immutable function
> dir_depth(path), which would just count the number of forward slashes.
> Then create a functional index on dir_depth(path) and in the query
> do a
> check for directories with a given prefix and the expected dir_depth.
Still I must check for flatfiles in those subdirectories...
See my clarification here
    http://archives.postgresql.org/pgsql-performance/2006-07/msg00311.php

Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

Re: directory tree query with big planner variation

From

Michael Stone

Date:

31 July 2006, 14:09:43

On Mon, Jul 31, 2006 at 05:54:41PM +0200, Axel Rau wrote:
>The file table is the biggest one, because it contains one row per
>backup job and file (see my column description).

I never saw a column description--that would certainly help. :) I saw a
schema, but not an explanation of what the elements do. From what I can
understand of what you're saying, it is sounding as though the
bacula.file table contains an entry for the subdirectory itself as well
as entries for each file in the subdirectory? And the reason you need to
join back to the filename table is that there may be multiple copies of
the filename from multiple backups? Does the subdirectory itself have an
entry in the filename table? What is the content of the lstat column; can
it be used to distinguish a file from a directory? Similarly for the md5
column--what would it contain for a directory?

Mike Stone

Re: directory tree query with big planner variation

From

Axel Rau

Date:

31 July 2006, 15:49:51

Am 31.07.2006 um 19:08 schrieb Michael Stone:

> I never saw a column description--that would certainly help. :) I
> saw a schema, but not an explanation of what the elements do. From
> what I can understand of what you're saying, it is sounding as
> though the bacula.file table contains an entry for the subdirectory
> itself as well as entries for each file in the subdirectory?
It is the junction relation between path and filename and job and
describes
1. which files (identified by bacula.filename) are in a directory
(identified by bacula.path)
2. For each of those files they record a snapshot with
characteristics (lstat [base64 encoded], md5-checksum and a backup-
job [via jobid], which itself has backup-time etc.)
> And the reason you need to join back to the filename table is that
> there may be multiple copies of the filename from multiple backups?
One entry per backup(job) for each bacula.path/bacula.filename pair
in bacula.file.
> Does the subdirectory itself have an entry in the filename table?
Yes. Directories reference an entry containing '' in
bacula.filename.name.
> What is the content of the lstat column
File status info -- see stat(2).
> ; can it be used to distinguish a file from a directory?
Yes, the S_IFDIR bit identifies directories, but the whole lstat
column is base64 encoded
> Similarly for the md5 column--what would it contain for a directory?
It seems to contain 0.

Axel
Axel Rau, ☀Frankfurt , Germany                       +49-69-951418-0

PITR performance overhead?

From

"George Pavlov"

Date:

01 August 2006, 13:15:50

I am looking for some general guidelines on what is the performance
overhead of enabling point-in-time recovery (archive_command config) on
an 8.1 database. Obviously it will depend on a multitude of factors, but
some broad-brush statements and/or anecdotal evidence will suffice.
Should one worry about its performance implications? Also, what can one
do to mitigate it?

Thanks,

George

Re: PITR performance overhead?

From

Bill Moran

Date:

01 August 2006, 13:59:11

In response to "George Pavlov" <gpavlov@mynewplace.com>:

> I am looking for some general guidelines on what is the performance
> overhead of enabling point-in-time recovery (archive_command config) on
> an 8.1 database. Obviously it will depend on a multitude of factors, but
> some broad-brush statements and/or anecdotal evidence will suffice.
> Should one worry about its performance implications? Also, what can one
> do to mitigate it?

Prior to implementing PITR, I did some testing to see what kind of
overhead it would add.  It was negligible.  I don't remember the details,
but I seem to remember the performance hit was barely measurable.

Note that in our usage scenarios, we have very little IO compared to
CPU usage.  The result is that our DB servers have plenty of disk
bandwidth to spare.  Since the log backup occurs as a background
process, it made almost no difference in our tests.  If your DB is
very IO intensive, you may have different results.

--
Bill Moran
Collaborative Fusion Inc.

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: PITR performance overhead?

From

"Merlin Moncure"

Date:

01 August 2006, 22:17:17

On 8/1/06, George Pavlov <gpavlov@mynewplace.com> wrote:
> I am looking for some general guidelines on what is the performance
> overhead of enabling point-in-time recovery (archive_command config) on
> an 8.1 database. Obviously it will depend on a multitude of factors, but
> some broad-brush statements and/or anecdotal evidence will suffice.
> Should one worry about its performance implications? Also, what can one
> do to mitigate it?

pitr is extremely cheap both in performance drag and administation
overhead for the benefits it provides.  it comes almost for free, just
make sure you can handle all the wal files and do sane backup
scheduling.  in fact, pitr can actually reduce the load on a server
due to running less frequent backups.  if your server is heavy i/o
loaded, it might take a bit of planning.

merlin

Re: PITR performance overhead?

From

"Denis Lussier"

Date:

03 August 2006, 02:22:38

If your server is heavily I/O bound AND you care about your data AND your are throwing out your WAL files in the middle of the day... You are headed for a cliff.

I'm sure this doesn't apply to anyone on this thread, just a general reminder to all you DBA's out there who sometimes are too busy to implement PITR until after a disaster strikes. I know that in the past I've personally been guilty of this on several occasions.

--Denis
EnterpriseDB (yeah, rah, rah...)

On 8/1/06, Merlin Moncure <mmoncure@gmail.com> wrote:

On 8/1/06, George Pavlov <gpavlov@mynewplace.com > wrote:
> I am looking for some general guidelines on what is the performance
> overhead of enabling point-in-time recovery (archive_command config) on
> an 8.1 database. Obviously it will depend on a multitude of factors, but
> some broad-brush statements and/or anecdotal evidence will suffice.
> Should one worry about its performance implications? Also, what can one
> do to mitigate it?

pitr is extremely cheap both in performance drag and administation
overhead for the benefits it provides.  it comes almost for free, just
make sure you can handle all the wal files and do sane backup
scheduling.  in fact, pitr can actually reduce the load on a server
due to running less frequent backups.  if your server is heavy i/o
loaded, it might take a bit of planning.

merlin

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org