Thread: PG planning randomly ?

PG planning randomly ?

From

"Laurent Raufaste"

Date:

26 February 2008, 11:19:31

Hi,

I'm having some issues with this simple query:

SELECT
  _comment.*,
  _article.title AS article_title,
  _article.reference AS article_reference
FROM
  _comment
  INNER JOIN _article
    ON _article.id = _comment.parent_id
WHERE
  _comment.path <@ '0.1.3557034'
ORDER BY
  _comment.date_publishing DESC
OFFSET 0
LIMIT 5
;

The varying information here is the ltree path "0.1.3557034"

Usually it's quite fast (<1s) but sometimes after an ANALYZE on the
_comment table it gets so slow it's killing our servers. And it's
really random.
We run our servers with default_statistics_target=100, I tried setting
it up to 1000 (max) but it does not change this wrong behavior.

I executed the same query on our 11 servers, 3 of them executed the
query slowly after the ANALYZE. Sometimes it happens to more,
sometimes to less.
Here is the EXPLAIN ANALYZE data on those 3 servers before and after
the ANALYZE execution.

===== Server 1 =( ======

===== The Query on server 1 before an ANALYZE =====
Limit  (cost=16286.04..16286.06 rows=5 width=567) (actual
time=62.521..62.526 rows=5 loops=1)
   ->  Sort  (cost=16286.04..16289.89 rows=1539 width=567) (actual
time=62.519..62.520 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..16204.57 rows=1539 width=567)
(actual time=2.063..44.517 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..4736.73 rows=1539 width=534) (actual time=2.038..20.487
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..7.44 rows=1 width=41) (actual time=0.004..0.004 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 64.844 ms

===== The Query on server 1 after an ANALYZE =====
 Limit  (cost=0.00..11082.13 rows=5 width=569) (actual
time=313945.051..693805.921 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..34057601.77 rows=15366 width=569)
(actual time=313945.049..693805.912 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..33949736.04 rows=15366 width=536) (actual
time=313923.129..693755.772 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..7.01 rows=1 width=41) (actual time=10.016..10.018 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 693806.044 ms

===== Poor Server 2 ='( ======

 ===== The Query on server 2 before an ANALYZE =====
  Limit  (cost=21096.49..21096.51 rows=5 width=586) (actual
time=34.184..34.187 rows=5 loops=1)
   ->  Sort  (cost=21096.49..21100.33 rows=1535 width=586) (actual
time=34.182..34.184 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..21015.26 rows=1535 width=586)
(actual time=0.119..25.232 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..6325.53 rows=1535 width=553) (actual time=0.100..11.066
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..9.56 rows=1 width=41) (actual time=0.002..0.003 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 34.658 ms

===== The Query on server 2 after an ANALYZE =====
   Limit  (cost=0.00..18806.13 rows=5 width=585) (actual
time=363344.748..575823.722 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..57764897.33 rows=15358 width=585)
(actual time=363344.747..575823.715 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..57618270.03 rows=15358 width=552) (actual
time=363344.681..575823.502 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..9.53 rows=1 width=41) (actual time=0.036..0.036 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 575823.796 ms

===== Poor Server 3 ='(((( ======

 ===== The Query on server 3 before an ANALYZE =====
   Limit  (cost=20563.80..20563.81 rows=5 width=585) (actual
time=31.424..31.428 rows=5 loops=1)
   ->  Sort  (cost=20563.80..20567.64 rows=1539 width=585) (actual
time=31.423..31.424 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..20482.32 rows=1539 width=585)
(actual time=1.198..22.912 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..6341.85 rows=1539 width=552) (actual time=1.160..9.641
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..9.18 rows=1 width=41) (actual time=0.002..0.003 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 31.850 ms

===== The Query on server 3 after an ANALYZE =====
   Limit  (cost=0.00..18726.66 rows=5 width=585) (actual
time=171378.294..286416.273 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..57577000.69 rows=15373 width=585)
(actual time=171378.293..286416.269 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..57436080.63 rows=15373 width=552) (actual
time=171378.249..286416.062 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..9.15 rows=1 width=41) (actual time=0.034..0.034 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 286416.339 ms

How can we stick the planner to the faster execution plan ?

Please help our poor servers, they are tired ;)

--
Laurent Raufaste
<http://www.glop.org/>

Re: PG planning randomly ?

From

"Laurent Raufaste"

Date:

26 February 2008, 11:35:10

2008/2/26, Laurent Raufaste <analogue@glop.org>:
> Hi,
>
>  I'm having some issues with this simple query:
>
>  SELECT
>   _comment.*,
>   _article.title AS article_title,
>   _article.reference AS article_reference
>  FROM
>   _comment
>   INNER JOIN _article
>     ON _article.id = _comment.parent_id
>  WHERE
>   _comment.path <@ '0.1.3557034'
>  ORDER BY
>   _comment.date_publishing DESC
>  OFFSET 0
>  LIMIT 5
>  ;

I forgot the table definition, here it is ;)

                                              Table "ob2._comment"
      Column       |            Type             |
     Modifiers
-------------------+-----------------------------+--------------------------------------------------------------
 id                | bigint                      | not null default
nextval('element_id_sequence'::regclass)
 parent_id         | bigint                      |
 path              | ltree                       |
 data              | text                        |
 date_creation     | timestamp without time zone | not null default now()
 date_publishing   | timestamp without time zone | not null default now()
 date_modification | timestamp without time zone | not null default now()
 counters          | hstore                      |
 reference         | integer                     | not null default
nextval('_comment_reference_seq'::regclass)
 text              | text                        |
Indexes:
    "_comment_pkey" PRIMARY KEY, btree (id), tablespace "indexspace"
    "gist_idx_comment_path" gist (path), tablespace "indexspace"
    "idx_comment_date_creation" btree (date_creation), tablespace "indexspace"
    "idx_comment_date_publishing" btree (date_publishing), tablespace
"indexspace"
    "idx_comment_parent_id" btree (parent_id), tablespace "indexspace"
    "idx_comment_reference" btree (reference), tablespace "indexspace"
Inherits: _element

Thanks for looking into ou problem !

--
Laurent Raufaste
<http://www.glop.org/>

Re: PG planning randomly ?

From

Tom Lane

Date:

26 February 2008, 12:32:02

"Laurent Raufaste" <analogue@glop.org> writes:
> I'm having some issues with this simple query:

> SELECT
>   _comment.*,
>   _article.title AS article_title,
>   _article.reference AS article_reference
> FROM
>   _comment
>   INNER JOIN _article
>     ON _article.id = _comment.parent_id
> WHERE
>   _comment.path <@ '0.1.3557034'
> ORDER BY
>   _comment.date_publishing DESC
> OFFSET 0
> LIMIT 5
> ;

> The varying information here is the ltree path "0.1.3557034"

What PG version is this?

If it's 8.2 or later then increasing the stats target for _comment.path
to 100 or more would likely help.

            regards, tom lane

Re: PG planning randomly ?

From

"Laurent Raufaste"

Date:

26 February 2008, 13:12:30

2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>
> What PG version is this?
>
>  If it's 8.2 or later then increasing the stats target for _comment.path
>  to 100 or more would likely help.
>

I'm using PG 8.2.4.
We are using 100 as default_statistics_target by default and all our
column are using this value:
# SELECT attname,attstattarget FROM pg_attribute WHERE attrelid=16743
AND attname='path' ;
 attname | attstattarget
---------+---------------
 path    |            -1

I tried increasing the stats target with the command:
SET default_statistics_target=1000 ;
That's the command I launched before executing the ANALYZE showed in
the previous mail.
The ANALYZE were longer to complete, but it did not change the planner
behavior (sometimes right, sometimes wrong).

I did not try setting up the target stats directly using an ALTER
TABLE because it implies some LOCK on our replication cluster. Do you
think the planner will act differently by using an ALTER TABLE rather
then just the "SET default_statistics_target" command ?

If so, I will try it =)

Thanks.

--
Laurent Raufaste
<http://www.glop.org/>

Re: PG planning randomly ?

From

Tom Lane

Date:

26 February 2008, 14:01:43

"Laurent Raufaste" <analogue@glop.org> writes:
> 2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>> If it's 8.2 or later then increasing the stats target for _comment.path
>> to 100 or more would likely help.

> I'm using PG 8.2.4.
> We are using 100 as default_statistics_target by default and all our
> column are using this value:

Hmm, that ought to be enough to activate the better selectivity
estimator.

Unless ... did you update this database from a pre-8.2 DB that already
had contrib/ltree in it?  If so, did you just load the existing old
definition of ltree as part of your dump, or did you install 8.2's
version fresh?  I'm wondering if you have a definition of operator <@
that doesn't specify the new selectivity estimator.  Please try a
pg_dump -s and see what it shows as the definition of <@.

            regards, tom lane

Re: PG planning randomly ?

From

"Laurent Raufaste"

Date:

26 February 2008, 16:07:54

2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
> "Laurent Raufaste" <analogue@glop.org> writes:
>
> > 2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>
> >> If it's 8.2 or later then increasing the stats target for _comment.path
>  >> to 100 or more would likely help.
>
>  > I'm using PG 8.2.4.
>  > We are using 100 as default_statistics_target by default and all our
>  > column are using this value:
>
>
> Hmm, that ought to be enough to activate the better selectivity
>  estimator.
>
>  Unless ... did you update this database from a pre-8.2 DB that already
>  had contrib/ltree in it?  If so, did you just load the existing old
>  definition of ltree as part of your dump, or did you install 8.2's
>  version fresh?  I'm wondering if you have a definition of operator <@
>  that doesn't specify the new selectivity estimator.  Please try a
>  pg_dump -s and see what it shows as the definition of <@.
>
>                         regards, tom lane
>

Here's the first definition of the <@ operator in my dump:

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--
CREATE OPERATOR <@ (
    PROCEDURE = ltree_risparent,
    LEFTARG = ltree,
    RIGHTARG = ltree,
    COMMUTATOR = @>,
    RESTRICT = ltreeparentsel,
    JOIN = contjoinsel
);
ALTER OPERATOR public.<@ (ltree, ltree) OWNER TO postgres;

Our data was created on an older PG (8.1.x) but we installed 8.2.x
from scratch, only dumping the schema and the data in it. I used ltree
found in the 8.2.4 source.

Do you think an update of ltree, or better of the database will fix
the problem ?

We plan on upgrading to the 8.3 branch in the next weeks, but this
behavior can't wait this much as our servers are overburned from time
to time =(

Thanks for your help ;)

--
Laurent Raufaste
<http://www.glop.org/>

Re: PG planning randomly ?

From

"Scott Marlowe"

Date:

26 February 2008, 16:08:08

On Tue, Feb 26, 2008 at 11:12 AM, Laurent Raufaste <analogue@glop.org> wrote:
> 2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>
> >
>  > What PG version is this?
>  >
>  >  If it's 8.2 or later then increasing the stats target for _comment.path
>  >  to 100 or more would likely help.
>  >
>
>  I'm using PG 8.2.4.
>  We are using 100 as default_statistics_target by default and all our
>  column are using this value:
>  # SELECT attname,attstattarget FROM pg_attribute WHERE attrelid=16743
>  AND attname='path' ;
>   attname | attstattarget
>  ---------+---------------
>   path    |            -1
>
>  I tried increasing the stats target with the command:
>  SET default_statistics_target=1000 ;
>  That's the command I launched before executing the ANALYZE showed in
>  the previous mail.
>  The ANALYZE were longer to complete, but it did not change the planner
>  behavior (sometimes right, sometimes wrong).

You're doing it wrong.  The default target affects newly created
columns / tables.  You need to use alter table to change a stats
target after creation.  Like so:

alter table abc alter column xyz set statistics 100;

Re: PG planning randomly ?

From

Tom Lane

Date:

26 February 2008, 17:57:45

"Laurent Raufaste" <analogue@glop.org> writes:
> 2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>> ... I'm wondering if you have a definition of operator <@
>> that doesn't specify the new selectivity estimator.  Please try a
>> pg_dump -s and see what it shows as the definition of <@.

> Here's the first definition of the <@ operator in my dump:

> CREATE OPERATOR <@ (
>     PROCEDURE = ltree_risparent,
>     LEFTARG = ltree,
>     RIGHTARG = ltree,
>     COMMUTATOR = @>,
>     RESTRICT = ltreeparentsel,
>     JOIN = contjoinsel
> );

That's the right RESTRICT function, but what exactly did you mean by
"first definition"?  Are there more?

It may be that it's just not possible for the estimator to come up with
accurate rowcount estimates given the amount of info it has available.
The query you are complaining about confuses the issue quite a lot by
involving other issues.  Would you try just "explain analyze select 1
from _commment where path <@ '....';" for various typical path values,
and see if it's coming up with estimated rowcounts that are in the right
ballpark compared to the actual ones?

            regards, tom lane

Re: PG planning randomly ?

From

Tom Lane

Date:

26 February 2008, 17:59:45

"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> On Tue, Feb 26, 2008 at 11:12 AM, Laurent Raufaste <analogue@glop.org> wrote:
>> I tried increasing the stats target with the command:
>> SET default_statistics_target=1000 ;
>> That's the command I launched before executing the ANALYZE showed in
>> the previous mail.

> You're doing it wrong.  The default target affects newly created
> columns / tables.  You need to use alter table to change a stats
> target after creation.  Like so:
> alter table abc alter column xyz set statistics 100;

That's completely incorrect.  If the column doesn't have a specific
stats target (indicated by -1 in attstattarget, which Laurent showed
us was the case), then ANALYZE will use the current value of
default_statistics_target.  The table-creation-time value of that
parameter isn't relevant at all.

            regards, tom lane

Re: PG planning randomly ?

From

"Laurent Raufaste"

Date:

27 February 2008, 13:39:01

2008/2/26, Tom Lane <tgl@sss.pgh.pa.us>:
>
> That's the right RESTRICT function, but what exactly did you mean by
>  "first definition"?  Are there more?

I thought it was enough, here is the complete definition of the <@ operator:

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
    PROCEDURE = ltree_risparent,
    LEFTARG = ltree,
    RIGHTARG = ltree,
    COMMUTATOR = @>,
    RESTRICT = ltreeparentsel,
    JOIN = contjoinsel
);


ALTER OPERATOR public.<@ (ltree, ltree) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
    PROCEDURE = _ltree_r_isparent,
    LEFTARG = ltree,
    RIGHTARG = ltree[],
    COMMUTATOR = @>,
    RESTRICT = contsel,
    JOIN = contjoinsel
);


ALTER OPERATOR public.<@ (ltree, ltree[]) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
    PROCEDURE = _ltree_risparent,
    LEFTARG = ltree[],
    RIGHTARG = ltree,
    COMMUTATOR = @>,
    RESTRICT = contsel,
    JOIN = contjoinsel
);


ALTER OPERATOR public.<@ (ltree[], ltree) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
    PROCEDURE = hs_contained,
    LEFTARG = hstore,
    RIGHTARG = hstore,
    COMMUTATOR = @>,
    RESTRICT = contsel,
    JOIN = contjoinsel
);


ALTER OPERATOR public.<@ (hstore, hstore) OWNER TO postgres;


>
>  It may be that it's just not possible for the estimator to come up with
>  accurate rowcount estimates given the amount of info it has available.
>  The query you are complaining about confuses the issue quite a lot by
>  involving other issues.  Would you try just "explain analyze select 1
>  from _commment where path <@ '....';" for various typical path values,
>  and see if it's coming up with estimated rowcounts that are in the right
>  ballpark compared to the actual ones?
>

It might be the source of the problem =)
I executed the following query on all the servers with a varying path
(but with the same path on each server), before and after lauching an
ANALYZE _comment.

EXPLAIN ANALYZE SELECT 1
FROM _comment
WHERE path <@ '0.1.810879'
;

On every server except one it showed the same plan before and after the ANALYZE:
 Bitmap Heap Scan on _comment  (cost=174.87..6163.31 rows=1536
width=0) (actual time=1.072..1.495 rows=1070 loops=1)
   Recheck Cond: (path <@ '0.1.14155763'::ltree)
   ->  Bitmap Index Scan on gist_idx_comment_path  (cost=0.00..174.48
rows=1536 width=0) (actual time=1.058..1.058 rows=1070 loops=1)
         Index Cond: (path <@ '0.1.14155763'::ltree)
 Total runtime: 1.670 ms

On a random server, the plan before the ANALYZE was:
 Bitmap Heap Scan on _comment  (cost=15833.00..440356.99 rows=155649
width=0) (actual time=1.581..2.885 rows=1070 loops=1)
   Recheck Cond: (path <@ '0.1.14155763'::ltree)
   ->  Bitmap Index Scan on gist_idx_comment_path
(cost=0.00..15794.09 rows=155649 width=0) (actual time=1.552..1.552
rows=1070 loops=1)
         Index Cond: (path <@ '0.1.14155763'::ltree)
 Total runtime: 3.160 ms

The runtime is ok, but the planned cost is huge, because the row count
of the index scan estimates 100x more rows. After the ANALYZE it was
like the others. If this wrong row count happens, I understand why the
planner try to find an alternative plan in the first query I showed
you in a previous mail.

How can I help him to better estimate the row count ? Setting
default_stats_target to 1000 did not help =(

--
Laurent Raufaste
<http://www.glop.org/>

Re: PG planning randomly ?

From

Tom Lane

Date:

27 February 2008, 16:36:27

"Laurent Raufaste" <analogue@glop.org> writes:
> On a random server, the plan before the ANALYZE was:
>  Bitmap Heap Scan on _comment  (cost=15833.00..440356.99 rows=155649
> width=0) (actual time=1.581..2.885 rows=1070 loops=1)
>    Recheck Cond: (path <@ '0.1.14155763'::ltree)
>    ->  Bitmap Index Scan on gist_idx_comment_path
> (cost=0.00..15794.09 rows=155649 width=0) (actual time=1.552..1.552
> rows=1070 loops=1)
>          Index Cond: (path <@ '0.1.14155763'::ltree)
>  Total runtime: 3.160 ms

> The runtime is ok, but the planned cost is huge, because the row count
> of the index scan estimates 100x more rows. After the ANALYZE it was
> like the others. If this wrong row count happens, I understand why the
> planner try to find an alternative plan in the first query I showed
> you in a previous mail.

> How can I help him to better estimate the row count ? Setting
> default_stats_target to 1000 did not help =(

Are you sure the table had been analyzed recently at all on that server?

If it had, then what you must be dealing with is a different result from
a different random sample.  The laws of statistics say that sometimes a
random sample won't be very representative ... but if the sample is
reasonably large they also say that won't happen very often.  You could
try ANALYZEing over and over and seeing what rowcount estimate you get
after each one.  If you frequently get a bad estimate, maybe it would be
worth looking at the pg_stats row for _comment.path to see if there's
anything obviously bogus about the bad samples.

            regards, tom lane

Re: PG planning randomly ?

From

"Laurent Raufaste"

Date:

28 February 2008, 13:46:50

2008/2/27, Tom Lane <tgl@sss.pgh.pa.us>:
>
> Are you sure the table had been analyzed recently at all on that server?
>
>  If it had, then what you must be dealing with is a different result from
>  a different random sample.  The laws of statistics say that sometimes a
>  random sample won't be very representative ... but if the sample is
>  reasonably large they also say that won't happen very often.  You could
>  try ANALYZEing over and over and seeing what rowcount estimate you get
>  after each one.  If you frequently get a bad estimate, maybe it would be
>  worth looking at the pg_stats row for _comment.path to see if there's
>  anything obviously bogus about the bad samples.
>

Thanks for your help Tom, it's greetly appreciated here =)

Yes, I ANALYZE the table before any bunch of request.
I isolated the problem, it happens when a lot (thousands) of rows in
the _comment table are matching the query. I can now reproduce the
problem at will, and on any server, even on our development server.

- I took 3 paths mathing thousands of rows: 0.1.4108047, 0.1.15021804
and 0.1.4749259
- I wrote the following script:
ANALYZE _comment (default_stats_target is 100)
EXPLAIN ANALYZE SELECT 1 FROM _comment WHERE _path <@ 0.1.4108047 ;
EXPLAIN ANALYZE SELECT 1 FROM _comment WHERE _path <@ 0.1.15021804 ;
EXPLAIN ANALYZE SELECT 1 FROM _comment WHERE _path <@ 0.1.4749259 ;
SELECT * FROM pg_stats WHERE tablename = '_comment' AND attname='path';

If I execute it a lot of times, approx. 2/3 of the executed query costs are OK:

Bitmap Heap Scan on _comment  (cost=114.24..4634.75 rows=1540 width=0)
(actual time=6.715..13.836 rows=12589 loops=1)
 Recheck Cond: (path <@ '0.1.15021804'::ltree)
 ->  Bitmap Index Scan on gist_idx_comment_path  (cost=0.00..113.85
rows=1540 width=0) (actual time=6.515..6.515 rows=12589 loops=1)
       Index Cond: (path <@ '0.1.15021804'::ltree)

And 1/3 of the executed queries are huge:

Bitmap Heap Scan on _comment  (cost=10366.65..342840.71 rows=156174
width=0) (actual time=6.513..12.984 rows=12589 loops=1)
 Recheck Cond: (path <@ '0.1.15021804'::ltree)
 ->  Bitmap Index Scan on gist_idx_comment_path  (cost=0.00..10327.61
rows=156174 width=0) (actual time=6.313..6.313 rows=12589 loops=1)
       Index Cond: (path <@ '0.1.15021804'::ltree)

The pg_stats table show no strange value, and the only rows not
constant are null_frac and correlation. avg_width go from 56 to 57 and
n_disctinct stay to -1 (which is OK, all path are distinct) after each
ANALYZE:
   [schemaname] => ob2
   [tablename] => _comment
   [attname] => path
   [null_frac] => 6.66667e-05
   [avg_width] => 56
   [n_distinct] => -1
   [most_common_vals] =>
   [most_common_freqs] =>
   [correlation] => -0.256958

If I do the same test with default_stats_target=1000, I get the same
behavior (huge row counts) but a bit closer to the reality. Instead of
only getting 2 different estimations accross all the requests
(rows=1540 and rows=156174), I get 3 different ones: (rows=1543
rows=15446 and rows=61784).

The problem is that the cost is still huge compared to the reality.
And the query we use in our production environment switch to a
different way of running it.

Fast version:

Limit  (cost=15557.29..15557.30 rows=5 width=570) (actual
time=1305.824..1305.829 rows=5 loops=1)
 ->  Sort  (cost=15557.29..15561.14 rows=1540 width=570) (actual
time=1305.822..1305.825 rows=5 loops=1)
       Sort Key: _comment.date_publishing
       ->  Nested Loop  (cost=0.00..15475.75 rows=1540 width=570)
(actual time=0.185..847.502 rows=61537 loops=1)
             ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..4746.26 rows=1540 width=537) (actual time=0.118..307.553
rows=64825 loops=1)
                   Index Cond: (path <@ '0.1.4108047'::ltree)
             ->  Index Scan using _article_pkey on _article
(cost=0.00..6.95 rows=1 width=41) (actual time=0.006..0.006 rows=1
loops=64825)
                   Index Cond: (_article.id = _comment.parent_id)


Slow version:

Limit  (cost=0.00..1047.60 rows=5 width=566) (actual time=0.352..1.625
rows=5 loops=1)
 ->  Nested Loop  (cost=0.00..32663447.76 rows=155897 width=566)
(actual time=0.351..1.620 rows=5 loops=1)
       ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..31719108.69 rows=155897 width=533) (actual
time=0.286..1.412 rows=5 loops=1)
             Filter: (path <@ '0.1.4108047'::ltree)
       ->  Index Scan using _article_pkey on _article
(cost=0.00..6.04 rows=1 width=41) (actual time=0.038..0.039 rows=1
loops=5)
             Index Cond: (_article.id = _comment.parent_id)

Don't you think an increase in some RAM parameter would help the
server working on this kind of query ? We have 20+GB of RAM for those
servers

Thanks.

--
Laurent Raufaste
<http://www.glop.org/>