Thread: Quick estimate of num of rows & table size
Hi all,
I read somewhere that the following query gives a quick estimate of the # of rows in a table regardless of the table's size (which would matter in a simple SELECT count(*)?):
SELECT (CASE WHEN reltuples > 0 THEN
pg_relation_size('mytable')/(8192*relpages/reltuples)
ELSE 0
END)::bigint AS estimated_row_count
FROM pg_class
WHERE oid = 'mytable'::regclass;
If relpages & reltuples are recorded accurately each time VACUUM is run, wouldn't it be the same to just grab directly the value of reltuples like:
SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass;
In the same manner, are pg_relation_size('mytable') and 8192*relpages the same?
I run both assumptions against a freshly VACUUMed table and they seem correct.
TIA,
Thalis K.
I read somewhere that the following query gives a quick estimate of the # of rows in a table regardless of the table's size (which would matter in a simple SELECT count(*)?):
SELECT (CASE WHEN reltuples > 0 THEN
pg_relation_size('mytable')/(8192*relpages/reltuples)
ELSE 0
END)::bigint AS estimated_row_count
FROM pg_class
WHERE oid = 'mytable'::regclass;
If relpages & reltuples are recorded accurately each time VACUUM is run, wouldn't it be the same to just grab directly the value of reltuples like:
SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass;
In the same manner, are pg_relation_size('mytable') and 8192*relpages the same?
I run both assumptions against a freshly VACUUMed table and they seem correct.
TIA,
Thalis K.
On Mon, Nov 5, 2012 at 2:02 PM, Thalis Kalfigkopoulos <tkalfigo@gmail.com> wrote: > Hi all, > > I read somewhere that the following query gives a quick estimate of the # of > rows in a table regardless of the table's size (which would matter in a > simple SELECT count(*)?): > > SELECT (CASE WHEN reltuples > 0 THEN > pg_relation_size('mytable')/(8192*relpages/reltuples) > ELSE 0 > END)::bigint AS estimated_row_count > FROM pg_class > WHERE oid = 'mytable'::regclass; > > If relpages & reltuples are recorded accurately each time VACUUM is run, > wouldn't it be the same to just grab directly the value of reltuples like: > > SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass; > > In the same manner, are pg_relation_size('mytable') and 8192*relpages the > same? > > I run both assumptions against a freshly VACUUMed table and they seem > correct. This doesn't seem to work for me. I get an estimated row_count of 0 on a table that I know has millions of rows.
On Mon, Nov 5, 2012 at 7:14 PM, Lonni J Friedman <netllama@gmail.com> wrote:
This doesn't seem to work for me. I get an estimated row_count of 0On Mon, Nov 5, 2012 at 2:02 PM, Thalis Kalfigkopoulos
<tkalfigo@gmail.com> wrote:
> Hi all,
>
> I read somewhere that the following query gives a quick estimate of the # of
> rows in a table regardless of the table's size (which would matter in a
> simple SELECT count(*)?):
>
> SELECT (CASE WHEN reltuples > 0 THEN
> pg_relation_size('mytable')/(8192*relpages/reltuples)
> ELSE 0
> END)::bigint AS estimated_row_count
> FROM pg_class
> WHERE oid = 'mytable'::regclass;
>
> If relpages & reltuples are recorded accurately each time VACUUM is run,
> wouldn't it be the same to just grab directly the value of reltuples like:
>
> SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass;
>
> In the same manner, are pg_relation_size('mytable') and 8192*relpages the
> same?
>
> I run both assumptions against a freshly VACUUMed table and they seem
> correct.
on a table that I know has millions of rows.
Which one doesn't work exactly? The larger query? Are you on a 9.x?
regards,
thalis k.
On Mon, Nov 5, 2012 at 3:56 PM, Thalis Kalfigkopoulos <tkalfigo@gmail.com> wrote: > > On Mon, Nov 5, 2012 at 7:14 PM, Lonni J Friedman <netllama@gmail.com> wrote: >> >> On Mon, Nov 5, 2012 at 2:02 PM, Thalis Kalfigkopoulos >> <tkalfigo@gmail.com> wrote: >> > Hi all, >> > >> > I read somewhere that the following query gives a quick estimate of the >> > # of >> > rows in a table regardless of the table's size (which would matter in a >> > simple SELECT count(*)?): >> > >> > SELECT (CASE WHEN reltuples > 0 THEN >> > pg_relation_size('mytable')/(8192*relpages/reltuples) >> > ELSE 0 >> > END)::bigint AS estimated_row_count >> > FROM pg_class >> > WHERE oid = 'mytable'::regclass; >> > >> > If relpages & reltuples are recorded accurately each time VACUUM is run, >> > wouldn't it be the same to just grab directly the value of reltuples >> > like: >> > >> > SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass; >> > >> > In the same manner, are pg_relation_size('mytable') and 8192*relpages >> > the >> > same? >> > >> > I run both assumptions against a freshly VACUUMed table and they seem >> > correct. >> >> This doesn't seem to work for me. I get an estimated row_count of 0 >> on a table that I know has millions of rows. > > > Which one doesn't work exactly? The larger query? Are you on a 9.x? doh, sorry. The first/larger doesn't work. As it turns out the 2nd actually does work well. I'm on 9.1.x.
On Mon, Nov 5, 2012 at 9:04 PM, Lonni J Friedman <netllama@gmail.com> wrote:
On Mon, Nov 5, 2012 at 3:56 PM, Thalis Kalfigkopoulosdoh, sorry. The first/larger doesn't work. As it turns out the 2nd<tkalfigo@gmail.com> wrote:
>
> On Mon, Nov 5, 2012 at 7:14 PM, Lonni J Friedman <netllama@gmail.com> wrote:
>>
>> On Mon, Nov 5, 2012 at 2:02 PM, Thalis Kalfigkopoulos
>> <tkalfigo@gmail.com> wrote:
>> > Hi all,
>> >
>> > I read somewhere that the following query gives a quick estimate of the
>> > # of
>> > rows in a table regardless of the table's size (which would matter in a
>> > simple SELECT count(*)?):
>> >
>> > SELECT (CASE WHEN reltuples > 0 THEN
>> > pg_relation_size('mytable')/(8192*relpages/reltuples)
>> > ELSE 0
>> > END)::bigint AS estimated_row_count
>> > FROM pg_class
>> > WHERE oid = 'mytable'::regclass;
>> >
>> > If relpages & reltuples are recorded accurately each time VACUUM is run,
>> > wouldn't it be the same to just grab directly the value of reltuples
>> > like:
>> >
>> > SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass;
>> >
>> > In the same manner, are pg_relation_size('mytable') and 8192*relpages
>> > the
>> > same?
>> >
>> > I run both assumptions against a freshly VACUUMed table and they seem
>> > correct.
>>
>> This doesn't seem to work for me. I get an estimated row_count of 0
>> on a table that I know has millions of rows.
>
>
> Which one doesn't work exactly? The larger query? Are you on a 9.x?
actually does work well. I'm on 9.1.x.
That's weird. I'm on a 9.1.5 as well and it works fine (!?)
Still the question remains: are they equivalent?
Thalis Kalfigkopoulos <tkalfigo@gmail.com> writes: > I read somewhere that the following query gives a quick estimate of the # > of rows in a table regardless of the table's size (which would matter in a > simple SELECT count(*)?): > SELECT (CASE WHEN reltuples > 0 THEN > pg_relation_size('mytable')/(8192*relpages/reltuples) > ELSE 0 > END)::bigint AS estimated_row_count > FROM pg_class > WHERE oid = 'mytable'::regclass; This seems a bit dubious, as it's protecting against only one of two possible zero-divide conditions, and ignoring the risk of integer overflow of 8192*relpages. It also seems rather inconvenient to have to specify 'mytable' twice. I'd try something like select case when relpages > 0 then (pg_relation_size(oid)::float8 * reltuples / relpages / 8192)::bigint else 0::bigint end from pg_class where oid = 'mytable'::regclass; > If relpages & reltuples are recorded accurately each time VACUUM is run, > wouldn't it be the same to just grab directly the value of reltuples like: > SELECT reltuples FROM pg_class WHERE oid='mytable'::regclass; Uh, no. The whole point of the more complicated query is to scale the reltuples/relpages tuple-density ratio up to the table's current physical size, so that you get an estimate that is not too far off even if vacuum hasn't been run lately. It's also worth noting that in recent PG versions, reltuples and relpages are themselves only moving-average estimates, and so your premise is faulty anyway: they are *not* necessarily exact even immediately after a vacuum. regards, tom lane