Thread: "No Free extents", table using all allocated space but no rows!

"No Free extents", table using all allocated space but no rows!

From
"Dolan, Sean"
Date:

Environment: PostGres 13 on RedHat 7.9.  

 

I am using logical replication (publisher/subscriber) between two databases and there are times where one of our schemas gets to 100% of allocated space (No Free Extents).   I went into the schema and did a \dt+ to see the amount of size being used and I could see one of the tables somehow shows 16GB, essentially the amount of allocated size.    Wanting to see what is in that table, I did a simple select * from the table and it returns no rows.  Doing a count(*) also returns 0 rows.

 

How can the table be using all that space but there is nothing “in” the table?    I don’t care about the data (as I am testing) so I can drop and recreate that one table.  \dt+ would then show 0 bytes.    Later, I will then create a subscription and then I will get a No Free Extents error again and again the table has filled up.

 

What can I look for?

 

Thank you

Re: "No Free extents", table using all allocated space but no rows!

From
Laurenz Albe
Date:
On Thu, 2023-03-16 at 01:58 +0000, Dolan, Sean wrote:
> Environment: PostGres 13 on RedHat 7.9.  
>  
> I am using logical replication (publisher/subscriber) between two databases and there
> are times where one of our schemas gets to 100% of allocated space (No Free Extents).
> I went into the schema and did a \dt+ to see the amount of size being used and I
> could see one of the tables somehow shows 16GB, essentially the amount of allocated size.
> Wanting to see what is in that table, I did a simple select * from the table and it
> returns no rows.  Doing a count(*) also returns 0 rows.
>  
> How can the table be using all that space but there is nothing “in” the table?
> I don’t care about the data (as I am testing) so I can drop and recreate that one table.
> \dt+ would then show 0 bytes.    Later, I will then create a subscription and then
> I will get a No Free Extents error again and again the table has filled up.
>  
> What can I look for?

I don't think that there is an error message "no free extents".

It can easily happen that a table is large, but SELECT count(*) returns 0.
That would mean that either the table is empty and VACUUM truncation didn't work,
or that the table contains tuples that are not visible to your user, either
because VACUUM didn't process the table yet, or because your snapshot is too
old to see the data, or because the transaction that created the rows is still open.

If you don't care about the data, your easiest option is to TRUNCATE the table.
If TRUNCATE is blocked, kill all transactions that block it.

Yours,
Laurenz Albe



I messed up and confused issues.   The error is :  
ERROR: Could not extend pg_tblspc/16555/PG_13_20200/xxxx/xxxx  No space left on device
HINT: Check free disk space

So the schema is "full" and the offender is this one table.   I can't TRUNCATE as there needs to be space to perform
theaction.   Is there a way to see if there is a transaction on that table like you allude to?
 

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Wednesday, March 15, 2023 11:45 PM
To: Dolan, Sean (US N-ISYS Technologies Inc.) <sean.dolan@lmco.com>; pgsql-general@lists.postgresql.org
Subject: EXTERNAL: Re: "No Free extents", table using all allocated space but no rows!

On Thu, 2023-03-16 at 01:58 +0000, Dolan, Sean wrote:
> Environment: PostGres 13 on RedHat 7.9.
>  
> I am using logical replication (publisher/subscriber) between two 
> databases and there are times where one of our schemas gets to 100% of allocated space (No Free Extents).
> I went into the schema and did a \dt+ to see the amount of size being 
> used and I could see one of the tables somehow shows 16GB, essentially the amount of allocated size.
> Wanting to see what is in that table, I did a simple select * from the 
> table and it returns no rows.  Doing a count(*) also returns 0 rows.
>  
> How can the table be using all that space but there is nothing “in” the table?
> I don’t care about the data (as I am testing) so I can drop and recreate that one table.
> \dt+ would then show 0 bytes.    Later, I will then create a 
> subscription and then I will get a No Free Extents error again and again the table has filled up.
>  
> What can I look for?

I don't think that there is an error message "no free extents".

It can easily happen that a table is large, but SELECT count(*) returns 0.
That would mean that either the table is empty and VACUUM truncation didn't work, or that the table contains tuples
thatare not visible to your user, either because VACUUM didn't process the table yet, or because your snapshot is too
oldto see the data, or because the transaction that created the rows is still open.
 

If you don't care about the data, your easiest option is to TRUNCATE the table.
If TRUNCATE is blocked, kill all transactions that block it.

Yours,
Laurenz Albe

On Thu, 2023-03-16 at 13:20 +0000, Dolan, Sean wrote:
> I messed up and confused issues.   The error is : 
> ERROR: Could not extend pg_tblspc/16555/PG_13_20200/xxxx/xxxx  No space left on device
> HINT: Check free disk space
>
> So the schema is "full" and the offender is this one table.
> I can't TRUNCATE as there needs to be space to perform the action.
> Is there a way to see if there is a transaction on that table like you allude to?

Ah, that's different.

If you don't have enough space to run TRUNCATE, and you don't feel like
extending the disk space, DROP TABLE would be a convenient alternative.

Yours,
Laurenz Albe



The initial email below was the end result of something "run-away" in my logical replication.

PostGres 13.6, RedHat 7.9

Database A is the publisher; Database B is the subscriber.    Within Database A are multiple schemas and the
publicationensures that all the schemas and its tables are added.    There is one table in particular that has
1.4millionrows of data.   
 

I create a subscription on Database B and can see in the log:
    LOG: logical replication table synchronization for subscription "sub_to_dbaseA", table "alert_history" has
started.
    CONTEXT:  COPY alert_history line 6668456
    LOG: logical replication table synchronization for subscription "sub_to_dbaseA", table "alert_history" has
started.
    CONTEXT:  COPY alert_history line 5174606
    LOG: logical replication table synchronization for subscription "sub_to_dbaseA", table "alert_history" has
started.
    CONTEXT:  COPY alert_history line 4325283

Normally I would see a line for "finished", but I never do.

I then actively watch the schema/table and do \dt+   and can see that table grow in 2GB increments until I fill up the
entiredrive and run out of room for that schema.
 

I am NOT getting any "checkpoints are occurring too frequently"    As I have updated my WAL size:
    max_wal_size=4GB
    min_wal_size=1GB

Is the system having trouble synching this amount of data in a quick fashion and therefore kicks off more
synchronizationthreads?   Anything I can do to prevent this?
 

Thank you



-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Thursday, March 16, 2023 1:25 PM
To: Dolan, Sean (US N-ISYS Technologies Inc.) <sean.dolan@lmco.com>; pgsql-general@lists.postgresql.org
Subject: EXTERNAL: Re: EXTERNAL: Re: "No Free extents", table using all allocated space but no rows!

On Thu, 2023-03-16 at 13:20 +0000, Dolan, Sean wrote:
> I messed up and confused issues.   The error is :  
> ERROR: Could not extend pg_tblspc/16555/PG_13_20200/xxxx/xxxx  No space left on device
> HINT: Check free disk space
> 
> So the schema is "full" and the offender is this one table.
> I can't TRUNCATE as there needs to be space to perform the action.
> Is there a way to see if there is a transaction on that table like you allude to?

Ah, that's different.

If you don't have enough space to run TRUNCATE, and you don't feel like
extending the disk space, DROP TABLE would be a convenient alternative.

Yours,
Laurenz Albe

Re: EXTERNAL: Re: "No Free extents", table using all allocated space but no rows!

From
"Peter J. Holzer"
Date:
On 2023-03-16 20:24:49 +0100, Laurenz Albe wrote:
> On Thu, 2023-03-16 at 13:20 +0000, Dolan, Sean wrote:
> > I messed up and confused issues.   The error is : 
> > ERROR: Could not extend pg_tblspc/16555/PG_13_20200/xxxx/xxxx  No space left on device
> > HINT: Check free disk space
> >
> > So the schema is "full" and the offender is this one table.
> > I can't TRUNCATE as there needs to be space to perform the action.
> > Is there a way to see if there is a transaction on that table like you allude to?
>
> If you don't have enough space to run TRUNCATE, and you don't feel like
> extending the disk space, DROP TABLE would be a convenient alternative.

If you need to get just a little bit free space, you may be able to
reduce the reserved space on the file system. For example, on ext4 there
are typically 5 % reserved for root, so by reducing that to 2 % you get
about 3 % of extra space:

kitty:~/tmp 0:26 :-) 19% dd if=/dev/zero of=fill bs=1024k
dd: error writing 'fill': No space left on device

kitty:~/tmp 0:27 :-( 20% df -h .
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/kitty--vg-home  4.8G  4.5G     0 100% /home

kitty:~/tmp 0:27 :-) 21% echo test > small
echo: write error: no space left on device

kitty# tune2fs -m 2 /dev/kitty-vg/home
tune2fs 1.46.2 (28-Feb-2021)
Setting reserved blocks percentage to 2% (25784 blocks)

kitty:~/tmp 0:27 :-( 22% df -h .
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/kitty--vg-home  4.8G  4.5G  152M  97% /home

kitty:~/tmp 0:29 :-) 23% echo test > small

kitty:~/tmp 0:29 :-) 24% cat small
test

(You should restore the reserved space to the default afterwards. The
BSD file system and its descendants (like ext4) don't like getting
completely full.)

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Why using a partial index is doing slightly more logical I/O than a normal index

From
"Dirschel, Steve"
Date:

Table definition:

workflow_db=> \d workflow_execution_test

                            Table "public.workflow_execution_test"

              Column              |           Type           | Collation | Nullable | Default

----------------------------------+--------------------------+-----------+----------+---------

execution_id                     | bigint                   |           |          |

state_machine_id                 | bigint                   |           |          |

workflow_id                      | bigint                   |           |          |

started_datetime                 | timestamp with time zone |           |          |

completed_datetime               | timestamp with time zone |           |          |

status                           | character varying(50)    |           |          |

execution_context_s3_arn         | character varying(200)   |           |          |

ol_version                       | integer                  |           |          |

created_datetime                 | timestamp with time zone |           |          |

updated_datetime                 | timestamp with time zone |           |          |

deleted_millis                   | bigint                   |           |          |

acquisition_channel_id           | bigint                   |           |          |

correlation_id                   | character varying(36)    |           |          |

result                           | character varying(50)    |           |          |

state_machine_execution_arn      | character varying(200)   |           |          |

created_by_id                    | bigint                   |           |          |

updated_by_id                    | bigint                   |           |          |

acquired_gcs_s3_object           | text                     |           |          |

sqs_trigger_id                   | bigint                   |           |          |

trigger_message                  | text                     |           |          |

acquired_gcs_s3_object_uuid      | character varying(36)    |           |          |

api_trigger_id                   | bigint                   |           |          |

scheduled_trigger_id             | bigint                   |           |          |

notification_trigger_workflow_id | bigint                   |           |          |

acquired_object_name             | text                     |           |          |

subscription_guid                | uuid                     |           |          |

processing_class_code            | character varying(50)    |           |          |

root_execution_guid              | uuid                     |           |          |

Indexes:

    "test_workflow_execution_active_pending_ordered_by_created_date_" btree (workflow_id, status, deleted_millis, created_datetime) WHERE (status::text = ANY (ARRAY['ACTION_NEEDED'::character varying, 'NOT_STARTED'::character varying, 'PAUSED'::character varying, 'PENDING'::character varying, 'RUNNING'::character varying]::text[])) AND deleted_millis <= 0

    "test_workflow_execution_initial_ui_tabs" btree (workflow_id, status, result, completed_datetime DESC NULLS LAST)

 

I created/populated this table with 22 million rows.  Afterwards I then created the 2 indexes.  So those indexes are packed tightly.

 

As is the optimizer decides to use the partial index.  Below shows it did 33 logical reads.  This index should be very good for this query-  the leading 3 columns of the index are on the 3 criteria in the WHERE clause and the partial part is only storing rows that match the status and deleted_millis filters.

 

explain (analyze, buffers)

select * from workflow_execution_test

where workflow_id = 1070

AND status in ('NOT_STARTED','PAUSED','PENDING','RUNNING')

and deleted_millis <= 0;

                                                                                             QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using test_workflow_execution_active_pending_ordered_by_created_date_ on workflow_execution_test  (cost=0.43..15824.82 rows=4335 width=1309) (actual time=0.040..0.095 rows=56 loops=1)

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

  Buffers: shared hit=33

 Planning:

   Buffers: shared hit=2

Planning Time: 0.321 ms

Execution Time: 0.117 ms

 

If I hint the query to use the other index it does less work-  it does 24 logical reads vs 33 using the partial index.

 

/*+ IndexScan(workflow_execution_test test_workflow_execution_initial_ui_tabs) */

explain (analyze, buffers)

select * from workflow_execution_test

where workflow_id = 1070

AND status in ('NOT_STARTED','PAUSED','PENDING','RUNNING')

and deleted_millis <= 0;

                                                                                 QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using test_workflow_execution_initial_ui_tabs on workflow_execution_test  (cost=0.56..15820.19 rows=4335 width=1309) (actual time=0.049..0.106 rows=56 loops=1)

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

   Filter: (deleted_millis <= 0)

   Buffers: shared hit=24 

Planning:

   Buffers: shared hit=2

Planning Time: 0.373 ms

Execution Time: 0.129 ms

 

It may make sense to me if the query did the same amount of work using either of the 2 indexes but it does not make sense to me why the partial index is doing more work.  Could anyone explain this behavior? 

 

One other thing to note-  when using the partial index the cost is .43 .. 15824.82.  When using the other index the cost is .56 .. 15820.19.  So the lower end cost (I believe the cost to find the first row) is slightly lower for the partial index but the higher end cost (I believe to find the last row) is higher for the partial index.  Since there is no LIMIT clause why wouldn’t the optimizer use the lowest cost to find all rows (which in this case would be to use the non-partial index)?

 

Thanks

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html
Because both the actual times and the expected costs are so similar to each other, I am assuming you are asking this as more of an academic question than a practical one.  If it is actually a practical matter, you should find a better example to present to us.

On Wed, May 3, 2023 at 9:17 AM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote:

 

 Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

  Buffers: shared hit=33


For this usage, the =ANY is a "boundary condition".  It re-descends the index for each value in the array, and each of those re-descents incurs buffer accesses.  They cost very little, as the planner thinks they will mostly be cached already (and indeed, they are), but the difference still shows up in the Buffers tabulation.

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

   Filter: (deleted_millis <= 0)

   Buffers: shared hit=24


For this usage, the =ANY is applied as an "in-index filter".  It only descends the index once, to where workflow_id=1070, and then scans forward applying the =ANY to each index-tuple until it exhausts the =1070 condition.  As long as all the =1070 entries fit into just a few buffers, the count of buffers accessed by doing this is fewer than doing the re-descents.  (Stepping from tuple to tuple in the same index page doesn't count as a new access.  While a re-descent releases and reacquires the buffer)

This is a bit of speculation on my part, as nothing in the plan output distinguishes boundary condition usages from in-index-filter usages.  This speculation is based on the fact that I was recently investigating a similar situation and did extensive work on it with a debugger and by adding new experimental log messages.  Also, It isn't clear to me why it chooses one usage for one plan and the other usage for the other one in your case, as it seems that both would be eligible for the "boundary condition" treatment.  But presumably for some reason invisible to us it just thinks one approach is faster for one index and the other approach for the other index.
 

One other thing to note-  when using the partial index the cost is .43 .. 15824.82.  When using the other index the cost is .56 .. 15820.19.  So the lower end cost (I believe the cost to find the first row) is slightly lower for the partial index but the higher end cost (I believe to find the last row) is higher for the partial index.  Since there is no LIMIT clause why wouldn’t the optimizer use the lowest cost to find all rows (which in this case would be to use the non-partial index)?


There is some code in the planner which deems plans to be tied if their costs are within a small difference (1%, I think it is).  This allows some branches of the tree of all possible plans to be pruned off early, which can save a lot of time in planning.

Cheers,

Jeff

Thanks for the reply Jeff.  Yes-  more of an academic question.  Regarding this part:

 

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

   Filter: (deleted_millis <= 0)

   Buffers: shared hit=24

 

For this usage, the =ANY is applied as an "in-index filter".  It only descends the index once, to where workflow_id=1070, and then scans forward applying the =ANY to each index-tuple until it exhausts the =1070 condition.  As long as all the =1070 entries fit into just a few buffers, the count of buffers accessed by doing this is fewer than doing the re-descents.  (Stepping from tuple to tuple in the same index page doesn't count as a new access.  While a re-descent releases and reacquires the buffer)

 

There are 2,981,425 rows where workflow_id = 1070.  Does that change your theory of using an “in-index filter” for that plan?  When you say there was a bit of speculation on the “boundard condition” vs “in-index filter” is the speculation on if Postgres has 2 different ways of processing a =ANY filter or is the speculation that one is being used by one plan and the other is being used by the other plan?

 

Thanks again for your reply.  It is helpful.

Steve

"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> There are 2,981,425 rows where workflow_id = 1070.  Does that change your theory of using an “in-index filter” for
thatplan?  When you say there was a bit of speculation on the “boundard condition” vs “in-index filter” is the
speculationon if Postgres has 2 different ways of processing a =ANY filter or is the speculation that one is being used
byone plan and the other is being used by the other plan? 

I don't believe the intelligence Jeff is postulating actually exists.
I see only one code path for this in nbtree.c, and what it's doing is
what he called the "boundary condition" implementation.  That is, it runs
one index search for "workflow_id = 1070 AND status = 'NOT_STARTED'",
then one for "workflow_id = 1070 AND status = 'PAUSED'", etc,
re-descending the index tree at the start of each of those four scans.

I'm not inclined to ascribe any great significance to the varying numbers
of buffer hits you observed.  I think that's likely explained by chance
layout of the two indexes' contents, so that some of these searches cross
different numbers of index pages even though they visit the same number of
index entries overall.  In particular, it doesn't look like the partial
index is buying anything for this specific test case.  The index's
constraint on "status" matters not at all, because in neither index will
we ever visit any regions of the index where other values of "status"
appear (save the one following entry where we detect that the status value
no longer matches in each scan; but it doesn't really matter what that
entry is).  The constraint on "deleted_millis" could help, but your second
EXPLAIN result shows that it didn't actually eliminate any rows:

>> Index Scan using test_workflow_execution_initial_ui_tabs on workflow_execution_test  (cost=0.56..15820.19 rows=4335
width=1309)(actual time=0.049..0.106 rows=56 loops=1) 
>>    Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))
>>    Filter: (deleted_millis <= 0)

(note the lack of any "Rows Removed by Filter" line).  We can therefore
conclude that the index regions satisfying the workflow_id+status
conditions had no entries with deleted_millis <= 0 either.  So these
two test cases visited exactly the same number of index entries, and
any differences in "buffers hit" had to be chance layout effects, or
possibly the result of different index entry sizes.  How large is
that "result" column in reality?

            regards, tom lane



On Wed, May 3, 2023 at 2:00 PM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote:

Thanks for the reply Jeff.  Yes-  more of an academic question.  Regarding this part:

 

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

   Filter: (deleted_millis <= 0)

   Buffers: shared hit=24

 

For this usage, the =ANY is applied as an "in-index filter".  It only descends the index once, to where workflow_id=1070, and then scans forward applying the =ANY to each index-tuple until it exhausts the =1070 condition.  As long as all the =1070 entries fit into just a few buffers, the count of buffers accessed by doing this is fewer than doing the re-descents.  (Stepping from tuple to tuple in the same index page doesn't count as a new access.  While a re-descent releases and reacquires the buffer)

 

There are 2,981,425 rows where workflow_id = 1070.  Does that change your theory of using an “in-index filter” for that plan? 


Yes.  There is no way that that many index tuples will fit in just 24 index leaf pages, so I think it must be using the re-descending method for both plans.  Then I don't know why there is a difference in the number of buffer accesses.  But the difference seems trivial, so I wouldn't put much effort into investigating it.

 

When you say there was a bit of speculation on the “boundard condition” vs “in-index filter” is the speculation on if Postgres has 2 different ways of processing a =ANY filter or is the speculation that one is being used by one plan and the other is being used by the other plan?


The speculation was that this applied to your query.  But going back to some of my original tests, I see that I remembered some of the details wrong on the broader topic as well.  When it voluntarily doesn't use the =ANY as a boundary condition, that shows up in the plan as having the condition evicted from "Index Cond" line and instead show up in a "Filter" line, at least in my one test case (which means it is no longer an in-index filter, as it jumps to the table and verifies visibility before applying the filter).  So the thing that the plans really don't distinguish is between when it just chooses not to use the extra index column for cost reasons, from when it thinks it is unable to use it for correctness/safety reasons.

Cheers,

Jeff