Re: Restarting DB after moving to another drive - Mailing list pgsql-general

From Daniel Begin
Subject Re: Restarting DB after moving to another drive
Date
Msg-id COL129-DS14555615E14EF99A34536294C60@phx.gbl
Whole thread Raw
In response to Restarting DB after moving to another drive  (Daniel Begin <jfd553@hotmail.com>)
List pgsql-general
A follow-up...

As expected, I was able to copy the database cluster on the new drive during the night. I changed the drive letter to
fitthe original database drive and I restarted the DB. Everything is now running on the new drive and I have been able
torecreate the tablespaces.  

However, the time expected to run queries on some tables seems longer.
- Could copying tables and indexes have had an effect on indexes?
- How can I verify that some of the indexes were not corrupted?

Daniel


-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Daniel Begin
Sent: May-15-15 15:17
To: 'Francisco Olarte'
Cc: rod@iol.ie; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Restarting DB after moving to another drive

Thank for that comprehensive response!

And you are right about practicing restore, I never had to :-)

However, I use pg_dump on a regular basis (custom format) but I did not know the difference between database/database
cluster(and pg_dumpall) until I had to move everything because the PGDATA drive started overheating.  

Now that I better understand what is going on with backup/restore processes, and considering...
- The time it is going to  take to rebuild the whole cluster ;
- That I am the only user of the database;
- That everything was just fine with the database, except the temperature of the drive
- And considering the initial concern of this tread was about bad copy of symbolic links with windows

I will make another attempt to copy everything on another drive from windows, unless someone tells me it is not
possible.
- I will move my external tablespaces content back to pgdata and drop them for the time I copy the db to the new drive.

- Doing so, I will get rid of the symbolic link (from tablespaces) from where originated the initial error message
- Without symbolic links, I should be able to copy the db using standard windows commands.
- Setting up the new drive's letter to the old one before restarting the db is easy -The whole process should take
12hoursinstead of a week. 

Hoping it makes sense and that I have not missed something important (again) Thank for your patience :-) Daniel


-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Francisco Olarte
Sent: May-15-15 12:20
To: Daniel Begin
Cc: rod@iol.ie; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Restarting DB after moving to another drive

HI Daniel:

On Fri, May 15, 2015 at 5:35 PM, Daniel Begin <jfd553@hotmail.com> wrote:
> Bonjour Francisco.
Buenos dias.

> Skimming the documentation sequentially is a cleaver advice,
> especially since the doc is much of the time well done and exhaustive.
> Unfortunately, even if I actually did it about 1 year ago, it seems
> this specific item slipped out of my mind :-(

Bad luck, you managed to get one of the most important commands, but it's solved now.

> About dump/restore operation, restoring the database cluster is running for 24hrs now (psql -f pgdumpallOutputfile
postgres).Since it took 13hrs to dump the cluster, I begin to wonder how long it is going to take to restore it... 

Much longer than this, especially if as you say below you have a lot of indexes. It's one of the reasons many of us do
notuse pg_dumpall for anything but global objects, but use something like  the crontab which John R. Pierce posted ( I
usea similar thing, but with an intermediate script with dumps critical databases more frequently, skips recreatable (
maybe someone can confirm if that word is right ? 
I mean test things which can be created from scratch ( as they come from a script ) ) databases and keeps several
numberedcopies ). Doing it this ways insures we can restore on criticality order if it needs to be done ( I even move
unmodifiedpartitions to a 'historic" schema, which gets dumped only after a change, which cuts my backups times to a
tenth) 

One thing. I do not know how you are restoring the database, but when doing this things we use a specially tuned
postgresql.conf( fsync off, minimal loging, lots of worrk mems and similar things, as we do not care about durability (
youcan just rerun initdb and redo the restore, and there is only 1 session connected, the restoring one ). 
This cuts the restore times to easily a tenth, then after ending it we restart the server with the normal cong. It is a
mustwhen doing this short of things. 


> My main concern is about how the indexes are managed in dump/restore operations. I understand that pg_dumpall
actuallyuses pg_dump where the doc says "Post-data items include definitions of indexes, triggers..." I would not worry
ifthe doc said that indexes are simply copied but it says "includes definition of indexes". 
> Since some of the indexes took days to build... does someone could confirm indexes are rebuilt instead of copied?
> If indexes are actually rebuilt, why should it be done that way? - There must be good reason!

You are out of luck, and it has a reason. First, pg_dumps does not copy, it dumps. It's simpler behaviour ( text output
)just output a SQL script which recreates everything and inserts all the data ( normally using copy for speed, but it's
thesame as inserting ). It takes care of generating a fast script ( meaning it creates the tables, then inserts the
data,then creates indexes and reactivates constraints, which is faster than defining everything and inserting with
indexesand constraints actives ). 

The reason to do it in text mode is you can dump between different version and/or architectures, and also the dump is
muchsmaller than the db, specially if you compress it ( I always do it, testing a bit you can always find a compresor
withwill lead to faster backups, as saved disk writing easily offsets compression times, specially in moder multicpu
memoryrich machines ). Bear in mind in many scenarios you backup a lot ( we dump some critical things hourly, even if
weare using replication ) and restore nearly never, and prefer to use a couple days more for the restore than a couple
hoursof degraded performance every backup. 

This being said, if you have an 820G db ( I still do not know which size is this, I suppose it's $PGDATA footprint ) of
importantdata ( it does not seem critical in availability, as you are taking days and still in bussiness ) and you are
havingthese kind of problems to dump and restore and move directories in your OS, and do not know how much time it
takesfor backups, you have a problem. You should practice backup AND restore more, because your question indicates you
MAYbe backing up your data, but you have never restored a backup. 

Also, the text output format is really good for the global objects in pg_dumpall, but not so much for the normal
databases.For this you should use the custom format, unless it is a really small db. The problem with it is it can only
doa database per file, and needs pg_restore to be read ( I know those are minors ). The advantage is instead of
generatinga plain text dump it builds a kind of tar file with the definitions and data for every object clearly
separated,so you can do partial restores or whatever thing you want ( in fact, without options and without connecting
tothe database pg_restore spits out the same text file that a text dump will generate ). If you had used this technique
youcould have restored your tables in order, or restored only the data and then reindexed them concurrently with some
other( performance degraded ) work. You can do the same thing by editing the text dump, but it gets impractical and
reallycomplex beyond a few megabytes. 

regards.
   Francisco Olarte.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



pgsql-general by date:

Previous
From: "FarjadFarid\(ChkNet\)"
Date:
Subject: Re: Index on integer or on string field
Next
From: Yves Dorfsman
Date:
Subject: Re: Index on integer or on string field