Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces? - Mailing list pgsql-general
From | Francisco Olarte |
---|---|
Subject | Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces? |
Date | |
Msg-id | CA+bJJbzrN8N7GmFzrSSwfjUrVcTC-Pf0tg2anGNCosXTimPFxg@mail.gmail.com Whole thread Raw |
In response to | Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces? ("Antman, Jason (CMG-Atlanta)" <Jason.Antman@coxinc.com>) |
List | pgsql-general |
Hi Jason: On Sat, Feb 15, 2014 at 10:55 PM, Antman, Jason (CMG-Atlanta) <Jason.Antman@coxinc.com> wrote: > The short answer is... due to too much technical debt, and some perhaps > bad decisions made in the past... yeah. We've dug ourselves into this > hole, and there's no feasible way out. The more I read about your problem, the more I think you've nailed it in that phrase. I'll just comment somethings in case you find them useful, but it seems joining the above with the lack of appropiate resources you will have a bad time. You seem to be in a big shop, terabyte database, 50 test environments, you must have a lot of developers. In your case I would start by estimating how much developer time you could save by buying hardware. > Currently we have ~50 dev/test > environments, each of which has a clone of the full production database > (on a NetApp, using flexclones, so it's *much* smaller in terms of > actual disk usage). I'm not familiar with netapp, but if your flexclones are exported as different filesystems you have a problem with that. Even if say, 100 clones of the database use only 5 Tb when you export it the instances see them at 100 Tb. > Each of them is backed up its own postgres instance > running in a dedicated VM. Some problem here, each VM needs its own cache. If I would need to spin a lot of databases on limited hardware what I would try is to use a single host with a snapshoting filesystem, ie ZFS or BTRFS or whatever running a lot of instances. This way you share the code mem between all the instances and you have a single OS instance, which can easily save hundreds or megs per machine. But, what is more important, you cannot do anything for shared buffers, work mem et al, they are per-instance, but postgres also uses an estimate of the cache. If your instances run on the same machine, with say 32Gb cache and the OS in this box runs the snapshots it can use cached blocks for one instance on another one, this is not possible if you are net-mounting it. I do not know how much you would gain, as it woill be heavily dependent on your concrete access patterns, but it's gonna be faster. BTW, I suppose you've already optimized the access stuff for tests, like running without fsync, minimal logging, etc in dev environements. > Data refreshes are done on-demand by stopping > the postgres instance, unmounting and deleting the volume backing it, > making a new clone of the daily snapshot, mounting it, and bringing > postgres back up. This would be the same in which I told you, just on a different place. Postgres doesn't meen wether it has a full VM or just a data directory and a port. > But we don't have the money (or, more importantly, > datacenter space) to dedicate a piece of hardware to each of these. So > most dev environments can only handle ~1 person using the application at > a time, due to trying to run a complex ORM (Django) against a ~1T > database on a host with < 24G RAM. Every time you quote one of this things it seems more and more like you are in big place and you have a really weird set of requisites. AFAIK Django is not that big, the problem may be more with the kind of things you are trying to do with them. And I do not quite understand what are you calling 'a testing environment'. With only what I've read, if the netapp thingy you mention is fast enough to serve the snapshots, I'll normally think, ram is cheap, why not put 32Gb in each developer machine and run everything there? But I suppose you have lot of other constraints. ... > So, we really need ~50 copies that can be refreshed at arbitrary > intervals. Since I've already determined that's absolutely impossible > without stopping postgres, I calculated 100-150 assuming we do bulk > refreshes at night, and each developer is allocated 2-3 possible > refreshes in a given 24-hour period. That left me still more puzzled. > Since we've allowed the database to grow without using Django's > facilities to split largely unrelated objects into different databases > and handle any JOINs in the application, and we've gradually let the > database diverge from the models actually in the ORM, every attempt at > crafting a DB with a restricted set of recent data has failed. And yes, > unfortunately, individual developers have had their own private database > instance for long enough that migrating to a shared DB approach has been > discarded as an option. As you say in your fist paragrapah, it seems you have dug a really deep hole. From what I've read so far, a lot more knowledge about your problem is needed to dessign a solution, and I'm not gonna try to do it. The only thing I can do is tell you what I've said above, virtualization is nice, but running several instances in the same machine ( be it virtual or real ) will decrease your requisites. NetApps are nice, but they virtualize the disks and hide potential sharings from the OS under the db server, so moving the snapshots nearer will decrease them too. ..... > Unfortunately, no, we're on a project that's far from having an > unlimited budget. That I supposed. The things is it seems that, as you stated, you may already be over budget. The thing about the machines is that given the numbers you quote I have a mental picture of your project. From that mental picture, for me, the cheapest way seems to be throwing hardware, as it seems you are trading some k EURO/k$/whatever for lost productivity on a whole lot of developers, and, time being money specially in this cases, it seems cheaper. Although, as I said, I do not know the restrictions. > So you're suggesting running multiple instances of postgres in the same > operating system, and they'll be able to share cache, instead of trying > to get multiple DBs in the same instance? Is there any accepted method > of managing N postgres instances on one host? Any tools or documentation > for that, or best practices? Seems scary to me, but I suppose it's the > best option we have... You are scared of N servers? Someone who was looking to copy the tablespaces? Do not worry, that is easy. If you want tools, Debian and ubuntu installations install with the ability to run multiple server instances, although that may not be adequate for you, you could grab some ideas from there. We have done this kind of thing for version upgrades ( running two versions in paralell, using pgbouncer to be able to migrate databases one by one, and it was easy. Basically, all a postgres instance uses is a directory for the data and a port number. If you configure a master volume with has every data inside ( even the conf files ), you can then spin up an instance easily by mounting COW copy of it elsewhere, changing the port in the conf and firing a postmaster. IIRC you do not to even edit the conf and can pass the port at start if it is not in the conf file. You may have a problem if you are using tablespaces in the snapshot, this would probably need a little fixing of links before starting the instance. > By "A postgres instance should happily run in half a gig" I assume > you're not counting cache/buffers, etc.? Because when trying to run > automation (i.e. automated testing) on our application, we can't get > query execution time down to an acceptable level with anything less than > ~48GB memory available in the host. There it comes again. Big problem, little resources. You use more RAM for automation testing than I use to run production for our whole shop ( counting every process, not just the DB ). Postgres will run happily, but slowly in your case. If you need to run automation on every one of your 50 copies, you have the minimum requirement, 48Gb. With VM I doubt you could squeeze it in less than 2400Gb, although memory deduplication may help you. But if you run 50 copies of postgres in the same OS and the OS knows the datadirs are COW, so it automatically shares caches, you may be able to do better. As always it depends on what your 48G are used for, if they are for work mem, it wont help you. If they are for cache it might. If they are for shared buffers, you may investigate some tunning. Anyway, the numbers you quote everytime you put one always remember me of the classical film words 'He is dead, but still does not know it', sorry, but this is my impression. > Thanks for the suggestion though. I was considering this as an > alternative, but every experience I've had of running multiple (let's > say, more than 2 or 3) instances of the same application/daemon on the > same host has inevitably led to all sorts of strange issues and > problems... I suppose it sounds like it's going to be the only option > though... Running multiple copies of the same program requires a little planning, but is a great resource saver. You only have one OS, one cache, one cpu / disk scheduler running ( and one init daemon, one cron daemon, etc... ). It's a kind of lightweight virtualization. And postgres is normally very well behaved in this scenarios. Francisco Olarte.
pgsql-general by date: