Designed to Fail: Apple Time Machine and MacOS X Server
I have a 1TB Apple Time Capsule which I say by way of introduction and not as a means to lord over you, dear reader, that I’ve got something that you might not have. I use this Time Capsule to backup my MacOS X Server.
According to the product page, here’s what Time Capsule can do:
Time Capsule is a revolutionary backup device that works wirelessly with Time Machine in Mac OS X Leopard. It automatically backs up everything, so you no longer have to worry about losing your digital life.
and
Backing up is something we all know we should do, but often don’t. And while disaster is a great motivator, now it doesn’t have to be. Because with Time Capsule, the nagging need to back up has been replaced by automatic, constant protection. And even better, it all happens wirelessly, saving everything important, including your sanity.
The only requirement for your Mac is that it is running MacOS X 10.5.2 or later.
That’s because MacOS X 10.5 introduces Time Machine like this:
For the initial backup, Time Machine copies the entire contents of the computer to your backup drive. It copies every file exactly (without compression), skipping caches and other files that aren’t required to restore your Mac to its original state.
and
By default, Time Machine backs up everything on your Mac.
(Emphasis mine.)
So when I hooked it up to my MacOS X Server 10.5.4 server and turned on Time Machine (one click!… and some typing to set the password, but we’ll overlook that), I was pleased to see that, indeed, my server was backed up. I could browse the backup and see it was good. Cool!
But then the other day, disaster struck only hours before we were to leave for the airport: my server had crashed in such a way as to be unrecoverable. Things were grim, and the only thing I could do was to restart from the Server installation DVD and restore the machine from the Time Capsule. Well, it did that in about 9 hours and with some remote system administration from my beautiful wife, the machine was back up and running.
Except it wasn’t.
E-mail didn’t work. Web services wouldn’t start. All hell was breaking loose in the system logs as various services which were trying to start up just plain wouldn’t start up. Things which previously had a quiet existence on this machine were suddenly vociferously complaining about a plethora of problems. While each had its own gripe, most were unhappy about the nonexistence of a log directory, /var/log. “Huh?” I thought to myself, “I thought I read that Time Machine backed up my entire machine to get it back to its original state. What’d I miss? And why do I feel like it’s my fault all of a sudden?”
A little searching on the web reveals that there’s a list of stuff Time Machine doesn’t back up which, on a normal MacOS client machine might be OK, but for a server is disastrous. The list, stored at
/System/Library/CoreServices/backupd.bundle/Contents/Resources/StdExclusions.plist
has, among other things, these items which are excluded
/private/var/log
and
/private/var/spool
OK, log files don’t necessarily need to be backed up, maybe only the last one so you can see what happened before the crash would be nice. Nonetheless, if there are various services which need log files or log directories to exist to run, then something, somewhere must recreate these logfiles or the system never gets up and running, and the backup has, in fact, failed. Apache is quite content to gripe that it can’t make logfiles. Amavis, too, can’t do anything unless the directory is there. Sorting out which logs need to be where and who owns them and what their permissions should be took me the better part of two hours, and I’m know I don’t have them all right. (And that’s only for the few services I’m running. God only knows what I’m missing for the others.)
But… and this one is inexcusable… not backing up /var/spool, which includes /var/spool/imap which is where my IMAP users’ E-mail is stored!! is insane and has my blood boiling. This is an oversight which is completely uncharacteristic of Apple but for which there is no excuse.
The next four hours I spent trying to recover my IMAP users and getting Postfix to run were maddening. I had lost nearly a gigabyte of E-mail. Multiple IMAP directories had to be “reconstructed,” sometimes a success and sometimes not, according to the many webpages out there. Even Apple’s own webpage describing this process failed. (Something about partition “/var/spool/imap/user” not existing.) My users were similarly peeved. “You mean we bought that expensive Time Capsule instead of a simple external hard drive for mirroring and it didn’t work?!”
Never mind the fact that it’s not the Time Capsule at fault, it’s Time Machine. They didn’t see it that way. They saw an expenditure that was unjustified because it simply didn’t work. I saw a maddening amount of work on a weekend, on my vacation, from 1000 miles away, because it simply didn’t work. And that’s just wrong, wrong, wrong.
Note to Steve Jobs You would be incensed, too.
My users and I are angry, and rightfully so. You, Apple, make a promise about a very important function and you don’t keep it. This is backup, for goodness’ sake! This is the kind of thing that has to work. If you say “backup,” it implies that it will, it shall, it must work.
And it didn’t.
Another note to Steve Jobs: If this had happened to you, you would have seen to it that somebody got yelled at and it would have been fixed immediately. Really. This is the kind of thing you hate. (I’ve read enough of Fake Steve Jobs to know how you think, man.)
Coming in an article “real soon now:” an article about how to recover from the various messes left behind by a Time Machine restore of MacOS X Server. (Just as soon as someone answers the question I pose here, that is.)
1 TrackBacks
Listed below are links to blogs that reference this entry: Designed to Fail: Apple Time Machine and MacOS X Server.
TrackBack URL for this entry: https://bill.eccles.net/cgi-bin/mt/mt-tb.cgi/310
Time Machine is still borked a bit. Read More
Oh man, this sux. Tell me you've filed a bug report with Apple about this. I mean i laughd it off when i restored my OS X client from a TM backup and noticed those log folders were not created but the same thing is happening on the SERVER ?! Jesus.. i wonder sometimes about Apple.
Adi
I have found another flaw with Time Machine. I have struggled for days to understand why my development directory, which happens to be in a directory called dev on a seperate disk partition (Called 'Data') was not being backed up.
It seems that the standard exclusions list that you kindly pointed me to in this article is GLOBAL and applies to all disk partitions, and not tied to any particular partition.
So, the std exclusion of /dev as you may well expect excludes the /dev directory as they are not real files and should not be backed up. HOWEVER it also causes my development directory /dev on my data partition not to be backed up. (It's real path is actually '/volumes/Data/dev', but time machine seems to see the Data partition as a root partition.
OK, slapped wrists to me for having the cheek to create a directory called dev, but hell, Time Machine SHOULD be able to differenciate between /dev and /volumes/Data/dev especially when there is no OS on that partition !!
I will still continue with Time Machine as it is useful for basic file backups (once you work out what it is NOT copying!), but you could never trust it for a full restore. For that, only SuperDuper or CarbonCopyCloner will really do the real job. Before Time Machine I used iBackup which is free and very good, but without the wizzy zooming effects of Time Machine, and it does seems to backup whatever you ask it to backup!
Hey, that's an excellent find, and certainly worthy of reporting to Apple with their feedback mechanism. You can certainly include the URL of this post in your feedback, and hopefully they'll get my message... again.
Heads up, folks--Dave found an interesting one!
(This post, by the way, is the most popular post on my website. This is a hot topic!)
Followed the link from /dev/why you left in the comments, I didn't know there were other things exempted from the backup like that! I also run a little mail server on OSX Server and would be very unhappy to recover my server only to discover that it wasn't recovered... Did you experiment with removing those exemptions from the plist file?
Actually, I did, but forgot to mention it. I did eliminate the exclusion with
which did it. After doing this, I rebooted (because I wanted to make sure it really did take effect), watched a backup, and verified that this directory was there and populated with real information.
So that works for me... YMMV.
Just thought I'd check to see if things are still OK after you changed the plist file, I'm thinking of doing this myself... Thanks!
Yes, things are still OK. I think you can modify the plist file, reboot, and watch a backup to make sure things do get backed up.
YMMV, as always.
Excellent article - saved my bacon when a Mac OS X Leopard Server corrupted its file system on an upgrade. Just to comment that as I was going to fix this on a new Mac OS X Snow Leopard Server - but its been fixed by Apple as of 10.6 :-)
as a Administrator of a Server you have to know:
a not tested Backup is no Backup. if you test your Backup ones a week, you have only on Backup per week. if you test only one part of you Backup, the rest will fail. every "Judge" knows the "Murphy's law" about this.
TM on a ServerOS is only fore FileServer-Dokuments. RTFM and forget the rest.
a Wireless Network is no Network. (in a Server environment)
TimeCaps what is this? kind of NAS from Apple? forget it, its nice to play with it in a private household but don't forget to save you Pictures on a reliable Media.
Do not forget:
Don't blame Apple (or Steve) fore things YOU don't know.
If you ask to have a very cheap "solution" don't blame any one else if you get it.
Every thing you don't know costs his price! Ask a professional to sell, setup and care fore your server.
You can take your lesson the hard way "learning by doing" You can take classes. You can take Server as you hobbyhorse But the roules stay the same: 1. do Backups 2. test your Backups. 3. do more Backups.
Rax,
I take exception to much of what you have said, but agree with some, so, point by point:
"a tested backup is no backup" Correct. But I still maintain that I should never have to question the content of the backup for a backup system that Apple claims makes an exact copy of my Mac. (Apple's words, not mine.)
"TM on a ServerOS is only fore FileServer-Dokuments. RTFM and forget the rest." No, no it's not. There is no caveat made whatsoever by Apple that Time Machine is in any way, shape or form not for OSX Server as well. Besides, define "documents." Aren't my users' e-mail files considered "documents," too? Why or why not? Because of where they're located in the file system?
Have you read the manual? I have. Here are some links you might find useful. Please feel free to help me recall which pages specifically mention that Time Machine should not be used with MacOS X Server:
This one mentions installing OSX Server 10.6 from a Time Machine backup (page 5) of an OSX Server.
Whoops! This one mentions that Time Machine users might have different needs than system admins, and introduces rsync, ditto, etc. (page 41). It doesn't say not to use Time Machine, though.
Ah ha! Finally! A manual that says "Time Machine backup of server data isn’t supported for an advanced server." So you're right... until page 53, which has this to say about Time Machine backups:
I added some emphasis because this description, which pertains to 10.5, just so we're on the same page, to show that the things that the Time Machine backup doesn't backup are the settings. It's pretty explicit. Interestingly, elsewhere in this same manual, it implies that this might not quite be the case, but doesn't go into any detail thereof. That's one hell of an omission.
"- a Wireless Network is no Network." I'm running full GigE between the server and my Time Capsule and never said anything about wireless. Although again Apple makes the specific claim that Time Capsule works just as well wirelessly as it does wired, and it does. It's just that Time Machine fails to back up the files that it says it will back up. That is the entire substance of my complaint. I have nothing against Time Capsule or wireless networks, nor do I complain about them here.
"- TimeCaps what is this? kind of NAS from Apple? forget it, its nice to play with it in a private household but don't forget to save you Pictures on a reliable Media." Yes, it is. Yes, this is a private household that uses it. But that's irrelevant because Time Machine fails to back up the files that it says it will back up, no matter where they are being backed up.
By the way, I do backup my pictures to removable drives, rotated regularly and kept offsite. I'm not willing to let one lightning strike take everything out.
"- Don't blame Apple (or Steve) fore things YOU don't know." I blame Apple for a thing that they explicitly tell me which turns out not to be true, hence I "knew." No more, no less.
"- If you ask to have a very cheap 'solution' don't blame any one else if you get it." Price is not the question. The featureset of the product is explicit and, though the product cost me $1,000 and was a relative bargain compared to Windows Server licenses, it is anything but "cheap."
"- Every thing you don't know costs his price!" How right you are.
"Ask a professional to sell, setup and care fore your server." I am that professional. The level of experience with the product, however, does not dictate whether or not I have to doubt Apple's words. Just because someone is a "pro" doesn't mean that he instantly doubts the veracity of Apple's product claims. Do pilots doubt Boeing's claims that the plane will fly just because they know more about the plane than the passengers? No. Both take Boeing's claims at face value because there's reasonable assurance that the claims are true. (Skeptics need not apply to flight school.)
"You can take your lesson the hard way 'learning by doing'" Also true.
"You can take classes." Also true.
"You can take Server as you hobbyhorse" Also true.
"But the roules stay the same: 1. do Backups 2. test your Backups. 3. do more Backups" Good, solid advice. It's the best material of what you posted.
/Bill
@Bill, how did you manage to change your StdExclusions.plist file?
Mine on 10.5.8 Server fails to save due to a permissions problem: The document "StdExclusions.plist" could not be saved. You do not have appropriate access privileges. To view or change access privileges, select the item in Finder and choose File > Get Info.
Of course I gave my account the read and write permissions, and Get Info lists "You can read and write". Despite saving the StdExclusions.plist file from "Property List Editor" version 2.2 (701) is still not possible.
I used emacs from the command line:
$ sudo emacs /System/Library/CoreServices/backupd.bundle/Contents/Resources/StdExclusions.plist
should take care of it for you. I didn't use the Plist editor because... well, because I didn't think of it, actually!
Hope that helps!
Bill
I just had to deal with similar insanity on a brand-new Snow Leopard Server machine. We were using Time Machine, too. We aren't doing anything weird - just AFP file sharing and network logins for a handful of Macs.
The problem? Every 24 hours or so, the file server would drop all its connections. When a Time Machine backup runs. I wish I were f'ing kidding. (And yes, Apple confirmed, "Time Machine doesn't work on OS X Server". The AppleCare rep told me to use SuperDuper instead.)
--Quentin
Funny thing: I've been running 10.6.5 on a mini with an external Firewire drive. Other than unmounting the FW drive occasionally (and I have no idea why), MOSXS and Time Machine seems to be playing together very nicely. No AFP dismounts recently.
I'm still disappointed that Apple won't remove the functionality from Server or won't acknowledge the problems and fix them for good.
On the other hand, I'm happy that my installation is working fine.