PowerBlogs.Com Development

Server Problems

Something is wrong with one of the powerblogs servers, and we're currently working on figuring out what and resolving it. I'll keep you up to date as I get more information.

Update: The problem appears to be with server's hardware. While the people at EV1's datacenter are working on fixing that, I've rented another server and will be restoring it from backups (we have a backup current as of early this morning EST). If I'm done before the other server is fixed, it will become the new primary server and the other server will become its backup when it's fixed. If the current server is fixed first, this new server will become a backup for the old server. I've gotten a lot of progress done on having the software support the live backup system that I described earlier, so when I've got two servers, I'll push to get the rest done ASAP (the log reports were a major bottleneck, but the new off-server generation method will now only take a little bit of modification to adapt to that problem).

Update: So it appears to be a hard drive problem. I've put in to have an EV1 specialist look at the machine and replace the hard drive if necessary. In the mean time, we're working to get the new server up and loaded with the backups. My guess now is that I'll have that up before the older server is fixed. For some people who set their DNS statically (rather than using a CNAME), once the new server is up we'll have to modify your DNS. (Once the new server is up, youraccount.powerblogs.com will work immediately, as will anyone whose DNS is a CNAME.) I'll keep you up to date on the progress with the new server we've ordered and are currently configuring.

Update: We've got the new server, and have done the base config. I'm working on turning it into a Powerblogs server at the moment, which is coming along well. Soon, I'll be testing it out. Once I've tested it, I'll start restoring from this morning's backups.

Update: I believe that I've finished configuring the new server as a Powerblogs server. I've been testing it, and I the tests are looking good now. I'm just about ready to start restoring from this morning's backups.

Update: I've done tests, and the server config definitely appears correct. I'm uploading the backup data to the server. As soon as it's up, I'll begin restoring from it.

Update: Reloading the backup data is going well. I've got the main database up and have done some initial integrity checks and it looks good. I'm still working on getting auxiliary data (stylesheets, uploaded files, etc) up and restored. Once I'm finished with that, I have to republish the blog pages, do a final check to make sure everything is ready, and modify the DNS settings, but once that's finished we'll be good to go.

Update: Oh, fyi, I still haven't heard back from the EV1 "systems support specialist" about the old server. They were going to begin investigating shortly 6 hours ago.

Update: I've got all of the data uploaded and am currently regenerating the blogs. After this, I need to verify that everything's correct and change the DNS, and we'll be all set.

Update: The regeneration is moving along. I've been doing some testing and things are looking good so far. It shouldn't be too long after the regeneration that we go live. I've got a few more tests to do, and then I might start changing the DNS for blogs as they finish their regeneration individually.

Update: The regeneration is going pretty well. Only volokh and whiteperil are left. If your DNS isn't working, please send an email to support@powerblogs.com and I'll investigate. I think that most people on this server are set up to automatically pick up the DNS change.

Also, I've heard from EV1 and it looks like we might be able to recover the data from the hard drive. I'll let you know more on that front as soon as they tell me.

Update: The regeneration is very nearly complete, and nearly everything is back up again. Scheduled posting is not yet enabled, however, and probably won't be until tomorrow. For various technical reasons, when I restored it re-created posts which were saved for later. I need to write a quick program to cull the posts which are saved for later but were already published, and I'm too tired at the moment to trust myself to do that correctly. I should have that up some time tomorrow. I'm going to bed now, but the DNS for the remaining blogs has been set and I've done a partial regeneration on them, so their front page works and they can be used. That means that as of now, all Powerblogs blogs can be used, except for scheduling posts (you can schedule them, they just won't be published until some time tomorrow).

Update: I'm working on fixing the problem where saved posts that were published have been "resurrected". I hope to have the redundant saved posts removed within an hour or so.

EV1 has indicated that prospects for file recovery looks hopeful, but that it will take some time. They said that they should have further results for me in the late afternoon (I believe that that's Texas time). I should mention, though, that even if all of the data from the drive can be recovered, it is possible that a post could have been caught at the exactly wrong time so that it was never written to disk in the first place.

I also just realized another layer of backup which we have — the mailing list archives. You can get to them at http://powerblogs.com/pipermail/yourhostname/ — any lost posts might be there, since that's hosted on a different machine from the one that went down.

Update: The mailing list archives are looking to be a real saver from this disaster. So, if you may have lost any posts that were made after the last daily backup was taken, check your mailing list archive. It's URL is http://powerblogs.com/pipermail/{yourhostname}/2005-December/date.html — e.g. the archive for dev.powerblogs.com is http://powerblogs.com/pipermail/dev/2005-December/date.html

Update: I'm nearly done culling the saved posts, and will very shortly re-enable the scheduled poster.

Update: I'm done culling the saved posts, and have re-enabled the scheduled publisher. Scheduled posts should be good from here on out.

Update: For some blogs, the trackbacks to their posts didn't restore correctly. The trackback data is not lost, and I'm working on restoring that.

I'd also like to take this opportunity, now that things are calming down, to give my heartfelt apologies that it was not handled better and more swiftly, and that it caused so much grief, frustration, and worry. I'm very sorry for that, and we're working on ways to improve our reliability.

Update: For some people, not all of the comment pages were written properly. I've triggered a republishing of everyone's comment pages. This is slowing things down a little bit, but that should be over with in an hour or two. Also, I've been tuning the system to improve performance.

Posted by Chris on 12.28.2005.
Gaijin Biker (mail) (www):
Where are the backup servers that were supposed to stop service outages like this from happening again?
12.28.2005 10:28am
Gaijin Biker (mail) (www):
You know, the thing you posted about here?
12.28.2005 10:50am
Gaijin Biker (mail) (www):
As I said, it's ambitious, but it will result in a system with no single point of failure that should be able to incorporate new servers quickly and seamlessly. Getting this done is going to be top priority here at Powerblogs. The transition to the two redundant servers should be completed within two weeks.
You posted that on September 6. It is now almost four months later and the redundant servers aren't up yet? Color me disappointed.
12.28.2005 11:28am
Chris (www):
GB,

I'm not ignoring you, I'm just working feverishly to get the new server up with the backup data. I'll respond once I've got that up and working.
12.28.2005 2:46pm
Gaijin Biker (mail) (www):
I don't think you're ignoring me, I think you're lying to me. You said you had installed redundant servers months ago, and you have not. If you had, you wouldn't need to be "working feverishly" now.

This kind of unreliability is not what I wanted when I decided to move my blog to a paid service, and is completely unacceptable.
12.28.2005 5:10pm
Chris (www):
I never claimed to have installed the servers. I was wildly over-optimistic in how long it would take to move to the redundant servers — there were a number of theoretical problems that I didn't appreciate in implementing true redundancy in web-accessible software. I am certainly guilty of not actively anouncing the status of that transition, and making it clear that my estimate was incorrect, which I should have done. I am very sorry for that, and I apologize.
12.28.2005 5:29pm
Dean Esmay (www):
Any idea how much longer it will be?
12.28.2005 8:32pm
Chris (www):
Dean,

It's hard to say for sure, but my sense is in about an hour.
12.28.2005 9:23pm
Chris (www):
Dean,

Then again, your blog is taking forever to regenerate (it's most likely all of those comments). The Volokh Conspiracy is likely to take a long time too. So, it might take a bit longer. But I've got a blog up that I can play with to make sure that everything is working, so I can do the testing before everyone's regeneration is complete. Once the testing is good, I can start moving the DNS over.

By the way, you're one of the people who have a static DNS set (in your case, for deanesmay.com; www.deanesmay.com is a CNAME which means that you don't have to do anything for it).

(The new IP will be 66.98.172.69.)
12.28.2005 9:48pm
Dean Esmay (www):
Does this mean I'm going to have to do something?
12.28.2005 9:50pm
Chris (www):
Dean,

Unfortunately, yes. Since deanesmay.com is statically set to the IP address of the old server, you're going to have to modify that DNS record. (www.deanesmay.com is a CNAME for deanesmay.powerblogs.com, so that when I change that, www.deanesmay.com will automatically be taken care of.)
12.28.2005 9:55pm
Gaijin Biker (mail) (www):
What will I have to do, if anything, for Riding Sun? It still isn't back up.

I don't know anything anbout modifying DNS records... if I need to do that how do I do it?
12.28.2005 10:14pm
Chris (www):
GB,

Unfortunately, you do for both www.ridingsun.com and ridingsun.com — if you email me your login for your registrar (the URL + username/password), I'll take care of it at the appropriate time.

It shouldn't be too much longer. Dean's blog is still regenerating, but it should finish pretty soon. There are a few blogs between his blog and yours, but not many, and none of them huge (==slow).
12.28.2005 10:21pm
Chris (www):
GB,

Actually, unless you've changed your Network Solutions login ID, I still have your previous email about it, and can use that, if you authorize me to.
12.28.2005 10:38pm
Chris (www):
Dean,

Ok, this is definitely taking more than an hour. Well, that's already obvious, but your blog in particular is taking an amazing amount of time to regenerate. I think that it has to do with the number of comments on the number of posts that you have, and that this is a fresh regeneration.

Anyhow, I've been working on the testing which is necessary and I'm going to start switching DNS over as soon as each blog has regenerated.
12.28.2005 10:44pm
Sean Kinsell (mail) (www):
Yeah, Chris, I'm like Gaijin Biker--anything I have to do on my end, you're going to have to let me know about.
12.28.2005 10:51pm
Chris (www):
Sean,

No, you're fine; all of your DNS entries are CNAMES, so everything will be taken care of automatically when I change your entry.
12.28.2005 10:53pm
Gaijin Biker (mail) (www):
My Network Solutions userID and password are the same ones I emailed you on August 7. Please make the DNS changes for me.

Thanks, looking forward to having my blog back online.
12.28.2005 10:57pm
Dean Esmay (www):
What's the new IP address??
12.28.2005 11:09pm
Dan Melson (mail) (www):
Chris, I think I'm cnamed but want to make certain

Dan Melson
Searchlight Crusade
12.28.2005 11:16pm
Chris (www):
Dean,

The new IP will be 66.98.172.69
12.28.2005 11:21pm
Chris (www):
Dan,

Yes, you're good.
12.28.2005 11:21pm
Dean Esmay (www):
I assume all my old non-Powerblogs files won't be working, including the old MT archives?
12.28.2005 11:27pm
Chris (www):
Dean,

Temporarily. Once the blogs are all working, for those who have old MT archives, I'll upload them as the next thing that I do.
12.28.2005 11:28pm
Dean Esmay (www):
Maybe it's time to write an MT conversion utility?
12.28.2005 11:29pm
Chris (www):
Dean,

I've already got an MT conversion utility. Unless I'm mistaken, I imported your old posts. The thing is that people want the old permalinks to work, which is why I have to copy around the old MT files (for the filenames).
12.28.2005 11:31pm
Dean Esmay (www):
I wouldn't mind importing all my old posts and then just losing the MT archives. That way I could edit the old posts and update them as needed.

Anyway: might I suggest that in future you do full backups that don't require you to regenerate everything like this?
12.28.2005 11:48pm
Dean Esmay (www):
RAID woudln't be a bad idea either.
12.28.2005 11:48pm
Chris (www):
Ok, dean's regeneration is finished and now things are moving along. I'm switching the DNS as each blog gets regenerated.
12.28.2005 11:54pm
Chris (www):
Dean,

Actually, I do full backups. Regenerating from data is actually faster, since that goes at about 1.5 megabytes per second, and the off-site backup doesn't have that kind of bandwidth.

You're right about RAID.
12.28.2005 11:56pm
Dean Esmay (www):
I'm glad you do off-site backups.

This is the kind of thing that can destroy a business's reputation. I don't think you can work too hard to ensure a failure this bad doesn't happen again. Even if you have to raise prices.

Bloggers--at least those who love what they do--go bonkers at the thought that they could suddenly lose their entire published output due to a hard drive failure.
12.29.2005 12:03am
Gaijin Biker (mail) (www):
Do these new DNS's mean that after you finish fixing everything, we still have to wait a day or two for the changes to "propagate" throughout the Internet?
12.29.2005 12:05am
Chris (www):
GB,

Technically it could take a while, though in your case I've made the changes and your registrar's nameservers already reflect the changes. I've looked at the SOA record for your domain, and Network Solutions has the refresh set at 3 hours for your domain, so that's about the longest that anyone should take to pick up the new domain. Anyone who hasn't tried recently will pick it up immediately.
12.29.2005 12:16am
Dean Esmay (www):
Well according to Joker my DNS server settings have gone through, but God knows how many hours it'll take before that makes it down to local DNS servers. I still can't get to my site from here.
12.29.2005 12:21am
Gaijin Biker (mail) (www):
Are you saying the problem is now supposedly fixed? I still can't access my blog.
12.29.2005 12:22am
Dan Melson (mail) (www):
Okay, this is weird. Just on a lark I tried my powerblogs url and it worked, but the main site URL (www.searchlightcrusade.net) did not.
12.29.2005 12:29am
Chris (www):
Dean,

Looking at the SOA for deanesmay.com, it's got a 24 hour refresh. However www.deanesmay.com is a CNAME, and powerblogs uses a pretty brief refresh, so that should be working for you.
12.29.2005 12:33am
Chris (www):
GB,

I can get at your blog via all of its URLs. If you've been trying it in your web browser, most likely the first one you'll get to is ridingsun.powerblogs.com — I can get to your blog via www.ridingsun.com and ridingsun.com, though, so you should be able to pretty soon.
12.29.2005 12:35am
Chris (www):
Dan,

I've got that fixed now.
12.29.2005 12:36am
Gaijin Biker (mail) (www):
I can now access Dean's blog, but not mine.
12.29.2005 12:36am
Gaijin Biker (mail) (www):
I notice Dean says he lost everything posted yesterday. You said earlier that things were backed up as of early in the morning Dec. 28th. I hope I haven't lost the several posts I wrote yesterday. I don't know, since I can't access my blog yet. Could you please tell me the title of the most recent post you see on my blog? Thanks.
12.29.2005 12:40am
Dan Melson (mail) (www):
Sorry, Chris, but I just tried again. No luck. I'm getting "connection refused.' On the other hand, I'm logged in on the Admin section and that's working OK. Just the main site URL seems to be the last issue, at least that I can make out.
12.29.2005 12:42am
Chris (www):
GB,

The most recent post on your blog is titled "Tuesday caption contest #18".

Any posts since the most recent backup aren't necessarily lost, depending on whether we can recover the old server. (I still haven't heard from EV1)
12.29.2005 12:43am
Gaijin Biker (mail) (www):
There are about three more posts after that one. I need you to restore those posts. Even if the server is toast, there may be some data-recovery company that can dig through it and retrieve the lost post data.

You really have no clue how angry I am about this whole scenario. Losing posts is just the final straw. Please get them back ASAP and keep me updated on your progress.
12.29.2005 12:48am
Chris (www):
Dean,

Are any of the posts that you lost still in saved posts, scheduled to go up?

I can't run the scheduled publisher until all of the blogs are restored (volokh is being restored at the moment, and after volokh, there's just Sean's blog (whiteperil)).
12.29.2005 12:57am
Gaijin Biker (mail) (www):
You didn't ask me, but none of my missing posts are in Saved Posts. They are gone. I want them back.

Also, I can access and post to ridingsun.powerblogs.com, but still no luck at www.ridingsun.com.
12.29.2005 1:09am
Dan Melson (mail) (www):
Main site URL problem seems fixed. Thanks Chris, you've done good work with a bad situation. Good night.
12.29.2005 1:18am
Chris (www):
GB,

I didn't ask because I had already checked and saw that you didn't have anything scheduled. I'll be working with the EV1 people to recover all of the data from the old server, and they've indicated that there's reason to hope that we can get it.
12.29.2005 1:21am
Chris (www):
Dan,

Thanks for letting me know.
12.29.2005 1:22am
Gaijin Biker (mail) (www):
Well, since you had to ask Dean if he had anything scheduled, it stood to reason that it wasn't something you could tell by checking on your own.

At any rate, I am very hopeful you will get my lost posts back. Any rough timeframe for this -- are we talking hours, days, or weeks?

Also, still no luck with www.ridingsun.com. I will be staying up until it is working for me.
12.29.2005 1:25am
Chris (www):
GB,

I didn't mean to imply that your mentioning it was in any way unreasonable; I meant no offense. I'm a bit tired at the moment.

As far as getting lost posts back, if what EV1 servers told me works out, then I should be able to get the data in a day to days.

I'm rather surprised about www.ridingsun.com — it's been working for me for a while now. Please let me know.
12.29.2005 1:33am
Gaijin Biker (mail) (www):
www.ridingsun.com is back up for me now, too.

I appreciate that you are tired. I'm tired, too. I'll be going to bed now, and I look forward to hearing about the recovery of lost posts off the server.
12.29.2005 1:47am
Chris (www):
GB,

Good to hear that it's up.

EV1 should have the server in a bootable configuration and turn it over to me tomorrow, when I can begin the data recovery process.
12.29.2005 1:52am
Dean Esmay (www):
I myself am not worried about the lost posts. These things happen. If I'd lost much more than that I'd be incensed, but a handful of losses doesn't make me all that upset.

Crashes happen. They're part of computers. Backups are there to restore most of what you lose, not necessarily everything.

If this had happened to the free BlogSpot blogs, or the old Userland free blogs, or LiveJournal, you'd find that they wouldn't even be trying to restore old data, most likely.

This sort of thing is also, by the way, why smart people make their own backups of their blogs, just in case. I back mine up periodically; this reminds me that I need to do it again.
12.29.2005 4:32am
Gaijin Biker (mail) (www):
Crashes happen. They're part of computers. Backups are there to restore most of what you lose, not necessarily everything.
Yeah, and that's why before I signed on to Powerblogs, I asked Chris about downtime and backups. Backing up is important, and I wanted a service that would do it for me. Paying for goods and services, instead of doing everything yourself, is the cornerstone of a modern economy. Do you grow your own vegetables and slaughter your own meat?
If this had happened to the free BlogSpot blogs, or the old Userland free blogs, or LiveJournal, you'd find that they wouldn't even be trying to restore old data, most likely.
That's why I'm not on a free blogging service.
This sort of thing is also, by the way, why smart people make their own backups of their blogs, just in case. I back mine up periodically; this reminds me that I need to do it again.
Good to know you think I'm not smart. But even if I saved copies of my posts, what would I do about the user comments? And if I reconstructed my blog from saved posts, all the permalinks would change, breaking old links. Not a good solution. That's why I want my blog service provider to handle it for me.
12.29.2005 8:19am
Chris (www):
GB,

I've found your posts in the mailing list archive.
12.29.2005 11:19am
Gaijin Biker (mail) (www):
Thanks, Chris. I had to fix all the HTML, and replace some links and a photo, but it was definitely better than starting from scratch. Problem solved.

I am now looking forward to hearing about how you will be improving the Powerblogs backup system.
12.29.2005 1:36pm
Gaijin Biker (mail) (www):
It looks like all my trackbacks from other blogs from before the crash are gone.

Every single one.

Is there any way you can recover those?
12.29.2005 4:07pm
Chris (www):
GB,

That shouldn't be the case. Trackbacks are just part of the blog data which was backed up. I'm looking into it, and will keep you updated.
12.29.2005 4:13pm
Gaijin Biker (mail) (www):
I am also missing the COMMENTS on many older posts. It seems like all the comments from all posts in July or earlier are GONE. In August, some posts kept their comments, while on others, they are GONE. From September to the present, it looks like all the posts kept their comments, but I am not sure.

I really hope you can find the comments and the trackbacks for my posts.
12.29.2005 4:47pm
Sean Kinsell (mail) (www):
Chris, I have the same thing as Gaijin Biker: a lot of older posts show non-zero numbers of comments, but you can't view them.
12.30.2005 3:04am
Martin L Shoemaker (www):
Chris,

I tried 12.30.2005 4:39pm
Martin L Shoemaker (www):
Chris,

I tried this address to try to find my mailing list archive, and got a 404. Does that mean I don't have a mailing list archive? Or did I guess the wrong URL? I only lost one post; but it was a long post full of photos, and it took me five hours to create it. I'm hoping it can be salvaged.

Oh, and I'd like to quote from a comment I made at Dean's World:


In the six months since you convinced me to use PowerBlogs, this is the first time I've had any service problem.

In that same time frame, not counting set-up costs, my total hosting costs have been $30. (My traffic's a little lower than yours, of course.) The total time I've spent on site administration, in hours, can probably be counted on the fingers of one hand. And I'm including in that the time I spent nagging them to remind them to bill me for some services. (Chris, head honcho at PowerBlogs, was going on his honeymoon at the time, and thought that was a higher priority.)

Frankly, I don't know how they can provide such good service at such a low cost with so little effort from me. I wouldn't have guessed it was a sustainable business model. But as long as they keep doing what they're doing, I can forgive the very rare hiccup.
12.30.2005 4:43pm
Gaijin Biker (mail) (www):
Since I haven't been shy about posting complaints, I should also note that I am impressed with Chris's efforts to fix the problem. (I do hope that he won't need to demonstrate such efforts again, though.)

Frankly (as I note here), Chris's dedicated customer service is the only reason why I'm not planning to switch away from Powerblogs after this crash. The only thing worse than having a problem is having a problem and not being sure if anyone is working to fix it.

I feel even better knowing that Chris's efforts will now be going into an ounce of prevention rather than a pound of cure.
12.31.2005 6:00am
Account:
Password:
Remember info?