PowerBlogs.Com Development

Maintenance (News)

There's going to be some minor downtime this weekend on all servers as we upgrade our webservers. The downtime should be minimal. I'll post more here as the times firm up.

Posted by Chris on 07.07.2006. (0 Comments)
One of the powerblogs servers is down (News)

We're very sorry for the interruption of service to those who are affected, and I'll be posting more as I know it here.

Update: We've got the server back up. Things are looking reasonably normal so far, but I'm going to do more investigation. We will probably need to bring the server offline for maintenance some time late tonight. I'll post more as I know it.

I'm very sorry for the trouble that this has caused people. We're investigating the cause to make sure that it doesn't happen again. There's no reason, so far, to suspect any data loss from the server, but would everyone please just take a quick look to make sure that everything appears to be intact? We've got all of the livebackups (plus the last-ditch mailing list backup), so if anything is missing we should be able to restore it promptly. Thank you, and once again we apologize for the trouble.

Posted by Chris on 06.01.2006. (0 Comments)
Server Downtime (News)

We may need to bring one of the powerblogs servers down for maintenance tonight. I'll update this post with when we start and when we finish.

Update: Ok, we're about to take the server down for maintenance. Hopefully, the downtime should only take about an hour.

Update: We're deferring the downtime until tomorrow, to give us some time to analyze what's going on further and minimize the downtime we need to perform.

Posted by Chris on 05.25.2006. (0 Comments)
Email Issues (News)

Just a quick note, at the end of last week we were having some problems with email. Everything should have been resolved and eventually received/sent, but if you emailed us or used the contact form and didn't receive a response, we would be very grateful if you would re-send your email or just drop us a note that you haven't heard back from us. Thanks!

Posted by Chris on 05.08.2006. (0 Comments)
Scheduled downtime (News)

We're going to need to bring one of the Powerblogs servers down tonight for a few minutes for some maintenance. This will probably happen at around 10pm tonight. The interruption should not last long, but we apologize for the inconvenience. I'll update this post when we start and finish.

Update: We're pushing the maintenance to sunday. I'll update with more details tomorrow.

Update: I'll be bringing the server down for maintenance at 9pm tonight. Hopefully it shouldn't be down long. We apologize for the inconvenience.

Update: Ok, things look good now, and the server is back up. We apologize for the inconvenience.

Posted by Chris on 04.22.2006. (0 Comments)
Possible downtime tonight (News)

We might need to take one of the Powerblogs servers down temporarily for some emergency maintenance tonight at around 10pm EST. (It's the original Powerblogs server, not the newest server that experienced problems two months ago.) The downtime shouldn't last more than about an hour, and hopefully less than that. I'll update this post with more information as it becomes available.

Update: We're definitely going to do it tonight. We'll probably take it down between 10pm and 10:30pm. With luck, it won't be down for more than an hour. We apologize for the short notice, but we need to fix the problem which has just surfaced before it gets worse and forces us to deal with it. Thank you for your understanding, and we apologize for the inconvenience.

Update: Ok, we've finished with the down time for now. We'll be keeping an eye on the issue, and it may require additional downtime, but for the moment, things look good. Thank you for your patience, and again we apologize for the inconvenience.

Posted by Chris on 02.13.2006. (0 Comments)
Reports coming back online (News)

I've generated the reports again today, finally, and things are looking good. I've just put them on an automatic schedule for every 8 hours, so they should be generated regularly again from now on. I apologize to everyone for the time without them.

Posted by Chris on 01.17.2006. (0 Comments)
Improved backups (News)

Since there is still a lot of work to be done (as well as some theoretical issues to overcome) with redundant servers, there obviously needs to be significantly improved backups immediately. Here's what I've come up with so far:

  1. I've already set the off-site backups to take place every 4 hours. Unfortunately, the problem with the server that went down happened about 23 hours into the 24 hour backup cycle (i.e. at the worst possible time). This will immediately drastically cut down on the risk period. (This applies to all Powerblogs servers.)

  2. The new server is more powerful than the server which went down, and in particular has two identically sized hard drives (larger than on the previous server, too). I'm going to set up a program to mirror the entire server onto its second hard drive, so if something goes wrong with the primary hard drive, we can immediately reboot to the secondary hard drive, minimizing downtime. I'm thinking that to start we'll sync it every hour. Tom, the Powerblogs Idea Rat (his semi-official title), is doing research into whether we can use realtime filesystem change information to bring the syncing to something like every minute. (This will only apply to the new server.)

  3. I had forgotten that I purposely enabled the mailing list archives with being a last-ditch backup in mind. Unfortunately, we ended up having to use the last-ditch safety net — not something that should ever happen — so I want to improve this concept. I'm going to set up an off-site mail server and modify the powerblogs software to email the full post information in machine-readable form (suitable for use in automatic restoring), before it does anything else, to the off-site backup. This will make the off-site backups for the posts genuinely real-time, or at least very, very, very close to it. (This will apply to all servers.)

  4. When we get the old server back (and test it thoroughly), until it's a redundant server sibling with the newest server, I'll keep it around with a full install and configuration of the powerblogs software, ready to take on the role of another server if anything goes wrong.

#1 is done already. #2 will not take long to set up, though the downside is that it will require scheduled downtime in order to test. #3 will take a little longer to implement, but it won't take very long. I'm guessing that it will take about a day or two to get the first version of the full-system syncing to the second hard drive. I should have the email code and email receptacle up in about a week. (Please note, since I screwed up with this before, my estimates are not guarantees, and please assume that anything that I talk about is not implemented until I explicitly say that it's finished and live.)

The long-term plan is for redundant peer-to-peer servers which will truly have no single points of failure and can operate both together with load balancing and realtime syncing, and independently with resyncing. There are still a few practical problems with this that need addressing, and a few theoretical ones, but I think that it's doable and will only be a few months away.

Comments and suggestions about these backup plans would be appreciated. One of the problems that this has exposed is that Powerblogs operates on pretty thin margins (given the cost of bandwidth, the bandwidth that accounts come with, the cost of disk space, the size chunks that we have to buy the stuff in in order to get good prices, and the infrastructure to do development), which makes reliability enhancements like spare servers and RAID for primary storage somewhere between difficult and not doable. Now, Powerblogs has been especially unlucky (not counting the very brief outage a few weeks ago when an errant program filled up the hard drive — ironically, that would have taken out even redundant servers, since the 100Gb file would have been replicated to them both), but bad luck can be overcome with money. I'd be especially interested to know how users would feel about increased prices in order to pay for higher-end hardware (with RAID to guard against disk failure, more RAM and CPU for better performance, etc), faster connections to the off-site backup, etc. For example, if Powerblogs doubled prices, we should be able to afford to rent a dual 3.2GHz Xeon with 4 GB ram and 4 250GB SATA drives in RAID 0+1 for 500GB of usable storage. (RAID 0+1 means fast reads and the data is always on 2 drives at any time, so that a single drive failure won't impact the system's uptime at all.) It would be blazingly fast, handle high loads very well, and be quite reliable (there's also the effect in computers that the more the computer costs, typically the higher quality all of the parts in it).

I would greatly appreciate if subscribers could leave comments whether you'd want to pay more for a better system and higher reliability, or whether you prefer going the less expensive route and doing the best that we can with what we have? What do you guys want? Where do you think that we should go?

Update: I've made the initial copy of the data onto the second hard drive in the server. I'll be working on setting up the scheduled syncing to the second hard drive tomorrow. Within a few days, I hope to have the post-email-backups going, and within a two weeks, we might have two off-site locations that will be getting the posts emailed to them.

Update: I've been working on the syncing, and unfortunately it's not as fast as I want it yet. I'm periodically syncing it manually, and before too long I should have it scheduled to do the syncing. I'm also working on some changes to the Powerblogs code that will let the syncs go faster. (The reason for the concern over speed is that the syncing places some stress on the server, and while reader page loads should still be pretty quick, the Powerblogs interface itself will be slowed down a bit. I don't want improved reliability to come at the expense of increased frustration.)

Posted by Chris on 12.29.2005. (7 Comments)
Server Problems (News)

Something is wrong with one of the powerblogs servers, and we're currently working on figuring out what and resolving it. I'll keep you up to date as I get more information.

Update: The problem appears to be with server's hardware. While the people at EV1's datacenter are working on fixing that, I've rented another server and will be restoring it from backups (we have a backup current as of early this morning EST). If I'm done before the other server is fixed, it will become the new primary server and the other server will become its backup when it's fixed. If the current server is fixed first, this new server will become a backup for the old server. I've gotten a lot of progress done on having the software support the live backup system that I described earlier, so when I've got two servers, I'll push to get the rest done ASAP (the log reports were a major bottleneck, but the new off-server generation method will now only take a little bit of modification to adapt to that problem).

Update: So it appears to be a hard drive problem. I've put in to have an EV1 specialist look at the machine and replace the hard drive if necessary. In the mean time, we're working to get the new server up and loaded with the backups. My guess now is that I'll have that up before the older server is fixed. For some people who set their DNS statically (rather than using a CNAME), once the new server is up we'll have to modify your DNS. (Once the new server is up, youraccount.powerblogs.com will work immediately, as will anyone whose DNS is a CNAME.) I'll keep you up to date on the progress with the new server we've ordered and are currently configuring.

Update: We've got the new server, and have done the base config. I'm working on turning it into a Powerblogs server at the moment, which is coming along well. Soon, I'll be testing it out. Once I've tested it, I'll start restoring from this morning's backups.

Update: I believe that I've finished configuring the new server as a Powerblogs server. I've been testing it, and I the tests are looking good now. I'm just about ready to start restoring from this morning's backups.

Update: I've done tests, and the server config definitely appears correct. I'm uploading the backup data to the server. As soon as it's up, I'll begin restoring from it.

Update: Reloading the backup data is going well. I've got the main database up and have done some initial integrity checks and it looks good. I'm still working on getting auxiliary data (stylesheets, uploaded files, etc) up and restored. Once I'm finished with that, I have to republish the blog pages, do a final check to make sure everything is ready, and modify the DNS settings, but once that's finished we'll be good to go.

Update: Oh, fyi, I still haven't heard back from the EV1 "systems support specialist" about the old server. They were going to begin investigating shortly 6 hours ago.

Update: I've got all of the data uploaded and am currently regenerating the blogs. After this, I need to verify that everything's correct and change the DNS, and we'll be all set.

Update: The regeneration is moving along. I've been doing some testing and things are looking good so far. It shouldn't be too long after the regeneration that we go live. I've got a few more tests to do, and then I might start changing the DNS for blogs as they finish their regeneration individually.

Update: The regeneration is going pretty well. Only volokh and whiteperil are left. If your DNS isn't working, please send an email to support@powerblogs.com and I'll investigate. I think that most people on this server are set up to automatically pick up the DNS change.

Also, I've heard from EV1 and it looks like we might be able to recover the data from the hard drive. I'll let you know more on that front as soon as they tell me.

Update: The regeneration is very nearly complete, and nearly everything is back up again. Scheduled posting is not yet enabled, however, and probably won't be until tomorrow. For various technical reasons, when I restored it re-created posts which were saved for later. I need to write a quick program to cull the posts which are saved for later but were already published, and I'm too tired at the moment to trust myself to do that correctly. I should have that up some time tomorrow. I'm going to bed now, but the DNS for the remaining blogs has been set and I've done a partial regeneration on them, so their front page works and they can be used. That means that as of now, all Powerblogs blogs can be used, except for scheduling posts (you can schedule them, they just won't be published until some time tomorrow).

Update: I'm working on fixing the problem where saved posts that were published have been "resurrected". I hope to have the redundant saved posts removed within an hour or so.

EV1 has indicated that prospects for file recovery looks hopeful, but that it will take some time. They said that they should have further results for me in the late afternoon (I believe that that's Texas time). I should mention, though, that even if all of the data from the drive can be recovered, it is possible that a post could have been caught at the exactly wrong time so that it was never written to disk in the first place.

I also just realized another layer of backup which we have — the mailing list archives. You can get to them at http://powerblogs.com/pipermail/yourhostname/ — any lost posts might be there, since that's hosted on a different machine from the one that went down.

Update: The mailing list archives are looking to be a real saver from this disaster. So, if you may have lost any posts that were made after the last daily backup was taken, check your mailing list archive. It's URL is http://powerblogs.com/pipermail/{yourhostname}/2005-December/date.html — e.g. the archive for dev.powerblogs.com is http://powerblogs.com/pipermail/dev/2005-December/date.html

Update: I'm nearly done culling the saved posts, and will very shortly re-enable the scheduled poster.

Update: I'm done culling the saved posts, and have re-enabled the scheduled publisher. Scheduled posts should be good from here on out.

Update: For some blogs, the trackbacks to their posts didn't restore correctly. The trackback data is not lost, and I'm working on restoring that.

I'd also like to take this opportunity, now that things are calming down, to give my heartfelt apologies that it was not handled better and more swiftly, and that it caused so much grief, frustration, and worry. I'm very sorry for that, and we're working on ways to improve our reliability.

Update: For some people, not all of the comment pages were written properly. I've triggered a republishing of everyone's comment pages. This is slowing things down a little bit, but that should be over with in an hour or two. Also, I've been tuning the system to improve performance.

Posted by Chris on 12.28.2005. (64 Comments)
Downtime (News)

Some Powerblogs users experienced a few minutes of downtime recently. The short version is that it's over and won't happen again.

The longer version is that there was a very subtle bug in the new remote report software which somehow made one user's log report consume all available disk space (the bug was subtle; the effects certainly weren't). Servers require some disk space for temporary files and such, or they can't do things like serve pages or let people log in. I'm modifying the report software to ensure that it will never write large html files, so this problem will never happen again.

Posted by Chris on 12.20.2005. (0 Comments)
Reports coming back on line (News)

I've finally got the offsite log generation system to the point where it's downloading the logs, generating the reports, and uploading them. It will still be a few days until the logs are generated regularly — for the next few days I'm going to manually run the report generation so that I can watch it and make sure that everything is going well. I will run the report script at least once a day for now. Once I'm confident in the new system, I'll have it back to generating the reports every four hours.

Posted by Chris on 12.14.2005. (0 Comments)
reports status (News)

Just a heas up on the reports: I'm almost ready to go live, now. I expect to have the reports live any day now — probably starting saturday night. (Because log data is valuable, and the off-site solution involves moving it and deleting it on the webserver, extensive testing needs to be done to ensure that no readership data is ever lost. Testing and debugging always takes more time that programming does, unfortunately.)

I am very grateful for the patience everyone has shown on this issue.

Posted by Chris on 12.08.2005. (0 Comments)
Reader statistics (News)

The new setup for readership statistics generation is going to be very robust, but it's not quite ready. It's the top priority project, though, and should be done very soon now. I'm hopeful that I can get it working before wednesday starts.

Posted by Chris on 12.06.2005. (0 Comments)
Reports coming soon (News)

The logfile reports will resume generation within the next few days. When they resume they'll be generated off-site, so they will never again impact the performance of the powerblogs servers. Thanks for your patience.

And happy Thanksgiving!

Posted by Chris on 11.24.2005. (1 Comments)
The Redundancy Plan (News)

Ok, here's the plan. It's ambitious but achievable, and worth it. (And it shouldn't drive prices up, either.)

Servers will work in pairs, though the implementation will allow for trios, etc. They will both serve web pages as well as take new posts, comments, etc. The servers will automatically sync with each other in real-time on a peer-to-peer basis, so they'll be capable of operating independently if one of them goes down or they get separated, and they'll automatically sync up as soon as they can talk to each other again. (Each server will have a sync daemon which stores all updates with a revision number; when a server reconnects it will just ask for all udpates from the last revision it knew about — this will ensure smooth resyncing.)

Reports will be generated by an off-site server, taking the load off of the web servers to ensure that report generation will never bog them down. (I'll have to get a feel for how quickly I'll be able to process the reports, but it might be reasonable to get the processing up to every 2 hours, which would be really nice.)

All of the servers will be DNS slaves (listed in the master record) so that there won't be any single point of failure. DNS will hit the two servers in round-robin fashion.

As I said, it's ambitious, but it will result in a system with no single point of failure that should be able to incorporate new servers quickly and seamlessly. Getting this done is going to be top priority here at Powerblogs. The transition to the two redundant servers should be completed within two weeks.

I'll keep you updated.

Posted by Chris on 09.06.2005. (0 Comments)
downtime (News)

One of the Powerblogs servers is unable to reach the network. We're currently working with our upstream provider to get the issue fixed (they're not sure yet, but it looks to be either the physical infrastructure or the switch). This is our new, more powerful server.

Obviously no one's happy about this. We were lulled into a false sense of security by our previous good experiences with them.

Obviously, we can't let things continue this way. We're working now on a redundancy plan to make our servers fully robust against upstream provider issues. I'll try to get the details out tonight once we're reasonably settled on what the plan and new design is going to be.

Update: we're back up. Apparently it was a network cable (now replaced) that had us down for so long.

I'll be updating with the plans for a redundant server as they firm up.

Posted by Chris on 09.06.2005. (0 Comments)
Back to blogging (Bug Fixes, News)

Ok, I'm finally back from my honeymoon and settled in enough to start blogging again, which I will do faithfully, from now on. :)

Today's news is a few bug fixes: unclosed HTML comments in the sidebar special content will no longer kill the save button on the edit page, and quotes in co-blogger author accounts won't interfere with the "Set Contact Info for an Author" drop-down any more.

I'm also getting pretty close on getting report generation back up and running. The improvements that I've been making will dramatically reduce memory usage, so things should be good for the long-term, then.

Posted by Chris on 07.28.2005. (2 Comments)
New low volume plans (News)

Powerblogs now offers two news plans: a $15/year plan and a $25/year plan. Designed for low-volume blogs, these plans should work well for people who want to blog only for friends and family, or for people who just want to try out blogging to see if they like it. (There will very shortly be an easy upgrade path from these plans to one of the high volume plans.)

So, if you know anyone who wants to get into blogging with great software but doesn't want to spend $5 a month, let them know! ;-)

Posted by Chris on 01.13.2005. (0 Comments)
PHP gone (News)

I finally got a chance to do some work on the main website. There's no longer any php in use, which should free up some RAM. I also took the opportunity to fix a typo, add an entry to the FAQ, and add an email field to the contact form.

I also updated the features page to make it more current, though I'm pretty sure that I missed a few features that were added since the last time that page was updated. :-)

Posted by Chris on 09.29.2004. (0 Comments)