LJ's problems
Jul. 24th, 2007 11:57 pmIt's difficult for me to understand just what this problem is that LJ and SixApart have with power. The technology exists and is well understood, yet power outages that last only minutes leave them fumbling for hours. It seems to have always been this way. Every time, their backup power system fails them. Every time they say they will find out what the problem was and fix it. And the next time it's just the same.
I can understand how an uncontrolled shutdown could make restarting such a large network of machines into a tricky task, but that too needs better planning and should not take so long. The real issue is avoiding the uncontrolled shutdown. It would seem that they can't manage to sustain power for five minutes, even though that really should be all it takes to achieve a controlled and orderly close that would allow a quick and efficient restart.
I can understand how an uncontrolled shutdown could make restarting such a large network of machines into a tricky task, but that too needs better planning and should not take so long. The real issue is avoiding the uncontrolled shutdown. It would seem that they can't manage to sustain power for five minutes, even though that really should be all it takes to achieve a controlled and orderly close that would allow a quick and efficient restart.
no subject
Date: 2007-07-25 11:32 am (UTC)Although, I suppose one should expect that.
It's funny when I'm working in one of the factories I frequent, and the power goes out. Big automated machines don't always recover well from shutting down like that either. Although all it would take is a few more battery backups that could detect when they've been engaged. But I guess they just don't want to spend the money.
no subject
Date: 2007-07-25 11:34 am (UTC)no subject
Date: 2007-07-25 11:34 am (UTC)no subject
Date: 2007-07-25 11:44 am (UTC)no subject
Date: 2007-07-25 11:47 am (UTC)Sure, that kind of outage is going to take you down. I don't dispute that. But it isn't that hard to insure that you go down quickly and smoothly, so that you can recover quickly rather than needing hours of fiddling and intervention.
no subject
Date: 2007-07-25 11:50 am (UTC)Capitalism...
Date: 2007-07-25 12:03 pm (UTC)no subject
Date: 2007-07-25 01:27 pm (UTC)no subject
Date: 2007-07-25 01:31 pm (UTC)Re: Capitalism...
Date: 2007-07-25 01:34 pm (UTC)I also notice that Brad himself has been more and more absent where he used to be an active presence. I don't know if he is now a millionaire, but I doubt it. Whatever happened, SixApart has managed to successfully separate him from LJ to the extent that he had no influence on the witch hunt in May and June and isn't even trying to make peace with users now. So in essence, there is no one left there who much cares what happens to LJ. They'll just keep milking it for whatever money they can suck out of it until it runs dry.
The question is, what do we do about it? What can we do about it?
no subject
Date: 2007-07-25 01:37 pm (UTC)The fact that LJ users received NO priority in the recovery, and only SixApart products were targeted for fast reactivation shows us where the corporate loyalties are based. They've demonstrated this repeatedly since the time 6A took over.
no subject
Date: 2007-07-25 01:40 pm (UTC)Indeed they have. Repeatedly. And that's my point. Once I can chalk up to inexperience. Twice I can allow because it does take time to resolve all the issues in something this large. But we're way past four or five times. This is now inexcusable, and SixApart's apparent low priority for LJ recovery is evidence that it won't get better. I contend that LJ is now just a cash cow for them.
Re: Capitalism...
Date: 2007-07-25 02:54 pm (UTC)no subject
Date: 2007-07-25 03:06 pm (UTC)no subject
Date: 2007-07-25 03:25 pm (UTC)Probably it's a larger scale version of what happens with small networks that rely just on battery backup. The UPS is the very last thing considered when upgrades or tests are performed. Consequently, it is often inadequate to provide the necessary power when a need actually arises. Backup is a cost, not a source of income. Increasing your rack capacity and network bandwidth produces a direct increase in income. So... they always put off adding more power facility until it is too late. It's the basic defect of capitalist economics: short term thinking.
Re: Capitalism...
Date: 2007-07-25 03:34 pm (UTC)I'm going to do pretty much the only thing I'm able to do. I'm going to let my paid account lapse to free when it comes due in September.
Getting the lowest priority on recovery for this repeating problem shows it's not worth paying for.
There is also a habit to allow things like comment notification emails to completely break down before doing a system upgrade. The 'fast' servers that a paid account supposedly allows still has timeouts. They can't be unaware of load issues.
Powers that B
Date: 2007-07-25 04:04 pm (UTC)no subject
Date: 2007-07-25 04:13 pm (UTC)Re: Capitalism...
Date: 2007-07-25 05:48 pm (UTC)no subject
Date: 2007-07-25 06:02 pm (UTC)My advice to LJ and other non-essential services that rely on large colocations like this would be to design and implement a rapid shutdown that triggers cleanly at the FIRST sign of trouble. Better to have an orderly down time and a clean restart than rely on crossed fingers and witchcraft to keep you going until the power returns or stabilizes. Even a 15 minute delay in shutdown is usually too long. Because LJ needs some sort of consistent synchronization between its many data clusters, this is critical. If a five minute power outage means a two hour downtime, that's fine. At least a simple restart is likely. Their problem seems to be that they have to manually resynchronize everything before they can allow a full restart and access to the database. I'd rather have a two hour outage followed by a quick restart than an eight hour outage with a lot of people running around firefighting to get it to restart.
Re: Powers that B
Date: 2007-07-25 06:03 pm (UTC)no subject
Date: 2007-07-25 07:57 pm (UTC)no subject
Date: 2007-07-25 08:04 pm (UTC)no subject
Date: 2007-07-25 08:12 pm (UTC)These days, generators of that sort could also be powered with natural or LP gas, and those I've been around (only a couple) have always started up very promptly and without intervention.
No, flywheels aren't a new idea. However, thinking of them as a green alternative does seem to be relatively new.
The real culprit in this scenario, though, is SixApart/LJ. This problem has recurred enough times that you'd think they'd make a faster shutdown and faster recovery a top priority instead of developing new graphical gimmicks to stick into other people's profiles.
Never trust the datacenter to provide uninterrupted power. Always have a controlled and effective recovery plan.
no subject
Date: 2007-07-25 08:14 pm (UTC)no subject
Date: 2007-07-25 08:43 pm (UTC)no subject
Date: 2007-07-25 08:59 pm (UTC)The "going down unexpectedly" part is also preventable. Evidently they continue to trust the power backup hardware to work for more than a few seconds, and it continues to fail. Therefore, they should be shutting down immediately on the first warning, and not crossing fingers and waiting. Even those flywheel generators must give them a minute or two warning. If it takes longer than that to stop all further updates and commit all writes, the code needs redesigning.
no subject
Date: 2007-07-25 09:11 pm (UTC)no subject
Date: 2007-07-25 09:22 pm (UTC)no subject
Date: 2007-07-25 11:56 pm (UTC)As a free account holder I can't complain about any drop in service, but the people on paid accounts must be chewing their tails.
no subject
Date: 2007-07-26 02:18 am (UTC)And I'm just a regular paid member. The people who have bought permanent accounts have a right to be really angry. It's not as if this were the first or second time. We've seen the same problem repeatedly, and every time it takes them hours, even most of a day, to recover from it.
Re: Powers that B
Date: 2007-07-26 01:15 pm (UTC)Re: Powers that B
Date: 2007-07-26 02:42 pm (UTC)I wish I could be more helpful than just offering advice. I sense that if there was an actual lesson to be learned from your encounter with law, it has been learned all right. Unfortunately, there seem to be many who, rather than learn from such things, resist the learning and just try to "get even." That is certainly a disastrous choice and I'm glad you aren't going that way. Hold onto your resolve and integrity, and you'll find a path.
Re: Powers that B
Date: 2007-07-26 03:42 pm (UTC)Yeah, I noticed that just a few days ago- I suprise one of the officers I was being instructed by because I was paying attention and asked her a specific question about one of my probation conditions. I threw her for such a loop, she asked me, "where did you see that?" Uh, fourth line down.
I have no desire to cop an attitude... I know that now is the time circumstances have caused everyone to look in my direction with a dubious glare, and I am choosing to show them all the things in me they've casually overlooked all these years. Showing them they misjudged me is far more satisfying than dwelling on hateful spite.
I think the initial lesson I learned during my 35 day incarceration after my initial arrest... I had plenty of time to think. I think if people knew who I was, they'ed see putting me in with a bunch of people who need direction wouldn't suite their purposes very well ;)
Re: Powers that B
Date: 2007-07-26 04:05 pm (UTC)Without knowing or needing to know anything more than I know (which is little, admittedly) I can tell that you are worth a lot more than you've been given credit for.