Recovery Monkey: Musings on backups, tuning and more

Choose a Topic:

Thu
8
May '08

Retarded storage and thin-skinned people

So this is kind of a long but funny story and a rant against oversensitive people at the same time.

About a year ago, this sales guy and I go to this architecture firm since they told us they are in dire need of a better storage solution.

We meet with their admin, real nice young guy, let’s call him Mike. He explains to me how they have this old <insert few-letter-company name> clustered NAS with some JBOD behind it. They’re having performance issues, it’s not scalable, they don’t replicate it or do snaps, the list goes on about how much he hates that box. It’s just not working out.

He then mentions he wasn’t part of the decision to buy the box and he just wants to get rid of it and get something much better.

So I start explaining to him the higher-end NAS solutions, I talk about the EMC Celerra, all the things it does etc. The whole explanation takes like 2 hours since he really was unfamiliar with a lot of the basics so I started from the ground up, explained the entire concept and architecture etc.

By the end of this we’re bonding with the guy, he’s throwing some F bombs in casual conversation, all in all we’re comfortable. He tells me he finally gets it, he realizes it took him a while to see the big picture but now he totally understands the value prop. He’s excited.

I feel stoked since I like the guy and it’s not often that you get to educate someone and make them that happy. Very rewarding. So we’re joking some more and I mention how the old box is pretty much retarded when compared to the EMC box, since the EMC box does so much more it’s ridiculous.

He laughs about that and agrees, we joke some more, I promise him I’ll send him a config to look over and we leave.

On the way out he tells me how great it all was, and cautions me jokingly that it’s probably not a good idea to mention to more conservative customers that their existing storage is retarded. We laugh and part ways in a very friendly fashion. Of course I don’t normally say something like that, I only did because we were joking around and bonding and, most importantly, he told me it wasn’t his baby and that he hates it. Usually the coast is clear after something like that :)

So I send him his config, he’s getting a great deal, all very well architected. No response. I call him, no response. Eventually the rep calls him, and Mike tells the rep how he was offended that I called his storage retarded and he doesn’t want to do business with us. I thought this was the weirdest thing ever. My initial reaction is that maybe someone close to him is mentally retarded – but if that were the case, he should have shown some kind of reaction when I first mentioned the dim-wittedness of his existing storage.

But wait, there’s more.

About a year later… different gig, different rep. I get the invite to go to this place and talk about storage. They’ve had problems for years and have a really old and bad system in place and really need a replacement. I walk in, and of course it’s the same exact architecture firm! I tell the rep that this is probably a bad idea and that I should leave. I don’t have time though because Mike comes to greet us.

The moment he sees me, he’s like “sorry guys, this is not gonna happen, you just leave now so we don’t waste each other’s time”. He says that he really respects my expertise but he won’t do business with a company I work at. He doesn’t want to speak to another engineer and pretty much kicks us out. I can’t shut up any more and I tell Mike that he has really, really thin skin.

Needless to say the new sales guy is dumbfounded.

The sales guy calls Mike a day or so later and gets an explanation out of him. Mike claims he doesn’t want to deal with engineers that belittle his equipment since how do I know in what financial dire straits they were? Maybe they were forced to buy the retarded storage.

Which is fine but shows that either Mike lied throughout our entire first meeting or has an amazingly bad short-term memory.

I wish Mike all the best in his future endeavors and still stand by my original assertion: get off your retarded storage if it’s causing you problems. Even if you don’t have money there are other appliance-type solutions to be had on the cheap (or free)!

Here are some easy-to-use appliances that are quite good:

You could try all of them as virtual machines if you don’t want to dedicate hardware to them to begin with. That way you can test all of them easily. You can also roll your own with Solaris 10 or Linux, of course it requires one to know what they’re doing but it’s amazing what can be accomplished for next to zero dollars nowadays.

And Mike, if you ever read this:

Get some thicker skin. And maybe some Gingko Biloba. Moreover, if the real reason I offended you was that someone close to you is retarded – get over it, it’s just an expression!

People are just too damn sensitive these days. Just get the job done.

D

Tue
25
Mar '08

Windows Server 2008 RTM 64-bit performance versus Vista SP1 64-bit, and using 2008 as a workstation

I’ve been using Vista x64 for a while now, just so I can make use of all the memory on my machine (an über-thinkpad), and because I like shiny new things and 64-bitness and don’t want to be one-upped by smug Mac users with their feline-named OSes, mock turtlenecks and their newfound 64-bit capabilities. Of course, with the good comes some bad – Vista, while in my opinion a step forward in many ways, does take a step backward when it comes to some areas of performance and sheer resource requirements. A lot of it can be attributed to poorly-written drivers, especially any Aero GUI slowdowns with nVidia cards.

Since space was running out I bought a new hard drive (200GB Seagate 7200 RPM) and decided to install the RTM 2008 bits. If something went wrong I figured I could always either go back to my old drive or just move Vista to the new drive with some imaging utility or other, no biggie. If 2008 worked out, I’d keep it.

The reason this comparison is worthwhile is that 2008 and Vista SP1 have the same exact kernel – I checked, NTOSKRNL.EXE is the same in both OSes. One would think that the differences wouldn’t be huge and that therefore there’s no point going to 2008. Of course, there are a lot of other pieces aside from the kernel, and I think that Microsoft checks to see what OS you’re running and maybe disables certain features in the kernel accordingly – I couldn’t get the LargeSystemCache registry parameter to have any effect on Vista, for example.

Let’s compare CPU- and Graphics-benchmarks first, since those shouldn’t really be different. I used Cinebench 64-bit.

 

Vista:

Rendering (Single   CPU): 3040 CB-CPU
Rendering (Multiple CPU): 5367 CB-CPU
Multiprocessor Speedup: 1.77
Shading (OpenGL Standard)          : 4256 CB-GFX

 

2008:

Rendering (Single   CPU): 3053 CB-CPU
Rendering (Multiple CPU): 5379 CB-CPU
Multiprocessor Speedup: 1.86
Shading (OpenGL Standard)          : 4478 CB-GFX

 

Slightly better scores for 2008 it seems, but not dramatically better. Next, postmark, since I/O should be where it shines, it being a server and all:

 

Vista:

Time:

        170 seconds total

        98 seconds of transactions (204 per second)

 

Files:

        20092 created (118 per second)

                Creation alone: 10000 files (200 per second)

                Mixed with transactions: 10092 files (102 per second)

        9935 read (101 per second)

        10064 appended (102 per second)

        20092 deleted (118 per second)

                Deletion alone: 10184 files (462 per second)

                Mixed with transactions: 9908 files (101 per second)

 

Data:

        548.25 megabytes read (3.23 megabytes per second)

        1158.00 megabytes written (6.81 megabytes per second)

 

2008:

Initially I had enabled the “advanced performance” in the device manager for disk, since everyone tells you to do so in all tuning guides…

 

Time:

136 seconds total

45 seconds of transactions (444 per second)

 

Files:

20092 created (147 per second)

Creation alone: 10000 files (263 per second)

Mixed with transactions: 10092 files (224 per second)

9935 read (220 per second)

10064 appended (223 per second)

20092 deleted (147 per second)

Deletion alone: 10184 files (192 per second)

Mixed with transactions: 9908 files (220 per second)

 

Data:

548.25 megabytes read (4.03 megabytes per second)

1158.00 megabytes written (8.51 megabytes per second)

 

Much faster than Vista. I then disabled the “enable advanced performance” to see how much slower it would become:

 

Time:

110 seconds total

39 seconds of transactions (512 per second)

 

Files:

20092 created (182 per second)

Creation alone: 10000 files (454 per second)

Mixed with transactions: 10092 files (258 per second)

9935 read (254 per second)

10064 appended (258 per second)

20092 deleted (182 per second)

Deletion alone: 10184 files (207 per second)

Mixed with transactions: 9908 files (254 per second)

 

Data:

548.25 megabytes read (4.98 megabytes per second)

1158.00 megabytes written (10.53 megabytes per second)

 

Amazingly, much faster, not slower! I did some checking and this is what the setting actually does… it re-introduces an older, somewhat undesirable behavior. A bit hard to find the proper explanation, and I hope Microsoft makes what happens behind the scenes a bit more obvious. At the moment it’s quite obscure, and every guide tells you to enable it for performance. Just leave it alone. BTW the Vista score is with the setting disabled.

 

Could I have run other benchmarks like Sandra etc? Sure, but I just wanted to keep it simple and there just wasn’t enough time.

 

The next step is to run the tests on the same hardware with XP. That’s forthcoming.

 

Conclusion:

 

Seems like Microsoft did something right. Even with the 64-bit version (that takes naturally more RAM than the 32-bit one), 2008 Server takes less memory than Vista (2-300MB less at any given time in my case), runs quicker and just feels better, kinda like an unencumbered Vista. Simple things like searching a huge index in Outlook happen much faster than before. The Server Manager app is awesome, and one can try out the Hyper-V Hypervisor (BTW that, predictably, clashes with VMware and disables your power management, so beware). A server OS is in general also more secure and, over time, probably more reliable, given the workloads it’s supposed to run.

 

Can everyone run it? Should they? No, not unless you have a license for 2008 through MSDN or somesuch, otherwise it’s expensive. Some assembly is also required, and you do need to know what you’re doing. However, if you’re so inclined, you can easily get the demo version of 2008. Apparently there are clean, documented ways to increase the evaluation period (no cracks or BIOS spoofers) that I think come from Microsoft but I’m not going to list them here just in case…

 

In addition, while almost all my apps installed fine (including games and hairy driver stuff like Daemon Tools), 2 things didn’t: Bluetooth and my Logitech mouse drivers. I don’t quite use Bluetooth but I liked some of the features of my mouse (the utterly kickass Logitech VX Revolution), now it’s just like a normal mouse. I’m still keeping 2008. I’m sure other stuff will have issues, like DRM/BluRay. For people that like the Windows Sidebar: there are hacks to get it working that involve copying stuff from Vista. I think the sidebar is largely useless.

 

FYI, there are 2 notable omissions in 2008: Readyboost and Superfetch. Superfetch exists as a service but to even get it to start you have to edit the registry. I didn’t think it helped much so I disabled it again. Readyboost isn’t even an option. And the old-style boot prefetch that worked in 2003 Server doesn’t seem to be there. So it does boot a bit slower than Vista, but not much. Once you get the box up and running it’s fast though.

 

In the end, I’m leaving 2008 on my box, and that’s all that matters.

 

D

Mon
4
Feb '08

NetApp posts SPC-1 results

NetApp posted some SPC results showing their 3040 box performing pretty well in SPC-1 relative to an EMC box.

There have been rumors that when running multiple features in a NetApp box then performance suffers. Which kinda negates the whole value prop of NetApp (since that’s when people typically choose NetApp - they want one box to do everything).

A realistic test would be to have OTHER apps sharing the array (on other spindles), as is usually the case. Almost nobody dedicates an entire array of that size to a single app.

Have the box do CIFS, NFS, iSCSI AND FC.

Show performance over a significant period of time (another point NetApp detractors use – performance declines over time due to WAFL fragmentation).

THEN show the performance delta as each feature is enabled.

Obviously hard to do and maintain kosher SPC results but it would be a worthwhile addendum and, if successful, would shut up the NetApp detractors (since that’s a usual technique for selling against NetApp). I’d also show performance in degraded mode.

Anyone have any data on NetApp performing either way when used as a multi-role box?

A note on the EMC config and interpreting those benchmarks in general, be they SPC or SPEC or whatever: ALWAYS READ THE FULL DISCLOSURE regarding the test, don’t just look at the graph. If you’re not technical, get a techie to explain it to you.

For instance, looking at the way the EMC box was set up, I highly doubt it was done using EMC’s best practices. To wit:

  1. They didn’t maximize the write cache
  2. They seem to not have used separate spindles for the snapshot area (a differentiator since, unlike NetApp, EMC not only allows such a thing to happen but actually encourages it)
  3. They could have used MetaLUNs more instead of striping using Windows.

I’d be willing to bet dollars to nuts that the NetApp box was set up properly :)

Another thing: look at the response times in the graphs.

Like they say, “only believe 50% of the statistics you read”.

D

Thu
20
Dec '07

Ate at Delmonico’s in NYC

I was helping out a customer with some backup issues in the Wall street area and they happened to be literally across the street from Delmonico’s.

At the end of a particularly long day I thought I’d reward myself with a nice steak, and the proximity to the steakhouse made it hard to resist.

Delmonico’s is one of those places that have been around forever. Bit stuffy inside, I didn’t opt for the wet-aged Delmonico cut but instead went for the T-Bone (dry-aged on-premises). I also had a rather excellent salad with roasted tomatoes, herbs and mozzarella.

This is not going to be one of those inspired entries – the steak just wasn’t that good. It was undercooked, underseasoned and just lacked flavor. I probably should have gone for the house’s signature cut (the famous Delmonico cut) but any decent steakhouse should have no problems making a proper T-Bone…

Maybe I’ll give it another chance. Prolly not.

D

Sun
9
Dec '07

We need more wizards!

No, I don’t mean Gandalf, I mean the software kind. And before I’m accused of being Gates’ live-in cabana boy (it’s all baseless rumors), let me clarify.

It’s a known fact that most OSes need tuning (sometimes significant) to perform well with heavy-duty applications (I’m not talking about your home web server, I’m talking about Exchange, SAP, Oracle, IIS, Apache etc. in large deployments. I acknowledge the fact that most OSes, out of the box, will work OK for anything small).

Most frequently the application documentation will have some kind of tuning guidelines telling you approximately what to do in each OS. The installer sometimes will apply some tunings for you after asking for your permission. Often, the suggested settings are woefully inadequate for truly large implementations, as with NetBackup (the Veritas-suggested tunings work for smaller environments but I have some magical kernel tunings as posted before that make it truly fly when the ridiculous is asked of it – and the difference in the parameters between my config and what Veritas suggests is huge. Oh, and some of my parameters are way smaller than what Veritas recommends. And I won’t call them Symantec, Veritas is a way cooler name anyway, look it up in a Latin-English dictionary).

Frequently, some tunings are so common that I don’t even know why they’re not in the default configuration in certain OSes. Different conversation.

The problem is, there are experts that DO know how to set up and tune the systems properly, but said experts are rarely the admins that install and administer the thing. Usually, a fair portion of those experts do work at the companies that make the OSes and apps.

The elitist among us might say, “tough, the lowly admins need to learn all this stuff, otherwise they’re not worth what they’re paid”. To which I respond with the following points:

  • Not everyone has the time to learn the arcana of several OSes and applications, learning most of the important features is complicated enough and some shops are truly short-staffed
  • The über-experts themselves don’t know it all: They may know how to perfectly set up Exchange but wouldn’t know how to do the same thing with Oracle, how can the basic admins be expected to have such multi-discipline expertise?
  • I firmly believe in the simplicity of the appliance computing model
  • We all have more important things to do (like taking care of the big picture) than constantly worrying about minutiae
  • The people that complain that the admins should be more intelligent are typically the people that actually enjoy dealing with the apocryphal, their jobs are secure anyway
  • There’s money to be made in the simplification of IT – look at Microsoft, EMC/VMware and NetApp. People like simplicity and are willing to pay for it.

Of course, many larger companies will opt for professional services to do the job, but the quality of people just varies dramatically. Just because you’re getting an expensive Veritas PS guy doesn’t mean that

  1. He knows what the hell he’s doing beyond what’s in the installation manual (you know who you are!) and (less significantly)
  2. Is even a Veritas employee, despite his badge (most vendors subcontract smaller companies).

At the moment, most OSes just apply generic formulas based on memory and/or number of CPUs, though somehow do not take into account CPU speed and load, and, indeed, the ancient formulas are a pain with today’s very large memory systems (usually you have to limit some tunables in large-memory HP-UX and Solaris boxes, otherwise some parameters get out of control).

I understand that making OSes truly self-tuning is not here yet, nor will it be for a while (64-bitness has taken away some of the pain though, at least in Windows). In the interim, there are better ways to approach the problem. My suggestion: Modernize the formulas that build the tunables and use simple AI techniques like Expert Systems. At installation time, benchmark the hardware and ask the user what will the server be running? OK, so if the answer is a web server, under what conditions? How many users? And so on. Admins are far more likely to know the answers to those questions than “how many open file handles do you think you’ll need?”

Based on the answers and the benchmark results, the system should either tell you what you want is possible, or bitch.

If the box is to be serving double-duty (or quintuple, in some cases), the wizard should check and see if the tunings will conflict and, if not, tune the whole box so that it can accommodate all the applications.

If you’re creating a filesystem, what will the intended use be? The defaults for almost all filesystems are wrong! One size fits only the people that have that size. The problem is that, once you’ve put in several TB on filesystems someone built with the default parameters, changing them is almost impossible: you have to take a backup, destroy the filesystems, rebuild them then restore the data. Which could have been avoided if, say, maybe not the OS but at least Oracle had the smarts to query the FS and figure out it’s using insufficient log and block sizes and that performance will suck. At which point it should puke and tell you “sorry, this is sub-optimal, either do such-and-such to fix it or continue anyway at your peril”. But of course you’re using raw disks for Oracle, right? Right?

Or take the example of Logical Volume Managers. They are cool, yes. They can work great. They will also let you do insane things such as create multiple LVs and stripe them, even if they’re on the same physical disk! The checks that should have been performed are so ridiculously simple it boggles the mind.

HP kinda started doing something like this a while ago – look at the templates in SAM, you can apply 2-3 different (useless) templates based on what the box will be doing that will affect a few tunables. HP-UX is guilty of needing the most tuning of any current OS I can think of, BTW (It also pays great dividends if you know what you’re doing, I took a Superdome to 2x the I/O performance once, felt proud but it took a lot of effort and research that could have been avoided).

Seems like the intelligence that would make our lives easier is like the proverbial hot potato: always someone else’s problem.

I know it’s a tall order: the whole solution would rely on much deeper interoperability between the various components than we’re used to. But I think the end result would be worth it.

In the meantime, if you have to do it all yourself, at least use common sense and have some golden OS builds that are each good for a different use, then just replicate them as needed.

Anyway, all this is aggravating my hemorrhoids (I call them The Grapes of Wrath), better stop now.

D

 

Fri
7
Dec '07

(Very) Preliminary Windows Server 2008 impressions and Vista Multimedia Performance under battery power

Out of curiosity, I very briefly tried the new Server 2008 Release Candidate (freely available from Microsoft). I’ve been using Vista 64-bit since I need to see all the memory in my machine and, while it works mostly OK, there are some low-level scheduling issues with it – for instance, sound is really choppy on battery power, no matter what I do with the power settings, so I can’t use the thing to watch a DVD or listen to music on the plane. Many others seem to be having the same issues, despite the funky Multimedia Class Scheduler nonsense that Microsoft put in the OS that makes networking slower (great info here), even though older incarnations were not suffering from media playback issues under load. And no, if I disable the Multimedia Scheduler it does NOT work better, it actually gets worse, which means that the service is there to fix some other kludge-y issue Microsoft introduced with the scheduler or something like excessive power throttling of certain devices.

But, as usual, I digress. This is about Server 2008. What’s noteworthy is that Vista SP1 inherits the exact same kernel as Server 2008.

This will be a short entry, there are others online talking more about 2008. What I noticed:

  1. It’s light for a Windows OS. There’s no excessive bloat guys, the thing takes about 300MB of RAM with the default install, and more can be saved by trimming unnecessary services (of which there are very few).
  2. It’s fast. Under preliminary benchmarking, even the RC code (that probably has some features missing and extra debugging code) seems about as fast as 2003 after SP2 (unlike others that have been releasing benchmarks of, say, Vista SP1 in it’s pre-release form, I’d rather wait until the final code is out).
  3. Seems to work with most Vista drivers so, if you want to turn it into a workstation, you can. You can also install the Vista GUI if you’re so inclined with no adverse effects (aside from the ones that come with the Vista UI that is). Runs very smooth.
  4. Application compatibility is similar to that of Server 2003.
  5. The OS does NOT suffer from the same issues as Vista regarding media playback (I made sure I installed the Power Management driver and selected the same kind of PM scheme as Vista). Maybe a good omen come Vista SP1? We shall see.

The new management interfaces are nicely laid out, and selecting Roles for the server and adding or removing features as needed is very simple. It feels more like a well-integrated 2003 R3 rather than Vista.

I didn’t get to play with the new virtualization, it doesn’t seem to be in the RC code (though, reading some documentation, it seems as if it will have VMotion-like capabilities, which I will believe when I see).

UPDATE: 12/17/07

There is no more Vista multimedia performance issue on 2 separate computers. Some patches just released by Microsoft removed the issue (plus the issue of the mouse cursor stuttering). Interestingly, the patches had no mention of fixing said issues. I thought it was a fluke but having seen this fixed on 2 different boxes (one 32-bit, one 64) I don’t think it is.

For the Vista detractors: I’d advise everyone to wait until SP1 – as with most Microsoft releases. It’s no different. They’re actually getting better, NT4 was unusable until SP3 at least… given the unreal amount of code in the system, I’m surprised it runs this well. They really need to slim it down. Supposedly, Windows 7 will be slimmer (http://apcmag.com/7668/beyond_vista_windows_7_what_we_know_so_far). However, it mostly targets the kernel and it was never the Windows kernel that was the issue (it’s actually surprisingly decent), it’s all the crud around it.

D

Thu
15
Nov '07

My opinion on the Sun/NetApp altercation: Both companies should be grateful instead of resorting to lawsuits

Since opinions are like you-know-what, and since I’m decidedly anatomically complete in that respect (some, indeed, claim all of me is composed of implied anatomical part, so maybe that’s why I’m so opinionated), I thought I’d throw my $0.2 in the pot and not stay silent. The whole issue irks me quite a bit, actually.

Like my colleague, Rich, and I think most digerati (there’s a nice word whose time came and went, it seems), I have been following the machismo display between Sun and NetApp (see some representative comments from both sides here and here). BTW, I doubt anything will really happen with the lawsuits, and highly doubt even that money will change hands out-of-court to settle this. This is more about chest-thumping than anything else. But, in a nutshell, it seems it all started due to NetApp wanting to buy some STK patents (from before the STK acquisition), Sun not wanting to sell but instead asking for $36m to license the patents, NetApp being upset and telling Sun they infringe their WAFL patent with ZFS, then Sun telling NetApp to stop selling filers. Those guys are all nuts. I may be missing some facts (NetApp is super-cagey about what STK stuff they wanted) but they are all still nuts.

It seems people will try to patent anything these days. But going after people that you think infringed your patents can be pathetic if your story is not airtight and your goals noble – remember SCO?

I do believe in protecting one’s IP in some way – whether the best way is a patent I’m not so sure, there’s always copyright. I’m not as naïve as some open source zealots that think all patents are evil and that all software should be free. I wonder where they work and how they all make their living? Do those guys all work in places that only do open source and just give away stuff? If I develop a piece of truly cool IP that can result in me making money, rest assured I’ll try to capitalize on it.

However, I do believe that the current patent system is flawed. It’s also difficult (I think impossible) to find people technically competent enough to oversee the process. For instance (and, to cut to the chase), I would have denied NetApp the WAFL patent, since

  1. It’s a simple evolution and/or modification of existing block allocation schemes to facilitate writes (more technical info later on)
  2. There were other COW (Copy On Write) filesystems prior to NetApp, such as LFS and numerous research projects. Specifically,
  3. Daniel Phillips had done most of the COW work prior to NetApp’s patent, but had to abandon work on the tux2 filesystem due to fear of patent laws (see here). He didn’t file a patent first, since nobody that does open source development is thus inclined.

     

But where do you draw the line on what’s truly new and patentable? And what if enforcing a patent is detrimental to the common good? Should Xerox have patented the mouse? It was totally new back then. What if they’d enforced the patent and told Apple and later Microsoft that they are not allowed, no matter what, to use a mouse? Or if HG Wells patented the science fiction novel? If Hoover patented the vacuum cleaner? If RCA patented the television? You get my drift. There would be zero innovation.

I think patenting obvious stuff should just not be allowed. And, if your patent is based on prior art (regardless of whether it’s been patented), it should be summarily denied. If the patent is granted but is then proven after the fact that someone else had figured out the idea first (as in the case of Mr. Phillips), the patent should automatically be invalidated. Complex, no?

Which is why many think that patenting software should not be allowed.

At the end, with some problems, there is only a finite number of solutions (often only one). Researchers may be working simultaneously on the problem. Eventually, only one will be first with a solution. I am opposed to penalizing the other guy simply because he used a similar algorithm to mine (especially when, mathematically, there may be zero other solutions, making every approach to solve the problem produce the same result).

Back to Sun and NetApp. The truth is, I think, pretty simple. While I have enormous respect for both companies (a bit more for Sun, due to their history and my extensive personal experiences), both companies’ major products are based on a tremendous amount of prior art (patented or not, nobody seems to have complained to either company). Truly, they stand on the shoulders of proverbial IT giants. Sun has the PR benefit of having contributed vast amounts of IP to the world, compared to NetApp (though some technologies like NFS and Java have been pretty painful, so it’s a mixed blessing).

NetApp code heavily borrows from Unix, Sun, IBM, Cisco, EMC and many others. For instance, since Data ONTAP (NetApp’s OS) can’t scale beyond 2 boxes, NetApp purchased Spinnaker – SpinOS creates a single namespace that can transcend many nodes (BTW other products such as IBRIX, Exanet and others can do the same thing really well). The current GX OS is bits from the older ONTAP on top of FreeBSD with some SpinOS bits. However, both the older 7G and the newer GX OSes are offered, since 7G does a lot more (SpinOS can be just large-scale NAS – no iSCSI or FC block device targets, even if those targets on a 7G box are just files, but I digress). Of course NetApp wants to move everyone to SpinOS, which explains NetApp’s current craze with NFS everywhere. It’s infectious, now all of a sudden once again everyone wants to use NFS – VMWare, Oracle, senile grannies running compute clusters all over the world. We get it, it’s a shared-namespace, network-based FS, and sure, you can run pretty much anything on it. People have been for decades. How quickly we forget that it really isn’t the best network-based filesystem, and that there was a reason people developed cool alternative technologies such as AFS, Coda, PVFS, the native IBRIX mode, and many others. The new CIFS that’s part of Windows Server 2008 is actually a really decent implementation, but I’ll probably get flamed by the NFS fanbois for saying so.

And how quickly people forget that it was Sun that gave us NFS, warts and all (well, v4.1 ain’t too bad but that’s a collective effort – the wonders of open source). The rather execrable CIFS, BTW, (the other main NetApp “technology”) was not invented by Microsoft but rather by IBM in 1983. IBM and Cisco invented iSCSI. Legato (now owned by EMC) played a fundamental role in developing NDMP. And I can’t even remember who first created versioning filesystems but I fondly remember my VAXes and they used to do that stuff ages before NetApp even existed (not to mention proper manly-man single-system-image clustering, but that’s a story for another day). I’m pretty sure NetApp didn’t develop Fibre Channel, either.

Cue to today: Now everyone can do snapshots, it’s almost de rigeur, and the truly cool do application-aware snaps.

Volume management is standard, too.

Filesystem expansion is everywhere.

Thin provisioning (not a fan but anyway) is becoming more and more prevalent.

iSCSI is everywhere.

So, the real ZFS issues NetApp is complaining about seem to be the “Write Anywhere” and COW parts, since those are really the only true similarities with WAFL. Seriously, like that’s what’s the most important aspect of Sun’s ZFS. Indeed, while very quick for initial writes, a write-anywhere algorithm can lead to horrific fragmentation and continuously-declining performance over time (which is why you have to defrag NetApp filers). It’s just a safe, easy and computationally cheap method for allocating blocks to minimize write time for write-heavy applications such as NFS. Possibly one of the reasons NetApp did it was because in their boxes there are no RAID controllers, there’s just a CPU or two (486’s I believe in the original boxes) that has to do EVERYTHING – RAID calcs, rebuilds, snaps, caching, etc (the back end of all NetApp gear is JBOD). Using WAFL a lot of the inefficiencies in RAID are bypassed, since it will schedule multiple writes in order to fill a RAID stripe. A more elegant approach such as extent-based allocation (like VxFS) would have been too computationally-intensive, especially for writes. Dave and his pals have a good paper on WAFL here, BTW.

Here’s what ZFS is: It was not meant to be a NetApp killer, it’s just a truly modern FS, with few limits, and an amalgam of all the current “cool” technologies and ideas. Snaps, thin provisioning, expansion, volume management, pools, quotas, self-healing, all in a single technology, that’s surprisingly well thought out, and easy to use even from the command line. ZFS is not the raison d’être of the Solaris OS, but merely a feature of it. Plus it does data checksumming with every write, which other filesystems don’t. Your data is exceptionally safe in ZFS. Some test results here. More features here, and it’s easy to see NetApp getting annoyed after reading that page (though they just think COW is a good idea, the other tremendous features are not in NetApp’s WAFL). Not sure if they fixed the read performance issues NetApp has with their implementation, I need to do some testing of my own.

In my opinion, the only reason NetApp became popular is because it trivialized the whole NAS aspect. Made it easy to build decent, clustered NFS/CIFS boxes without the need to know UNIX. If Sun had put a wizard-driven GUI to perform such actions in their boxes 10 years ago, NetApp might not exist today. To date, I think Sun’s management tools are pathetic, no matter how amazingly solid the underlying tech might be. There’s a GUI for ZFS but, again, that’s besides the point. Aside from initial write performance, a NetApp filer is not about WAFL, extending disk pools and whatnot, it’s about all-around ease-of-use and the sheer amount of cool features.

If NetApp wants to sue someone so badly, maybe they need to sue the Openfiler or FreeNAS developers? Or, if they want to go after someone that’s not open source, how about Open-E? That stuff sure looks much more similar to NetApp than anything made by Sun. Really cool, too. Or maybe they need to sue EMC. Those guys sure make some nice, full-featured NAS gear. Among a myriad other solutions…

Suing someone over a filesystem that’s newer and better in almost every single way than yours but uses one common (and unavoidable in the case of COW) design methodology is just plain silly… and, BTW, how did this escape the patent trolls? Another COW implementation?

And if more developers like Daniel Phillips get scared because of patent laws, then innovation will truly be stifled. The whole point of research is that you can reference other people’s ideas so you don’t always have to re-invent the wheel.

NetApp needs to innovate a bit more themselves. They developed a cool technology and have milked it to death, and even made it do things it shouldn’t (like iSCSI and FC targets, the NetApp approach is really unclean but they are trying to force their OS to do everything, whereas companies like EMC go for the more modular approach and are criticized for being “complex”).

I think I’ll stop writing now since it’s getting late. Never was one to save posts for editing later.

D

Fri
26
Oct '07

Ate at the Staghorn steakhouse in NYC

At the insistence of my colleagues (that seem to enjoy the steak posts more than the high-falutin’ technology ones) I decided to visit another NYC steakhouse.

It was raining, I didn’t feel like going further so I went to a place near the office at 2 Penn Plaza (Madison Sq. Garden).

It’s a newer place called the Staghorn on 36th, just west of 8th Ave. Really nice and modern inside, unlike most other NYC steakhouses. Almost totally empty.

The prices are a bit below other joints, probably because the cuts are not quite as colossal.

I opted for a T-bone this time and a house salad. All the cuts had the same price, BTW.

The salad had an excellent vinaigrette with a touch of oregano. I fortified it with a tiny bit of blue cheese.

The steak was truly excellent, dry-aged, with a wonderful nuttiness and caramelization, exhibiting slight undertones of hazelnut.

Not perfect though - had the cut been a bit thicker it would have been juicier, another 4-5 oz wouldn’t be too much to add. Nonetheless, a wonderful piece of beef. In the thicker parts it was amazing in tenderness, texture and flavor.

I finished with a rather good tiramisu that was a touch on the oversoaked side but very tasty.

Recommended. This place shouldn’t be as obscure.

D

Mon
15
Oct '07

Uptempo cache can get paged out! (EDIT: After all, it does NOT).

I normally don’t do retractions unless proven wrong. So, ignore the text below and read Nick’s comment.

—————————-

A warning to those who use Datacore’s Uptempo:

While it works wonderfully as long as the server doesn’t suffer a low memory condition, the memory it reserves for cache will get paged out in low-memory situations.

I found out the hard way (as usual), while running some very demanding VMs (I only have 2GB and not the best laptop, a new machine is forthcoming). The way Uptempo reserves memory is by using a specific process, Dscaddmemory or something like that (I’ve now removed it from my system so I can’t remember the exact name). If you look at Task Manager, that process has as much memory allocated to it as you’ve allocated Uptempo.

When I was running out of RAM, I noticed that the process started shrinking in size, until it was 16MB (out of 280MB). Windows, since it looks like a normal process, decided to page it out in order to reclaim RAM.

Of course, this kinda defeats the purpose. I’d rather page out everything BUT my fancy dedicated cache, the way HP-UX does it if you tell it to (story for another day but HP-UX cache tends to work better if you specify the min and max sizes as the same and not let it auto-allocate).

My real beef with Uptempo is that it didn’t try to reclaim the memory when there most obviously was enough memory for it (after it paged itself out needlessly, I had over 350MB free and plenty in the Windows cache).

It didn’t even try to reclaim the RAM after I quit VMWare and had 1.5GB free.

Obviously, either I’m missing something fundamental or some work needs to be done. Granted, any time you are forced to swap heavily cache won’t help much but they should be at least giving the memory back to the process afterwards.

Supercache never shows up as a process, it grabs the memory when the system boots (it’s one of the first things that happen) and nothing can swap it out. It’s also configurable on-the-fly, Uptempo needs a reboot for any size changes.

With 64-bit all these helper caching programs will probably become obsolete since cache is not limited to 1GB any longer. Though I’m not sure I subscribe to Vista’s Superfetch, since it does make the HD work like crazy when you first start the box and is more suited for boxes that are not shut down it seems. Once it settles down it works OK.

D

Wed
26
Sep '07

Ate at The Old Homestead in NYC

I’ve been hopped up on uppers all day (relax, just a huge amount of chocolate-covered high-test espresso beans, though the amount of caffeine was surely enough to get me disqualified from competing in any sport - every time I pee it smells like freshly-brewed coffee). Needing something to relax me, and since my bowel movements have been altogether too easy lately, I thought I’d go for steak. Two birds with one stone.

It’s been a while since my last red meat extravaganza, and, at the behest of my buddies, I tried The Old Homestead, on 14th and 9th.

The place is a bit old-fashioned, as befits most NYC steakhouses. There’s this weird old sign, stating this place is “the king of beef”.

I bumped into Odin on the way in, he was ordering takeout for the lads. We exchanged knowing nods, told him to say hi.

I was served by a decrepit waiter with a handlebar moustache, he probably was almost too old to fight when he was drafted in WWI. He had an accent so I asked him where his pith helmet was. He, in turn, recommended the 36oz ribeye, priced no more than lighter fare on the menu. Once again, I asked for an internal temperature between 145F and 150F, once again I got a blank stare. So far, only the people at Emeril’s Delmonico in Vegas have been able to respond to this request without batting an eyelid. But that is a story for another day.

I also ordered a chopped salad since I’ve been told I need some roughage. The salad was amazing, and enough for two. I ate the whole thing, not one to ignore roughage consumption guidelines.

Then the steak came.

Oh dear.

The bone wasn’t even that big. The rest was all meat and a bit of fat. This is, to date, the largest single steak I’ve had (though not, alarmingly, the largest amount of meat I’ve consumed in one sitting). And was it good! It was served with a roasted head of garlic, French style. Not quite the consistency of the steak in Flames (that was almost like good Ahi) but still awesome.

I almost couldn’t eat the whole thing. But I did, it was that good. By the end I felt like Mr. Creosote in Monty Python’s The Meaning of Life. And I did not have the “waffer thin mint“.

On the way back to the train, it was hot and, after all this food, I started sweating profusely. I passed by a funeral parlor on 14th and the proprietor eyed me appreciatively. This is not hot-weather food!

Highly recommended.

D

Thu
20
Sep '07

WAN acceleration for remote workers

The deluge of WAN accelerators from Cisco, Riverbed, Juniper, Expand, Packeteer,Bluecoat, Silverpeak etc. etc. is proving good for datacenters. Not sure how many vendors will remain viable in a year or two, but the selection at the moment is decent.

However, most of the vendors don’t address remote desktop acceleration, say for people using 3G cards on their laptops or even cable modems - sometimes the routing to corporate networks can be arcane enough that the ms of latency add up, plus most home connections are asymmetrical anyway.

So, it would be pretty cool to have a WAN accelerator in your laptop, right? Well, so far only two companies have stepped forward:

The far more established product, even if you’ve never heard of it, is AcceleNet Enterprise from ICT (Intelligent Compression Technologies, www.ictcompress.com - they were recently bought by ViaSat). ICT has been doing just this for years, with a veritable who is who of clients (no they haven’t paid me to say this, I just think the stuff is cool). Lots of service providers use it.

ICT deploys a server that acts as a proxy, then you install an agent on your laptop. Transfers are compressed both ways.

The other vendor is known to us all - it’s Riverbed. They have now what’s called Steelhead Mobile. Effectively, it puts a Riverbed box inside your laptop. A normal Steelhead is needed to communicate with, as well as a Steelhead Mobile Controller for management. I saw pricing for the controller and it was a bit dear…

You can even adjust how much cache to give your mini-Riverbed, so if you have the space, go nuts.

Of course, you can also use this technology for servers and save money on appliance costs - I wonder if they have something that checks if you’ve installed it on a server OS, and how much CPU does it take to do it’s thing.

I heard somewhere Cisco is also working on something similar, unsurprisingly.

D

Fri
17
Aug '07

Processor scheduling and quanta in Windows (and a bit about Unix/Linux)

One of the more exotic and exciting IT subjects is the one of processor scheduling (if you’re not excited, read on, practical stuff to be seen later in the text). Multi-tasking OSes just give the illusion that they’re doing things in parallel - in reality, the CPUs rapidly skip from task to task using various algorithms and heuristics, making one think the processes truly are running simultaneously. The choice of scheduling algorithm can be immensely important.

Wikipedia has a nice article on schedulers in general: en.wikipedia.org/wiki/Scheduling_%28computing%29, good primer.

To cut a long story short: the processors are allowed to spend finite chunks of time (quanta) per process. Note that the quantum has nothing to do with task priority, it’s simply the amount of time the CPU will spend on the task. Every time the CPU switches to a new process, there’s what’s called a context switch (en.wikipedia.org/wiki/Context_switch), which is computationally expensive. Obviously, we need to avoid excessive context switching but still maintain the illusion of concurrency.

In Windows Server (that uses a multi-level feedback queue algorithm, FYI), the default quantum is a fixed 120ms, close to many UNIX variants (100ms) and generally accepted as a reasonably short length of time that can fool humans into believing concurrency. Compare this to the workstation-level products (Windows Vista/XP/2000 Pro) that have a variable quantum that’s much shorter and also provide a quantum (not priority) boost to the foreground process (the process in the currently active window). In the workstation products, the quantum ranges from 20-60ms typically, with the background processes always relegated to the smallest possible quantum, ensuring that the application one is currently using “feels” responsive and that no background task hampers perceived performance too much. Typically, in a box that’s used as a busy terminal server this will be the better setting to use since it will ensure that the numerous “in-focus” user processes will all get a quantum sooner rather than later.

The longer, fixed quantum of Windows Server means that fewer system resources are wasted on context switching, and that all processes have the same quantum. More total system throughput can be realized with such a scheme, and it’s a more of a fair scheduler. It also explains the higher benchmark numbers when running the scheduler in “background services” mode. It’s obviously best for systems that are running a few intensive processes that can benefit from the longer quantum (and, believe it or not, games and pro audio apps run better like this).

Note that I/O-bound threads (processes waiting on disk, mouse, screen and keyboard I/O) are given priority over CPU-bound threads anyway, which explains why the longer quantum doesn’t harm interactivity much. Try it - have 4 winzip/winrar/7zip sessions running concurrently. You CAN still move your mouse :) Here’s a great primer on internal windows architecture: elqui.dcsc.utfsm.cl/apuntes/guias-free/Windows.pdf. Another, deeper dive: download.microsoft.com/download/5/b/3/5b38800c-ba6e-4023-9078-6e9ce2383e65/C06X1116607.pdf.

Of course, there are ways to tune the timeslice in a more fine-grained fashion. In the registry, check out HKLM\SYSTEM\CurrentControlSet\Control\PriorityControl\Win32PrioritySeparation . Here are some explanations about how it works: www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/regentry/29623.mspx?mfr=true and www.microsoft.com/mspress/books/sampchap/4354c.aspx are great.

For instance - what if you don’t care to increase the quantum on the foreground window but, instead, just want short, fixed quanta (effectively around 60ms) for all processes to improve response time on a system with a lot of processes? Setting Win32PrioritySeparation to 0×28 will take care of that.

Here’s a useful Win32PrioritySeparation chart from forums.guru3d.com/showthread.php?p=1451631#post1451631:

2A Hex = Short, Fixed , High foreground boost.
29 Hex = Short, Fixed , Medium foreground boost.
28 Hex = Short, Fixed , No foreground boost.

26 Hex = Short, Variable , High foreground boost.
25 Hex = Short, Variable , Medium foreground boost.
24 Hex = Short, Variable , No foreground boost.

1A Hex = Long, Fixed, High foreground boost.
19 Hex = Long, Fixed, Medium foreground boost.
18 Hex = Long, Fixed, No foreground boost.

16 Hex = Long, Variable, High foreground boost.
15 Hex = Long, Variable, Medium foreground boost.
14 Hex = Long, Variable, No foreground boost.

Here are some other pages where others have figured out the effective quanta (and remember the numbers are not in ms): blogs.msdn.com/embedded/archive/2006/03/04/543141.aspx (for embedded Windows, I have doubts about the accuracy of his calculations regarding the effective quantum but still interesting), www.microsoft.com/technet/sysinternals/information/windows2000quantums.mspx (for Windows 2000, probably still valid).

Here’s a really nice article on the effects of schedulers and I/O-bound processes on virtualization: regions.cmg.org/regions/mcmg/m102006_files/6187_Mark_Friedman_Virtualization.doc

Linux, on the other hand, has not one but several totally different CPU schedulers and I/O elevators available. Just see this page, comparing 2.6.22 with Vista’s kernel, and note how many non-standard features are available as patches: widefox.pbwiki.com/Scheduler . You can get schedulers with cool names such as genetic, anticipatory, etc. Linux used to suffer on the desktop, but with recent patches interactivity has improved tremendously, and is now far more viable as a desktop OS. Here’s some cool info on anticipatory schedulers: www.cs.rice.edu/~ssiyer/r/antsched/. Anticipatory schedulers can help systems with slower I/O (laptops and desktops, especially) feel more interactive, and was the default I/O elevator for a while (CFQ is the current default for I/O, though can have issues with desktop users, see ubuntuforums.org/showthread.php?t=456692). A list of all the I/O elevators in the kernel: ebergen.net/wordpress/2006/01/26/io-scheduling/. Whitepapers: www.cs.ccu.edu.tw/%7Elhr89/linux-kernel/Linux%20IO%20Schedulers.pdf, www.linuxinsight.com/files/ols2004/pratt-reprint.pdf, www.linuxinsight.com/files/ols2005/seelam-reprint.pdf .

Recently, Linux moved to the Completely Fair Scheduler model (www.osnews.com/story.php/18240/Linux-Switches-to-CFS-Scheduler-in-2.6.23), sparking a lot of controversy (www.osnews.com/story.php/18350/Linus-On-CFS-vs.-SD) since it’s not quite done yet (kerneltrap.org/node/14055). More info on CFS: immike.net/blog/2007/08/01/what-is-the-completely-fair-scheduler/.

Interesting benchmarks showing the effects of scheduling on Linux performance: developer.osdl.org/craiger/hackbench/, math.nmu.edu/~randy/Research/Speaches/Disk%20Scheduling%20In%20Linux.ppt.

For anyone wishing to test the various Linux schedulers’ impact on interactivity, Con Kolivas has something: members.optusnet.com.au/ckolivas/interbench/. Con’s Staircase/Deadline (SD) scheduler (lwn.net/Articles/224865/) didn’t make it to the mainline kernel, unfortunately, and a miffed Con announced he’s dropping out of kernel development. Pity, since I think he single-handedly contributed more to the advancement of Linux interactivity on the desktop than anyone else. It’s great to have the choice of schedulers depending on how you’re planning to use your system - it’s already done with the I/O elevator, let it be done with the CPU scheduler. Instead, Linus invoked his Papal-like powers and made what I consider to be an unsound decision.

The real issue with Linux though is the userland. Here’s a great paper showing issues with the userland and how it robs us of speed: ols2006.108.redhat.com/reprints/jones-reprint.pdf . A lot of the CPU and I/O scheduler design is workarounds for those issues. Unless one deliberately chooses a stripped-down Linux distribution, the amount of bloat in the current code is incredible.

Finally, Solaris 10 also comes with a bunch of different schedulers, which you can assign globally or on a per-process/project basis. Tons more info: www.princeton.edu/~unix/Solaris/troubleshoot/schedule.html, blogs.sun.com/andrei/date/20050131, wiki.its.queensu.ca/display/JES/Solaris+10+Containers+and+Fair+Share+Scheduling, docs.sun.com/app/docs/doc/816-0222/6m6nmlsug?l=en&a=view.

Heady reading, no?

D

Thu
9
Aug '07

Ate at AJ Maxwell’s in Manhattan

Once more, dear reader, I place my colon’s health at peril for your reading pleasure and culinary edification.

I could have gone to Via Brazil for a proper feijoada by walking a few yards from my hotel but, instead, I sacrificed variety on the altar of dedication and had another bone-in ribeye. It is my mission to eat at all the decent NYC steakhouses.

For those who don’t know me (and many who do): I don’t eat steak all the time… indeed, I consider myself a veritable gourmand (and I do know the difference between gourmand and gourmet, as do my belts).

Anyway: ordered a medium-rare ribeye. They chargrill their steaks at AJ Maxwell’s so if you don’t like them that way don’t go. If you do, the steaks are good. The meat was tender and flavorful. It looks colossal but it is (they say) just 22oz. It looked huge and was over 2in thick. Probably 22oz after cooking.

I read some reviews and typically the people that complain asked for medium or medium well. If the piece is that thick and they chargrill it, rest assured the exterior will be pretty crispy if you want medium. By the same token, getting medium rare could mean some parts are pretty rare indeed. Not the place to be if you like medium and above.

I actually thought it was better than Bobby Van’s though still not as good as Flames. However, eating once someplace is not enough of a statistical sample. It’s beef after all, not purified water. Not the easiest thing in the world to be consistent with. Hence the incredulity of most people when I tell them that I had the best steak of my life at Wollensky’s. Maybe I got lucky. Hey, at least I said Wollensky’s, not Appleby’s… it’s a legitimate steakhouse.

After a few months I’ll definitely need colonics to get rid of the barnacles.

BTW, if you just want to read about technology you can select the topics at the top of the screen so you don’t have to read about my steak-eating adventures. Or vice versa.

D

Wed
8
Aug '07

Ate at Bobby Van’s in Manhattan

After the glowing reviews of a colleague I ate at Bobby Van’s on 230 Park. It’s considered to be one of the better NYC steakhouses (there are 4 in the chain, most in NYC).

I got a bone-in ribeye and some mushrooms.

I asked for a 145°F internal temperature and the decrepit waiter looked at me like I had three heads. “What does that mean?” I said medium rare…

The steak was pretty good, slightly overcooked but not as flavorful as what I had at Flames. It was also a bit dry for a ribeye and totally unseasoned. Still, not a bad cut.

The mushrooms provided some lubrication.

Not a religious experience, I’ll try the Old Homestead tomorrow hopefully.

D

Mon
30
Jul '07

Just how much is your antivirus harming your I/O?

I just got a new corporate laptop, a nice, shiny T60 (OK, it’s IBM black and therefore thoroughly incapable of reflecting on any part of the spectrum).

I noticed that doing disk-intensive work was much slower than I’ve been used to. I configured it as a server (see previous posts) and that helped a bit but not as much as I’d like to.

It seems the antivirus software is checking each and every file, and takes 100% of a CPU to do so. Were this not a dual-core box it would be begging for mercy.

Taking an entire CPU is unacceptable IMO. So I ran some benchmarks - the trusty postmark once more to the rescue:

 

After tweaking as a server, antivirus running, 100% CPU utilization while bench running:

Time:
344 seconds total
230 seconds of transactions (86 per second)

Files:
20092 created (58 per second)
Creation alone: 10000 files (95 per second)
Mixed with transactions: 10092 files (43 per second)
9935 read (43 per second)
10064 appended (43 per second)
20092 deleted (58 per second)
Deletion alone: 10184 files (1131 per second)
Mixed with transactions: 9908 files (43 per second)

Data:
548.25 megabytes read (1.59 megabytes per second)
1158.00 megabytes written (3.37 megabytes per second)

 

With a more efficient antivirus program instead, variable CPU utilization (from 10%-100%):

Time:
276 seconds total
174 seconds of transactions (114 per second)

Files:
20092 created (72 per second)
Creation alone: 10000 files (123 per second)
Mixed with transactions: 10092 files (58 per second)
9935 read (57 per second)
10064 appended (57 per second)
20092 deleted (72 per second)
Deletion alone: 10184 files (484 per second)
Mixed with transactions: 9908 files (56 per second)

Data:
548.25 megabytes read (1.99 megabytes per second)
1158.00 megabytes written (4.20 megabytes per second)

 

Disabling the antivirus makes it way faster for transactions:

Time:
174 seconds total
91 seconds of transactions (219 per second)

Files:
20092 created (115 per second)
Creation alone: 10000 files (222 per second)
Mixed with transactions: 10092 files (110 per second)
9935 read (109 per second)
10064 appended (110 per second)
20092 deleted (115 per second)
Deletion alone: 10184 files (268 per second)
Mixed with transactions: 9908 files (108 per second)

Data:
548.25 megabytes read (3.15 megabytes per second)
1158.00 megabytes written (6.66 megabytes per second)

Caching with UpTempo for a nice 50% boost in performance:

Time:
121 seconds total
65 seconds of transactions (307 per second)

Files:
20092 created (166 per second)
Creation alone: 10000 files (277 per second)
Mixed with transactions: 10092 files (155 per second)
9935 read (152 per second)
10064 appended (154 per second)
20092 deleted (166 per second)
Deletion alone: 10184 files (509 per second)
Mixed with transactions: 9908 files (152 per second)

Data:
548.25 megabytes read (4.53 megabytes per second)
1158.00 megabytes written (9.57 megabytes per second)

Not tweaking the laptop as a server resulted in > 400s runtimes in the default config (sometimes 500s). FYI, the drive is a smaller, 5400 RPM jobbie, not the 200GB 7200 RPM SATA I have my eye on.

One could extrapolate these results. On a bigger box the end results will differ but everything will remain relatively similar.

Obviously, antivirus is sorely needed in this day and age, but if you’re planning on doing heavy I/O be careful what antivirus program you pick and how it’s configured. Depending on the server, I’d gladly trade some protection in exchange for a bunch more performance. Or you can go Unix/Linux and not really have to bother.

I’d say setting up an antivirus program to only scan extensions that can be infected and only scan on creates/modifies and not reads, can boost performance significantly.

Interestingly, caching didn’t help much with antivirus enabled - most of the bottleneck was the antivirus since everything had to go through it first. What if this was a database/email/fileserver with heavy activity?

D

Wed
13
Jun '07

Ate at Murphy’s Style Grill, in Red Bank, NJ

Will be demonstrating Cisco’s WAAS tomorrow in NYC, so today we spent some time going through a testing protocol so we can show people different things.

After we finished we had dinner at Murphy’s in NJ. Strange place. It’s not a classy steakhouse or anything - nor does it have aspirations to be one.

The menu is, to quote Kipling, as immutable as the hills. Apparently any substitutions or deviations are swiftly and sternly stamped out, as though they signify an impending revolution that threatens all that we hold holy. Dressing on the side? Heresy! Burn!

I got the 24oz Delmonico. I was urged not to ask anything about it, lest they bring out someone to take me to the back. He also suggested generous amounts of A1.

At least it was inexpensive (about $17) and properly cooked. If you’re looking for flavor and marbling, look elsewhere. Much of it looked like solid marble, though. Had to surgically remove a good amount of gristle.

Better than the steak at Bowling Green, I have to admit.

D

Fri
8
Jun '07

This has been one of the worst trips ever - because of one of the silliest DR exercises ever

Well, aside from visiting Flames and helping fix a severe customer problem. Those were rewarding. I still haven’t pooped that steak, BTW.

I was supposed to only stay for 1 day in Manhattan, fix the issue, ba da bing. I ended up staying an extra day - had no extra clothes and no time to get anything. Washed my undies on my own and used the hair dryer over a period of hours to dry them. I learned my lesson now and will always have extra stuff with me.

So I try to go back home today and guess what - Air Traffic Control computers had a major glitch (abcnews.go.com/Business/wireStory?id=3259992) that messed up the whole country’s air travel. Thousands of flights delayed and canceled. Mine was canceled, after I spent about 10 hours in the airport. Another 2 hours in the line to simply rebook the flight since they had 3 people trying to serve hordes. And all because, at least according to the report, a system failed and the failover system didn’t have the capacity to sustain the whole load.

So, while I wait in the airport to catch a stand-by flight tomorrow morning, unbathed and frankly looking a bit menacing, I decided to vent a bit. No hotels, no cars.

Maybe this is too much conjecture and if I’m wrong please enlighten me, but let’s enumerate some of the things wrong with this picture:

  1. First things first: While it’s cool to fail over to a completely separate location, typically you want a robust local cluster first so you can fail over to another system in the original location.
  2. If the original location is SO screwed up (meaning that a local cluster has failed, which typically means something really ominous for most places) ONLY THEN do you fail over to another facility altogether.
  3. Last but not least: Whatever facility you fail over to has to have enough capacity (demostrated during tests) to sustain enough load to let operations proceed. Ideally, for critical systems, the loss of any one site should hardly be noticeable.

According to the report none of the aforementioned simple rules were followed. Someone made the decision to fail over to another facility, which promptly caved under the load. A cascade effect ensued.

I mean, seriously: One of the most important computer systems in the country does not have a well-thought-out and -tested DR implementation. Guys, those are rookie mistakes. Like some airports having 1 link to the outside world, or 2 links but with the same provider. Use some common sense!

So, I guess I’ll put that in the list together with using what’s tantamount to unskilled labor securing our airports instead of highly trained and well-paid personnel that’s been screened extremely intensely and actually takes pride in the job. Maybe some of those unskilled people are running the computers, it might be like the Clone Army in Star Wars. A mass of cheap, expendable labor that collectively has the IQ of my left nut (I’m not being overly harsh - my left nut is quite formidable). The armed forces heading the same way isn’t the most reassuring thought, either.

Yes, I’m upset!!!

wallpapers images animal gorilla

D

Thu
7
Jun '07

ZFS in OSX

Not amazing news but an official announcement nonetheless: Saw this (www.macnn.com/articles/07/06/06/zfs.in.leopard/) and I couldn’t resist posting. This means a few things:

  1. Sun figured out how to make ZFS bootable (at least on OSX)
  2. Someone figured out how to deal with ZFS and resource forks (I can’t believe they are willing to break compatibility with so much software otherwise).

Now I just need a Mac so I can run some benchmarks before and after. I have some buddies that might oblige… finally the Macs get a decent FS.

Now if only Apple could lose the silly Mach legacy, it’s a common misconception that the kernel in OSX is FreeBSD - it ain’t. Run lmbench (www.bitmover.com/lmbench/) on different platforms and compare results such as context switching, thread creation and whatnot. Then you’ll see why OSX can’t always make a decent server OS.

D

'

Ate at Flames in Manhattan

I was helping a client in the Wall Street District today with some rather obscure CIFS performance issues (Opportunistic Locks anyone? Berzerk BDCs causing issues? Multi-user Access DBs over WAN?)

Had to stay overnight (unplanned) so after putting in some solid hours I decided to get some steak, and NYC is the place to get decent steak.

Did some research and found out that Flames was walking distance from my hotel, so I went.

Got a T-Bone this time (usually go for strip or ribeye but the waiter insisted, even though they had far more expensive cuts on offer). Some creamed spinach and a small salad and I was set.

Flames is one of those fancy places where they cut your steak for you. At least they don’t feed you or, indeed, help you masticate.

Not that they would need to - the dry-aged steak had fantastic flavor and was reasonably tender (not the most tender but good). I wish it had been a tad less cooked but it was still great, and I devoured it in atavistic glory, almost beating the man-pelt on my chest in ecstasy. It’s been a while since I’ve had proper dry-aged beef.

The creamed spinach wasn’t too creamy or salty. The salad was just OK, I typically use salads for intestinal lubrication anyway and it served the purpose.

I did overhear some patrons asking for well done steaks, this is one of those places where they won’t try to talk you out of it, sadly. I think steakhouses should make you actually sign a waiver if you want to commit such culinary atrocity.

I also overheard a waiter trying to sell some $100 “Kobe” steak to some ladies, telling them how they massage the cows 4 times a day. I discreetly shook my head at them and they got the message.

Anyway - long story short, strongly recommended, and don’t dare order anything beyond medium-rare.

Now back to washing and drying my Superman underoos - I had no change of clothes and I’m writing this naked. It kinda is an appropriate image for this review though…

D

Sat
2
Jun '07

IBRIX at EMC World

I’ve known about IBRIX for a while, but it was refreshing to talk to a decent techie that knew the product. They have improved it a lot over the past year.

For the uninitiated, IBRIX can be either

  1. A network-based filesystem using the IBRIX client and protocol
  2. Also accessible using NFS or CIFS
  3. SAN-based parallel filesystem

The product’s claim to fame is it’s scalability and performance (realized by adding extra nodes “hot”). Their most famous client is probably Pixar, they replaced a ton of NetApp boxes with an IBRIX cluster and realized huge performance benefits and vastly reduced costs. I always liked cool filesystem technologies and this definitely falls under the realm of “cool”. Some highlights based on notes I took on my Blackberry during the session and questions I asked:

  • No limits on filesystem size (they have deployed single namespace filesystems several PB in size).
  • 300mb/s read, 200mb/s write on small box per node. Bigger boxes can do 1.2GB/s per node, of course your storage needs to be able to keep up.
  • No limit on the number of nodes.
  • Automatic rebalancing of data over time. When you add new disk you rebalance to keep things humming.
  • Dedicated ibrix backup node, works with 3rd party backup SW, can have many backup servers for backup speed.
  • Has snaps now (global), this was a failing of the product before since it was lacking snapshots.
  • No real limit on the number of files per FS.
  • Biggest file size they have tested on production is an 8TB file, no software limit.
  • Nodes use FC to access storage, clients use Ethernet.
  • Client on Windows or Linux, otherwise general NFS and CIFS. Client is fastest.
  • Your prod servers can be the ibrix nodes but very compute-intensive. They recommend the client (IP-based, bonded). or get an 8-core box.
  • There is no single lock manager - this is the coolest thing. There is global metadata and global locking, all nodes participate equally.
  • How are node failures handled? All nodes interchangeable. All see same storage. Storage allocated to remaining servers if you lose a node.
    Can lose all but 1 server.
  • Back-end storage size per node? Unlimited.
  • Multipathing per node? Powerpath works. Can do bonded GigE up to 8 ports per.
  • How are files allocated? The file inode contains the info concerning which node it needs to go to. Round-robin allocation or preferred servers per file type. Also if server over 50% full then it’s skipped.
  • All volumes accessible by all nodes.
  • Can stripe huge files across many nodes.

I’m stoked! I can think of so many uses for this product:

  1. Data mining
  2. Digital media
  3. Oil and gas
  4. Backups

D