RoHS CompliantAll Small Tree cards, switches and storage products are RoHS compliant!

Clone Detectors

November 26, 2012 by Steve Modica

Steve ModicaBack when cell phones were new, a number of vendors had “clone” problems.  People were cloning phone serial numbers so they could get free cell service.

To combat this problem, the cellular companies built up “Clone Detector” systems. These were massive database servers that had to be extremely fast. They would monitor all in process calls looking for two that had the same serial number. If they found a match, that phone was cloned and both were taken out of service.

SGI’s systems were uniquely qualified to handle this work.  The company had some stellar Oracle and Sybase numbers and offered these vendors a 10X speed up in clone detection.

The phone call came in from Florida during the test phase of the new system.  The sysadmin called me up and told me that when she dumped a 25% load on the system, it slowed down very quickly.  If she put a full load on the system, it stopped.

This was puzzling.  I’m not a database expert, so I spent time looking at the normal performance metrics.  How busy are the CPUs? Not very.  How busy are the (massive) RAID arrays? Not very.  How much memory is in use? Not much. Nothing was adding up.

I started watching the machine’s disk activity during the 25% load. I noticed one disk was very busy, but it was not the RAID and shouldn’t have been slowing the machine down.  I asked the sysadmin about it.  She said it was the disk with her home directory on it and it shouldn’t be interfering with the machine’s database performance. That answer nagged at me, but she was right.  If the database wasn’t touching the disk, why should it matter?  But how come it was so busy?  There was a queue of pending IOs for Pete’s sake!  Was she downloading files or something?

I asked her if I could take a look at the index files.  Index files are used by a database to keep track of where stuff is.  Imagine a large address book.  I wanted to see if the index files were corrupted or “strange” in any way.  I thought maybe I could audit accesses to the index files and spot some rogue process or a corrupt file.

What I found were soft links instead of “real” files. For you Windows people, they are like Short Cuts.  On a Mac you might call them Aliases.  She had the index files “elsewhere” and had these aliases in place to point to them.  She told me “Yeah. I do this on purpose so I can keep a close eye on the index files.  I keep them…. in… my… home… oh!”

So her ginormous SGI system with hundreds of CPUs and monstrous RAIDs was twiddling its thumbs waiting for her poor home directory disk to respond to the millions and millions of index lookups it could generate a second. It fought heroically, but alas, could not keep up.

Some quick copies to put the index files where they belonged and we had one smoking clone detector system.


No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.