|
Stress testing a system by llungster
WhiningDog.NET
12/02/2002
Introduction
Suppose you just designed the hottest graphics card
imaginable. You want to ship it and make a buck. How do you know if it
will work right? How will potential customers know that you've tested
your product for stability? Product testing is tough. It's one aspect
of the computer industry that consumers often underappreciate. Most people
buy their systems from integrators - companies that bundle various parts
into a PC and perform the integration testing. Integrators don't build
the actual parts though they will sometimes provide specifications for
custom parts. When we buy a system, we assume is has been thoroughly tested
amd we rely on the warranty if we encounter problems.
How does the integrator cover his behind? The integrator
relies on systems testing to feel good about the interoperability of the
various components they purchase and assemble into systems. The vendors
of each card, disk drive, case, power supply, etc., also performs tests
to make sure things work.
With a universe of parts as large as the PC industry,
it is impossible to test all combinations of hardware and software for
stability. Nonetheless, there is a need to provide an accepted baseline;
a set of tests that all parties can agree that defines a properly functioning
device. Computer related web sites tend to run combinations of well know
benchmarks such as WinBench,
games such as Quake
and cpu intensive programs such as Prime95.
In R&D and QA test labs, integrators also rely on tools from Microsoft.
In it is Microsoft's best interest that hardware and software work seamlessly,
and thus Microsoft has taken the lead role in this endeavour through the
its support of the Hardware Compatibility Tests or HCT.
Some of you may be familiar with the Hardware Compatibility
List (http://www.microsoft.com/windows/compatible/default.asp).
This is a list of devices that have passed HCT testing. When you install
a 3rd party driver and it bears a digital signature, that signature says
that the Windows Hardware Quality Labs (WHQL) is satisfied the driver
will perform safely on a PC that has received the Windows logo.
Do it yourself HCT testing
Wouldn't it be nice if you, as the integrator of
a home-built PC, could run some of the same tests that a hardware manufacturer
ran? Well you can! Some of the tests provided under the Microsoft Developer's
Network (MSDN)
are only available for a fee, but others are readily available for download
if you know what to look for.
-1- The first thing to do is to download
a copy of the HCT. The latest version always supports the latest operating
system and since Microsoft has officially withdrawn support for older
operating systems, don't expect to find HCT kits for older OSes (they
may be available; I haven't bothered to look for them.) For this article,
I ran HCT 10.0 under Windows XP.
Unfortunately, Microsoft's web site tends to change frequently and links
don't often stay static for very long. I downloaded HCT 10.0 from the
"Systems and Servers Testing" page at (http://www.microsoft.com/hwdq/hwtest/devices/systems.asp?area=SystemsServers).
Give it a shot and see if this link is still valid. If you don't find
it there, go to either of these two locations below and dosome browsing
:
- The Device Driver Kit site at (http://www.microsoft.com/ddk/hct)
- The Windows Hardware Quality Labs at (http://www.microsoft.com/hwdq/hwtest)
The HCT download is rather large - several hundred
megabytes. If you use a modem, drop a movie in the DVD player!
What is Stress Testing ?
Before we go to step 2, let's take a detour and
define what stress testing is. I firmly believe that stress testing should
be done by any DIY system builder. If you're an overclocker or system
tweaker, stress testing is even more important. But what is stress testing
?
Ask the typical systems builder who has surfed the
web and he'll tell you stress testing means you run Prime95,
Sandra, Doom
and some benchmark tools like WinBench.
Throw in a few variations on the above; heat up the CPU, peg the load
meter at 100% for a few hours and you've got yourself a stable system
if nothing bad happens right?
Not so fast! There's nothing wrong with doing all
of the above listed tasks and each does in fact stress one or more parts
of the system. But what about stressing the system as a whole, hitting
all parts at the same time? How about reading and writing to disk while
using both the integer and the floating point unit, causing lots of page
faults, abusing the system cache, causing all those random edge conditions
that can only be achieved by lots of simultaneous tasks running for a
very long time? That's what system stress is about.
Now on to the next step.
-2- After you download the HCT, run the
installer. This take a bit of time due to the large number of files
to unpack and the large total byte count. The installation script may
ask you some questions. Since we're not going to be submitting any test
reports to the test lab, don't worry about what you enter. Just be sure
to install everything. I used the default installation directory; it
should be rather uneventful.
-3- Run the HCT. There may be an HCT icon
on your desktop. If not, look in your program menu for the HCT folder.
Depending on the version you are running, this may be called simply
"HCT" or "Test Manager". For version 10.0, this
is called "HCT 10.0".
What is the HCT test kit ?
Time for another break. Since we're installing the
HCT test kit, let's take a quick look at what this kit is all about.
I said earlier that manufacturers and integrators
use the HCT to test their hardware and software. Test results may be submitted
to show compliance with the test kit. To accomplish this with a suitable
level of detail, the HCT must logs all devices on the test system. Each
time a test is run on the same system, the overall configuration of the
system is checked. The slightest difference from the previous test run
(such as the addition of a new card) will invalidate previous tests. For
our purposes, we're not concerned with logs or even test results. We want
to know if our test system is capable of running the stress test without
crashing or locking uo. Therefore, if you encounter dialog boxes asking
you for very specific system information such as BIOS revision numbers,
enter anything.
-4- When you start the HCT tests, you may
be asked to reset the test run and delete all test runs. Respond "yes".
-5- If you are asked to select a test,
select "System Under Test".
-6- If you are asked for a Product Name,
enter anything. It doesn't matter since we're not submitting any test
results.
-7- When you reach the test selection window,
you will have a choice of selecting Manual or Automated tests. Expand
the the Automated tests tree. In this tree, look for "System Stress".
This is the one test we want. Select it and use "Add Selection"
to add it to the list of tests to execute. Note that you can come back
at any time and explore the other tests in the HCT kit to see what they
do. For now, let's concentrate on System Stress.
-8- Click on "Start Tests" and
you're off and running ! The display will look something like this.
There are two status windows. If you want to stop the test early, click
"exit" on the window titled WmStress. The test timer indicates
a runtime of 72 hours but it will exit after 8 hours. See if your system
can hack it !
Why run Stress ?
When you run System Stress, some of you may not
be convinced that it's all that tough on the system. Believe me when I
say that many home tweaked system won't last long under this type of stress
testing! You can have a perfectly stable system that's used every day
to play games, read email and surf the web. Maybe you power it up for
a few hours a night. Or maybe it stays up fulltime. But until you stress
it like this, your system hasn't seen stress! If you have a CPU load meter
up you'll note that it's not pegged to the max. Don't let this fool you.
Let me reassure you that what is happening in the background is much worse
than a full load. There are a very large number of threads being created,
executed, and deleted. Tons of kernel objects are coming and going. The
video card is being tortured, not by fast paced graphics as with games,
but with complicated rendering tasks that involve all kinds of bitmaps
loading and caching and numerous clipping rectangles. OpenGL exercises
the FPU as well as the display interface; the memory manager is working
overtime; Windows messages are being pumped all over the place. This is
real stress.
System Stress is just a script that runs a number
of other tests in the HCT collection. Back in the Test Selection window,
you can see the other tests that are available for execution. For example,
you can test just the serial port or just the display adapter. Needless
to say, the HCT is a very useful tool for isolating problems. You can
also use it to verify accuracy. For instance, the display adapter test
verifies that the correct pixels are rendered in hardware as they would
in software - no cheating is allowed! Take some time and tinker with the
tests, I think you'll find them very useful; especially if you build and
tweak systems regularly.
Summary
In my years of working with Windows systems, I have
yet to find a system that passed HCT stress yet failed regularly in the
real world. I have however, seen systems pass other "standard"
tests of the WinBench variety that later failed miserably in the real
world. This is good indication that the HCT provides sufficient baseline
coverage for system stability. This makes the HCT a much tougher, more
accurate measure of system stability. Users of MP systems are even more
susceptable to stability problems and owe it to themselves to run the
HCT System Stress test. Naturally, no test can guarantee absolute stability;
but it's worth using the toughest tools we can get our hands on.
Those of you with access to MSDN may have come across
another System Stress offering from Microsoft. It is "fancier"
test, but does essentially the same thing.
Enjoy and happy testing!
If you have any comments or questions, please
post in our forums
|