Robustness testing

Definition: Robustness is defined as the degree to which a system operates correctly in the presence of exceptional inputs or stressful environmental conditions. [IEEE Std 24765:2010]

Goal: The goal of robustness testing is to develop test cases and test environments where a system's robustness can be assessed.

Robustness testing approaches
Our robustness testing experiments of HA middleware

Robustness testing approaches

Many research projects worked on this topic, the two major approaches are:

I. Interface robustness testing: bombarding the public interface of the application/system/API with valid and exceptional inputs. The success criteria is in most cases: "if it does not crash or hang, then it is robust", hence no oracle is needed for the testing. Examples:

Fuzz: Fuzz used a simple method (randomly generated string) to test the robustness of Unix console applications. They repeated their original experiment (1990) in 1995, and applied the method also for X-Window applications. The results were distressing, originally approximately 40% of the applications tested could be crashed with this method, and many of the reported robustness errors remained even after five years. In 2000 they conducted a third experiment with Windows 2000 applications. The method was similar, randomly generated mouse and keyboard events were supplied to the programs.
The source code of the testing tools can be downloaded from Fuzz's homepage.
Ballista: in Ballista the robustness of the Posix API implementation was tested. They conducted a great number of experiments and compared 15 Unix versions. Later the test suite was implemented for Windows systems also. Part of the POSIX test suite can be downloaded from their website. Over the years a lot of publications appeared concerning Ballista. A good introduction is the brochure and the "Software Robustness Evaluation" slides.
Ballista suggested quite a few good ideas and techniques, and they carried out a lot of well-documented experiments, it is worth to see.
JCrasher: JCrasher is a tool to generate robustness tests from Java byte code in form of JUnit tests. Novel approaches implemented in a nice tool, which can be downloaded even as an Eclipse plug-in.
PROTOS (Security Testing of Protocol Implementations): The PROTOS project analyzes the robustness and security aspects of protocols. Among the papers they published a test suite for WAP. The project was split into a research project (PROTOS Genome) and a commercial tool called Codenomicon DEFENSICS.

More information

Z. Micskei, H. Madeira, A. Avritzer, I. Majzik, M. Vieira, N. Antunes: Robustness Testing Techniques and Tools. In: K. Wolter et al. (eds.). Resilience Assessment and Evaluation of Computing Systems, pp. 323-339, Springer-Verlag, 2012. DOI: 10.1007/978-3-642-29032-9_16
Philip Koopman, Kobey Devale, John Devale. Chapter 11. Interface Robustness Testing: Experience and Lessons Learned from the Ballista Project. In: Karama Kanoun, Lisa Spainhower (eds.). Dependability Benchmarking for Computer Systems. ISBN: 978-0-470-23055-8. August 2008, (Online draft)

II. Dependability benchmarking: the aim is to develop a public benchmark specification which focuses on evaluating the dependability of the system. It is a much broader field, than robustness, it contains the other attributes of dependability like availability and maintainability. The common method is to create a workload, which resembles the normal operation of the system under benchmark. Then define a faultoad, which contains typical faults (hardware, software, operator, etc.) and the exact time period when they should be instered. The specification includes also what dependability measures should be collected.

DBench: the goal of the EU project DBench was to produce guidelines for developing dependability benchmarks.
Along the general guidelines and background research they developed also concrete benchmarks also:
- OLTP benchmarks: 4 configurations of Oracle DBMS compared in the paper "Benchmarking the Dependability of Different OLTP Systems" (DOI: 10.1109/DSN.2003.1209940)
- Webserver benchmarks: compared Abyss and Apache webservers, faultload simulated typical programmer errors ("Dependability Benchmarking of Web-Servers", DOI: 10.1007/978-3-540-30138-7_25)
- OS benchmarks: benchmarked 6 Windows and 4 Linux versions using the PostMark file benchmark as a load ("Benchmarking the dependability of Windows and Linux using PostMark/spl trade/ workloads", DOI: 10.1109/ISSRE.2005.13)
IBM Autonomic Computing Benchmark: similar to the DBench-OLTP, but uses SPECjAppServer2004 as a workload and focuses on the resiliency of the system to various disturbances.

More information

Karama Kanoun and Lisa Spainhower, eds. Dependability Benchmarking for Computer Systems. Wiley-IEEE Computer Society Press, 2008. ISBN: 9780470230558. DOI: 10.1002/9780470370506
Ali Shahrokni, Robert Feldt. A systematic review of software robustness, Information and Software Technology, Volume 55, Issue 1, January 2013, pp. 1-17, DOI: 10.1016/j.infsof.2012.06.002.

Our robustness testing experiments of HA middleware

Together with Nokia Research Center we were working on the robustness testing of HA middleware systems. The general idea is illustrated on the following figure. On the left side you can see the typical fault sources of an HA middleware, on the right side the means to test against these fault sources.

Publications:

See Chapter 2 of my PhD dissertation
This paper from the ISAS 2007 conference presents the tools developed to test the robustness of Service Availability Forum's Application Interface Specification, and the results obtained for two middleware systems.
This previous paper form the EFTS 2006 conference presented the general ideas, the related work and some preliminary results.

Test suite and results:

Readme:

README for the test suite

Code:

The code of the test suite for openais trunk (2007.10.02.)

Older codes for previous versions (just for backup, try to use the above one, we updated the code and scripts slightly since these versions): openais trunk 20061211 and openais release 0.80.1

Results:

The test results obtained for the openais platform:

openais trunk 2007.10.02.: result logs
openais trunk 2006.12.11.: result logs
openais release 0.80.1: result logs

Last modified: 2013. 10. 02.