NFTAPE is a configurable tool for injecting faults, triggering injections, producing workloads, detecting errors, and logging results. It provides a framework in which light weight fault injectors, triggers, and target applications (workloads) can be specified. The system is configuration-driven, meaning that users can specify a great variety of fault injection scenarios without system recompilation.
Toward this end, NFTAPE facilitates specifying each component in the experiment (fault injectors, triggers, workloads); directs communication among the components; logs the activities of the components; and directs the execution of the processes which comprise those components.
The architecture of NFTAPE can be depicted as in the following diagram:
Figure: NFTAPE High-level Architecture
Here, there are two network
nodes depicted. In reality, NFTAPE can run on any number of
nodes.
On one node, a Control Host is running.
The Control Host launches, logs, and facilitates the
orderly execution of a fault-injection experiment.
Typically, the Control Host runs on a node that is external
to the nodes directly involved in a given fault injection
scenario.
A Process Manager daemon process
runs on each of the remaining nodes. Its job is to manage
the experimental activity of the processes on a single
node, and to provide communication between the processes on
that node and the Control Host.