Non quia difficilia sunt non audemus, sed quia non audemus difficilia sunt
Home -> Research -> Netgauge -> OS Noise
Home
  Publications
  Awards
  Research
    
NB Collectives
    MPI Topologies
    MPI Datatypes
    Netgauge
      
LogGPS
      OS Noise
      eBB
    Network Topologies
    Ethernet BTL eth
    ORCS
    DFSSSP
    Older Projects
    cDAG
    LogGOPSim
    CoMPIler
  Teaching
  BLOG
  Miscellaneous
  Full CV [pdf]






  Events








  Past Events





Netgauge - Operating System Noise Measurement

Netgauge OS Noise Measurement Description:

The noise pattern in Netgauge allows the precise measurement of OS Noise. The current version supports three different benchmark methods:
  • Fixed Work Quantum
  • Fixed Time Quantum
  • Selfish Detour
Netgauge (including all OS Noise benchmarks) can be downloaded on the main Netgauge page.

General

The benchmarks use Netgauge's high-performance timers for different architectures. Users should make sure that the configure script detected the timer correctly and that it works reliably (no frequency scaling etc.). Netgauge should be run on all cores (CPUs) of a processing system to ensure realistic benchmarks. For example, if the machine has four cores and two sockets, then Netgauge should be run with 8 processes on this machine. General help: mpirun -n 1 ./netgauge -x noise --help

Selfish Detour (selfish)

This is the default benchmark. It is a modified version of the selfish detour benchmark proposed in [5]. The benchmark runs in a tight loop and measures the time for each iteration. If an iteration takes longer than the minimum times a particular threshold, then the timestamp (detour) is recorded. The benchmark runs until it recorded a predefined number of detours (it will never halt on a noise-free BG/P system!). Example run (on a very noisy laptop):
mpirun -n 2 ./netgauge -x noise 
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Selfish benchmark
# min clock cycles per rank: 91 91 
# Minimal cycle length [ns]: 42.055092 
# Number of iterations (recorded+unrecorded): 606344901 
# Threshold: [% minimal cycle length]: 900 
# CPU overhead due to noise: 7.80%
# Measurement period: 41.26 s
The file "ng.out" contains the time of each detour. The data can be plotted with the gnuplot command: plot "ng.out" .

Fixed Work Quantum (FWQ)

The fixed work quantum benchmark performs a fixed amount of work multiple times and records the time it takes for each run. Example run:
mpirun -n 2 ./netgauge -x noise -e fwq
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise -e 
# fwq )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Fixed Work Quantum benchmark
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): -931598522
# random output (to prevent compiler optimizations): -931598522
The file "ng.out" contains the detailed time of each work-quantum. The data can be plotted with the gnuplot command: plot "ng.out" .

Fixed Time Quantum (FTQ)

Netgauge supports the Fixed Time Quantum (FTQ) Benchmark described in [2]: A very small work quantum is performed until a fixed time quantum has exceeded, for each iteration it is recorded how many workload iterations were carried out. In the absence of noise this number should be equal for every sample. When there is noise this number varies. Because the start and end time of every sample is defined (because every sample takes an equal amount of time), periodicity in the occurance of noise can be analyzed with this method. The workload should be portable (written in C) and not modified by compiler optimizations. For this purpose we use the workload described in [4]. A good starting point for the length of the time quantum should be one millisecond as suggested by [3]. Example run:
mpirun -n 2 ./netgauge -x noise -e ftq
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise -e ftq )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Fixed Time Quantum benchmark
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): -931598522
# random output (to prevent compiler optimizations): -931598522
The file "ng.out" contains a detailed number of iterations for each time slice. The data can be plotted with the gnuplot command: plot "ng.out" .

Acknowledgments

The noise pattern was funded by the FastOS II (LAB 07-23) project.

References

HPCC'07
[1] T. Hoefler, T. Mehlan, A. Lumsdaine and W. Rehm:
 Netgauge: A Network Performance Measurement Framework Vol 4782, In Proceedings of High Performance Computing and Communications, HPCC'07, presented in Houston, USA, pages 659-671, Springer, ISBN: 978-3-540-75443-5, Sep. 2007,
[2] Matthew Sottile and Ronald Minnich:
  Analysis of microbenchmarks for performance tuning of clusters IEEE International Conference on Cluster Computing, 2004, pages 371-377, ISSN: 1552-5244, ISBN: 0-7803-8694-9
[3] Fabrizio Petrini, Darren J. Kerbyson, Scott Pakin:
  The Case of the Missing Supercomputer Performance SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, ISBN 1-58113-695-1
[4] Carl Staelini and Larry McVoy
  mhz: Anatomy of a micro-benchmark USENIX Annual Technical Conference (NO 98), 1998
[5] P. Beckman, K. Iskra, K. Yoshii, S. Coghlan, and A. Nataraj
  Benchmarking the Effects of Operating System Interference on Extreme-Scale Parallel Machines Cluster Computing, vol. 11, no. 1, pp. 3-16, 2008.

serving: 54.158.86.243:53258© Torsten Hoefler