PP-PME load balancing improvements
Add a minimum number of nstlist tuning intervals and minimum time delay
at the beginning of the run before the load balancing starts. This allow
hardware clocks to ramp up and avoids having early measurements
overestimate rendering subsequent ones with different grid setups only
faster due to hardware warmup.
Also use global variables to adjust the number of measurements to be
skipped after switching configs.
Refs #3208
Fixes #2203
Change-Id: If835d2482e127caa51d50f45f25c19144d35efaa