java - Is it a sensible optimization to check whether a variable holds a specific value before writing that value? -
if (var != x) var = x;
is sensible or not? compiler optimize-out if statement? there use cases benefit if statement?
what if var
volatile variable?
i'm interested in both c++ , java answers volatile variables have different semantics in both of languages. java's jit-compiling can make difference.
the if statement introduces branching , additional read wouldn't happen if overwrote var x, it's bad. on other hand, if var == x
using optimization perform read , not perform write, have effects on cache. clearly, there trade-offs here. i'd know how looks in practice. has done testing on this?
edit:
i'm interested how looks in multi-processor environment. in trivial situation there doesn't seem sense in checking variable first. when cache coherency has kept between processors/cores check might beneficial. wonder how big impact can have? shouldn't processor such optimization itself? if var == x
assigning once more value x
should not 'dirt-up' cache. can rely on this?
yes, there cases sensible, , suggest, volatile variables 1 of cases - single threaded access!
volatile writes expensive, both hardware , compiler/jit perspective. @ hardware level, these writes might 10x-100x more expensive normal write, since write buffers have flushed (on x86, details vary platform). @ compiler/jit level, volatile writes inhibit many common optimizations.
speculation, however, can far - proof in benchmarking. here's microbenchmark tries 2 strategies. basic idea copy values 1 array (pretty system.arraycopy), 2 variants - 1 copies unconditionally, , 1 checks see if values different first.
here copy routines simple, non-volatile case (full source here):
// no check (int i=0; < array_length; i++) { target[i] = source[i]; } // check, set if unequal (int i=0; < array_length; i++) { int x = source[i]; if (target[i] != x) { target[i] = x; } }
the results using above code copy array length of 1000, using caliper microbenchmark harness, are:
benchmark arraytype ns linear runtime copynocheck same 470 = copynocheck different 460 = copycheck same 1378 === copycheck different 1856 ====
this includes 150ns of overhead per run reset target array each time. skipping check faster - 0.47 ns per element (or around 0.32 ns per element after remove setup overhead, pretty 1 cycle on box).
checking 3x slower when arrays same, , 4x slower different. i'm surprised @ how bad check is, given predicted. suspect culprit largely jit - more complex loop body, may unrolled fewer times, , other optimizations may not apply.
let's switch volatile case. here, i've used atomicintegerarray
arrays of volatile elements, since java doesn't have native array types volatile elements. internally, class writing straight through array using sun.misc.unsafe
, allows volatile writes. assembly generated substantially similar normal array access, other volatile aspect (and possibly range check elimination, may not effective in aia case).
here's code:
// no check (int i=0; < array_length; i++) { target.set(i, source[i]); } // check, set if unequal (int i=0; < array_length; i++) { int x = source[i]; if (target.get(i) != x) { target.set(i, x); } }
and here results:
arraytype benchmark linear runtime same copycheckai 2.85 ======= same copynocheckai 10.21 =========================== different copycheckai 11.33 ============================== different copynocheckai 11.19 =============================
the tables have turned. checking first ~3.5x faster usual method. slower overall - in check case, paying ~3 ns per loop, , in worst cases ~10 ns (the times above in us, , cover copy of whole 1000 element array). volatile writes more expensive. there 1 ns of overheaded included in different case reset array on each iteration (which why simple slower different). suspect lot of overhead in "check" case bounds checking.
this single threaded. if actual had cross-core contention on volatile, results much, worse simple method, , above check case (the cache line sit in shared state - no coherency traffic needed).
i've tested extremes of "every element equal" vs "every element different". means branch in "check" algorithm predicted. if had mix of equal , different, wouldn't weighted combination of times same , different cases - worse, due misprediction (both @ hardware level, , perhaps @ jit level, can no longer optimize always-taken branch).
so whether sensible, volatile, depends on specific context - mix of equal , unequal values, surrounding code , on. i'd not volatile alone in single-threaded scenario, unless suspected large number of sets redundant. in heavily multi-threaded structures, however, reading , doing volatile write (or other expensive operation, cas) best-practice , you'll see quality code such java.util.concurrent
structures.
Comments
Post a Comment