c# - String.Where Comparatively Poor Performance -
i have 2 methods take string , remove 'invalid' characters (characters contained in hashset). 1 method uses linq.where, uses loop w/ char array.
the linq method takes twice long (208756.9 ticks) loop (108688.2 ticks)
linq:
string linq(string field) { var c = field.where(p => !hashchar.contains(p)); return new string(c.toarray()); }
loop:
string chararray(string field) { char[] c = new char[field.length]; int count = 0; (int = 0; < field.length; i++) if (!hashchar.contains(field[i])) { c[count] = field[i]; count++; } if (count == 0) return field; char[] f = new char[count]; buffer.blockcopy(c, 0, f, 0, count * sizeof(char)); return new string(f); }
my expectation linq beat, or @ least comparable to, loop method. loop method isn't optimized. must missing here.
how linq.where work under hood, , why lose method?
if the source code of toarray
in mono indication, implementation wins because performs fewer allocations (scroll down line 2874 see method).
like many methods of linq, toarray
method contains separate code paths collections , other enumerables:
tsource[] array; var collection = source icollection<tsource>; if (collection != null) { ... return array; }
in case, branch not taken, code proceeds loop:
int pos = 0; array = emptyof<tsource>.instance; foreach (var element in source) { if (pos == array.length) { if (pos == 0) array = new tsource [4]; else // if number of returned character significant, // method called multiple times array.resize (ref array, pos * 2); } array[pos++] = element; } if (pos != array.length) array.resize (ref array, pos); return array;
as can see, linq's version may allocate , re-allocate array several times. implementation, on other hand, 2 allocations - upfront 1 of max size, , final one, data copied. that's why code faster.
Comments
Post a Comment