Heh, okay so I did a quick benchmark and interned string comparison is faster but it doesn't matter and I'll explain why. Here's the code I ended up using. One method uses the "==" operator and the other uses the "is" operator.
class Program
const NAMESPACE = "NAMESPACE"
const CLASS = "CLASS"
const DEF = "DEF"
const PRINT = "PRINT"
const ID = "ID"
const DOT = "DOT"
const STRING_SINGLE = "STRING_SINGLE"
const INDENT = "INDENT"
const DEDENT = "DEDENT"
const EOL = "EOL"
const EOF = "EOF"
var iterations = 100_000
def main
tokenKinds = [
'NAMESPACE', 'ID', 'DOT', 'ID', 'EOL', 'EOL', 'INDENT', 'CLASS', 'ID', 'EOL',
'EOL', 'INDENT', 'DEF', 'ID', 'EOL', 'EOL', 'INDENT', 'PRINT', 'STRING_SINGLE',
'EOL', 'DEDENT', 'DEDENT', 'DEDENT', 'EOF'
]
.benchmarkEqualsOperator(tokenKinds)
.benchmarkIsOperator(tokenKinds)
def benchmarkEqualsOperator(tokenKinds as List<of String>)
sw = Diagnostics.Stopwatch()
sw.start
for i in .iterations
for kind in tokenKinds
if kind == .NAMESPACE
pass
else if kind == .CLASS
pass
else if kind == .DEF
pass
else if kind == .PRINT
pass
else if kind == .ID
pass
else if kind == .DOT
pass
else if kind == .STRING_SINGLE
pass
else if kind == .INDENT
pass
else if kind == .DEDENT
pass
else if kind == .EOL
pass
else if kind == .EOF
pass
else
throw FallThroughException("No clause for '[kind]'")
sw.stop
print "Using '==' operator: [sw.elapsedMilliseconds] ms"
def benchmarkIsOperator(tokenKinds as List<of String>)
sw = Diagnostics.Stopwatch()
sw.start
for i in .iterations
for kind in tokenKinds
if kind is .NAMESPACE
pass
else if kind is .CLASS
pass
else if kind is .DEF
pass
else if kind is .PRINT
pass
else if kind is .ID
pass
else if kind is .DOT
pass
else if kind is .STRING_SINGLE
pass
else if kind is .INDENT
pass
else if kind is .DEDENT
pass
else if kind is .EOL
pass
else if kind is .EOF
pass
else
throw FallThroughException("No clause for '[kind]' or it is not interned")
sw.stop
print "Using 'is' operator: [sw.elapsedMilliseconds] ms"
Here's some results on a decently beefy .NET machine
- Code: Select all
Using '==' operator: 72 ms
Using 'is' operator: 29 ms
However, like most benchmarks, these results are artificial. When I actually try to use this approach using a token list obtained from a CobraTokenizer instance I run into an issue. Not all the strings are interned in that case and so the process of determining if a string is interned (and then retrieving the interned version if not) makes the equals operator approach faster. A LOT faster in fact. Adding this code before doing any 'is' comparisons...
if not String.isInterned(kind)
kind = String.intern(kind) to !
...made the time shoot up from 29 ms to 414 ms. I even tried getting creative with a local dictionary mapping each string constant to itself, but the best I could do was bring it down to 119 ms. Still slower than ==. I also tried a third approach using kind.equals(.SOME_CONST, StringComparison.Ordinal) for grins which clocked in around 110 ms.
So, == it is! I always know I shouldn't do any premature optimizations but I just can't help myself sometimes
Too clever for my own good!