I've been looking at the support for negative index values to indicate offset from the end of a sequence (similarly to slicing)
Unfortunately there's a performance cost for all accesses for an implementation like that for Cobra Slicing which would wrap some
index checking and adjustment around the current use of a direct map to the C# indexing-and-exception-if-out-of-range
On the basis of a reasonable but untested (leading) assumption
- That the most frequent use (of negative indexes) would be as a *literal* (to get the last or some short end offset from last values in list)
rather than than using a (negative) variable as an index for the same purpose
- I thought we maybe able to get some of the best of both.
That is; recognise and only special case (a single) negative literal index
e.g.
- Code: Select all
a = [ 1,2,3,4]
assert a[-1] == 4
assert a[-4] == 1
For that there's no additional cost for access using (existing working implementation) positive literals or variables
and you only incur any performance cost for the use of negative literals
(in return for removal of that bounds exception and provision of some convenient and simple to use and understand behaviour).
Its definitely not a 100% solution in that negative indices in an index variable would still not be recognised and handled and trigger
a Bounds exception.
A secondary assumption is that negative value indices ( end offset indexing) literal or variable are the minority of cases anyway.
I did some mods to Expr.cobra IndexExpr writeSharpDef()
so as to detect a single unary negative literal as an index on sequences and setup a call a C# runtime overloaded method (GetIndexed) that resolves Object
Type (for Strings arrays and ILists (and generics of same) ) and calls a sub method that offsets the (negative) arg from the Length/Count and
ensures index >=0 to give a final index..
Then index the sequence as per normal obj[i] syntax.
i.e
- Code: Select all
a[-n] becomes GetIndexed(a, -n)
The slow down for this
(as calculated using an internal duration timer around 100_000 iterations of a method asserting a number of indexed accesses)
varies (widely) but is at worst about an order of magnitude.
Direct indexing Lists/Strings/arrays seems relatively fast in comparison to method invocation
Direct indexing into Lists/arrays ~ 0.3 microSec/access, Strings ~1 microSec/access.
The Performance overhead/ratio of going through a couple of methods and some
comparisons rather than a direct index (for unoptimised compiles) varies
- Code: Select all
Worst Best
Lists 1:11 1:2
Strings 1:2.5 1:1.7
Arrays 1:6 1:5.7
(Optimised compiles often give smaller ratios but also optimise down
to 0 durations (:-))
Interestingly enough the timings devolve down to similar values regardless of the type of sequence
~ 0.25 s/100_000 iterations or about 2.5 microSec/per access/index
BUT take the above with a lot of salt since there was a wide variance in the timings across runs or just from a recompilation and rerun
(Variances for Lists up to 5.5x, Others slightly more stable)
Having said max slowdown is order of magnitude though
that is from 0.3 millionths sec to 3 millionths sec
and 1 millionth sec to 10 millionths sec per access.....
for a few accesses I defy anyone to notice the difference.
It makes little difference if rather than calling an overloaded GetIndexed method to do indexing you still use direct indexing but
setup to wrap the index with a C# runtime function that does a similar calculation (to ofset the negative index and pull
any resulting index to be >=0 ) to the above....
e.g similarly to
- Code: Select all
a[CobraImp.negIdxCalc(a, -n)]
There's about the same orders of slowdown (and variance).
If we (add some code complexity and) do negative offset calculations
from within cobra :
Preget sequence length and use the negative index applied against that
l = a.length
...
v1 = a[l-1]
v2 = a[l-2]
You get similar timings as per direct indexing
i.e same order of magnitude + a few percent (unsurprisingly)
If instead we pass the index through a Cobra method
def negIdx(len as int, idx as int) as int
if idx <0
idx = len + idx
if idx < 0
return 0
if idx >= len
return len-1
return idx
v1 = a[.negIdx(a.length,-1)]
v2 = a[.negIdx(a.length,-2)]
You get similar timings as per direct indexing plus ~20-30%
- but (weirdly enough) this is still much faster than mapping onto
C# internal internal handling doing the same thing
i.e If need to care about negative indexing performance you can handle negative offsetting directly in Cobra or avoid negative/end offsets altogether
SO The bias/opinion/conclusion I bring and take from all this :
I would like Cobra to support some form of negative indexing for
a) Convenience - easy indexing from end (last item in sequence)
b) Consistancy - Cobra slicing and similarity to Python syntax
and to remove (some of) an unnecessary source of a bounds exception and cos its sorta neat.
.. I don't want to pay (performance wise) for that unless I choose to use it in my code (if there's a trade off)
I think its worthwhile supporting negative literal indexes as above if only on the basis of convenience.
As implemented there is (at worst) an order of magnitude slowdown
(only to negative indexes and only if used)
BUT I dont think this is particularly relevant given the frequency of use and absolute speed anyway
And there's an escape: If your app is such where negative indexing is used and the degradation is of importance you can resurrect
direct access performance by doing your own end ofset calculations directly (in cobra) or avoiding end/negative ofsets entirely
IF YOU THINK ITS NECESSARY.
Opinions?
CounterArguments?
I'd be really interested if anyone has any proof (from python?) that my assumptions have any (or no) validity