I took the collection class Matrix in the Supplements directory and placed it in the Cobra.Lang standard library with the new name MultiList. The name "Matrix" was a poor choice because it implied a mathematics orientation when in fact this is a collection class like List, Dictionary and Set.
MultiList provides an n-dimensional List-like collection. It's similar to, but more basic than, C++'s Boost.MultiArray which serves the same purpose.
You can view the code online. It's a start and covers the basics. If anyone wants to contribute a .resize operation, slicing, etc. feel free.
Forums
MultiList
36 posts
• Page 1 of 4 • 1, 2, 3, 4
Re: MultiList
would there be any drawbacks to making the Cobra list support this multidimensional aspect? If they are minimal, would love to have this promoted, so I can take advantage of the nice syntax for lists w/ this.
- torial
- Posts: 229
- Location: IA
Re: MultiList
Do you mean make the List class multi-dim? Or do you mean that nested list literals should be interpreted as MultiList under certain circumstances?
If you mean enhancing the List class, I think it does what it does well, as it is now. Plus it's provided by the virtual machine's standard library and not under our control (other than adding some extension methods).
If you mean literals like:
...these are currently nested lists as I'm sure you know. I think there are some problems with trying to make these MultiLists. There is the question of what to do with this:
...where the rows are not the same size.
Also, MultiList is not that mature right now. It supports multiple dims, indexing by integers and looping through all elements, but it does not support looping through sub-multilists. Here is what I mean:
The above for-loops wouldn't work if t were a MultiList<of int> because MultiList doesn't have an enumerator that returns sub-multilists. I say "sub-multilists" because with the higher dims, you have to switch from the concept of "row". See Boost.MultiArray, for example.
On a related note, we've played with the idea of allowing a type specified for collection literals like:
We could then entertain this:
The absence of the inner type would mean that the compiler should infer it.
Btw your future comments and inquiries will be more clear if you showed some example code of what you meant.
If you mean enhancing the List class, I think it does what it does well, as it is now. Plus it's provided by the virtual machine's standard library and not under our control (other than adding some extension methods).
If you mean literals like:
t = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
...these are currently nested lists as I'm sure you know. I think there are some problems with trying to make these MultiLists. There is the question of what to do with this:
t = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9, 10],
]
...where the rows are not the same size.
Also, MultiList is not that mature right now. It supports multiple dims, indexing by integers and looping through all elements, but it does not support looping through sub-multilists. Here is what I mean:
t = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
# t is a List<of List<of int>>
for row in t # row is inferred as List<of int>
print row
for x in row
print x
The above for-loops wouldn't work if t were a MultiList<of int> because MultiList doesn't have an enumerator that returns sub-multilists. I say "sub-multilists" because with the higher dims, you have to switch from the concept of "row". See Boost.MultiArray, for example.
On a related note, we've played with the idea of allowing a type specified for collection literals like:
# we're start the list with a Square,
# but want the list type to be more general:
t = [Square()] to List<of Shape>
We could then entertain this:
t = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
] to MultiList
The absence of the inner type would mean that the compiler should infer it.
Btw your future comments and inquiries will be more clear if you showed some example code of what you meant.
- Charles
- Posts: 2515
- Location: Los Angeles, CA
Re: MultiList
Hi Charles,
Sorry for the ambigious comments. I had a killer few weeks and my brain was definitely addled. Your comments revealed a few things I didn't realize were available (or hadn't thought through), like the nested list literals and the returning all items sequentially.
I'm swamped right now, but I if I can find the time to get a release of Naja out this summer, I'd be happy to take on a task or two on extending the MultiList.
In general, good comments on the disavantages of what I was suggesting, and I'll try to include some examples in the future. I might as well have filed a bug report saying: "It doesn't work" :-O.
Perhaps for a MultiList literal (when it gets solidified enough), a syntax like
would be ok -- I like the # symbol for indicating a visual n-dimensional data, but the obvious drawback would be the contextual parsing to identify it isn't a comment :/ Perhaps a literal format like +[] might be ok.
Sorry for the ambigious comments. I had a killer few weeks and my brain was definitely addled. Your comments revealed a few things I didn't realize were available (or hadn't thought through), like the nested list literals and the returning all items sequentially.
I'm swamped right now, but I if I can find the time to get a release of Naja out this summer, I'd be happy to take on a task or two on extending the MultiList.
In general, good comments on the disavantages of what I was suggesting, and I'll try to include some examples in the future. I might as well have filed a bug report saying: "It doesn't work" :-O.
Perhaps for a MultiList literal (when it gets solidified enough), a syntax like
x = #[[1,2,3], [4,5,6],[7,8,9]]
would be ok -- I like the # symbol for indicating a visual n-dimensional data, but the obvious drawback would be the contextual parsing to identify it isn't a comment :/ Perhaps a literal format like +[] might be ok.
- torial
- Posts: 229
- Location: IA
Re: MultiList
The # sign a clever idea. Unfortunately, it would be incompatible with the many syntax highlighting configs we have. Most editors don't allow for saying that '#' is for single comments, but for certain exceptions. And that raises the question of whether it would be good for humans as well.
Thanks for the follow up.
Thanks for the follow up.
- Charles
- Posts: 2515
- Location: Los Angeles, CA
Re: MultiList
I think the x values for each dimension could be memoized
to optimize as user making a lot of value lookups
for j in i+1 : len, x *= _shape[j]
to optimize as user making a lot of value lookups
- Code: Select all
sum = 0
for i in ...
for j in ...
for k in ...
for l in ...
sum = ml[i,j,k,l]
- jaegs
- Posts: 58
Re: MultiList
In the simple case of summing, you can just write one for loop that goes through .items. Although that doesn't eliminate your possible point.
Does Boost.MultiList cache the indexes? If we cache them, do benchmarks show a pay off?
Does Boost.MultiList cache the indexes? If we cache them, do benchmarks show a pay off?
- Charles
- Posts: 2515
- Location: Los Angeles, CA
Re: MultiList
I have poor reading comprehension when it comes to C++ but it looks like Boost.MultiArray caches the indexes. The technical term to use is "strides." A stride is how far one must move in memory to get to the next element in an array along a certain dimension. In addition to accessing elements in a Boost.MultiArray directly, there are also "Views"
which allows you to specify a range for each dimension (aka slicing) and doesn't copy the underlying data. I'm assuming accessing a MultiArray directly uses a view too but I'm not sure.
The stride array is a data member of a View. See View.hpp
I'm in favor of the "to" syntax to specify the exact type of a literal. There are tons of dictionary implementations and using the same literal syntax for all of them is desirable. I'm not clear as to how the "to" keyword differs from "as."
If I get some time next week I'll look into extending the MultiArray class. Methods I'm considering implementing are .reshape, .compareTo, .size (or .numElements). strides and shape will be get properties. Also a constructor that also takes in "data as T*".
One choice is to have an accompanying MultiArrayView class so that one can have multiple views of the same multi array like in Boost. A slice will return a MultiArrayView. Alternatively, there could be no MultiArrayView class, and then MultiArray would expose just a single view of itself, like in numpy.
There's lots of other less important methods on the numpy site if you're interested.
Hopefully I understand all of this correctly.
Also, is there a particular reason for not implementing getHashCode
or is it a TODO?
BTW, what is currently the best editor to use on a Mac?
- Code: Select all
array_type::array_view<3>::type myview =
myarray[ boost::indices[range(0,2)][range(1,3)][range(0,4,2)] ];
which allows you to specify a range for each dimension (aka slicing) and doesn't copy the underlying data. I'm assuming accessing a MultiArray directly uses a view too but I'm not sure.
The stride array is a data member of a View. See View.hpp
- Code: Select all
const index* strides() const {
return stride_list_.data();
}
On a related note, we've played with the idea of allowing a type specified for collection literals like:# we're start the list with a Square,
# but want the list type to be more general:
t = [Square()] to List<of Shape>
I'm in favor of the "to" syntax to specify the exact type of a literal. There are tons of dictionary implementations and using the same literal syntax for all of them is desirable. I'm not clear as to how the "to" keyword differs from "as."
If I get some time next week I'll look into extending the MultiArray class. Methods I'm considering implementing are .reshape, .compareTo, .size (or .numElements). strides and shape will be get properties. Also a constructor that also takes in "data as T*".
One choice is to have an accompanying MultiArrayView class so that one can have multiple views of the same multi array like in Boost. A slice will return a MultiArrayView. Alternatively, there could be no MultiArrayView class, and then MultiArray would expose just a single view of itself, like in numpy.
There's lots of other less important methods on the numpy site if you're interested.
Hopefully I understand all of this correctly.
Also, is there a particular reason for not implementing getHashCode
def getHashCode as int is override
throw InvalidOperationException()
or is it a TODO?
BTW, what is currently the best editor to use on a Mac?
- jaegs
- Posts: 58
Re: MultiList
Interesting...
>>> a = np.array([[1,2],[3,4]])
>>> b = a[1]
>>> b
array([3, 4])
>>> a[1][1] = 5
>>> b
array([3, 5])
>>> b[1][1] = 6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'numpy.int64' object does not support item assignment
- jaegs
- Posts: 58
Re: MultiList
I'm not clear as to how the "to" keyword differs from "as."
'to' is the keyword for a cast to an accessible Type
'as' is the keyword for a Type Declaration
The thing following the keyword is always a TypeName in both cases
# in both the following a is typed as a Number
var a as Number = 999
# default inferred type from assignment from the init-expr is as an int
# declaring it as a wider but still assignment compatible Type allows the variable to be typed
# explicitly ( or typed without assignment)
var a = 999 to Number
# uses type inference to make 'a' a Number since the int 999 is cast to a Number and the type of
# the initialising expression is then Number
In most Languages 'views' onto an underlying structure ( or parts thereof) are usually readonly (for performance)
- hopscc
- Posts: 632
- Location: New Plymouth, Taranaki, New Zealand
36 posts
• Page 1 of 4 • 1, 2, 3, 4
Who is online
Users browsing this forum: No registered users and 11 guests