I still think there's room for experimenting with a Lisp that has the hashmap (a.k.a. dictionary, associative array, property bag) as its organizing principle rather than the list. There's nothing better than hashmaps for exploratory programming. If there is something to be gained from building a Lisp in terms of them (from the ground up; I don't mean support for object literals), I'd like to know what it is. This has little to do with syntax.
This presents a very thorny language design problem. Lists have this very nice property that they are ordered, which lets you attach implied semantics to the position of an element, e.g.:
(defun foo (a b c) ...)
We know this is a function defining because the symbol DEFUN is in the FIRST position of the list.
Associative arrays are unordered, so you have to explicitly label everything:
{ top-level-form : defun, arguments : { 1 : a, 2 : b, 3 : c }, ... }
Or something like that. You can see where that would get annoying.
Exercise for the reader: what happens if you base a Lisp-like language on vectors instead of cons cells?
Could you allow (a b c) as a synonym for { 0 a 1 b 2 c }? That's more or less how JS deals with this, though JS feels messy under the hood.
what happens if you base a Lisp-like language on vectors instead of cons cells?
I want to know that too! One thing is that you lose cons and cdr (except by copying the vector), so you lose the ability to do most of the classic Lisp things efficiently. You must have something in mind other than this?
> Could you allow (a b c) as a synonym for { 0 a 1 b 2 c }?
Yes, of course, but all you've done is re-invent vectors.
> you lose cdr (except by copying the rest of the vector)
Copying the rest of the vector is not semantically equivalent to CDR unless you are being purely functional. But you can actually implement CDR using displaced vectors.
> You must have something in mind other than this?
Could be :-)
Try writing EVAL for a vector-based Lisp and see what happens. In particular, when you're done, make a list of all the primitives you needed to employ in order to do it. There's a deep insight at the end of this process. (No, I'm not going to tell you what it is.)
Yes, of course, but all you've done is re-invent vectors.
True, but only as a device to keep the playing field open for any new benefits that might spring from universal hashmaps. What are they? I don't know, that's the question. Maybe there aren't any, but that itself would be interesting as a negative result (the L in Lisp is there for a reason).
What I'd really like to know is what would complete this sequence: list:Lisp, array:APL, stack:Forth, hashmap:?. That would be really interesting, no? And no one seems to be working on it, which is surprising given the ubiquity of hashmaps in modern languages (especially JS, but also Python, Lua, etc).
when I had a similar thought about hashmaps, I noticed that a hashmap is basically a function that takes one argument and returns a constant. And then I fell into a Turing tar pit about Church encoding. ... I'd love to try again, though.
> In particular, when you're done, make a list of all the primitives you needed to employ in order to do it. There's a deep insight at the end of this process. (No, I'm not going to tell you what it is.)
Having done this before (https://github.com/munificent/lark), all it meant for me was that numbers and basic integer arithmetic became primitive. The clever bit about McCarthy's original metacircular interpreter is that it didn't need any types at all beyond atoms and cons cells: no bools, no ints, no nothing.
I found it interesting, but I don't know if I reached any deep insight. Maybe that's just me, or maybe I missed something.
I suppose depth is in the eye of the beholder. That's pretty much it: to build a Lisp out of vectors you need numbers as primitives. To build a Lisp out of cons cells you don't.
Follow-up exercise: do you need numbers to build a Lisp out of associative arrays? (Hint: this is not a trivial question to answer.)
In Clojure an associative array auto-decomposes to a sequence of pairs when necessary.
Rereading McCarthy's paper yesterday I noticed for the first time that it didn't need numbers. But I hadn't connected all the dots yet.
I'll bite: cdr gets expensive? You could represent a list as a naked array of lisp pointers, but then cons gets expensive. And good luck garbage collecting the vectors.
> cdr gets expensive
Yes, that's one thing. But do you really need CDR?
> You could represent a list as a naked array of lisp pointers, but then cons gets expensive.
Actually, it's much worse than that. CONS actually becomes impossible. (Think about it.) But maybe there's something else that you can use instead of CONS?
> And good luck garbage collecting the vectors
Why would that be a problem? Extant Lisps collect vectors just fine.
"Extant Lisps collect vectors just fine."
But they don't permit pointers inside the vectors, do they? I think if you permit internal pointers so you can have copyless cdr, computing reachability would get really expensive.
The phrase "permit pointers inside the vectors" doesn't really make sense. Vectors are just ordered collections of objects. They don't have insides and outsides, except insofar as the objects contained by the vector could be (but usually aren't) considered "inside" the vector. Lisp vectors are usually implemented as arrays of pointers, so in that sense pointers are "permitted" inside vectors. But I'm guessing that's not what you meant. (I have no idea what you actually meant.)
If you implement lists as vectors with car/cdr (I'm still thinking about your other questions) and choose to have cdr not copy results out of the original cell, you can end up in the (rope-like) situation above with pointer B (value (b c)) in addition to pointer A (value (a b c)). But that makes GC inefficient. For a vector to be GC'd you have to know what internal pointers it has, and then you have to make the choice to copy them out if they're 'toward the end', or to leave the vector uncollected.
(ignore pointers to a, b, and c.)
Ah. What you are calling a "pointer to the inside of a vector" is actually called a displaced vector. It's a standard Common Lisp feature, and no problem for modern GCs.
Interesting! Thanks.
It does mean that you are potentially keeping around masses of data when you only care about one or two elements. Normally not an issue, but normally internal pointers ("displaced arrays") aren't a central organizational concept in a language.
The point of json, I think, is that you'd only use associative arrays when you need them.
Nobody seems to have mentioned the advantage of table literals: free syntax for ruby-style keyword arts.
could accept a map of variable names
> Associative arrays are unordered
That's not true in general, is it? It's just hash maps that are unordered. You can use a tree to represent an associative array (so the order of keys is found by a tree traversal), or a "worse" method like an association list (linked list). You can also just store a linked list of keys along with your hash map.
You are conflating the abstract data type with its implementation. A hash-map is one implementation of an associative array. A tree is another implementation of an associative array. A tree happens to depend on an ordering of the keys in order to do efficient lookup and a hash-map doesn't, but this is an implementation detail. It's not part of the definition of an associative array. And this is a good thing because not all data types have well defined orderings but you still might want to use them as keys, with symbols being the most notable example.
I think the mistake the parent post has made is conflating the kind of ordering you're referring to -- ordering in a tree implementation to allow for O(log n) resolution -- and ordering to preserve the position of each key-value pair as it was in the source structure. These are only the same thing if the source structure already happens to be sorted by some kind of ordering, something that obviously cannot be extended to the general case of preserving positionality.
AFAIK, the only two ways to preserve positionality as part of an associative array are (1) to store as a postional list of key-value pairs and use linear search to find matching keys, or (2) to use an additional data structure as an index, either by storing as a positional list and having a additional map from key value into the positions, or by storing as a map and having an additional list of keys stored in positional ordering.
Or by having the map nodes have a "next" pointer.
Duh, I should have thought of that. Thank you.
Smalltalk requires explicit labeling of all but the first argument (though they are ordered) for keyword messages. Good naming conventions can make this enjoyable (at least subjectively).
In a hashmap based lisp, I'd expect defun to look something like
(define-function: (foo-with-a: a b: b c: c) with-body: ...)
Of course, since you're using an unordered representation, one could instead say (with-body: ... define-function: (c: c b: b foo-with-a: a)), which is significantly more confusing. This also means you probably can't do implicit progn.
progn in general will probably be unpleasant, unfortunately, unless one includes syntactic sugar for ordered sequences, which might defeat the purpose of an associative-array-based Lisp.
The first thing I've noticed about a vector-based Lisp is that you actually need primitive integers so that you can do indexing, which isn't the case with cons-cell based Lisps.
> Exercise for the reader: what happens if you base a Lisp-like language on vectors instead of cons cells?
I've been thinking about this in terms of a self-hosted Parenscript. What I see is a lot more pattern matching and map/reducing. Not altogether a bad thing.
Lua tables are a general-purpose associative structure that are used for both map and list functions in that language: there's syntactic sugar for both consecutive integer keys and simple string keys, and the former are used to make sequences (with internal, semantics-preserving optimizations in the Lua runtime for storing them). They strike me as one of the best options for a single core data structure if one were limited to one.
I haven't use Lua, but that sounds exactly like JavaScript objects. JavaScript does get on very well with basically just these objects.
They're similar, but the Lua form is much cleaner. In particular, JS objects only permit string keys, which makes Arrays sort of preposterous; they also have other baggage related to the prototype system. Lua tables can have any type except nil (which is not truly first-class in Lua) as keys, and can also have "metatables", which are of similar or greater power to the latter baggage but semantically somewhat simpler. To me this makes a world of difference.
You could argue that Javascript is a language with the hashmap as its central data structure. It doesn't have a macro system/quotations/other LISPy things, of course, but you can certainly frame your argument as "would JS benefit from <LISP feature>?"
Yes, JS certainly is that (or comes close), which is the thing I like the most about JS and miss the most in Lisp - so much so that I wrote macros to allow me to use JS-style objects wherever possible. (I don't mean JS-style OO, which I avoid at all costs. Hate "this" and "prototype", love "a.b" and "a[b]".) The benefit of the hashmap for exploratory programming is so great that it raises the question of whether a fully regular/metaprogrammable Lisp could be evolved from it. Not clear if it would be practical, or add anything of value, but if it did it would be interesting.