FWIW, I think the idea he presents of performing algebra (or other formal syntactic or semantic operations) to derive and transform programs deserves more attention than it seems to get.
I was doing some experiments with Joy ( https://joypy.osdn.io/ ) and some meta-programming that would have been pretty gnarly in most other notations was very straightforward:
The actual mental slavery, is the notion of instructions.
You have Operand A, Operand B, and a hidden Operand I, which operates on given Operands, encoded in a endless chain of drudgery.
Now imagine a system, which is deterministic, but purely phyics based, data flows continously at maximum possible speed, by beeing copied from A to *, and operations, are executed purely as a attribute of data- meaning, this river carrying sand, carrys certain certain sand, again purely as a property to a stone in the river, where it is grinded, split, etc..
Then again, the new data, having diffrent "physical" properties, is sped along to other streams.
The art to program such a machine, would be to understand the physics of data, of the system, to place a stone upriver, and to know, where downriver to reach, to collect the gem, liberated and polished.
Such dataphysics also has interesting inherent properties. Interaction can never reach farther then the number of processors in parallel times the maximum length of data. So dataphysics has a lightcone, within which it is fully parallel. One could take data out and insert it into this determinstic simulations, simply by staying out of sight and keeping causality.
But i digress.
Shameless plug: Try flowgrid.org -- Unfortunately it tends to be more cumbersome than "traditional" programming. I hope the tutorial is some fun though. Perhaps because one just gets used to traditional programming too much? Data flow programming seems to be successful for Node Red though...
DonHopkins on Oct 26, 2017 | parent | favorite | on: Cryptography with Cellular Automata (1985) [pdf]
A "Moveable Feast Machine" is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture.
It's similar to a Cellular Automata, but it different in several important ways, for the sake of "Robust First Computing". These differences give some insight into what CA really are, and what their limitations are.
Cellular Automata are synchronous and deterministic, and can only modify the current cell: all cells are evaluated at once (so the evaluation order doesn't matter), so it's necessary to double buffer the "before" and "after" cells, and the rule can only change the value of the current (center) cell. Moveable Feast Machines are like asynchronous non-deterministic cellular automata with large windows that can modify adjacent cells.
Here's a great example with an amazing demo and explanation, and some stuff I posted about it earlier:
A Movable Feast Machine [1] is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture.
The video "Distributed City Generation" [2] demonstrates how you can program a set of Movable Feast Machine rules that build a self-healing city that fills all available space with urban sprawl, and even repairs itself after disasters!
The paper "Local Routing in a new Indefinitely Scalable Architecture" [2] by Trent Small explains how those rules work, how the city streets adaptively learn how to route the cars to nearby buildings they desire to find, and illustrated the advantages of "Robust First" computing:
Abstract: Local routing is a problem which most of us face on a daily basis as we move around the cities we live in. This study proposes several routing methods based on road signs in a procedurally generated city which does not assume knowledge of global city structure and shows its overall efficiency in a variety of dense city environments. We show that techniques such as Intersection-Canalization allow for this method to be feasible for routing information arbitrarily on an architecture with limited resources.
This talk "Robust-first computing: Demon Horde Sort" [4] by Dave Ackley describes an inherently robust sorting machine, like "sorting gas", implemented with the open source Movable Feast Machine simulator, available on github [5].
A Movable Feast Machine is similar in many ways to traditional cellular automata, except for a few important differences that are necessary for infinitely scalable, robust first computing.
First, the rules are applied to cells in random order, instead of all at once sequentially (which requires double buffering). Many rule application events may execute in parallel, as long as their "light cones" or cells visible to the executing rules do not overlap.
Second, the "light cone" of a rules, aka the "neighborhood" in cellular automata terms, is larger than typical cellular automata, so the rule can see other cells several steps away.
Third, the rules have write access to all of the cells in the light cone, not just the one in the center like cellular automata rules. So they can swap cells around to enable mobile machines, which is quite difficult in cellular automata rules like John von Neumann's classic 29 state CA. [6] [7]
Forth, diffusion is built in. A rule may move the particle to another empty cell, or swap it with another particle in a different cell. And most rules automatically move the particle into a randomly chosen adjacent cell, by default. So the particles behave like gas moving with brownian motion, unless biased by "smart" rules like Maxwell's Demon, like the "sorting gas" described in the Demon Hoard Sort video.
In this video "Robust-first computing: Announcing ULAM at ECAL 2015" [8], David Ackley explains why "Robust First" computing and computing architectures like Movable Feast Machines are so incredibly important for scaling up incredibly parallel hardware.
I think this is incredibly important stuff in the long term, because we've hit the wall with determinism, and the demos are so mind blowing and visually breathtaking, that I want to try programming some of my own Movable Feast Machine systems!
Absolutely, their important work foreshadowed and inspired so much great stuff. Also Douglas Engelbart's NLS pioneered many of the ideas of visual programming.
I think spreadsheets also qualify as visual programming languages, because they're two-dimensional and grid based in a way that one-dimensional textual programming languages aren't.
The grid enables them to use relative and absolute 2D addressing, so you can copy and paste formulae between cells, so they're reusable and relocatable. And you can enter addresses and operands by pointing and clicking and dragging, instead of (or as well as) typing text.
Some people mistakenly assume visual programming languages necessarily don't use text, or that they must use icons and wires, so they don't consider spreadsheets to be visual programming languages.
Spreadsheets are a wildly successful (and extremely popular) example of a visual programming language that doesn't forsake all the advantages of text based languages, but builds on top of them instead.
And their widespread use and success disproves the theory that visual programming languages are esoteric or obscure or not as powerful as text based languages.
Other more esoteric, graphical, grid-based visual programming languages include cellular automata (which von Neumann explored), and more recently "robust first computing" architectures like the Moveable Feast Machine.
>A rough video demo of Trent R. Small's procedural city generation dynamics in the Movable Feast Machine simulator. See http://nm8.us/q for more information. Apologies for the poor audio!
Programming the Movable Feast Machine with λ-Codons
>λ-Codons provide a mechanism for describing arbitrary computations in the Movable Feast Machine (MFM). A collection of λ-Codon molecules describe the computation by a series of primitive functions. Evaluator particles carry along a stack of memory (which initially contains the input to the program) and visit the λ-Codons, which they interpret as functions and apply to their stacks. When the program completes, they transmute into output particles and carry the answer to the output terminals (left).
Very beautiful and artistically rendered! Those would make great fireworks and weapons in Minecraft!
From a different engineering perspective, Dave Ackley had some interesting things to say about the difficulties of going from 2D to 3D, which I quoted in an earlier discussion about visual programming:
David Ackley, who developed the two-dimensional CA-like "Moveable Feast Machine" architecture for "Robust First Computing", touched on moving from 2D to 3D in his retirement talk:
"Well 3D is the number one question. And my answer is, depending on what mood I'm in, we need to crawl before we fly."
"Or I say, I need to actually preserve one dimension to build the thing and fix it. Imagine if you had a three-dimensional computer, how you can actually fix something in the middle of it? It's going to be a bit of a challenge."
"So fundamentally, I'm just keeping the third dimension in my back pocket, to do other engineering. I think it would be relatively easy to imaging taking a 2D model like this, and having a finite number of layers of it, sort of a 2.1D model, where there would be a little local communication up and down, and then it was indefinitely scalable in two dimensions."
"And I think that might in fact be quite powerful. Beyond that you think about things like what about wrap-around torus connectivity rooowaaah, non-euclidian dwooraaah, aaah uuh, they say you can do that if you want, but you have to respect indefinite scalability. Our world is 3D, and you can make little tricks to make toruses embedded in a thing, but it has other consequences."
Here's more stuff about the Moveable Feast Machine:
Sorry but this goes in the bucket of all-to-common backseat programming comments I find all too often on HN. Sounds pretty, has a lot of "ideas" but is absolute hogwash.
If you've programmed an FPGA, or done any sort of analog circuitry, you'll know that what you propose is FAR WORSE than the current status quo.
CPUs already work like that internally to some extent. Think of the program as a serialized description of a computation graph. Instructions are not a mental slavery. They are the most efficient way to encode and decode the graph. A superscalar processor is tracking dependencies and will reorder or execute multiple operations at the same time if it has determined that it can do so.
I wonder if anyone noticed how people first came up with the architecture of a single bus for data and code and how they then had endless problems due to writing data where code should be and executing data as code. And they plug the problems by forbidding to write data and code in the same place and to read data as code (afaik).
So yeah, should maybe computers be liberated from the single pipeline?
I have a Gigatron (https://gigatron.io/) which has this kind of architecture. Only the ROM can contain machine instructions, so the only way to program it is to use the ROM interpreter to run bytecode stored in RAM, or to burn a new ROM. Frankly this aspect of it sucks (the Gigatron in itself however is a wonderful learning project and recommended).
There's really not a problem with having a "single bus" for code and data. On modern computers where you can mark the data with the NX bit to ensure it doesn't get run, while at the same time allowing you to load programs as data and mark them as executable.
> So yeah, should maybe computers be liberated from the single pipeline?
Internally, they basically are. Instruction and data caches are separate and only connect at the L2 level. Aliasing between the two pipelines is only allowed if software requests it, with the specific architecture determining the amount of automatic vs. manual synchronization required. What would you prefer, physically disjoint DRAM?
I think I've read about programs misapplying the execution bits accidentally, such that malicious data can slip by. Buffer overflows still don't seem to be shrugged away despite the bits.
Maybe finish the move and do exactly what was proposed before the single pipeline: don't ever mix data and code in the same memory. No need for physically separating the RAM. Just have designated areas, maybe even allocate code areas on the fly―only not by the program itself.
Even JIT can be accommodated this way, I think: since you know that code comes from the program and not errant buffers, and if the program is marked as JIT-type, let it write to a code area allocated just for this program. But probably not copy from the data memory.
If you have two pipelines you still have a von Neumann architecture. The core principle of the von Neumann architecture isn't how many buses you have and what restrictions you apply to them. The point is that you separate the processing element from the memory and connect the two over a bus which then causes you to suffer from the downsides of having a bus.
* It is not given that the "Von Neumann Bottleneck" is what generates flabby, verbose programs.
* Computers execute programs, human reason about them. There's no reason to expect the same tools to apply. This has similarities to Linus' objections to C++ in the Linux kernel - kernel code is an implement of high level concepts and there's no to have the language structures partly support or impose these high level concepts; instead one should understand abstractly and then bring them to implementation.
* The operations of functional programming aren't efficient in their naive implementation. They can be made efficient with an optimizing compiler but this effectively results in the programmer thinking about more, not less, elements.
* Dijkstra takes issue with Backus' assumption that functional programs will make program proving accessible to the average programmer. I think Dijkstra wants to up the skill of the average programmer rather than expecting that functional programs will make proving easier. (this one I'm less sure of but it seems like you have any situation where only "unaverage" programmers are going to be using functional languages).
Dijkstra closes: "In short, the article is a progress report on a valid research effort but suffers badly from aggressive overselling of its significance, long before convincing result have been reached."
And I think the situation now is that FP has shown itself to be definitely useful in some domains but still suffers from publicity that implies it is a paradigm that will "eat" ordinary software engineering.
That it has huge mind-share for what it isn't might be as bad as it not being widely used for what it is useful for.
We probably will see the end of the von Neumann Style, but it won't be because programmers don't like it. It will because in it's modern incarnation it's horribly inefficient.
This breakdown of power consumption of a 32 bit ADD by various bits of the CPU was taken from a Tesla presentation about their navigation computer, https://youtu.be/Ucp0TTmvqOE?t=5247
(sometimes I curse HN limited formatting. it doesn't do much, and yet the little it does often manages to get in the way.)
The ADD takes 0.15% of the total energy consumption. His breakdown doesn't show it, but I'm guessing the rest could be divided up into three things: getting the data to the ALU, and decoding of the instruction stream (which can probably be better thought of as a form of decompression), and control overhead mostly devoted to extracting parallelism from a serial set of instructions.
0.15% is woeful, particularly so when you realise the main constraint on speed now is power, or more precisely getting rid of it. We already have architectures that do better than tvon Neumann (eg, GPU's, NPU's), but even those are orders of magnitude away from what the brain achieves in terms of MIP's/watt.
The main constraint here is not building it. IBM did build a computer that mimics how the brain does things. It was a neural network and so massively parallel, but just as importantly information was sent by varying the number of pulses sent in a unit of time. That's important because if no information is sent there are no pulses, so potentially bugger all power was used.
The main constraint is programming it. We can't do it, and we have no idea where to begin. This debate about FP vs statefull programming seems to be a never ending source of entertainment in our field, but it's beside the point as neither have much relevance to solving this problem.
If the Alonzo Church model and Turing model of computing are computationally equivalent, then does it really matter? I may not be understanding correctly.
Be careful speaking negatively about functional programming... there's an almost cult like following obsession that's almost anti-establishment, bordem with imperative paradigms, or more in pursuit of leaving a mark/legacy.
I've used functional and imperative programming for varied tasks throughout my career. Sometimes, functional programming approaches are fantastic and intuitive for tasks, other times, they're counterintuitive and become cumbersome depending on what you're doing.
While academically, functional programming can be interesting in many respects, for most development needs it can hamper development progress more than assist (humans have lots of tendencies and history that gives imperative approaches an edge).
For humans, computers exist to help us perform tasks we can't very well or quickly and do so consistently. As long as you can do that, people in general don't care about what underlying theoretical computational abstractions you choose--lets not lose sight of that fact. Only those in computing will really care.
Functional programming isn't new per se, there's just a trend of rediscovery and advancement going on. There's something to be said about supporting the underdog. Whether imperative paradigms will always dominate as the higher level abstraction of choice is questionable since the software field is still, relatively speaking, new. It likely isn't going to change much in any near future, however.
I don't see FP (and I mean pure FP style) as the underdog. I think the FP community tend to blame FP's repeated failure to reach mainstream on either developers' stubbornness or FP's poor implementation in many languages.
The way I see it however, is that FP has had plenty of attention and plenty of opportunities to go mainstream but it failed to do so simply because it's not as effective as imperative programming when it comes to building most real-world software.
It's not like the market doesn't know. Every developer on the planet knows what FP is and knows all the benefits. Some imperative languages like JavaScript have even adopted some aspects of FP for certain use cases (some aspects of the front end UI related to dynamic templates; e.g. VueJS, React).
But FP is ideal only for certain narrow niches. FP proponents should just leave it at that and stop trying to promote FP as if there is a fundamental law of science which says that it's superior to imperative programming because it's not.
My personal experience with FP is that it's not good for creating large modular software. You can't create effective abstractions that are accurate models of real world entities without encapsulating their state as well. State is an essential and inseparable part of the abstraction when we talk about modeling real-world entities in software.
Here's the thing, Backus was not coming from the lisp and embryonic ML communities that were function programming research at the time. This was career-retrospective speech be a new-commer to the field. I think the existing community was surprised and bemused.
Certainly your regular function programminger would have chuckled at all the "point-free" style FP. This was a decade before Haskell mind you, and at least a decade before lots of function reuse became popular (being blocked on better strictness analysis and a lack of type classes and instances). Remember in then-cutting-edge Scheme, (define ...) has an implicit (begin ...) rather than single-expression body, and don't get me started on the other imperative lisps. APL before FP just made the existing community laugh harder.
So fast forward a while, and even today implicit parallelism is not happening much in production in Haskell. Applicative is much more popular than Arrow, and thus everyone is still writing Von Neuman code even if its dressed up functionally.
Well, let me tell, that thesis puts a lot of the pieces together:
The real genius of point-free isn't getting rid of the names. We have debruign indices for that. Nor is it allowing other modules of computation. Computer scientists found good ole' 1970s Category Theory (In principle, Backus could have done this too!) and now we know about (symmetric) monoidal categories and their internal languages and all the stuff we need to point-free-ify the programs in lambda-calculus-influenced calculii.
The real genius is dealing with dynamism / higher-order programs. With the raw lambda calculus, you are in a fun house of higher-orderism and it's hard to jump out of the substitution / term-rewritting mindset. CPS, which occupied much of the functional research community for at least a man-decade, only further serializes and higher-order-ifizes things. Monadic programs bind promiscuously; every step is potentially dynamic af, one again also serializing and higher-order-ifizing things.
With arrow-like things, however (and Control.Cateogry.* and Profunctor is really much better at distilling the concepts than the legacy Arrow class, thanks Ed), suddenly we have a decently expressive form of computation with no lambdas in site. We've lifted the "fog of war", freeing the first order "90%" of the work from the higher order "10%".
Arrow notation as implemented in Haskell today, however, abuses `arr` for the structural rules of the guest language (weakening, contraction, and exchange per Wikipedia [1]). This reshrowds everything and ruins all the promise of Arrows. Yuck.
Back to the thesis, there are 2 big contributions:
1) A practical version of Arrow which doesn't need arr for structural rules. (Control.Cateogry.* since does this better and with more faithfulness to the math, but had to start somewhere.)
2) A theory of heterogenous metaprogramming so the first order and higher order bits don't even have to be the same language. We're talking things like and FPGA that computes its own re-flashing.
There is also a protoype implementation of various lambdas so humans can still use names, but I hear its bitrotted and so I can't vouch for it.
So, back to John Backus. The thing is programmers are lazy. They rather write in the language they know, or learn the language their teachers know. Also, humans need at least the option of some names (even if we have to name too many things today), so mandatory combinators will cause problems as long as we struck programming in shitty, 1-dimentional text. Finally, without at least the taste of actually compiling to other programming models, along with scaling programs with the Von Neumann model, the economic and curiosity incentives just aren't there. And in 1978 the Von Neuman model was still fucking killing it. Wires were fast, and components be they spinning disks, ALUs, or DRAM were all kinda slow but in similar and predictable ways. Programming DOS was a shit-show, of course, but so were non-trivial compilers. Nothing was ready.
Now we know how to do program better, have the power to run the compilers, and the legacy machines have a cost model I like to think of as "molasses in capillaries with huge bottlenecks". We're ready.
I think it's worth it. Everything is lambda-in-a-straight-jacket-like. The question is how can you mix and match straight jackets, and how convoluted is programming.
Categorical logic reframes the lambda calculus as being "simply" the internal language of cartesian-closed categories. Doing the same for other categories leads to other languages that are not so lambda-like. Sometimes categories are best characterized by graphical languages, as seen in string diagrams and generalizations thereof. Studying these things involves a rather thriving subfield at the intersection of PLT, category theory and foundational mathematics/logic.
He designed a 29 state cellular automata architecture to implement a universal constructor that could reproduce itself (which he worked out on paper, amazingly):
He actually philosophized about three different kinds of universal constructors at different levels of reality:
First, the purely deterministic and relatively harmless mathematical kind referenced above, an idealized abstract 29 state cellular automata, which could reproduce itself with a Universal Constructor, but was quite brittle, synchronous, and intolerant of errors. These have been digitally implemented in the real world on modern computing machinery, and they make great virtual pets, kind of like digital tribbles, but not as cute and fuzzy.
Second, the physical mechanical and potentially dangerous kind, which is robust and error tolerant enough to work in the real world (given enough resources), and is now a popular theme in sci-fi: the self reproducing robot swarms called "Von Neumann Probes" on the astronomical scale, or "Gray Goo" on the nanotech scale.
>The von Neumann probe, nicknamed the Goo, was a self-replicating nanomass capable of traversing through keyholes, which are wormholes in space. The probe was named after Hungarian-American scientist John von Neumann, who popularized the idea of self-replicating machines.
Third, the probabilistic quantum mechanical kind, which could mutate and model evolutionary processes, and rip holes in the space-time continuum, which he unfortunately (or fortunately, the the sake of humanity) didn't have time to fully explore before his tragic death.
p. 99 of "Theory of Self-Reproducing Automata":
>Von Neumann had been interested in the applications of probability
theory throughout his career; his work on the foundations of quantum
mechanics and his theory of games are examples. When he became
interested in automata, it was natural for him to apply probability
theory here also. The Third Lecture of Part I of the present work is
devoted to this subject. His "Probabilistic Logics and the Synthesis
of Reliable Organisms from Unreliable Components" is the first work
on probabilistic automata, that is, automata in which the transitions
between states are probabilistic rather than deterministic. Whenever
he discussed self-reproduction, he mentioned mutations, which are
random changes of elements (cf. p. 86 above and Sec. 1.7.4.2 below).
In Section 1.1.2.1 above and Section 1.8 below he posed the problems
of modeling evolutionary processes in the framework of automata
theory, of quantizing natural selection, and of explaining how highly
efficient, complex, powerful automata can evolve from inefficient,
simple, weak automata. A complete solution to these problems would
give us a probabilistic model of self-reproduction and evolution. [9]
[9] For some related work, see J. H. Holland, "Outline for a Logical Theory of Adaptive Systems", and "Concerning Efficient Adaptive Systems".
> Although I refer to conventional languages as "von Neumann languages" to take note of their origin and style, I do not, of course, blame the great mathematician for their complexity. In fact, some might say that I bear some responsibility for that problem.
FWIW, I think the idea he presents of performing algebra (or other formal syntactic or semantic operations) to derive and transform programs deserves more attention than it seems to get.
I was doing some experiments with Joy ( https://joypy.osdn.io/ ) and some meta-programming that would have been pretty gnarly in most other notations was very straightforward:
https://joypy.osdn.io/notebooks/Quadratic.html#derive-a-defi...
https://joypy.osdn.io/notebooks/Recursion_Combinators.html#d...
https://joypy.osdn.io/notebooks/Generator_Programs.html#maki...
https://joypy.osdn.io/notebooks/Newton-Raphson.html#finding-...
Joy is similar to Backus' FP and to the point-free Haskell of Conal Elliot's "Compiling to categories" http://conal.net/papers/compiling-to-categories/ https://joypy.osdn.io/notebooks/Categorical.html
The actual mental slavery, is the notion of instructions. You have Operand A, Operand B, and a hidden Operand I, which operates on given Operands, encoded in a endless chain of drudgery.
Now imagine a system, which is deterministic, but purely phyics based, data flows continously at maximum possible speed, by beeing copied from A to *, and operations, are executed purely as a attribute of data- meaning, this river carrying sand, carrys certain certain sand, again purely as a property to a stone in the river, where it is grinded, split, etc..
Then again, the new data, having diffrent "physical" properties, is sped along to other streams.
The art to program such a machine, would be to understand the physics of data, of the system, to place a stone upriver, and to know, where downriver to reach, to collect the gem, liberated and polished.
Such dataphysics also has interesting inherent properties. Interaction can never reach farther then the number of processors in parallel times the maximum length of data. So dataphysics has a lightcone, within which it is fully parallel. One could take data out and insert it into this determinstic simulations, simply by staying out of sight and keeping causality. But i digress.
Do you mean like Sutherland and Sproull's counterflow pipeline processor developed at Sun in the 90s?
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.4...
Shameless plug: Try flowgrid.org -- Unfortunately it tends to be more cumbersome than "traditional" programming. I hope the tutorial is some fun though. Perhaps because one just gets used to traditional programming too much? Data flow programming seems to be successful for Node Red though...
Sounds like a systolic array.
https://en.m.wikipedia.org/wiki/Systolic_array
Or an FPGA.
Or a spreadsheet.
Or a fragment shader.
I've posted about this before, but it's worth repeating here:
https://news.ycombinator.com/item?id=15560845
DonHopkins on Oct 26, 2017 | parent | favorite | on: Cryptography with Cellular Automata (1985) [pdf]
A "Moveable Feast Machine" is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture. It's similar to a Cellular Automata, but it different in several important ways, for the sake of "Robust First Computing". These differences give some insight into what CA really are, and what their limitations are.
Cellular Automata are synchronous and deterministic, and can only modify the current cell: all cells are evaluated at once (so the evaluation order doesn't matter), so it's necessary to double buffer the "before" and "after" cells, and the rule can only change the value of the current (center) cell. Moveable Feast Machines are like asynchronous non-deterministic cellular automata with large windows that can modify adjacent cells.
Here's a great example with an amazing demo and explanation, and some stuff I posted about it earlier:
https://news.ycombinator.com/item?id=14236973
Robust-first Computing: Distributed City Generation:
https://www.youtube.com/watch?v=XkSXERxucPc
A Movable Feast Machine [1] is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture.
The video "Distributed City Generation" [2] demonstrates how you can program a set of Movable Feast Machine rules that build a self-healing city that fills all available space with urban sprawl, and even repairs itself after disasters!
The paper "Local Routing in a new Indefinitely Scalable Architecture" [2] by Trent Small explains how those rules work, how the city streets adaptively learn how to route the cars to nearby buildings they desire to find, and illustrated the advantages of "Robust First" computing:
Abstract: Local routing is a problem which most of us face on a daily basis as we move around the cities we live in. This study proposes several routing methods based on road signs in a procedurally generated city which does not assume knowledge of global city structure and shows its overall efficiency in a variety of dense city environments. We show that techniques such as Intersection-Canalization allow for this method to be feasible for routing information arbitrarily on an architecture with limited resources.
This talk "Robust-first computing: Demon Horde Sort" [4] by Dave Ackley describes an inherently robust sorting machine, like "sorting gas", implemented with the open source Movable Feast Machine simulator, available on github [5].
A Movable Feast Machine is similar in many ways to traditional cellular automata, except for a few important differences that are necessary for infinitely scalable, robust first computing.
First, the rules are applied to cells in random order, instead of all at once sequentially (which requires double buffering). Many rule application events may execute in parallel, as long as their "light cones" or cells visible to the executing rules do not overlap.
Second, the "light cone" of a rules, aka the "neighborhood" in cellular automata terms, is larger than typical cellular automata, so the rule can see other cells several steps away.
Third, the rules have write access to all of the cells in the light cone, not just the one in the center like cellular automata rules. So they can swap cells around to enable mobile machines, which is quite difficult in cellular automata rules like John von Neumann's classic 29 state CA. [6] [7]
Forth, diffusion is built in. A rule may move the particle to another empty cell, or swap it with another particle in a different cell. And most rules automatically move the particle into a randomly chosen adjacent cell, by default. So the particles behave like gas moving with brownian motion, unless biased by "smart" rules like Maxwell's Demon, like the "sorting gas" described in the Demon Hoard Sort video.
In this video "Robust-first computing: Announcing ULAM at ECAL 2015" [8], David Ackley explains why "Robust First" computing and computing architectures like Movable Feast Machines are so incredibly important for scaling up incredibly parallel hardware.
I think this is incredibly important stuff in the long term, because we've hit the wall with determinism, and the demos are so mind blowing and visually breathtaking, that I want to try programming some of my own Movable Feast Machine systems!
[1] http://movablefeastmachine.org/
[2] https://www.youtube.com/watch?v=XkSXERxucPc
[3] http://www.cs.unm.edu/~ackley/papers/paper_tsmall1_11_24.pdf
[4] https://www.youtube.com/watch?v=helScS3coAE
[5] https://github.com/DaveAckley/MFM
[6] https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton
[7] https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...
[8] https://www.youtube.com/watch?v=aR7o8GPgSLk
https://news.ycombinator.com/item?id=18497241
DonHopkins on Nov 20, 2018 [-]
Absolutely, their important work foreshadowed and inspired so much great stuff. Also Douglas Engelbart's NLS pioneered many of the ideas of visual programming.
I think spreadsheets also qualify as visual programming languages, because they're two-dimensional and grid based in a way that one-dimensional textual programming languages aren't.
The grid enables them to use relative and absolute 2D addressing, so you can copy and paste formulae between cells, so they're reusable and relocatable. And you can enter addresses and operands by pointing and clicking and dragging, instead of (or as well as) typing text.
Some people mistakenly assume visual programming languages necessarily don't use text, or that they must use icons and wires, so they don't consider spreadsheets to be visual programming languages.
Spreadsheets are a wildly successful (and extremely popular) example of a visual programming language that doesn't forsake all the advantages of text based languages, but builds on top of them instead.
And their widespread use and success disproves the theory that visual programming languages are esoteric or obscure or not as powerful as text based languages.
Other more esoteric, graphical, grid-based visual programming languages include cellular automata (which von Neumann explored), and more recently "robust first computing" architectures like the Moveable Feast Machine.
https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton
https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...
Robust-first Computing: Distributed City Generation (Moveable Feast)
https://www.youtube.com/watch?v=XkSXERxucPc
>A rough video demo of Trent R. Small's procedural city generation dynamics in the Movable Feast Machine simulator. See http://nm8.us/q for more information. Apologies for the poor audio!
Programming the Movable Feast Machine with λ-Codons
https://www.youtube.com/watch?v=DauJ51CTIq8
>λ-Codons provide a mechanism for describing arbitrary computations in the Movable Feast Machine (MFM). A collection of λ-Codon molecules describe the computation by a series of primitive functions. Evaluator particles carry along a stack of memory (which initially contains the input to the program) and visit the λ-Codons, which they interpret as functions and apply to their stacks. When the program completes, they transmute into output particles and carry the answer to the output terminals (left).
https://news.ycombinator.com/item?id=21131468
DonHopkins 81 days ago [-]
Very beautiful and artistically rendered! Those would make great fireworks and weapons in Minecraft! From a different engineering perspective, Dave Ackley had some interesting things to say about the difficulties of going from 2D to 3D, which I quoted in an earlier discussion about visual programming:
https://news.ycombinator.com/item?id=18497585
David Ackley, who developed the two-dimensional CA-like "Moveable Feast Machine" architecture for "Robust First Computing", touched on moving from 2D to 3D in his retirement talk:
https://youtu.be/YtzKgTxtVH8?t=3780
"Well 3D is the number one question. And my answer is, depending on what mood I'm in, we need to crawl before we fly."
"Or I say, I need to actually preserve one dimension to build the thing and fix it. Imagine if you had a three-dimensional computer, how you can actually fix something in the middle of it? It's going to be a bit of a challenge."
"So fundamentally, I'm just keeping the third dimension in my back pocket, to do other engineering. I think it would be relatively easy to imaging taking a 2D model like this, and having a finite number of layers of it, sort of a 2.1D model, where there would be a little local communication up and down, and then it was indefinitely scalable in two dimensions."
"And I think that might in fact be quite powerful. Beyond that you think about things like what about wrap-around torus connectivity rooowaaah, non-euclidian dwooraaah, aaah uuh, they say you can do that if you want, but you have to respect indefinite scalability. Our world is 3D, and you can make little tricks to make toruses embedded in a thing, but it has other consequences."
Here's more stuff about the Moveable Feast Machine:
https://news.ycombinator.com/item?id=15560845
https://news.ycombinator.com/item?id=14236973
The most amazing mind blowing demo is Robust-first Computing: Distributed City Generation:
https://www.youtube.com/watch?v=XkSXERxucPc
And a paper about how that works:
https://www.cs.unm.edu/~ackley/papers/paper_tsmall1_11_24.pd....
Plus there's a lot more here:
https://movablefeastmachine.org/
Now he's working on a hardware implementation of indefinitely scalable robust first computing:
https://www.youtube.com/channel/UC1M91QuLZfCzHjBMEKvIc-A
Sorry but this goes in the bucket of all-to-common backseat programming comments I find all too often on HN. Sounds pretty, has a lot of "ideas" but is absolute hogwash.
If you've programmed an FPGA, or done any sort of analog circuitry, you'll know that what you propose is FAR WORSE than the current status quo.
CPUs already work like that internally to some extent. Think of the program as a serialized description of a computation graph. Instructions are not a mental slavery. They are the most efficient way to encode and decode the graph. A superscalar processor is tracking dependencies and will reorder or execute multiple operations at the same time if it has determined that it can do so.
2016: https://news.ycombinator.com/item?id=13210988
https://news.ycombinator.com/item?id=12159792
2015: https://news.ycombinator.com/item?id=10182712
2014: https://news.ycombinator.com/item?id=7671379
2009: https://news.ycombinator.com/item?id=768057
(Links for the curious.)
I wonder if anyone noticed how people first came up with the architecture of a single bus for data and code and how they then had endless problems due to writing data where code should be and executing data as code. And they plug the problems by forbidding to write data and code in the same place and to read data as code (afaik).
So yeah, should maybe computers be liberated from the single pipeline?
I have a Gigatron (https://gigatron.io/) which has this kind of architecture. Only the ROM can contain machine instructions, so the only way to program it is to use the ROM interpreter to run bytecode stored in RAM, or to burn a new ROM. Frankly this aspect of it sucks (the Gigatron in itself however is a wonderful learning project and recommended).
There's really not a problem with having a "single bus" for code and data. On modern computers where you can mark the data with the NX bit to ensure it doesn't get run, while at the same time allowing you to load programs as data and mark them as executable.
> So yeah, should maybe computers be liberated from the single pipeline?
Internally, they basically are. Instruction and data caches are separate and only connect at the L2 level. Aliasing between the two pipelines is only allowed if software requests it, with the specific architecture determining the amount of automatic vs. manual synchronization required. What would you prefer, physically disjoint DRAM?
I think I've read about programs misapplying the execution bits accidentally, such that malicious data can slip by. Buffer overflows still don't seem to be shrugged away despite the bits.
Maybe finish the move and do exactly what was proposed before the single pipeline: don't ever mix data and code in the same memory. No need for physically separating the RAM. Just have designated areas, maybe even allocate code areas on the fly―only not by the program itself.
Even JIT can be accommodated this way, I think: since you know that code comes from the program and not errant buffers, and if the program is marked as JIT-type, let it write to a code area allocated just for this program. But probably not copy from the data memory.
If you have two pipelines you still have a von Neumann architecture. The core principle of the von Neumann architecture isn't how many buses you have and what restrictions you apply to them. The point is that you separate the processing element from the memory and connect the two over a bus which then causes you to suffer from the downsides of having a bus.
A rebuttal by Dijkstra to Backus' paper is here (actually, a deconstruction): http://www.cs.utexas.edu/users/EWD/ewd06xx/EWD692.PDF
A rough summary of his points:
* It is not given that the "Von Neumann Bottleneck" is what generates flabby, verbose programs.
* Computers execute programs, human reason about them. There's no reason to expect the same tools to apply. This has similarities to Linus' objections to C++ in the Linux kernel - kernel code is an implement of high level concepts and there's no to have the language structures partly support or impose these high level concepts; instead one should understand abstractly and then bring them to implementation.
* The operations of functional programming aren't efficient in their naive implementation. They can be made efficient with an optimizing compiler but this effectively results in the programmer thinking about more, not less, elements.
* Dijkstra takes issue with Backus' assumption that functional programs will make program proving accessible to the average programmer. I think Dijkstra wants to up the skill of the average programmer rather than expecting that functional programs will make proving easier. (this one I'm less sure of but it seems like you have any situation where only "unaverage" programmers are going to be using functional languages).
Dijkstra closes: "In short, the article is a progress report on a valid research effort but suffers badly from aggressive overselling of its significance, long before convincing result have been reached."
And I think the situation now is that FP has shown itself to be definitely useful in some domains but still suffers from publicity that implies it is a paradigm that will "eat" ordinary software engineering.
That it has huge mind-share for what it isn't might be as bad as it not being widely used for what it is useful for.
"John told me he considered FP a failure because, to paraphrase him, FP made it simple to do hard things but almost impossible to do simple things." -- Grady Booch, https://twitter.com/Grady_Booch/status/1153348388827480065
We probably will see the end of the von Neumann Style, but it won't be because programmers don't like it. It will because in it's modern incarnation it's horribly inefficient.
This breakdown of power consumption of a 32 bit ADD by various bits of the CPU was taken from a Tesla presentation about their navigation computer, https://youtu.be/Ucp0TTmvqOE?t=5247
icache: 25.0 pJ, register file: 6.0 pJ, control: 39 pJ, ALU add: 0.1 pJ
(sometimes I curse HN limited formatting. it doesn't do much, and yet the little it does often manages to get in the way.)
The ADD takes 0.15% of the total energy consumption. His breakdown doesn't show it, but I'm guessing the rest could be divided up into three things: getting the data to the ALU, and decoding of the instruction stream (which can probably be better thought of as a form of decompression), and control overhead mostly devoted to extracting parallelism from a serial set of instructions.
0.15% is woeful, particularly so when you realise the main constraint on speed now is power, or more precisely getting rid of it. We already have architectures that do better than tvon Neumann (eg, GPU's, NPU's), but even those are orders of magnitude away from what the brain achieves in terms of MIP's/watt.
The main constraint here is not building it. IBM did build a computer that mimics how the brain does things. It was a neural network and so massively parallel, but just as importantly information was sent by varying the number of pulses sent in a unit of time. That's important because if no information is sent there are no pulses, so potentially bugger all power was used.
The main constraint is programming it. We can't do it, and we have no idea where to begin. This debate about FP vs statefull programming seems to be a never ending source of entertainment in our field, but it's beside the point as neither have much relevance to solving this problem.
If the Alonzo Church model and Turing model of computing are computationally equivalent, then does it really matter? I may not be understanding correctly.
Programming will never be 'liberated' from the von Neumann style because the most successful programs will always be written in that style.
That's like asking if dentistry can be liberated from the toothbrush because we have dental floss.
Be careful speaking negatively about functional programming... there's an almost cult like following obsession that's almost anti-establishment, bordem with imperative paradigms, or more in pursuit of leaving a mark/legacy.
I've used functional and imperative programming for varied tasks throughout my career. Sometimes, functional programming approaches are fantastic and intuitive for tasks, other times, they're counterintuitive and become cumbersome depending on what you're doing.
While academically, functional programming can be interesting in many respects, for most development needs it can hamper development progress more than assist (humans have lots of tendencies and history that gives imperative approaches an edge).
For humans, computers exist to help us perform tasks we can't very well or quickly and do so consistently. As long as you can do that, people in general don't care about what underlying theoretical computational abstractions you choose--lets not lose sight of that fact. Only those in computing will really care.
Functional programming isn't new per se, there's just a trend of rediscovery and advancement going on. There's something to be said about supporting the underdog. Whether imperative paradigms will always dominate as the higher level abstraction of choice is questionable since the software field is still, relatively speaking, new. It likely isn't going to change much in any near future, however.
I don't see FP (and I mean pure FP style) as the underdog. I think the FP community tend to blame FP's repeated failure to reach mainstream on either developers' stubbornness or FP's poor implementation in many languages.
The way I see it however, is that FP has had plenty of attention and plenty of opportunities to go mainstream but it failed to do so simply because it's not as effective as imperative programming when it comes to building most real-world software.
It's not like the market doesn't know. Every developer on the planet knows what FP is and knows all the benefits. Some imperative languages like JavaScript have even adopted some aspects of FP for certain use cases (some aspects of the front end UI related to dynamic templates; e.g. VueJS, React).
But FP is ideal only for certain narrow niches. FP proponents should just leave it at that and stop trying to promote FP as if there is a fundamental law of science which says that it's superior to imperative programming because it's not.
My personal experience with FP is that it's not good for creating large modular software. You can't create effective abstractions that are accurate models of real world entities without encapsulating their state as well. State is an essential and inseparable part of the abstraction when we talk about modeling real-world entities in software.
HNers, let me tell what Backus finally had in mind come to fruition; I only figured this out recently: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-...
Here's the thing, Backus was not coming from the lisp and embryonic ML communities that were function programming research at the time. This was career-retrospective speech be a new-commer to the field. I think the existing community was surprised and bemused.
Certainly your regular function programminger would have chuckled at all the "point-free" style FP. This was a decade before Haskell mind you, and at least a decade before lots of function reuse became popular (being blocked on better strictness analysis and a lack of type classes and instances). Remember in then-cutting-edge Scheme, (define ...) has an implicit (begin ...) rather than single-expression body, and don't get me started on the other imperative lisps. APL before FP just made the existing community laugh harder.
So fast forward a while, and even today implicit parallelism is not happening much in production in Haskell. Applicative is much more popular than Arrow, and thus everyone is still writing Von Neuman code even if its dressed up functionally.
Well, let me tell, that thesis puts a lot of the pieces together:
The real genius of point-free isn't getting rid of the names. We have debruign indices for that. Nor is it allowing other modules of computation. Computer scientists found good ole' 1970s Category Theory (In principle, Backus could have done this too!) and now we know about (symmetric) monoidal categories and their internal languages and all the stuff we need to point-free-ify the programs in lambda-calculus-influenced calculii.
The real genius is dealing with dynamism / higher-order programs. With the raw lambda calculus, you are in a fun house of higher-orderism and it's hard to jump out of the substitution / term-rewritting mindset. CPS, which occupied much of the functional research community for at least a man-decade, only further serializes and higher-order-ifizes things. Monadic programs bind promiscuously; every step is potentially dynamic af, one again also serializing and higher-order-ifizing things.
With arrow-like things, however (and Control.Cateogry.* and Profunctor is really much better at distilling the concepts than the legacy Arrow class, thanks Ed), suddenly we have a decently expressive form of computation with no lambdas in site. We've lifted the "fog of war", freeing the first order "90%" of the work from the higher order "10%".
Arrow notation as implemented in Haskell today, however, abuses `arr` for the structural rules of the guest language (weakening, contraction, and exchange per Wikipedia [1]). This reshrowds everything and ruins all the promise of Arrows. Yuck.
Back to the thesis, there are 2 big contributions:
1) A practical version of Arrow which doesn't need arr for structural rules. (Control.Cateogry.* since does this better and with more faithfulness to the math, but had to start somewhere.)
2) A theory of heterogenous metaprogramming so the first order and higher order bits don't even have to be the same language. We're talking things like and FPGA that computes its own re-flashing.
There is also a protoype implementation of various lambdas so humans can still use names, but I hear its bitrotted and so I can't vouch for it.
So, back to John Backus. The thing is programmers are lazy. They rather write in the language they know, or learn the language their teachers know. Also, humans need at least the option of some names (even if we have to name too many things today), so mandatory combinators will cause problems as long as we struck programming in shitty, 1-dimentional text. Finally, without at least the taste of actually compiling to other programming models, along with scaling programs with the Von Neumann model, the economic and curiosity incentives just aren't there. And in 1978 the Von Neuman model was still fucking killing it. Wires were fast, and components be they spinning disks, ALUs, or DRAM were all kinda slow but in similar and predictable ways. Programming DOS was a shit-show, of course, but so were non-trivial compilers. Nothing was ready.
Now we know how to do program better, have the power to run the compilers, and the legacy machines have a cost model I like to think of as "molasses in capillaries with huge bottlenecks". We're ready.
[1]: https://en.wikipedia.org/wiki/Structural_rule
Are the arrows presented in the paper, something you think I should really familiarize myself with and use?
I did a deep dive into lambda calculus over the past year and I noticed similar limitations you mention. So this seems interesting.
I have seen arrows before but they just seem like a re-hash of the other lambda-like calculi. But maybe this paper's version of arrows is different?
I think it's worth it. Everything is lambda-in-a-straight-jacket-like. The question is how can you mix and match straight jackets, and how convoluted is programming.
Also look at https://github.com/conal/concat
Categorical logic reframes the lambda calculus as being "simply" the internal language of cartesian-closed categories. Doing the same for other categories leads to other languages that are not so lambda-like. Sometimes categories are best characterized by graphical languages, as seen in string diagrams and generalizations thereof. Studying these things involves a rather thriving subfield at the intersection of PLT, category theory and foundational mathematics/logic.
I love this paper, I believe it will be. Something like APL meets Prolog with a little side of lisp.
John von Neuman's 29 state cellular automata machine is (ironically) a classical decidedly "non von Neumann architecture".
https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton
He wrote the book on "Theory of Self-Reproducing Automata":
https://archive.org/details/theoryofselfrepr00vonn_0
He designed a 29 state cellular automata architecture to implement a universal constructor that could reproduce itself (which he worked out on paper, amazingly):
https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...
He actually philosophized about three different kinds of universal constructors at different levels of reality:
First, the purely deterministic and relatively harmless mathematical kind referenced above, an idealized abstract 29 state cellular automata, which could reproduce itself with a Universal Constructor, but was quite brittle, synchronous, and intolerant of errors. These have been digitally implemented in the real world on modern computing machinery, and they make great virtual pets, kind of like digital tribbles, but not as cute and fuzzy.
https://github.com/SimHacker/CAM6/blob/master/javascript/CAM...
Second, the physical mechanical and potentially dangerous kind, which is robust and error tolerant enough to work in the real world (given enough resources), and is now a popular theme in sci-fi: the self reproducing robot swarms called "Von Neumann Probes" on the astronomical scale, or "Gray Goo" on the nanotech scale.
https://en.wikipedia.org/wiki/Self-replicating_spacecraft#Vo...
https://grey-goo.fandom.com/wiki/Von_Neumann_probe
>The von Neumann probe, nicknamed the Goo, was a self-replicating nanomass capable of traversing through keyholes, which are wormholes in space. The probe was named after Hungarian-American scientist John von Neumann, who popularized the idea of self-replicating machines.
Third, the probabilistic quantum mechanical kind, which could mutate and model evolutionary processes, and rip holes in the space-time continuum, which he unfortunately (or fortunately, the the sake of humanity) didn't have time to fully explore before his tragic death.
p. 99 of "Theory of Self-Reproducing Automata":
>Von Neumann had been interested in the applications of probability theory throughout his career; his work on the foundations of quantum mechanics and his theory of games are examples. When he became interested in automata, it was natural for him to apply probability theory here also. The Third Lecture of Part I of the present work is devoted to this subject. His "Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components" is the first work on probabilistic automata, that is, automata in which the transitions between states are probabilistic rather than deterministic. Whenever he discussed self-reproduction, he mentioned mutations, which are random changes of elements (cf. p. 86 above and Sec. 1.7.4.2 below). In Section 1.1.2.1 above and Section 1.8 below he posed the problems of modeling evolutionary processes in the framework of automata theory, of quantizing natural selection, and of explaining how highly efficient, complex, powerful automata can evolve from inefficient, simple, weak automata. A complete solution to these problems would give us a probabilistic model of self-reproduction and evolution. [9]
[9] For some related work, see J. H. Holland, "Outline for a Logical Theory of Adaptive Systems", and "Concerning Efficient Adaptive Systems".
https://www.deepdyve.com/lp/association-for-computing-machin...
https://deepblue.lib.umich.edu/bitstream/handle/2027.42/5578...
https://www.worldscientific.com/worldscibooks/10.1142/10841
> Although I refer to conventional languages as "von Neumann languages" to take note of their origin and style, I do not, of course, blame the great mathematician for their complexity. In fact, some might say that I bear some responsibility for that problem.
From the paper. Whew.