The description of the BCE reminds me a lot of the PIO in the RP2040 and 2350 microcontrollers. From the article, BCE instructions include "Transmit Data, Receive Data, Load Timeout Register, Store Status, and Wait."
To me, these correspond more or less 1:1 with PIO instructions OUT, IN, SET, INT, and WAIT. These plus PUSH, PULL (which can considered auxiliaries of IN and OUT), MOV, and JMP are all the PIO instructions. Like the BCE, it runs with completely deterministic clocking, one instruction per clock, and like the BCE there are a bunch of them (a total of 12 state machines on the 2350), though they now run totally in parallel rather than being time-multiplexed.
As a hobby project, I've lately been implementing USB (aiming for higher performance than Pico-PIO-USB, which proves that it's possible), and that's been quite fun.
I wonder to what extent they were explicitly inspired, and to what extent you just get convergent solutions when there are similar goals and constraints.
I don't know if these boards were flown. They were coated with conformal coating (which I hate for reverse-engineering), which is usually omitted from prototypes. I believe that bodge wires are okay for flight if they are done properly.
Amazing article, thanks! However, I think when you said "...heat was transmitted by convection through the metal plate inside each page...", you probably meant conduction, right?
Convective heat transfer in metal would be a worrying event on the Space Shuttle!
Interestingly the earliest capacitors were glass jars (called Leyden Jars) [0]. I was taught that early inventors thought that charge was accumulated within the jar, and it wasn't until much later that it was realised that the shape was irrelevant, only the area and the distance between conductors.
oh that is absolutely fascinating to see in detail
I wonder if the very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
if I remember correctly, and it's been decades of course
four of the computers ran in parallel with the exact same instructions in case one failed or came up with a wrong answer
and the fifth computer was the "decider"
is that understanding correct?
ah I see now you mention
Eight networks were assigned to flight-critical systems,
with each CRT display and engine controller connected to four networks for redundancy.
> very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
Yes. Large size transistors (and other IC components) are less impacted by the radiation problems that exist outside the relative security of the atmosphere. Most radiation hardened IC circuity is many process sizes larger than whatever the current state of the art tiny process sizes happen to be at any given time.
But note I said "less impacted". Given sufficient radiation, things will have issues, which is why items like the Shuttle carried the redundant computers, to cover for the possible lucky-strike impacts.
Yes, the low density and TTL chips (instead of MOS) helped against radiation. When the Shuttle computers moved to semiconductor RAM, they needed extensive error correction, as well as a process that constantly fixed bit errors, as the memory would get multiple errors per flight due to cosmic rays.
As far as redundancy, it's complicated. During critical flight phases, four computers would run the main software (PASS, Primary Avionics Software System), while the fifth computer was ready with the Backup Flight Software (BFS). The backup software was written by a completely different team to ensure that a software bug couldn't crash all the computers at once. In orbit, they used fewer redundant computers to free up computers for payload operations and stuff.
The four computers constantly checked the results from each other and would vote out a faulty system. Voting ensured that a bad computer couldn't vote out the good ones (Byzantine failure). Moreover, the actuators hydraulically voted on the results from the computers: if one computer tried to push a valve in a different direction, the three good computers would physically overpower the bad computer's action at the level of the hydraulic pistons.
It's pretty tiny in the grand scheme of things. Wikipedia says it's 30ft long, 14ft wide and 9 ft tall (https://en.wikipedia.org/wiki/Boeing_X-37#Specifications_(X-...). The space shuttle was 122ft long, so around 4x the size. When you don't need to put space station modules inside the payload bay, or humans in the cockpit, you can make it pretty small and hard to find against the vastness of space
Thanks so much for the information. I am familair with the voting logic (I've worked on systems that implemented the same thing, odd-number of processor cores and the majority wins).
One question, were any "misbehaving" processor or actuation requests ever logged? As in, were there examples where one actuator or CPU didn't agree in the Shuttle flights?
There have been a fair number of GPC failures [1], and computers have been voted out. I haven't looked closely enough to see how many were "disagreements" versus hard failures or self-check failures.
It's unlikely that you'd get a simultaneous tie; you'd expect one computer to go bad before the other. But I think in that case, the astronauts switch to the Backup Flight System, the fifth computer.
Mission STS-9 had two computer failures, causing landing to be delayed by 7 3/4 hours. They carried a sixth computer as a backup for following missions.
As far as how the voting works, each computer has a signal indicating what it thinks the status is of each computer, including itself. (Computers can detect many failures from self-checking, such as parity errors.) Each IOP uses these votes to determine the "redundant set", calculating the votes in hardware. The status is also displayed to the astronauts in a 5×5 grid. Astronauts can power down a computer or reboot it.
Favorite story I heard about voting was an anecdote relating to the flight computers on one of the Boing 7xx jets (probably the 757, but I don't know).
The story was that they were planning to fly with 3 computers, and that they would "vote" on important decisions.
The real trick was that they intended to build those computer with 3 separate teams, using clean room implementation (no coordinating with the other teams), and that they were going to use 3 separate CPU architectures, and even 3 different implementation languages.
As I understand it, they conceded on the language choice, they were all going to use the same language, but I don't know about the rest.
The goal was to avoid some catastrophic "unknown unknown" that might have crept into the implementation if they simply rolled out 3 copies of the same system.
The description of the BCE reminds me a lot of the PIO in the RP2040 and 2350 microcontrollers. From the article, BCE instructions include "Transmit Data, Receive Data, Load Timeout Register, Store Status, and Wait."
To me, these correspond more or less 1:1 with PIO instructions OUT, IN, SET, INT, and WAIT. These plus PUSH, PULL (which can considered auxiliaries of IN and OUT), MOV, and JMP are all the PIO instructions. Like the BCE, it runs with completely deterministic clocking, one instruction per clock, and like the BCE there are a bunch of them (a total of 12 state machines on the 2350), though they now run totally in parallel rather than being time-multiplexed.
As a hobby project, I've lately been implementing USB (aiming for higher performance than Pico-PIO-USB, which proves that it's possible), and that's been quite fun.
I wonder to what extent they were explicitly inspired, and to what extent you just get convergent solutions when there are similar goals and constraints.
Author here if you have questions...
Did Shuttle fly with these bodge-wired boards or were these used for development on the ground?
I don't know if these boards were flown. They were coated with conformal coating (which I hate for reverse-engineering), which is usually omitted from prototypes. I believe that bodge wires are okay for flight if they are done properly.
That's amazing. My bodge-wired work won't last 10 miles in the trunk of my car.
As always - great article.
FYI - The link for Peter Kogge is broken and should probably link to https://en.wikipedia.org/wiki/Peter_Kogge
Amazing article, thanks! However, I think when you said "...heat was transmitted by convection through the metal plate inside each page...", you probably meant conduction, right?
Convective heat transfer in metal would be a worrying event on the Space Shuttle!
Oops, I've fixed that now.
Glass capacitors!!!! I didn't even know this existed! and made by Corning!
Corning makes some shit for the military that most civilians never even heard of.
Interestingly the earliest capacitors were glass jars (called Leyden Jars) [0]. I was taught that early inventors thought that charge was accumulated within the jar, and it wasn't until much later that it was realised that the shape was irrelevant, only the area and the distance between conductors.
[0] https://en.wikipedia.org/wiki/Leyden_jar
oh that is absolutely fascinating to see in detail
I wonder if the very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
if I remember correctly, and it's been decades of course
four of the computers ran in parallel with the exact same instructions in case one failed or came up with a wrong answer
and the fifth computer was the "decider"
is that understanding correct?
ah I see now you mention
> very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
Yes. Large size transistors (and other IC components) are less impacted by the radiation problems that exist outside the relative security of the atmosphere. Most radiation hardened IC circuity is many process sizes larger than whatever the current state of the art tiny process sizes happen to be at any given time.
But note I said "less impacted". Given sufficient radiation, things will have issues, which is why items like the Shuttle carried the redundant computers, to cover for the possible lucky-strike impacts.
Yes, the low density and TTL chips (instead of MOS) helped against radiation. When the Shuttle computers moved to semiconductor RAM, they needed extensive error correction, as well as a process that constantly fixed bit errors, as the memory would get multiple errors per flight due to cosmic rays.
As far as redundancy, it's complicated. During critical flight phases, four computers would run the main software (PASS, Primary Avionics Software System), while the fifth computer was ready with the Backup Flight Software (BFS). The backup software was written by a completely different team to ensure that a software bug couldn't crash all the computers at once. In orbit, they used fewer redundant computers to free up computers for payload operations and stuff.
The four computers constantly checked the results from each other and would vote out a faulty system. Voting ensured that a bad computer couldn't vote out the good ones (Byzantine failure). Moreover, the actuators hydraulically voted on the results from the computers: if one computer tried to push a valve in a different direction, the three good computers would physically overpower the bad computer's action at the level of the hydraulic pistons.
what a treasure-trove of historical technical info you have
do you know anything about the military's secret space-shuttle still in operation?
I'm sure it's either been very modernized or runs on completely different design since it's supposedly remote-control
Sorry, I don't know anything secret :-)
How could it be possible to conceal the movement of such a huge thing?
Space is big, dark, and the X-37 has significant delta-v capabilities so its position is not a consistent stable orbit like a normal satellite.
It's pretty tiny in the grand scheme of things. Wikipedia says it's 30ft long, 14ft wide and 9 ft tall (https://en.wikipedia.org/wiki/Boeing_X-37#Specifications_(X-...). The space shuttle was 122ft long, so around 4x the size. When you don't need to put space station modules inside the payload bay, or humans in the cockpit, you can make it pretty small and hard to find against the vastness of space
Hi kens,
Thanks so much for the information. I am familair with the voting logic (I've worked on systems that implemented the same thing, odd-number of processor cores and the majority wins).
One question, were any "misbehaving" processor or actuation requests ever logged? As in, were there examples where one actuator or CPU didn't agree in the Shuttle flights?
There have been a fair number of GPC failures [1], and computers have been voted out. I haven't looked closely enough to see how many were "disagreements" versus hard failures or self-check failures.
[1] Search for "GPC" in the Mission Summary report: https://newspaceeconomy.ca/wp-content/uploads/2023/05/space-...
How did the voting work? The first thing I thought was, what happens in a tie (2-2) vote?
It's unlikely that you'd get a simultaneous tie; you'd expect one computer to go bad before the other. But I think in that case, the astronauts switch to the Backup Flight System, the fifth computer.
Mission STS-9 had two computer failures, causing landing to be delayed by 7 3/4 hours. They carried a sixth computer as a backup for following missions.
As far as how the voting works, each computer has a signal indicating what it thinks the status is of each computer, including itself. (Computers can detect many failures from self-checking, such as parity errors.) Each IOP uses these votes to determine the "redundant set", calculating the votes in hardware. The status is also displayed to the astronauts in a 5×5 grid. Astronauts can power down a computer or reboot it.
Favorite story I heard about voting was an anecdote relating to the flight computers on one of the Boing 7xx jets (probably the 757, but I don't know).
The story was that they were planning to fly with 3 computers, and that they would "vote" on important decisions.
The real trick was that they intended to build those computer with 3 separate teams, using clean room implementation (no coordinating with the other teams), and that they were going to use 3 separate CPU architectures, and even 3 different implementation languages.
As I understand it, they conceded on the language choice, they were all going to use the same language, but I don't know about the rest.
The goal was to avoid some catastrophic "unknown unknown" that might have crept into the implementation if they simply rolled out 3 copies of the same system.