Sure, but COFF and its variations are much less practically common in Unixes (ie ignoring PE)
In any case, it’s highly doubtful being different from ELF is offering real value to anyone at this point, it’s just nobody wants to spend time, money, etc to migrate when incremental change to their own formats gets most of what they want.
Though it is probably accurate to say apple probably has it the worst here. Mach-O tooling is almost certainly the least available out of ELF, COFF, and Mach-O.
If Apple wants to add some brand new feature to Mach-O, they generally just define a new load command. There's generally just one way to do it. The main downside, is (in practice) only Apple can do it. [0]
Whereas, with ELF – if you want to add a new feature, do you add it as a new program table entry type (PT_), or a new note type (NT_), or a separate note section (SHT_NOTE)?
[0] Well, historically OSF/1 also used Mach-O, although my understanding is it was an incompatible implementation of the same basic ideas. The main vendor to ship an OSF/1-based Unix was DEC (later Compaq then HP Tru64 Unix), although HP also (briefly) shipped their own OSF/1-based Unix (prior to buying DEC/Compaq), as did IBM (the short-lived AIX/ESA for IBM mainframes, which despite its name, had very little in common with the AIX everyone knows), and also various supercomputer vendors (Intel Paragon and ASCI Red, Hitachi, Kendall Square Research). But all those systems are defunct, so Apple's Mach-O is the only member of the "Mach-O family" still in production use.
> Whereas, with ELF – if you want to add a new feature, do you add it as a new program table entry type (PT_), or a new note type (NT_), or a separate note section (SHT_NOTE)?
It won't be a note section since ELF binaries don't need to have any sections. You can remove all section headers from an existing ELF binary and it'll still work (sstrip [https://github.com/aunali1/super-strip - not sure the one in ELFkickers is the same thing] does this).
Program headers define how & where to load things, notes are for small metadata.
Is your point that section headers do generally exist in ELF files, even though they're not necessary? I do agree that this "double indexing" between PHs and SHs is pretty weird and a bit odd.
Well, you'll have a SHT_NOTE section and then PT_NOTE pointing to its contents, right? So that way the notes are accessible both via section headers and program headers.
Mach-O does similarly have a distinction between sections as a link-time view, and segments as an execution-time view – a Mach-O contains one or more segments (LC_SEGMENT for 32-bit, LC_SEGMENT_64 for 64-bit), each of which contains a segment header followed by zero or more section headers, so the sections are subdivisions of the segment. Mach-O has names for both segments and sections; ELF only has names for sections.
Let me put it this way – if you wanted to add embedded digital signatures to ELF, so the kernel could verify them at runtime, similar to Mach-O LC_CODE_SIGNATURE – what is the extension point you'd use? The Linux kernel supports digital signatures for ELF executables in conjunction with IMA, but instead of storing the signature in the executable like Mach-O does, it stores it in a filesystem xattr.
> Well, you'll have a SHT_NOTE section and then PT_NOTE pointing to its contents, right?
No, the PT_NOTE points directly at parts of the file. The SHT_NOTE is still there because tools don't bother removing it, but it's essentially wrong to use any section information on a finished binary. The loader and dynamic linker don't.
> […] so the sections are subdivisions of the segment. […]
Ok, I'll say this is probably cleaner than the ELF n:m relationship between PT_LOAD blocks and sections. I guess the ELF way is a tiny bit more efficient (since there's fewer instructions for the loader to follow) but that really doesn't matter in face of the cost of doing relocations.
> Let me put it this way – if you wanted to add embedded digital signatures to ELF, […]
Hm. I'm not a libc/kernel developer, so take this with multiple grains of salt, but my train of thought would be this:
There's already GNU_RELRO which makes a statement about "spans" in the file. I'm not sure if it even makes sense to sign a binary partially, but let's say it does. In that case, I'd start with a new PT_SIGNED, of which there could be multiple, defining chunks of the file to be signed. Unfortunately, the signature itself can't be put into that same header, since it has no payload. (Actually... I guess e_phentsize could be bumped up. Hmm. Not sure how much breakage that would cause. But also, then you'd need to skip the signatures in the middle of program headers…) Left with having to put the signature(s) somewhere (one? multiple, if multiple chunks are getting signed? not sure.) - hmm, I'm torn between notes and a second new program header. I'm slightly leaning towards notes, but don't really know.
[Ed.:] oh, wait. The hypothetical PT_SIGNED program header would probably only use the PhysAddr / FileSize fields. The VirtAddr could point at the actual signature. (Or other way around?) A little hacky, but could maybe be executed well enough to not turn extremely ugly…
It'd make quite a few things so much easier if Apple just switched to ELF. At the cost of being a giant break :/
There are other UNIXes that aren't ELF, e.g. AIX.
Sure, but COFF and its variations are much less practically common in Unixes (ie ignoring PE)
In any case, it’s highly doubtful being different from ELF is offering real value to anyone at this point, it’s just nobody wants to spend time, money, etc to migrate when incremental change to their own formats gets most of what they want.
Though it is probably accurate to say apple probably has it the worst here. Mach-O tooling is almost certainly the least available out of ELF, COFF, and Mach-O.
I honestly think Mach-O is more elegant than ELF, with its structure of load commands.
Of course, ELF still wins by being more mainstream, and not being de facto under the control of a single vendor.
> I honestly think Mach-O is more elegant than ELF, with its structure of load commands.
Can you elaborate on that, especially contrasting with ELF's PT_LOAD program headers?
If Apple wants to add some brand new feature to Mach-O, they generally just define a new load command. There's generally just one way to do it. The main downside, is (in practice) only Apple can do it. [0]
Whereas, with ELF – if you want to add a new feature, do you add it as a new program table entry type (PT_), or a new note type (NT_), or a separate note section (SHT_NOTE)?
[0] Well, historically OSF/1 also used Mach-O, although my understanding is it was an incompatible implementation of the same basic ideas. The main vendor to ship an OSF/1-based Unix was DEC (later Compaq then HP Tru64 Unix), although HP also (briefly) shipped their own OSF/1-based Unix (prior to buying DEC/Compaq), as did IBM (the short-lived AIX/ESA for IBM mainframes, which despite its name, had very little in common with the AIX everyone knows), and also various supercomputer vendors (Intel Paragon and ASCI Red, Hitachi, Kendall Square Research). But all those systems are defunct, so Apple's Mach-O is the only member of the "Mach-O family" still in production use.
> Whereas, with ELF – if you want to add a new feature, do you add it as a new program table entry type (PT_), or a new note type (NT_), or a separate note section (SHT_NOTE)?
It won't be a note section since ELF binaries don't need to have any sections. You can remove all section headers from an existing ELF binary and it'll still work (sstrip [https://github.com/aunali1/super-strip - not sure the one in ELFkickers is the same thing] does this).
Program headers define how & where to load things, notes are for small metadata.
Is your point that section headers do generally exist in ELF files, even though they're not necessary? I do agree that this "double indexing" between PHs and SHs is pretty weird and a bit odd.
Well, you'll have a SHT_NOTE section and then PT_NOTE pointing to its contents, right? So that way the notes are accessible both via section headers and program headers.
Mach-O does similarly have a distinction between sections as a link-time view, and segments as an execution-time view – a Mach-O contains one or more segments (LC_SEGMENT for 32-bit, LC_SEGMENT_64 for 64-bit), each of which contains a segment header followed by zero or more section headers, so the sections are subdivisions of the segment. Mach-O has names for both segments and sections; ELF only has names for sections.
Let me put it this way – if you wanted to add embedded digital signatures to ELF, so the kernel could verify them at runtime, similar to Mach-O LC_CODE_SIGNATURE – what is the extension point you'd use? The Linux kernel supports digital signatures for ELF executables in conjunction with IMA, but instead of storing the signature in the executable like Mach-O does, it stores it in a filesystem xattr.
> Well, you'll have a SHT_NOTE section and then PT_NOTE pointing to its contents, right?
No, the PT_NOTE points directly at parts of the file. The SHT_NOTE is still there because tools don't bother removing it, but it's essentially wrong to use any section information on a finished binary. The loader and dynamic linker don't.
> […] so the sections are subdivisions of the segment. […]
Ok, I'll say this is probably cleaner than the ELF n:m relationship between PT_LOAD blocks and sections. I guess the ELF way is a tiny bit more efficient (since there's fewer instructions for the loader to follow) but that really doesn't matter in face of the cost of doing relocations.
> Let me put it this way – if you wanted to add embedded digital signatures to ELF, […]
Hm. I'm not a libc/kernel developer, so take this with multiple grains of salt, but my train of thought would be this:
There's already GNU_RELRO which makes a statement about "spans" in the file. I'm not sure if it even makes sense to sign a binary partially, but let's say it does. In that case, I'd start with a new PT_SIGNED, of which there could be multiple, defining chunks of the file to be signed. Unfortunately, the signature itself can't be put into that same header, since it has no payload. (Actually... I guess e_phentsize could be bumped up. Hmm. Not sure how much breakage that would cause. But also, then you'd need to skip the signatures in the middle of program headers…) Left with having to put the signature(s) somewhere (one? multiple, if multiple chunks are getting signed? not sure.) - hmm, I'm torn between notes and a second new program header. I'm slightly leaning towards notes, but don't really know.
[Ed.:] oh, wait. The hypothetical PT_SIGNED program header would probably only use the PhysAddr / FileSize fields. The VirtAddr could point at the actual signature. (Or other way around?) A little hacky, but could maybe be executed well enough to not turn extremely ugly…
That's a true statement, but what's the argument?