The BCD instructions, despite originally conceived for BCD, actually have some other niche uses such as AAM/AAD for a small multiply/divide:
https://news.ycombinator.com/item?id=8477254
In that article I also further refute the common belief that they are significantly slower in this comment, by reasoning and then an actual benchmark: https://news.ycombinator.com/item?id=8477585
And this is where I think a gap between compilers and humans exist --- would a compiler recognise, for example, that if your code, for whatever reason, happened to have this pattern of operations in it...
http://x86.renejeschke.de/html/file_module_x86_id_3.html
...it should emit a single AAS instruction? Or perhaps a simpler example, would it know to emit an AAM for an 8-bit division or modulus with a constant divisor? Maybe it could --- see the other discussion here about LEA --- but the developers just didn't bother to. I wish they would though, because despite how slow these instructions may be (which might not be the case), they are still valuable for size optimisation (-Os), and after all, it is a gap.