That is sort of useful, through the findings are not quite the same.
1) My code works in init().
2) I do not have reversed bit order but reversed nybble order in each byte.
3) Offset is not in words.
If we consider that the Endedness that results from reading in the application is the opposite to that read in from the loader, then we need to reverse the byte order to correct that.
This results in a complete nybble reversal.
i.e. 12345678 becomes 87654321.
If Larry was using a single line SPI and so reading one bit at the time, and I am using a quad flash then this would make some sense.
I think I can live with this. I will write some code to reverse the nybble order and I am good.
This is still all confusing, since reversed bit or nybble order should not be a thing. The code to read to SPI flash should sort this.