Not really a knack, just a lot of hard-word and a bit of trial and error sometimes. Single most important thing is knowing where the rom and ram is mapped to, for obvious reasons. If it's a multiple processor board that you're targeting (and a lot of boards after '85 are) you need to know where the shared ram area is so your CPUs can communicate with each other.
You also need to know where the watchdog is mapped to as you'll be kicking it often (at least once per vbl). Other things like hardware irq enables are important too because many boards can enable/disable specific interrupts as well as the CPU being able to enable/disable interrupts and if you don't enable hardware ints then you wont get any interrupts on the CPU. If the board is Z80 based and you plan on using IM2 (interrupt mode 2) then you'll need to set up at least one interrupt vector where the interrupt routine will jump to. You also need to remember to set up a stack at the top of ram.
You need to remember that Z80's boot from address 0 and 6809 CPUs boot from the address held in the reset vector, which is located at address $fffe in rom. Reset vector for 6502 is $fffc. There are also other vectors you need to set up (vbl, nmi, irq, etc) for 6502/6809. If you're targeting a 68000 based board then the first 4 bytes of your code needs to be the initial supervisor stack pointer address and the next 4 bytes needs to be the initial program counter address (where code execution will begin).
If you're targeting a board that supports paged memory (either rom, ram or both) then you need to spend some time familiarising yourself with the paging mechanism (the paging registers will be mapped somewhere in memory).
If you're targeting a board with a custom memory mapper chip (like the Sega 315-5195 used on Out Run/Super Hang-On/etc) then you need to init the memory mapper by feeding it a bunch of data. This is the very first thing your code must do on those types of boards.
If you're coding a completely non-interactive tech demo then you don't need to know where the sticks/buttons are mapped to but pretty much anything more than HELLO WORLD! will need some kind of user interaction, hence you'll need to know how to read the sticks/buttons.
When I was learning Pac-Land and L System I hard-crashed MAME a few times until I got things working, but that's the joy of dev'ing in MAME! It's much better to crash that a few times than to let untested code run wild on hardware. I don't deploy code to hardware unless I'm 100% happy that's it's working as good as it can be in MAME first, even then occasionally it doesn't work on hardware and then the fun begins!
Pac-Land is quite complicated to bring up as is does quite a bit of communication between the main cpu and sub-cpu. As it's not an easy task to change the sub-cpu boot code you need to make sure you follow the correct protocol for booting. You need to do this as the sub-cpu handles all input from the player and so it needs to be woken up correctly or else you wont be able to have input for your game/demo. Quite a few Namco boards boot in this way.
Then there are system unique things which will vary from one board to the next. For example Tron writes a value to output port $e8 on boot, but it's undocumented/unknown what that does. It may be important for the boot process or it may be unimportant for the boot process. Best bet, in situations like that, is to simply follow the original boot code and implement it in your code, to be on the safe side.
I know what you mean about the gfx macros in the drivers! They're fun to follow, that's for sure! When I was learning the tile format of Out Run I just played with the roms until I figured it out the hard way! For Pac-Land I started to understand them a little better and by the time I got to L System I finally managed to understand them, but I had to make lots of scribbles on paper in doing so!
You also need to know where the watchdog is mapped to as you'll be kicking it often (at least once per vbl). Other things like hardware irq enables are important too because many boards can enable/disable specific interrupts as well as the CPU being able to enable/disable interrupts and if you don't enable hardware ints then you wont get any interrupts on the CPU. If the board is Z80 based and you plan on using IM2 (interrupt mode 2) then you'll need to set up at least one interrupt vector where the interrupt routine will jump to. You also need to remember to set up a stack at the top of ram.
You need to remember that Z80's boot from address 0 and 6809 CPUs boot from the address held in the reset vector, which is located at address $fffe in rom. Reset vector for 6502 is $fffc. There are also other vectors you need to set up (vbl, nmi, irq, etc) for 6502/6809. If you're targeting a 68000 based board then the first 4 bytes of your code needs to be the initial supervisor stack pointer address and the next 4 bytes needs to be the initial program counter address (where code execution will begin).
If you're targeting a board that supports paged memory (either rom, ram or both) then you need to spend some time familiarising yourself with the paging mechanism (the paging registers will be mapped somewhere in memory).
If you're targeting a board with a custom memory mapper chip (like the Sega 315-5195 used on Out Run/Super Hang-On/etc) then you need to init the memory mapper by feeding it a bunch of data. This is the very first thing your code must do on those types of boards.
If you're coding a completely non-interactive tech demo then you don't need to know where the sticks/buttons are mapped to but pretty much anything more than HELLO WORLD! will need some kind of user interaction, hence you'll need to know how to read the sticks/buttons.
When I was learning Pac-Land and L System I hard-crashed MAME a few times until I got things working, but that's the joy of dev'ing in MAME! It's much better to crash that a few times than to let untested code run wild on hardware. I don't deploy code to hardware unless I'm 100% happy that's it's working as good as it can be in MAME first, even then occasionally it doesn't work on hardware and then the fun begins!
Pac-Land is quite complicated to bring up as is does quite a bit of communication between the main cpu and sub-cpu. As it's not an easy task to change the sub-cpu boot code you need to make sure you follow the correct protocol for booting. You need to do this as the sub-cpu handles all input from the player and so it needs to be woken up correctly or else you wont be able to have input for your game/demo. Quite a few Namco boards boot in this way.
Then there are system unique things which will vary from one board to the next. For example Tron writes a value to output port $e8 on boot, but it's undocumented/unknown what that does. It may be important for the boot process or it may be unimportant for the boot process. Best bet, in situations like that, is to simply follow the original boot code and implement it in your code, to be on the safe side.
I know what you mean about the gfx macros in the drivers! They're fun to follow, that's for sure! When I was learning the tile format of Out Run I just played with the roms until I figured it out the hard way! For Pac-Land I started to understand them a little better and by the time I got to L System I finally managed to understand them, but I had to make lots of scribbles on paper in doing so!