Z80 nibbles

* The least significant byte is stored first when referencing a 16-bit number in memory.

* There’s no LD instruction to load a value into a register pair from another pair. When needed, you can use single register load instructions, like LD H, B and LD L, C.

* You can use ADD and ADC instructions to add constants to the accumulator, but there’s no equivalent instruction for the HL pair. Every time you want to add a number to HL, for instance, you should load that number into another pair and then add it to HL:

LD DE, 10EFh
LD HL, DE

Another possibility would be to use the accumulator instead and make the addition register by register:

LD A,L
ADD A,EFh
LD L,A
LD A,H
ADC A,10h
LD H,A

* The instructions INC and DEC don’t affect the carry flag.

* Whereas you can add directly to register pair HL (ADD HL, BC; ADD HL, DE; ADD HL, HL), it’s not possible to subtract directly from it. Thus, there’s no instructions like SUB HL, BC, for instance. As a side effect, all subtraction mnemonics (without carry) are written as SUB, implying that the register A is the only option to subtract from.

* However, you can subtract from HL with the instruction SBC, which takes the carry flag into account.

* Apparently, since both ADC and SBC instructions sets are extended ones (ED XX), there could be SUB HL instructions, unless there were any decoding issues prevent this addition in the Z80 (remember that Z80 took its instruction set from the 8080).

* Although you have instructions to set the carry flag (SCF) or to complement it (CCF), there’s no instruction to reset it. And it’s not needed indeed, since you can issue SCF followed by CCF in order to reset the carry flag. Another possibility is to use a dumb logical operation like AND A or OR A, which always reset the carry flag.

* PUSH and POP instructions only work with register pairs. The accumulator pairs with the F register (flags) making the AF pair.

* If you want to exchange values between two pairs (without overriding other register), you can use the stack:

PUSH BC
PUSH DE
POP BC
POP DE

* Bear in mind that using memory is slower than using registers. Thus, resort to registers as much as you can. The earlier example could use the accumulator, for instance:

LD A, C
LD C, E
LD E, A
LD A, B
LD B, D
LD D, A

* Within a subroutine (CALL), you can use a small trick to make sure to return to the correct address. Store the SP contents in another memory location upon entering the subroutine, and just before returning, restore the saved value back to SP. Thus, there’s no need to worry about unbalanced PUSHs and POPs inside the subroutine.

* Another useful trick when dealing with subroutines is to change the return address before returning.

POP HL    ; removes the original return address
LD HL, new_return_address
PUSH HL   ; inserts the new return address
RET

or

LD HL, new_return_address
EX (SP), HL
RET

* XOR FF has an equivalent result as the CPL instruction.

* This one sounds weird. Instructions RR and RL mean rotate the specified register’s bits including the carry flag, whereas instructions RRC and RLC do the same thing but exclude the carry flag.

* The instruction RLA performs the same action as RL A, but takes only 4 T-cycles, against 8 T-cycles from RL A, and only affects the carry flag. RRA, RLCA and RRCA work in a similar manner.

* In the Spectrum, when calling a machine code routine from BASIC, like using PRINT USR address, the address value is stored into the BC register pair before executing the machine code. Likewise, when the routine returns control to BASIC (RET instruction), the current BC contents is passed back as the result of the USR expression.