Notes about 14.13 and 16.8 (Stallings 9th ed) {12.13 and 14.8 in the 8th ed}.

By Nicholas Duchon.

I will be following page and figure numbering from the 9th edition, so those using the 8th will have to make adjustments.

Both of these problems are about branch prediction algorithms, and the algorithms presented in 16.8 are a superset of those in 14.13, so we can actually just focus on the ones in 16.8.

First, Stallings describes the finite state machine (FSM) description of these algorithms on pp 506-510 (figs 14.18 and 14.19), and on pg 587 (near fig 16.6).

The context in which these algorithms operate is a pipeline - such as the one pictured in fig 14.10, pg 497. The concept is that executing an instruction has been broken down into a series of steps (FI, DI, CO, FO, EI, WO in that diagram), and that the steps can overlap because each step uses a different part of the CPU. If the program manages to keep the pipeline full, the speedup over sequential processing of instructions can be as high as the number of stages in the pipeline (6 in that figure). The problem is that if the pipeline picks the next instruction incorrectly, and only becomes aware of that problem in a later part of the cycle (say at EI), then the instruction processing the following instruction must be abandoned and the next instruction can only start after the correct instruction has been determined. This is the situation shown in fig 14.11, where after EI of instruction 3 it is determined that that next instruction should be 15, not 4. Thus the work done for instructions 4, 5, 6 and 7 is wasted.

All of this is because instruction 3 was a branch instruction (rather like an if/else instruction in a high-level language), the processor guessed taking one path from the branch, but the guess was wrong. In this example, the guess was NEXT (3 --> 4). The other possibility would be TAKEN (3 --> 15), which would have been the right guess in the case shown on fig 14.11.

Let's take a look at the finite state machines from problem 16.8 and start by comparing them to the machines in chapter 14. The T's mean a branch was (or is predicted to be) Taken, the N's mean that the branch was (or is predicted to be) Not taken. The numbers in the circles are simply labels for those states, to make it easier to refer to the circles, representing the states, by name. The T's and N's on the arrows are the actual paths, and the N's and T's in the circles are the predictions. Thus, in (b), state 2, the prediction is T, and if the branch is taken (T), the next state will be 1, and if the branch is not taken, the next state will be 0. • Fig 16.17 (d) is the same as Fig 14.19, pg 509 - you should confirm that you agree with this statement!
• (b) is the same as the left FSM (finite state machine) in Fig 14.28, pg 529 (problem 14.13) - again, check this out yourself.
• (c) is the same as the right FSM on pg 529. For example, if we are currently in state 3:

• The FSM predicts that the next branch will be T (taken)
• If the next branch really is T, the new state will also be 3, and the prediction will have been correct.
• If the next branch in not taken, N, the new state will be 2, and the prediction will have been wrong.

These algorithms are characterized by which sequences of T's and N's are mostly correct for that FSM, and which sequences of T's and N's are particularly bad for each FSM.

For the moment, let's work with example (a). A good sequence E:  T T T T T T S: 0 1 1 1 1 1 P:  N T T T T T C?  X + + + + + E = event S = state P = prediction C? was that correct? X = wrong + = correct Another good sequence E:  N N N N N N S: 0 0 0 0 0 0 P:  N N N N N N C?  + + + + + + A bad sequence E:  T N T N T N S: 0 1 0 1 0 1 P:  N T N T N T C?  X X X X X X

Thus, this FSM works well for many T's in a row, and for many N's in a row, but if T's and N's alternate, this FSM works very badly, ie, ALL its predictions are wrong.

Each of the machines in problem 16.8 will have some sequences that work well, and other sequences that work badly, and some sequences which are best with each of the FSM. For example, TNTNTNTNT is correctly predicted about half the time in FSM (b).

This is the kind of answer you should be presenting when you are asked to compare the relative merits of these FSM's.

ND.