That kind of sounds like every load pulse the next buffer register is loaded..
This is my quick mockup so far...
- shifter.png (12.97 KiB) Viewed 4970 times
I have made the assumption here that both banks of registers are being clocked the same speed, in reality the MMU could actually load the data quicker than the shifter needs it, but I do not know this at the moment either way...
So data initially goes into bank zero, but then they would have to be a specific wait time of 16 clock cycles before the MMU should load more data.. So this look like it may not be actually correct...
The basic idea was that there would be 128 bit counter for bank select, where the most significant bit would be high or low for 64 clock cycles. So on the 65th clock cycle banks zero or bank one would be selected.
So probably there is actually a another small counter to select banks zero 16 bit banks. Basically a one of four selector which is clocked by the MMU load signal..
Of course I do not actually know the load sequence of the MMU, but the at least using load as a one of four selector for the banks, the speed in which the MMU loads of data would actually be irrelevant, providing of course it is actually faster than the shifter is outputting the data...
EDIT:
So it would be like this..
- shifter2.png (16.4 KiB) Viewed 4968 times
So each 16 bit bank is connected directly to the MMU databus.. And which bank is selected is based on a one of four counter, which is clocked on the load line..
Of course it probably is a bit more complicated as bank one would also need saying databus as input, but that would not really be a problem anyway, is just getting a bit messy to draw now
The problem I see there is, there would have to be some sort of disable on the banks zero clock line, as we in this case would not clock a shift registers the same time as the output register...
This probably now makes more sense why a copy is done rather than a bank swap...