A complicated data processing system as above can be simplified and abstracted as below. Data is processed by Algorithm 1, Algorithm 2, and Algorithm 3 in a row. One implementation is to achieve all three algorithm in hardware. This approach can normally support high throughput with low power. But one major downside is flexibility. Hardware implementation may be buggy and finding a war on silicon can be tough. It means new tapeout, cost, and time to market. It also means tons of design, verification and emulation efforts. In addition sometimes the algorithm 2 may not be finalized due to, for example, standard is not finalized yet. Firmware approach is more desired to handle these two cases.
Assuming algorithm 2 needs to be implemented in DSP processor, how can it be done?
DSP normally follows Harvard architecture where instruction bus and data bus are separated. A straightforward approach is as below. Alg 1 dumps data to a local memory called rMem. Interrupt is generated to DSP. DSP uses data bus to read data out of rMem, process it, and then write to another local memory called tMem. tMem can trigger Alg 3 when data reaches certain level. Or Alg 3 can retrieve data out of tMem when instructed by DSP or by some timer signal.
Note at power up, DSP instruction memory is empty. We need main CPU to write DSP instruction to it first. So instruction memory can be connected to main CPU bus. DSP instruction bus can be another bus master to this CPU bus.
In above, DSP data bus and main CPU bus are separated. Below implementation uses a combined solution. It also shows DSP can be used to implement both alg 2 and alg b.
Here is a real example how CEVA DSP can be used in LTE baseband design.