DMA is widely used to offload processor from resource-consuming data movement. There are two typical usage scenarios, one is between two memory spaces and the other is between memory and a peripheral, aka some hardware module which can receive data from or transmit data to external device. These two scenarios are related. The 2nd scenario can be designed in a way that peripheral first put data into some local memory and then data transfer becomes the 1st scenario since it is between two memories.
Let’s say we want to design a system for the 2nd scenario that DMA is used to transfer data between a peripheral and a memory. Both direction, mem to peri and peri to mem, need to be supported. The peripheral implements FIFO instead of full memory to save space. Before we implement the design, we first think through how it works. For peri to mem direction, peripheral receives data and then dumps to FIFO, once FIFO reaches certain level, asserts request to DMA, DMA then fetches data from peri FIFO and then writes to memory. Sounds good. But at the end of a transfer, likely we don’t have enough data to fill the FIFO to the threshold level so request is not sent to DMA and DMA won’t fetch. Processor will not see the whole data of a transfer. How to resolve it?
Let’s take a look of existing IP for DMA controller and maybe the issue is already resolved over there. ARM processors and AMBA bus have been widely used in industry. For AHB bus based low-end system, ARM offers PrimeCell SMDMAC, single master DMA controller. An user can take a look of PL081 technical reference manual for details. After checking this IP, indeed our above question is resolved.
SMDMAC system level inter-connection can be seen as below. AHB slave interface is for processor to program and configure SMDMAC internal registers. AHB master interface is used by DMAC to read and write data once it becomes active. Interrupt signals are to notify processor DMA transfer is done. DMA request and response signals are between DMAC and peripheral and it is these signals resolving above question.
Once again, DMA request and response signals run between DMAC and peripheral. Notice it is NOT just one request signal. There are multiple of them. Here is how above issue is resolved. On peripheral side, when data reaches certain level in FIFO, peripheral asserts BREQ, block request, to DMAC. This “certain level” in FIFO is normally programmable and must match DMA’s burst transaction size. For example, if FIFO threshold is programmed to be 16, it asserts BREQ to DMAC saying there is 16 word ready. DMAC sees BREQ and will do a burst read. Since DMAC burst size is also programmed by processor to be 16, DMAC will read out exactly 16 word out of FIFO.
Above process can repeat for many bursts. Then coming to our question, at the end of a transfer, there is no enough data to fill up FIFO to assert BREQ. SMDMAC provides two extra request signals, SREQ and LSREQ, to resolve this issue. SREQ is single request and LSREQ is last single request. Let’s say we have 7 word sitting in FIFO at the end of a transfer. Peripheral will assert SREQ for 6 times and then LSREQ once in the end. When DMAC sees SREQ or LSREQ, it will only read one word. When DMAC sees LSREQ, it also knows this is the end of a transfer, it will notify processor a transfer is done and data is ready in memory. By the way, DMAC also offers LBREQ, last block request. It is used when transfer size is exactly multiple of burst size. In this case, LBREQ needs to be asserted at the end to notify DMAC the end of a transfer otherwise DMAC won’t send interrupt to processor.
But how does peripheral know the size of a transfer? Well, yes, processor programs transfer size into peripheral. But how does processor know about it? A typical solution is to have a fixed size transfer first which consists of the size of the upcoming transfer. Since the 1st transfer is fixed size, processor can program it into peripheral first. Peripheral and DMA work together to get this transfer done. processor is notified and process this 1st transfer and realize the size of the upcoming transfer. Processor then programs size of 2nd transfer into peripheral. It then works.
Here is DMAC configuration and process flow.
- it says it is possible to use SMDMAC as flow controller. First this way still needs peripheral to send DMA request. Second, I am not sure how to resolve above issue. Remember, peripheral switches to SREQ because it knows the transfer size and the burst size and that at the end of transfer there is no enough data for a burst. If DMAC is flow controller and peripheral doesn’t know about transfer size, I don’t think our issue can be resolved.
- It mentions DMACCxLLI. LLI is link list for scatter-gather DMA operation. It is very useful feature since we can dump data to non contiguous memory sections.
Let’s take a look of real timing. Below shows peripheral asserts one BREQ. DMAC sees it and then fetch data out of FIFO 16 times. But when DMAC writes data into memory, there are only 4 transactions. Something is wrong? In fact for the 16 read of FIFO, each read is only one byte. 16 bytes is 4 words which matches 4 transactions to memory. BTW, SMDMAC has 4 word internal FIFO to save data.
Below is another example. If look closely, you can see peripheral FIFO read transaction and memory write transaction are interleaved. One read, one write, one read, one write, and so on. I am not sure why. I think at least memory side can be burst based. Need to check closer.
Below is what I mentioned in above that to resolve the issue, after BREQ, peripheral needs to assert several SREQ and then one LSREQ at the end. For each BREQ/LBREQ/SREQ/LSREQ, DMAC needs to clear it, through asserting CLR1, to tell the peripheral the request is served.