First of all, about PCIe MSI and MSI-X in-band interrupts, there are a couple of good resources online.
PCIe defines three interrupt types, legacy PCI out-of-band interrupt, MSI (Message Signaled Interrupt), and MSI-X.
PCIe introduces MSI to generate in-band interrupt instead of using out-of-band dedicated hardware wires. MSI is nothing but a PCIe memory write transaction from device to host. Below is what a MSI TLP packet looks like. As can be seen, there is nothing special compared to a normal memory write. It consists of address to write and “Message_data” to be written. What really separates MSI from a normal mem write is the address and message_data which are both system specific. The address is a special address on host side that writing to this address will generate an interrupt to host processor. Message_data is also specific. It is 16bit, lower 5 bits [4:0] can be used to support up to MSI vectors, and upper 11bits [15:5] is fixed and configured by host driver during pcie enumeration.
On device side, MSI capability needs to be implemented in PCIe configuration space. If you are not sure what configuration does, you may want to check out our other post first, PCIE Configuration Space and Example to Enable L1SS ASPM with Config Space Access.
MSI capability structure is shown as below.
“Message Control” field is defined as:
During system enumeration, host scans and discovers this pcie device as well as above MSI capability structure in the configuration space. Host driver reads the “Multiple Message Capable” field in the “Message Control Register” to determine number of interrupts this device supports. MSI supports up to 32 interrupts or called vectors. So why are there only 3 bits in “Multiple Message Capable” field?
Depending on the value read back, host driver writes a value that is no more than the read back value. This is written value is the number of interrupts host assigns to device. Next, host writes “Message address”, “Message Upper address” if 64 bit host address is supported, and “Message data” registers. These register values will be used in MSI packet as mentioned above. Finally, host driver enables MSI by setting “MSI Enable” bit,  in “Message Control” field. That’s all. Device can then use MSI to send interrupt to host.
MSI capability structure also defines “Mask bits” and “Pending bits” registers. They are optional to MSI. Most PCIe devices which support MSI do not implement these two registers. But these two registers are mandatory for MSI-X. We will talk their function in MSI-X.
The following MSI flow diagram is from Handling PCIe Interrupts. It introduces an interesting concept called “MSI status”. MSI status is not per PCIe specification and it is “normally” not in device configuration space. It is some customer logic on device side. MSI status register has multiple bits with each bit corresponding to one of supported interrupts. Device sets bit to 1 to indicate the corresponding MSI interrupt is active or is sent from device to host. When host sees and services the interrupt, host driver needs to clear this MSI status bit. So below diagram is WRONG by saying “endpoint clears MSI status bit” when CPU services the interrupt. It is obvious. How does device know the host services the interrupt?
This MSI status on device side is not per PCIe spec and is commonly implemented in a PCIe design. It gives device side an idea if the interrupt is serviced. Device may generate another interrupt in a row which is “masked” by existing one if device side does not check if existing one is already serviced.
MSI-X provides additional capabilities include,
- a larger maximum number of vectors per function
- the ability for software to control aliasing, when fewer vectors are allocated than requested
- the ability for each vector to use an independent address and data value, specified by a table that resides in Memory Space.
Below is MSI-X capability structure.
MSI-X puts message address and message data of multiple interrupts in some device memory and it is called MSI-X table. Above “Table Offset” and “Table BIR” are used to let host know where to find this MSI-X table. “Table BIR” is table BAR indicator register and it says which BAR should be used to access the table. The table resides at location with starting address as BAR (specified in BIR) address + “Table Offset”.
Similarly to MSI case, host driver needs to write the MSI-X table during device enumeration. Then device can use the table to send MSI-X packet to host.
MSI-X capability structure also defines PBA, pending bit array. PBA is also implemented as a table in device memory. How to access is through PBA BIR and “PBA Offset” just like in the MSI-X table case.
But what is the function of PBA or pending bit? Here is extracted from PCIe spec.
“Per-vector masking is managed through a Mask and Pending bit pair
per MSI vector or MSI-X Table entry. An MSI vector is masked when its associated Mask bit is set. An MSI-X vector is masked when its
associated MSI-X Table entry Mask bit or the MSI-X Function Mask
bit is set. While a vector is masked, the function is prohibited from
sending the associated message, and the function must set the
associated Pending bit whenever the function would otherwise send
the message. When software unmasks a vector whose associated
Pending bit is set, the function must schedule sending the
associated message, and clear the Pending bit as soon as the message
has been sent.”
As mentioned, if device wants to send a specific interrupt but this interrupt is masked by host, device can set the corresponding pending bit. When host unmasks the interrupt, device can send the interrupt out to host and at the same time clear the pending bit.
One pcie device can have multiple functions. Each function is permitted to implement both MSI and MSI-X. But at most one can be enabled at the same time. Per PCIe spec, it is allowed for one function to use MSI and another function to use MSI-X. But some host software only allows one MSI type per pcie device. So functions either all use MSI or all use MSI-X.