Re: memory barriers in bus_dmamap_sync() ?
On Jan 11, 2012, at 10:10 AM, Ian Lepore wrote:
> On Wed, 2012-01-11 at 09:59 -0700, Scott Long wrote:
>>=20
>> Where barriers _are_ needed is in interrupt handlers, and I can
>> discuss that if you're interested.
>>=20
>> Scott
>>=20
>=20
> I'd be interested in hearing about that (and in general I'm loving the
> details coming out in your explanations -- thanks!).
>=20
> -- Ian
>=20
>=20
Well, I unfortunately wasn't as clear as I should have been. Interrupt =
handlers need bus barriers, not cpu cache/instruction barriers. This is =
because the interrupt signal can arrive at the CPU before data and =
control words are finished being DMA's up from the controller. Also, =
many controllers require an acknowledgement write to be performed before =
leaving the interrupt handler, so the driver needs to do a bus barrier =
to ensure that the write flushes. But these are two different topics, =
so let me start with the interrupt handler.
Legacy interrupts in PCI are carried on discrete pins and are level =
triggered. When the device wants to signal an interrupt, it asserts the =
pin. That assertion is seen at the IOAPIC on the host bridge and =
converted to an interrupt message, which is then sent immediately to the =
CPU's lAPIC. This all happened very, very quickly. Meanwhile, the =
interrupt condition could have been predicated on the device DMA'ing =
bytes up to host memory, and those DMA writes could have gotten stalled =
and buffered on the way up the PCI topology. The end result is often =
that the driver interrupt handler runs before those writes have hit host =
memory. To fix this, drivers do a read of a card register as the first =
step in the interrupt handler, even if the read is just a dummy and the =
result is thrown away. Thanks to PCI ordering, the read will ensure =
that any pending writes from the card have flushed all the way up, and =
everything will be coherent by the time the read completes.
MSI and MSIX interrupts on modern PCI and PCIe fix this. These =
interrupts are sent as byte messages that are DMA'd to the host bridge. =
Since they are in-band data, they are subject to the same ordering rules =
as all other data on the bus, and thus ordering for them is implicit. =
When the MSI message reaches the host bridge, it's converted into an =
lAPIC message just like before. However, the driver doesn't need to do =
a flushing read because it knows that the MSI message was the last write =
on the bus, therefore everything prior to it has arrived and everything =
is coherent. Since reads are expensive in PCI, this saves a =
considerable amount of time in the driver. Unfortunately, it adds =
non-deterministic latency to the interrupt since the MSI message is =
in-band and has no way to force priority flushing on a busy bus. So =
while MSI/MSIX save some time in the interrupt handler, they actually =
make the overall latency situation potentially worse (thanks Intel!).
The acknowledgement write issue is a little more straight forward. If =
the card requires an acknowledgment write from the driver to know that =
the interrupt has been serviced (so that it'll then know to de-assert =
the interrupt line), that write has to be flushed to the hardware before =
the interrupt handler completes. Otherwise, the write could get =
stalled, the interrupt remain asserted, and in the interrupt erroneously =
re-trigger on the host CPU. I've seen cases where this devolves into =
the card getting out of sync with the driver to the point that =
interrupts get missed. Also, this gets a little weird sometimes with =
buggy MSI hacks in both device and PCI bridge hardware.
Scott
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
討論串 (同標題文章)
完整討論串 (本文為第 13 之 13 篇):