Re: using ConnectX card as Ethernet (mlxen)
--Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
Hi,
On 2014-1-20, at 21:59, John Baldwin <jhb@freebsd.org> wrote:
> I believe this should work, yes. Getting a crashdump or the panic =
messages=20
> would be really helpful in figuring out why it isn't. Thanks.
I rebuilt the kernel, and see no crashes anymore. So that's good.
But there are a bunch of other issues that maybe someone has some ideas =
about:
(1) Late attach
The ConnectX-3 attaches very late during the boot process, after the =
system is already in single-user mode. See the attached dmesg; pci17 and =
pci18 (there are two identical cards in this system) first show as "no =
driver attached" during the PCI bus enumeration. Only after the system =
is single-user mode does the mlx4_core attach to the cards.
That means that e.g. trying to set sysctls for these cards in =
/etc/sysctl.conf, or configuring their IP addresses via rc.conf is not =
possible. At the moment, I work around this by sleeping in rc.local and =
then doing assignments there, but that's a hack.
Any clues why these cards attach so late?
(2) Device numbers change
After booting, these cards show up in InfiniBand mode:
ib0: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.21
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
ib1: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
ib2: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
ib3: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
Then I force one into Ethernet mode:
# sysctl sys.device.mlx4_core0.mlx4_port1=3Deth
sys.device.mlx4_core0.mlx4_port1: auto (ib) -> eth
and the device numbers on the ib devices change: ib1 is now ib4, and I =
have a new mlxen0 device.
ib2: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
ib3: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
mlxen0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d1:21
inet6 fe80::f652:14ff:fe10:d121%mlxen0 prefixlen 64 scopeid 0xe=20=
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
ib4: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.4a.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
When I change another port into Ethernet mode
# sysctl sys.device.mlx4_core0.mlx4_port2=3Deth
sys.device.mlx4_core0.mlx4_port2: auto (ib) -> eth
device numbers change again. Now mxlen0 disappears and becomes mxlen1, =
and I have a new mxlen2 device:
ib2: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
ib3: flags=3D8002<BROADCAST,MULTICAST> metric 0 mtu 65520
options=3D80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
mlxen1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d1:21
inet6 fe80::f652:14ff:fe10:d121%mlxen1 prefixlen 64 scopeid 0xe=20=
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
mlxen2: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d1:22
inet6 fe80::f652:14ff:fe10:d122%mlxen2 prefixlen 64 scopeid 0xf=20=
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
Changing the other two ports (on the second card) to Ethernet mode=20
# sysctl sys.device.mlx4_core1.mlx4_port1=3Deth
sys.device.mlx4_core1.mlx4_port1: auto (ib) -> eth
# sysctl sys.device.mlx4_core1.mlx4_port2=3Deth
sys.device.mlx4_core1.mlx4_port2: auto (ib) -> eth
leaves me with mlxen1, mlxen2, mlxen4 and mlxen 5:
mlxen1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d1:21
inet6 fe80::f652:14ff:fe10:d121%mlxen1 prefixlen 64 scopeid 0xe=20=
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
mlxen2: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d1:22
inet6 fe80::f652:14ff:fe10:d122%mlxen2 prefixlen 64 scopeid 0xf=20=
inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255=20
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (40Gbase-CR4 =
<full-duplex,rxpause,txpause>)
status: active
mlxen4: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d0:d1
inet6 fe80::f652:14ff:fe10:d0d1%mlxen4 prefixlen 64 scopeid 0x10=20=
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
mlxen5: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 1500
=
options=3Dd05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether f4:52:14:10:d0:d2
inet6 fe80::f652:14ff:fe10:d0d2%mlxen5 prefixlen 64 scopeid 0x11=20=
inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255=20
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-CX4 =
<full-duplex,rxpause,txpause>)
status: active
Needless to say, having devices change numbers is problematic.
(3) 40G TCP performance
I barely get over 10G with netperf over the 40G interfaces:
root@one:~ # netperf -H two-mlxen2 -- -s512k -S512K
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to =
two-mlxen2.muclab () port 0 AF_INET : histogram : interval : dirty data =
: demo
Recv Send Send =20
Socket Socket Message Elapsed =20
Size Size Size Time Throughput =20
bytes bytes bytes secs. 10^6bits/sec =20
524288 512000 512000 10.07 10268.01 =20
Any clues as to what could be limiting performance here?
Thanks,
Lars
--Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="signature.asc"
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Message signed with OpenPGP using GPGMail
-----BEGIN PGP SIGNATURE-----
iQCVAwUBUt44K9ZcnpRveo1xAQLNZAP/Zb4RgcWGfayz8qAx7Zqd/iC306na4yCq
KTb4VKA7vduD9iKEzkD3+XOY2jbHHgpWzGljStPu0X1OYErkn+2IMoICBXMMn/1I
uRPrgOFJqAzcCZmBNQ6G8FFCxX2ahb/CuNDTfhGWpfV7vP4IouGPAN81GaSq794/
gsodbbfJcG8=
=ALPM
-----END PGP SIGNATURE-----
--Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB--
討論串 (同標題文章)
完整討論串 (本文為第 11 之 11 篇):