Re: NFS 75 second stall

看板FB_stable作者時間15年前 (2010/09/02 02:01), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串12/13 (看更多)
On 07/01/10 15:23, Garrett Cooper wrote: > On Thu, Jul 1, 2010 at 11:51 AM, alan bryan<alan.bryan@yahoo.com> wrote: >> >> --- On Thu, 7/1/10, Garrett Cooper<yanefbsd@gmail.com> wrote: >> >>> From: Garrett Cooper<yanefbsd@gmail.com> >>> Subject: Re: NFS 75 second stall >>> To: "alan bryan"<alan.bryan@yahoo.com> >>> Cc: freebsd-stable@freebsd.org >>> Date: Thursday, July 1, 2010, 11:13 AM >>> On Thu, Jul 1, 2010 at 11:01 AM, alan >>> bryan<alan.bryan@yahoo.com> >>> wrote: >>>> Setup: >>>> >>>> server - FreeBSD 8-stable from today. 2 UFS dirs >>> exported via NFS. >>>> client - FreeBSD 8.0-Release. Running a test php >>> script that copies around various files to/from 2 separate >>> NFS mounts. >>>> Situation: >>>> >>>> script is started (forked to do 20 simultaneous runs) >>> and 20 1GB files are copied to the NFS dir which works >>> fine. When it then switches to reading those files back >>> and simultaneously writing to the other NFS mount I see a >>> hang of 75 seconds. If I do an "ls -l" on the NFS mount it >>> hangs too. After 75 seconds the client has reported: >>>> nfs server 192.168.10.133:/usr/local/export1: not >>> responding >>>> nfs server 192.168.10.133:/usr/local/export1: is alive >>> again >>>> nfs server 192.168.10.133:/usr/local/export1: not >>> responding >>>> nfs server 192.168.10.133:/usr/local/export1: is alive >>> again >>>> and then things start working again. The server was >>> originally FreeBSD 8.0-Release also but was upgraded to the >>> latest stable to see if this issue could be avoided. >>>> # nfsstat -s -W -w 1 >>>> GtAttr Lookup Rdlink Read Write Rename >>> Access Rddir >>>> 0 0 0 222 257 >>> 0 0 0 >>>> 0 0 0 178 135 >>> 0 0 0 >>>> 0 0 0 85 127 >>> 0 0 0 >>>> 0 0 0 0 0 >>> 0 0 0 >>>> 0 0 0 0 0 >>> 0 0 0 >>>> 0 0 0 0 0 >>> 0 0 0 >>>> 0 0 0 0 0 >>> 0 0 0 >>>> 0 0 0 0 0 >>> 0 0 0 >>>> ... for 75 rows of all zeros >>>> >>>> 0 0 0 272 266 >>> 0 0 0 >>>> 0 0 0 167 165 >>> 0 0 0 >>>> I also tried runs with 15 simultaneous processes and >>> 25. 15 processes gave only about a 5 second stall but 25 >>> gave again the same 75 second stall. >>>> Further, I tested with 2 mounts to the same server but >>> from ZFS filesytems with the exact same stall/timeout >>> periods. So, it doesn't appear to matter what the >>> underlying filesystem is - it's something in NFS or >>> networking code. >>>> Any ideas on what's going on here? What's causing >>> the complete stall period of zero NFS activity? Any flaws >>> with my testing methods? >>>> Thanks for any and all help/ideas. >>> What network driver are you using? Have you tried >>> tcpdumping the packets? >>> -Garrett >>> >> I'm using igb currently but have also used em. I have not tried tcpdumping the packets yet on this test. Any suggestions on things to look out for (I'm not that familiar with that whole process). >> >> Which brings up another point - I'm using TCP connections for NFS, not UDP. > Is the net.inet.tcp.tso sysctl enabled or not? What about rxcsum and txcsum? > Thanks, > -Garrett We're occaisionally seeing these same types of stalls (+ repeated "is not responding" "is alive again" messages in quick succession). We're seeing it only on our 8.1-RELEASE systems against a variety of NFS servers (6.3-RELEASE, 7.2-RELEASE, and 8-STABLE from before the release of 8.1). We also see it happen with a variety of client hardware and network adapters (em, bce, bge); the only common denominator is 8.1-RELEASE on the clients. _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
文章代碼(AID): #1CVfJepG (FB_stable)
討論串 (同標題文章)
文章代碼(AID): #1CVfJepG (FB_stable)