Re: cvs commit: src/sys/sys tls.h src/lib/libc/gen tls.c src/lib

看板DFBSD_commit作者時間21年前 (2005/03/29 02:02), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串8/14 (看更多)
:... :> prefer NOT to do). I did a quick timing test on sys_set_tls_area() :> and it costs around 339ns on my AMD64 test cube. But this is still :> going to be far higher performing then having to call __tls_get_addr :> all the time. The procedure setup cost for figuring out the GOT offset :> alone is 17ns on the same box. : :It's not about calling __tls_get_addr, but : mov %gs:0, %eax : mov a@NTPOFF(%eax), %eax :vs. : mov $gs:a@NTPOFF, %eax : :The difference is one load instruction with possible a pipe-line stale :involved here. The difference should be zero once the base register is :loaded. : :Joerg There's no pipeline stall there. %gs:0 is likely to ALWAYS be in the L1 cache. The %gs prefix itself can cost time verses a non-prefixed relative load instruction so my guess is that it turns out to be a wash. Also keep in mind that GCC will cache the data loaded from %gs:0, which makes it even less of an issue (and potentially faster then %gs:OFFSET). I did a quick test with both the direct and indirect %gs models and couldn't see any difference in timing. Matthew Dillon <dillon@backplane.com>
文章代碼(AID): #12I4QU00 (DFBSD_commit)
討論串 (同標題文章)
完整討論串 (本文為第 8 之 14 篇):
文章代碼(AID): #12I4QU00 (DFBSD_commit)