Re: [問題] SEM用PLS跑的問題與優缺點?

看板Statistics作者chenyutn (人生要死，何為苦心。)時間16年前 (2009/06/02 13:12)推噓2(2推 0噓 0→)

留言2則, 2人參與討論串9/10 (看更多)

我後來直接寄信去問Dr. Goodue了。 Dr. Goodhue, sorry for my poor English, I am a graduate student in Taiwan, having a question about your article "Statistical Power in Analyzing Interaction Effects: Questioning the Advantage of PLS With Product Indicators". I wonder how many bootstrapping resamples would be enough for estimating the parameter? Is that "the more, the better" making sense? And, why the five hundred resamples is the usual recommendation? An article of Stata (http://www.stata.com/support/faqs/stat/reps.html) suggests that "the right answer is that you should choose an infinite number of replications because, at a formal level, that is what the bootstrap requires", and recommend three steps for identifying the reasonable number of resamples: "1. Choose a large but tolerable number of replications. Obtain the bootstrap estimates. 2. Change the random-number seed. Obtain the bootstrap estimates again, using the same number of replications. 3. Do the results change meaningfully? If so, the first number you chose was too small. Try a larger number. If results are similar enough, you probably have a large enough number. To be sure, you should probably perform step 2 a few more times, but I seldom do." Is that true? I hope I've explained my questions well, and hope they make sense. Thanks for your help! Chen, Y-T 剛剛收到Dr. Goodue的回覆： Chen, Attached is the material from the paper on the number of resamples. As you can see, we did carry out a version of what is recommended by Stata. 他那篇文章的部份內容，也確實是依照比較不同resamples數而來，可見Appendix E: Comparing Bootstrapping With 100 and 500 Resamples。因此，跟我之前所提的論點一樣（這樣下結論應該可以吧）。 ※ 引述《chenyutn (人生要死，何為苦心。)》之銘言： : ※ 引述《danny789 (這其中一定有什麼誤會)》之銘言： : : 對於我來說 PLS 只是一個工具而已 : : 我只要知道如何使用及瞭解它的假設及限制, 而能產出 outcome 並解讀就可以了 : : 如同您會操作電腦, 但您知道半導體是如何製造的嗎? 畢竟電腦只是一個工具而已 : : 也許您只是站在純數學的觀點來看, 認為 resample 設越大越好 : : 但這樣反而太過操弄統計這個工具了, 這樣統計的結果真的就是事實的結果嗎? : : 如果您可以提供文獻證明 resample 設越大越好, 那我也可以修正我原來的看法. : : 若如您所言, 對於 resample 設越大越好, 我一個合理的懷疑 : : 那麼這許多作研究的學者應該會有人提到這點, 但是並沒有 ... : : 至少我看過的 papers 沒人提到此點 : : 而且我相信這些學者的電腦應該不會太差, resample 設100萬也不是問題才對 : : 所以我認為這並不是電腦執行速度的問題 : : 我後來還是找到了 Goodhue et al.(2007) 這篇 pdf 檔 (ISR 在 MIS 排前五大期刊) : : 也許底下的片段可以解答您的問題, 所以我的建議還是設 500 比較恰當 : : 因為這是大多數學者所使用的數值 : : It might be suggested that we should use bootstrapping : : with 500 resamples (rather than 100). Five hundred : : resamples is the usual recommendation when : : using bootstrapping to estimate a parameter using a : : single sample (Chin 1998). However, we draw 500 : : samples (500 researchers) from the same population : : for each cell in our analysis, and use bootstrapping : : with 100 resamples on each of those. This amounts to : : 50,000 resamples for each cell, and hence we expect : : that moving from 50,000 to 250,000 resamples in each : : cell would not affect the outcome. : bootstrapping的目的本就是 : Estimate parameters that we don't know how to estimate analytically : (Howell, 2002, http://tinyurl.com/q6v3c2) . : 以下取自Stata的guidelines（http://www.stata.com/support/faqs/stat/reps.html）， : 懶得翻了，僅標重點。 : 這段告訴我們一點： : 數字設多大不一定，但越大必然會獲得越精確的CI估計。 : 只是我們需不需要這麼精確的數字而已。 : 我想其實danny789板友也是想表達這個意思，只是在回文時我太注重500這個數字了， : 因為我覺得能越精確當然越好啊。:P : 所以bmka板友前幾篇推文給的建議非常實用，設個500次、1000次跑看看， : 再跟2000次比較一下有沒有太大的差異，如果沒有，就放心報告吧。 : How large should the bootstrapped samples be relative to the total number : of cases in the dataset? : In terms of the number of replications, there is no fixed answer such as : “250” or “1,000”to the question. The right answer is that you should : choose an infinite number of replications because, at a formal level, that : is what the bootstrap requires. : The key to the usefulness of the bootstrap is that it converges in terms of : numbers of replications reasonably quickly, and so running a finite number : of replications is good enough—assuming the number of replications chosen : is large enough. : The above statement contains the key to choosing the right number of : replications. Here is the recipe: : 1. Choose a large but tolerable number of replications. Obtain the : bootstrap estimates. : 2. Change the random-number seed. Obtain the bootstrap estimates : again, using the same number of replications. : 3. Do the results change meaningfully? If so, the first number you chose : was too small. Try a larger number. If results are similar enough, you : probably have a large enough number. To be sure, you should probably : perform step 2 a few more times, but I seldom do. : Whether results change meaningfully is a matter of judgment and has to be : interpreted given the problem at hand. How accurately do you need the : standard errors, confidence intervals, etc.? Often, a few digits of precision : is good enough because, even if you had the standard error calculated : perfectly, you have to ask yourself how much you believe your model in terms : of all the other assumptions that went into it. For instance, in a Becker : earnings model of the return to schooling, you might tell me that return is : 6% with a standard error of 1, and I might believe you. If you told me the : return is 6.10394884% and the standard error is .9899394, you have more : precision but have not provided any additional useful information. -- ◤◢ 玄妙系列作第二部《黃泉路》全家、福客多、OK便利商店熱賣中 ▊▋▌▍▎ ▇▆◣▅▇▇▅▆▇█▋ ▇▆▇▍▄▇ ▇▅▂▄▆▇▏ ▇ ．． ▏ ‧ ═▉ ▎ 發生過命案的三重賓館857號房 ▏ ‧ … ═ ▉ ▏ 憑空傳來的詭異歌聲 ▏ ▎ ． ‥ ‥ ． ═ ˙▊ ▉ ‧ 歸來的惡靈即將帶走他們的性命.◢▉ . ． ◣ ﹎ ▄ ‧ ▊ ▉▇◣ ▄▅ http://kuso.cc/4ltv ﹒ ▎▊ ▆ ﹊ -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 114.42.90.187

推

bmka

06/02 21:32, , 1^F

06/02 21:32, 1^F

推

chengjaylee

06/03 08:41, , 2^F

06/03 08:41, 2^F

‣ 返回看板[ Statistics ] 統計

‣ 更多 chenyutn 的文章

文章代碼(AID): #1A9BJ167 (Statistics)