[試題]109下 陳信希 自然語言處理 期末考

看板NTU-Exam作者 (River1Z)時間2年前 (2021/07/09 15:32), 2年前編輯推噓1(100)
留言1則, 1人參與, 2年前最新討論串1/1
課程名稱︰自然語言處理 課程性質︰系選修 課程教師︰陳信希 開課學院:電資學院 開課系所︰資訊工程學系 考試日期(年月日)︰2021/6/24 考試時限(分鐘):180 *因疫情改為線上,可查詢網路資源 試題 : 1. Given the sentence “在 夫子廟 入口 遍布 我 喜歡 的 小吃店”, please show the results after (a) constituency parser, (b) noun phrase chunker, and (c) dependency parser. (15 points) 2. Assume arc-standard dependency parser is adopted. Please show the actions to parse the sentence “在 夫子廟 入口 遍布 我 喜歡 的 小吃店”. (10 points) 3. Assume we have a set of four discourse relations – say, temporal, contingency, comparison, and expansion, as defined in PDTB. Please judge if ”而” in each of the following sentences is a discourse connective. If yes, please specify their relations based on the connective. (20 points) (a) 1997 年發達國家經濟形勢的特點是[美國增長強勁]而[日本經濟疲弱]。 (b) 開放起了[積極]而[關鍵]的作用。 (c) [這當然不是歷史的巧合],而[是歷史的累積和轉接]。 (d) [水東開發區是適應乙烯工程的需要]而[建立的一個後繼加工基地]。 4. In recent years, there are important advances in the quality of state-of-the-art models, but those models are often less interpretable. Nowadays “explainable NLP”is an emerging research when we develop a model. Attention mechanism is widely used operation to enable explanations. Please explain how it achieves "explanation." (10 points) 5. Nowadays newspapers become more partisan. Some research proposes a slant index to measure the frequency of phrases to sway readers to the left or the right in a media outlet. Some research investigates demographic characteristics and political attitudes of newspaper readers in Taiwan from 1992 to 2004. Their studies conclude that media are biased, i.e., left-wing vs. right-wing in US and pan-green vs.panblue in Taiwan. Now you are asked to design an NN model to transform a pan-green content to a pan-blue one. Please show your idea. (10 points) 6. There are several ways to achieve semantic analysis. One possibility is a sequenceto-sequence model to transform an NL sentence to a semantic form. Another possibility is to extract the most important parts from an NL sentence, such as Arg0, Arg1, and so on. Please explain the ideas behind these two possible solutions. (15 points) 7. One major disadvantage of skip-gram and CBOW is the same representation for different senses of a word. Do you have any idea to capture a suitable sense of a word based on its context? (10 points) 8. To automatically interpret the semantics of written languages, the analysis and understanding of causal relationships between facts stand as a key point. The following shows three examples. The 2nd column shows a passage. The cause and the effect extracted from the passage are shown in the 3rd and 4th columns, respectively. https://imgur.com/a/GAyJN9h Assume you are given a cause-effect corpus consisting of passages with annotated cause and effect segments. You are asked to design a system to identify the cause and effect segments from the given passage. (10 points) 9. For the privacy and security issues, electronic medical records (EMRs) have to be de-identified before being released for potential applications. According to HIPPA, 18 types of identifiable data must be removed, including names, telephone, email addresses, IP addresses, social security numbers, medical record numbers, and so on. Do you have any ideas to deal with this problem? (10 points) -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 36.230.109.173 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/NTU-Exam/M.1625815940.A.751.html ※ 編輯: eayaps1788 (36.230.109.173 臺灣), 07/09/2021 15:39:39 ※ 編輯: eayaps1788 (36.230.109.173 臺灣), 07/09/2021 15:40:11 ※ 編輯: eayaps1788 (36.230.109.173 臺灣), 07/09/2021 15:51:11

07/09 18:44, 2年前 , 1F
收錄資訊系精華區!
07/09 18:44, 1F
文章代碼(AID): #1Wv_k4TH (NTU-Exam)