[問題] 想請問有更有效率的寫法嗎?
我想把一個列數相當多的csv檔案
把裡面重複的列數給刪除掉
我只能想到這種寫法:
import csv
rows = []
a = 0
o = open("output.csv","w")
f = open("input.csv","r")
for row in csv.reader(f):
rows.append(row[0]+","+row[1]+","+row[2]+","+row[3]+","+row[4]+","+row[5]+","+row[6]+","+row[7]+","+row[8]+","+row[9]+","+row[10])
for i in set(rows):
o.write(i+"\n")
f.close()
o.close()
但由於行數非常多,資料量也大(csv檔案約400mb)
因此全部跑完可能需要五天(有寫個計數器來大約計算過,為了節省空間沒列出來)
想請問有沒有更有效率的寫法
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 182.234.196.206
※ 文章網址: http://www.ptt.cc/bbs/Python/M.1408860343.A.6B1.html
推
08/24 17:02, , 1F
08/24 17:02, 1F
→
08/24 17:03, , 2F
08/24 17:03, 2F
→
08/24 17:04, , 3F
08/24 17:04, 3F
→
08/24 17:04, , 4F
08/24 17:04, 4F
→
08/24 17:04, , 5F
08/24 17:04, 5F
→
08/24 17:04, , 6F
08/24 17:04, 6F
→
08/24 17:04, , 7F
08/24 17:04, 7F
→
08/24 17:04, , 8F
08/24 17:04, 8F
→
08/24 17:05, , 9F
08/24 17:05, 9F
→
08/24 17:05, , 10F
08/24 17:05, 10F
→
08/24 17:05, , 11F
08/24 17:05, 11F
→
08/24 17:05, , 12F
08/24 17:05, 12F
→
08/24 17:05, , 13F
08/24 17:05, 13F
→
08/24 17:11, , 14F
08/24 17:11, 14F
→
08/24 17:12, , 15F
08/24 17:12, 15F
→
08/24 17:19, , 16F
08/24 17:19, 16F
→
08/24 20:41, , 17F
08/24 20:41, 17F
推
08/24 21:10, , 18F
08/24 21:10, 18F
討論串 (同標題文章)