8. 檔案操作 File Operations¶

使用 Python 內建函式 open 產生 file 物件，預設開啟為文字模式的檔案，檔案的相關操作均透過這個 file 物件的方法，基本操作方法如以下表格。

檔案操作	說明
`file = open('app.log', 'w')`	開啟並清空一個可供寫入的檔案物件
`file = open('app.log', 'a')`	開啟一個檔案物件，從結束位置開始寫入
`file.write(aString)`	寫入字串到檔案
`file.writelines(aList)`	寫入清單裡的數行字串到檔案
`file = open(r'C:\data.csv', 'r')`	開啟一個可供讀取的文字檔案物件
`aString = file.read()`	讀取整個檔案到一個字串
`aString = file.read(N)`	讀取下 N 個 bytes 到字串
`aString = file.readline()`	讀取下一行（包含 '\n' 字元）到字串
`aList = file.readlines()`	讀取整個檔案為數行（包含 '\n' 字元）的清單
`file = open('music.bin', 'w+b')`	開啟一個 binary 檔案物件，清空為 0 byte
`file = open('music.bin', 'r+b')`	開啟一個 binary 檔案物件，不清空
`file.close()`	手動關閉檔案物件
`file.flush()`	把緩衝（buffer）中的資料寫入實體磁碟

存取文字模式的檔案內容時，會先經過系統預設編碼解譯後返回 str 字串物件。 binary 模式則不會進行解碼，檔案內容直接返回 bytes 物件。
檔案的讀取可以當成可迭代物件操作
手動呼叫 close() 關閉檔案通常不是必要的，Python 的資源回收機制會自動關閉開啟的檔案物件。但是養成開啟檔案後，明確地呼叫 close() 是個好習慣。
存取檔案資源，建議搭配 context manager 或 try-except-finally 使用，可以提供有保障的資源存取策略。

§ 文字檔案讀寫¶

檔案讀寫若未指定編碼（encoding）參數，預設會是作業系統預設編碼（如 big5），建議明確指定 utf-8 以方便與他人檔案交換。

In [1]:

# 寫一個新檔案
outfile = open('hello.txt', 'w', encoding='utf-8')
# write() 的結果返回寫入的字元數量，不是 byte
nchars = outfile.write('Hello Python\n')
print('寫入', nchars, '個字')
nchars = outfile.write('你好，拍神\n')
print('寫入', nchars, '個字')
outfile.close()

寫入 13 個字
寫入 6 個字

In [2]:

# 開啟剛剛寫入的檔案，讀進內容，mode 參數預設為 'rt'，可以省略
hello_string = open('hello.txt', encoding='utf-8').read()
#hello_string
print(hello_string)

Hello Python
你好，拍神

In [3]:

# 對 file 使用 for 迴圈，每次迭代就是讀進一行
for line in open('hello.txt', encoding='utf-8'):
    print(line, end='')

Hello Python
你好，拍神

§ 檔案操作 - 使用 Context Manager¶

使用 Context Manager 在檔案物件上，可代替 try/finally 的例外處理功能，在檔案開啟進入 with 區塊後，不論區塊內的運算是否發生例外狀況，保證在離開 with 區塊前自動關閉檔案。

with expression [as variable]:
    statements

巢狀的 Context Manager 可以寫成

with exprA [as varA], exprB [as varB]:
    statements

In [4]:

# 同樣的程式使用 try-except-finally 的寫法
# fin = open('hello.txt', encoding='utf-8')
# try:
#     for line in fin:
#         print(line)
# except:
#     print('something is wrong')
# finally:
#     fin.close()

# 使用 with 的可讀性較佳
with open('hello.txt', encoding='utf-8') as fin:
    for line in fin:
        print(line, end='')

Hello Python
你好，拍神

In [5]:

# 使用巢狀的 Context Manager 作轉碼，讀取 utf-8 檔案，寫入 utf-16 檔案
with open('hello.txt', encoding='utf-8') as fin, open('uhello.txt', 'w', encoding='utf-16') as fout:
    for line in fin:
        fout.write(line)

In [6]:

# 使用巢狀的 Context Manager 作簡單的檔案比較
with open('hello.txt', encoding='utf-8') as fu8, open('uhello.txt', encoding='utf-16') as fu16:
    for (linenum, (u8line, u16line)) in enumerate(zip(fu8, fu16)):
        if u8line != u16line:
            print('line #{} 不同\tfile1:"{}",\tfile2:"{}"'.format(linenum, u8line[:-1], u16line[:-1]))

§ 檔案操作 - 使用 Comprehension¶

使用 List 或 Dict Comprehension，在該段運算結束後，暫時的檔案物件也會自動被資源回收機制所關閉。

In [8]:

# rstrip() 去除換行字元後放入 List
[line.rstrip() for line in open('hello.txt', encoding='utf-8')]

Out[8]:

['Hello Python', '你好，拍神']

In [9]:

# 忽略第一個字元是註解 '#' 符號的那一行
[line.rstrip() for line in open('uhello.txt', encoding='utf-16') if line[0] != '#']

Out[9]:

['Hello Python', '你好，拍神']

In [10]:

# 用行號把每一行記錄成 Dict
{key: line.rstrip() for key, line in enumerate(open('hello.txt', encoding='utf-8'))}

Out[10]:

{0: 'Hello Python', 1: '你好，拍神'}