Home Python Course #15: Reading and Writing Files
Post
Cancel

Python Course #15: Reading and Writing Files

So far, you can store data within the variables of a program; however, when the program ends, this data is gone. And when you start the program again, all the data you want to work with has to be entered again. To store data throughout multiple runs of a program (or, to be correct, persist the data), you need to write it into a file and read from it again. In this article, you will learn how to read and write files in Python.

Free Python File I/O Cheat Sheet

Pick up your free Python sets cheat sheet from my Gumroad shop:

Free Python File I/O Cheat Sheet

The Python with Keyword

Before you can start reading a file, a new keyword has to be introduced. The keyword is with, which is needed because reading from a file is an unmanaged resource access. You have complete control over the code you write in your program. When interacting with files, you lose some of this control because the operating system or other programs can interact with that file, leading to errors. Those errors have to be dealt with. The with and it’s with-block take care of handling those errors and keep your program running. Additionally, they ensure the open file won’t be corrupted when your program fails. Usually, you would use a try/finally block in Python for such unmanaged resources; however, the with keyword provides a shortcut.

1
2
with expression [as variable]:
    with-block

Reading a File in Python

As an example file, you can use this text which are the first four lines of The Zen of Python (PEP20). Create a new file with a plain text editor and store it as zen.txt and make sure it is in the same directory as the Python code you are going to write throughout this article:

1
2
3
4
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.

The following code reads the entire content of a file into a single variable:

1
2
3
4
5
6
7
8
if __name__ == "__main__":

    file_content = ""

    with open('zen.txt') as file:
        file_content = file.read()

    print(file_content)

The file_content variable will contain the whole text inside the file after it is read. In line 5 the file zen.txt is opened using the with statement followed by the function open(...), which takes the file path as an argument. The return value of open(...) is a data structure (a TextIOWrapper) that has read access to the passed to open(...). In the example above, the with statement passes the return value of open(...) to file indicated by the as keyword. With the built-in function .read(), the entire data stored in a file is read.

Because of the usage of a with-block, the file doesn’t have to be closed manually. The with-block takes care of closing the file and telling the operating system that your program no longer uses it such that other applications can open and change it.

Reading a File line-by-line in Python

Reading the entire content of a file all at once using the .read() is an easy way to store the content of a small file in a variable. However, when you deal with larger files, this can become quite problematic as the entire file content is stored in the computer memory, which can slow down your program and your computer when the file size exceeds the available amount of memory. Therefore, it is better to read a file line-by-line and only pick out the parts that are needed for your program:

1
2
3
4
5
6
7
8
9
10
if __name__ == "__main__":

    with open('zen.txt') as file:
        while True:
            line = file.readline()

            if not line:
                break

            print(line, end="")

In the code example above the function .readline() is used instead of .read(). The .readline() function returns one line of a file every time it is called (the end of a line is determined by the newline character, which is \n on macOS and Linux and \n\r on Windows). It starts with the first line, and when the last line is returned it only returns an empty str of length 0 in subsequent calls. With the if not line statment, an empty string is detected, and the while loop is stopped with the break keyword. If line is not empty, the string is printed. However, the new line from the print(...) statement in line 10 is removed with end="" as the string containing the line already has a new line at its end.

Python File Positions tell() and seek()

The TextIOWrapper stores the current file position internally. the file position can be imagined as a cursor, and every time something is read from the file, the cursor/position moves forward the number of characters that were read. To get the current position use .tell():

1
2
3
4
5
6
7
if __name__ == "__main__":
    with open('zen.txt') as file:
        print(file.tell())
        file.readline()
        print(file.tell())
        file.read(1)
        print(file.tell())

The output of the code example above:

1
2
3
0
31
32

When nothing has been read from a file, the position is 0. When you call .readline() all characters from the file are read until the new line character. The first line of zen.txt contains 31 characters, including the new line character; because of this, the position after .readline() is 31. When only reading a single character with .read(1) (the number states how many characters should be read while no number means all characters from the current position should be read), the position also moves by one position.

You can also move the file position without reading a character with the .seek() function. With the current mode, you open files (character-wise) .seek() only allows to enter an absolute position; otherwise, the file has to be opened as a binary file which is not part of this article. However, with the .tell() function, a lot can be achieved already. For example, the following code only reads every second character from a file:

1
2
3
4
5
6
7
8
9
10
11
if __name__ == "__main__":
    with open('zen.txt') as file:
        while True:
            s = file.read(1)

            file.seek(file.tell()+1)

            if not s:
                break

            print(s, end="")

The output of the code example above:

1
2
3
Batfli etrta gy
xlcti etrta mlct
ipei etrta ope.Cmlxi etrta opiae

With file.read(1) in line 4 a single character is read from the file, moving the position forward by 1. With file.seek(file.tell()+1) the current file position (the next character the should be read) is retrieved. To this file, position 1 is added and set as a new file position with .seek(...); therefore, every second character in the file is skipped.

Writing a File in Python

Before writing code, it is essential to look deeper into the open(...) function. open(...) does not just take the file path and name as an argument. It also takes a mode. The default mode is open for reading, which you used when reading files. The default mode restricts the program from changing the file in any form. The mode is passed as a character. For example, to set the default mode open for reading, you would enter the following:

1
open('zen.txt', 'r')

This table shows all available modes together with their characters:

CharacterMeaning
ropen for reading (default)
wopen for writing, truncate the file first
xcreate a new file and open it for writing
aopen for writing, append to the end of the file if it exists
bbinary mode
ttext mode (default)
+open a disk file for updating (reading and writing)

As you can see, to write into a file has to be opened in w, x, or a mode. The most interesting mode is + because it is often combined with other modes to open a file for reading and writing and setting the initial position. The most common combinations are:

CharactersMeaning
rread-only, initial position at the beginning
r+reading and writing, initial position at the beginning
wwrite-only, remove all existing content from the file, initial position at the beginning
w+reading and writing, remove all existing content from the file, initial position at the beginning
awrite-only, the file is created if it doesn’t exist, initial position at the end
a+reading and writing, the file is created if it doesn’t exist, the initial position is at the end

Writing into a file with r+:

1
2
3
4
5
6
7
8
if __name__ == "__main__":
    with open('zen.txt', 'r+') as file:
        file.write("abc")

        file.seek(0)
        file_content = file.read()

        print(file_content)

The code example above opens the file zen.txt for reading and writing and sets the initial position to the beginning of the file. With file.write("abc"), three characters are written into the file starting from position 0, replacing the first three characters of the word Beautiful. Then the position is set to the beginning, and the whole file is read and printed:

1
2
3
4
abcutiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.

Running the same code with w+ leads to the truncation first and then writing into it, which results in the following output:

1
abc

And last but not least, using a+ opens the file and set the initial position to the end of the file and therefore appends abc when file.write("abc") is called, resulting in the following output:

1
2
3
4
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.abc

As already teased, there are more complex and versatile ways of reading and writing files in Python. However, this article focuses on the basics as a starting point for Python beginners. Make sure to get the free Python Cheat Sheets in my Gumroad shop. If you have any questions about this article, feel free to join our Discord community to ask them over there.

This post is licensed under CC BY 4.0 by the author.