Suppose we have a very large text file, and each line contains either ON/ OFF, ON and OFF or some other text that starts with ON/ OFF, and we are only interested to know the number of ON's and OFF's contained in the text file.
Let's find out how we can do that.
Download the attached "on_off.txt" plain text file and put it in the same directory as the program below:
- """Parsing TextFiles in Python"""
- #Open and read file
- with open("on_off.txt") as f:
- lines = f.readlines()
- lines = [x.strip('\n') for x in lines] #Optional
- print(lines)
- #Look for patterns
- countON = 0
- countOFF = 0
- for line in lines:
- line = line.strip().upper() #Remove white spaces
- if line.find("ON")!= -1 and len(line)==2:
- countON += 1
- if line.find("OFF")!= -1 and len(line)==3:
- countOFF += 1
- print("ON:", countON)
- print("OFF:", countOFF)
Here is an explanation on what the code does:
- "with open" functions create a file object and store it in the variable "f"
- f.readlines() method reads all the lines in a file and returns them as a Python list
- countON and countOFF are initializations for ON and OFF counts
- The for loop processes each line separately, "line.strip()" removes/strips white spaces before and after each text of our interest, in this case ON and OFF, upper() function transforms a text into uppercase
- We check the length of each text by "len()" to make sure that the text of our interest is of intended length and not embedded into anything else
Open a file and read its contents
Code: Select all
file = open("on_off.txt", "r")
Code: Select all
file = open("on_off.txt", "w")