Skip to content Skip to sidebar Skip to footer

Python Iterating Through Two Files By Line At The Same Time

I am trying to compare columns in two files to see if the values match, and if there is a match I want to merge/concatenate the data for that row together. My issue is that when re

Solution 1:

Use the zip builtin function.

withopen(file1) as f1, open(file2) as f2:
    for line1, line2 inzip(f1, f2):
        motif1 = line1.split()[0]
        motif2 = line2.split()[0]
        ...

Note that zip behaves differently in python2 and python3. In python2, it would be more efficient to use itertools.izip instead.

Solution 2:

I'm assuming you're using Python 3. Here's a nice abstraction, iterlines. It hides the complexity of opening, reading, pairing, and closing n files. Note the use of zip_longest, this prevents the ends of longer files being silently discarded.

defiterlines(*paths, fillvalue=None, **open_kwargs):
    files = []
    try:
        for path in paths:
            files.append(open(path, **open_kwargs))
        for lines in zip_longest(*files, fillvalue=fillvalue):
            yield lines
    finally:
        for file_ in files:
            with suppress():
                file_.close()

Usage

for line_a, line_b in iterlines('a.txt', 'b.txt'):
    print(line_a, line_b)

Complete code

from contextlib import suppress
from itertools import zip_longest


defiterlines(*paths, fillvalue=None, **open_kwargs):
    files = []
    try:
        for path in paths:
            files.append(open(path, **open_kwargs))
        for lines in zip_longest(*files, fillvalue=fillvalue):
            yield lines
    finally:
        for file_ in files:
            with suppress():
                file_.close()


for lines in iterlines('a.txt', 'b.txt', 'd.txt'):
    print(lines)

Post a Comment for "Python Iterating Through Two Files By Line At The Same Time"