i trying parse text file looks using pandas:
some random text more random text may of different length junk 0 9 8 0 1 1 5 5 5 more random text interdispersed 123 321 2 junk 55 1 9 1 2 3
the file space delimited. care lines start 'good', have same formatting.
i believe read_table()
right command don't know how filter it.
my current method of parsing files open file, use regex match lines care , split line on spaces. can slow , looking faster cleaner way.
you don't need regex match lines start "good". iterate on file , throw away other lines, creating "clean" copy of data want:
with open('irregular.txt') infile, open('regular.txt', 'w') outfile: line in infile: if line.startswith('good'): outfile.write(line)
then can read "regular.txt" using read_table
or read_csv
delim_whitespace=true
argument.
Comments
Post a Comment