python - Parsing a non regular text file in pandas -


i trying parse text file looks using pandas:

some random text more random text may of different length junk 0 9 8 0 1 1 5 5 5 more random text interdispersed 123 321 2 junk 55 1 9 1 2 3 

the file space delimited. care lines start 'good', have same formatting.

i believe read_table() right command don't know how filter it.

my current method of parsing files open file, use regex match lines care , split line on spaces. can slow , looking faster cleaner way.

you don't need regex match lines start "good". iterate on file , throw away other lines, creating "clean" copy of data want:

with open('irregular.txt') infile, open('regular.txt', 'w') outfile:     line in infile:         if line.startswith('good'):             outfile.write(line) 

then can read "regular.txt" using read_table or read_csv delim_whitespace=true argument.


Comments