i want extract words have following label "w="
. example, need, " have received no"
string below.
w="that" v="22.23092" a="19.09109" i="3"/> <r s="1480150" d="150" w="have" v="20.66713" a="19.09183" i="3"/> <r s="1480300" d="360" w="received" v="18.70063" a="19.09165" i="2"/> <r s="1480660" d="200" w="-sil-" v="11.65527" a="19.09165" i="0"/> <r s="1480860" d="210" w="no" v="18.49828" a="19.09137" i="2"/> <r s="1481070" d="4330" w="-s-" v="11.55029" a="19.09137" i="0"/> <r s="1485400" d="4170" w="-s-" v="11.88606" a="19.09137" i="0"/>
i have been trying use following regex:
matches = re.findall('(?<=[w][=])\w+',line)
however, not seem work. please help.
something this:
>>> import re >>> re.findall(r'w="(\w+)"',strs,re.dotall) ['that', 'have', 'received', 'no']
then use str.join
single string:
>>> " ".join(re.findall(r'w="(\w+)"',strs,re.dotall)) 'that have received no'
where strs
:
>>> print strs w="that" v="22.23092" a="19.09109" i="3"/> <r s="1480150" d="150" w="have" v="20.66713" a="19.09183" i="3"/> <r s="1480300" d="360" w="received" v="18.70063" a="19.09165" i="2"/> <r s="1480660" d="200" w="-sil-" v="11.65527" a="19.09165" i="0"/> <r s="1480860" d="210" w="no" v="18.49828" a="19.09137" i="2"/> <r s="1481070" d="4330" w="-s-" v="11.55029" a="19.09137" i="0"/> <r s="1485400" d="4170" w="-s-" v="11.88606" a="19.09137" i="0"/>
Comments
Post a Comment