python - Extract specific values from string -


i want extract words have following label "w=". example, need, " have received no" string below.

w="that" v="22.23092" a="19.09109" i="3"/> <r s="1480150" d="150" w="have" v="20.66713" a="19.09183" i="3"/> <r s="1480300" d="360" w="received" v="18.70063" a="19.09165" i="2"/> <r s="1480660" d="200" w="-sil-" v="11.65527" a="19.09165" i="0"/> <r s="1480860" d="210" w="no" v="18.49828" a="19.09137" i="2"/> <r s="1481070" d="4330" w="-s-" v="11.55029" a="19.09137" i="0"/> <r s="1485400" d="4170" w="-s-" v="11.88606" a="19.09137" i="0"/> 

i have been trying use following regex:

 matches = re.findall('(?<=[w][=])\w+',line) 

however, not seem work. please help.

something this:

>>> import re >>> re.findall(r'w="(\w+)"',strs,re.dotall) ['that', 'have', 'received', 'no'] 

then use str.join single string:

>>> " ".join(re.findall(r'w="(\w+)"',strs,re.dotall)) 'that have received no' 

where strs :

>>> print strs w="that" v="22.23092" a="19.09109" i="3"/> <r s="1480150" d="150" w="have" v="20.66713" a="19.09183" i="3"/> <r s="1480300" d="360" w="received" v="18.70063" a="19.09165" i="2"/> <r s="1480660" d="200" w="-sil-" v="11.65527" a="19.09165" i="0"/> <r s="1480860" d="210" w="no" v="18.49828" a="19.09137" i="2"/> <r s="1481070" d="4330" w="-s-" v="11.55029" a="19.09137" i="0"/> <r s="1485400" d="4170" w="-s-" v="11.88606" a="19.09137" i="0"/> 

Comments