python - REGEX To get seven numbers in a row? -
just wondering best regex 7 , 7 numbers in row is? there way use [0-9]
7 times succinctly? or should use few ???
?
the 7 numbers refer school district id code appear anywhere on school district's wiki page. they separated other content spaces.
input: beautifulsoup of these pages nces d id on right in table: https://en.wikipedia.org/wiki/anniston_city_schools same thing: https://en.wikipedia.org/wiki/huntsville_city_schools
ouptut: 7 digit number representing district id ex: 1234567
don't use regular expression at all. use html parser, beautifulsoup:
from urllib2 import urlopen, request bs4 import beautifulsoup resp = urlopen(request('https://en.wikipedia.org/wiki/anniston_city_schools', headers={'user-agent': 'stack overflow'})) soup = beautifulsoup(resp.read()) table = soup.find('table', class_='infobox') row in table.find_all('tr'): if 'nces' in row.th.text: nces = row.td.a.text print nces break
this loads url data, finds "infobox" table, row nces entry.
there 12 exactly-7-digit numbers in html source, above code extracts correct number in 1 go.
Comments
Post a Comment