Unable to Access Child Node in Parsing XML with Python Language -


i new python scripting language , working on parser parses web-based xml file.

i able retrieve 1 of elements using minidom in python no issues have 1 node having trouble with. last node require xml file 'url' within 'image' tag , can found within following xml file example:

<events>     <event id="abcde01">         <title> name of event </title>         <url> url of event <- url tag not need </url>         <image>              <url> url need </url>         </image>     </event> 

below have copied brief sections of code feel may of relevance. appreciate retrieve last image url node. include have tried , error recieved when ran code in gae. python version using python 2.7 , should point out saving them within array (for later input database).

class xmlparser(webapp2.requesthandler): def get(self):         base_url = 'http://api.eventful.com/rest/events/search?location=dublin&date=today'         #downloads data xml file:         response = urllib.urlopen(base_url)         #converts data string         data = response.read()         unicode_data = data.decode('utf-8')         data = unicode_data.encode('ascii','ignore')         #closes file         response.close()         #parses xml downloaded         dom = mdom.parsestring(data)                 node = dom.documentelement  #needed declaration of variable         #print out event names (titles) found in eventful xml         event_main = dom.getelementsbytagname('event')          #urls list parsing - attempt -          urls_list = []         im in event_main:             image_url = image.getelementsbytagname("image")[0].childnodes[0]             urls_list.append(image_url) 

the error receive following appreciated, karen

image_url = im.getelementsbytagname("image")[0].childnodes[0] indexerror: list index out of range 

first of all, not reencode content. there no need so, xml parsers capable of handling encoded content.

next, i'd use elementtree api task this:

from xml.etree import elementtree et  response = urllib.urlopen(base_url) tree = et.parse(response)  urls_list = [] event in tree.findall('.//event[image]'):     # find text content of first <image><url> tag combination:     image_url = event.find('.//image/url')     if image_url not none:         urls_list.append(image_url.text) 

this consideres event elements have direct image child element.


Comments

Popular posts from this blog

c# - Send Image in Json : 400 Bad request -

jquery - Fancybox - apply a function to several elements -

An easy way to program an Android keyboard layout app -