pythonでweb上のxmlにxpathでアクセスしたい

前提・実現したいこと

pythonでuniprotというサイトのAPIを使用したいと考えています。
https://www.uniprot.org/uniprot/P12345.xmlにアクセスし、uniprot/entry/protein/recommendedName/fullNameにある、"Aspartate aminotransferase, mitochondrial"をとってきたいと考えています。要はxpathで要素にアクセスしたいのですがうまくいきません。

発生している問題・エラーメッセージ

[]

Process finished with exit code 0

とだけ表示されます。コードにエラーは無いようですが。。

該当のソースコード

python
1import urllib.request
2
3url = 'https://www.uniprot.org/uniprot/P12345.xml'
4req = urllib.request.Request(url)
5
6with urllib.request.urlopen(req) as response:
7    xml_string = response.read()
8
9import xml.etree.ElementTree as ET
10root = ET.fromstring(xml_string)
11name = root.findall(".//protein/recommendedName/fullName")
12print(name)

試したこと

name = root.findall(".//protein/recommendedName/fullName")
print(name)
では[]しか示されませんが、代わりに
print(root[0][3][0][0].text)
を実行すると、"Aspartate aminotransferase, mitochondrial"
で表示できることは確認しています。

補足情報（FW/ツールのバージョンなど）

python3.7及びpycharmを利用しています。

t_obara

2020/04/01 11:29

試すのは簡単なので、パスを増減させて、どのような挙動をするのか確認してみてはいかがですか？

ssk1

2020/04/01 22:07

ご回答ありがとうございます。「試したこと」に書くのを失念しておりましたが、パスの増減はいろいろ試しました。また他の要素へのアクセスも試してみましたが同様の結果が出てしまっております。

行動規範の内容に同意します

回答2件

ベストアンサー

「試したこと」に書くのを失念しておりましたが、パスの増減はいろいろ試しました。

例えば、以下のコード

python
1    names = root.findall(".//")
2     for name in names:
3         print(name)

これを実行すると、以下のような結果が得られます。

bash
1<Element '{http://uniprot.org/uniprot}entry' at 0x10cbcbbd8>
2<Element '{http://uniprot.org/uniprot}accession' at 0x10cbd41d8>
3<Element '{http://uniprot.org/uniprot}accession' at 0x10cc18c28>
4<Element '{http://uniprot.org/uniprot}name' at 0x10cc18c78>
5<Element '{http://uniprot.org/uniprot}protein' at 0x10cc38318>
6<Element '{http://uniprot.org/uniprot}recommendedName' at 0x10cc56318>
7

すでに回答が出ていますが、これを見ると、要素の前に名前空間が付与されていることがわかります。

python
1    names = root.find(".//uniport:protein", {'uniport': 'http://uniprot.org/uniprot'})
2    for name in names:
3        print(name)

上のように名前空間をつけて検索してあげると、以下の結果が得られます。

bash
1<Element '{http://uniprot.org/uniprot}protein' at 0x107fe1318>
2<Element '{http://uniprot.org/uniprot}recommendedName' at 0x107fff318>
3<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff368>
4<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff4f8>
5<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff548>
6<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff638>
7<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff688>
8<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff6d8>
9<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff728>
10<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff7c8>
11<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff818>
12<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff8b8>
13<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff908>
14<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff958>
15<Element '{http://uniprot.org/uniprot}fullName' at 0x107fff9a8>
16<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fff9f8>
17<Element '{http://uniprot.org/uniprot}fullName' at 0x107fffa48>
18<Element '{http://uniprot.org/uniprot}alternativeName' at 0x107fffae8>
19<Element '{http://uniprot.org/uniprot}fullName' at 0x107fffb38>
20