先程の質問と似ているがxmlファイルのデータの抽出

Question

ABSTRACT abstract 81 Neurologic complications of COVID-19 infection have been recently described and include dizziness, headache, loss of taste and smell, stroke, and encephalopathy. Brain MRI in these patients have revealed various findings including ischemia, hemorrhage, inflammation, and demyelination. In this article, we report a case of critical illness-associated cerebral microbleeds identified on MRI in a patient with severe COVID-19 infection and discuss the potential etiologies of these neuroimaging findings. 　 9606 Species patients　　　　　　　　　　　　　　ソースコード import xml.etree.ElementTree as ET tree = ET.parse('result.xml') root=tree.getroot() ### 試したこと pythonでxmlファイルのannotationのtextの部分を無視してNeurologicの部分のtextの抽出って可能なのでしょうか?

Accepted Answer

BioC XML の場合、`sentence` や `relation` タグが含まれる場合があります。例えば以下の様にです。 **result.xml** ```xml ABSTRACT abstract 81 Neurologic complications of COVID-19 ... 9606 Species patients original sentence 70 Active Raf-1 phosphorylates and activates the mitogen-activated protein ... ``` 以下は [lxml - Processing XML and HTML with Python](https://lxml.de/) モジュールを使う場合です。 ```python from lxml import etree as ET tree = ET.parse('result.xml') root = tree.getroot() texts = [t.text for t in root.findall('.//text') if t.getparent().tag != 'annotation'] print(texts) # ['Neurologic complications of COVID-19 ...', 'Active Raf-1 phosphorylates and activates the mitogen-activated protein ...'] ```

Answer

```result.xml ABSTRACT abstract 81 Neurologic complications of COVID-19 infection have been recently described and include dizziness, headache, loss of taste and smell, stroke, and encephalopathy. Brain MRI in these patients have revealed various findings including ischemia, hemorrhage, inflammation, and demyelination. In this article, we report a case of critical illness-associated cerebral microbleeds identified on MRI in a patient with severe COVID-19 infection and discuss the potential etiologies of these neuroimaging findings. 9606 Species patients ``` ```py import xml.etree.ElementTree as ET tree = ET.parse('result.xml') root = tree.getroot() text = root.find('.//text') print("textがありません" if text is None else text.text) ``` ```text:実行結果 Neurologic complications of COVID-19 infection have been recently described and include dizziness, headache, loss of taste and smell, stroke, and encephalopathy. Brain MRI in these patients have revealed various findings including ischemia, hemorrhage, inflammation, and demyelination. In this article, we report a case of critical illness-associated cerebral microbleeds identified on MRI in a patient with severe COVID-19 infection and discuss the potential etiologies of these neuroimaging findings. ```

試したこと

関連した質問