2つのテキストファイルを読み込み・加工したい

前提・実現したいこと

下記のテキストファイル群のうち，fasta.txt内の">"以降の名称(Abe, Sato, Seo)に，それに応じた順番のemboss.txt内の"Five ="の要素を付与したファイル(output.csv)を生成するようなコードを書きたいです．("理想の出力"参照)

尚，このファイルは約1000行のファイルを縮小・簡略化したものである為，行の指定含め自動化してくれるような完成形の一例を示していただけますと幸いです．宜しくお願いします．

該当のソースコード

python
1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3input_path1 = "fasta.txt"
4input_path2 = "emboss.txt"
5output_path = "output.csv"
6
7with open(input_path1) as f:
8    text1 = f.read()
9with open(input_path2) as g:
10    text2 = g.read()
11
12# 文字列の途中の改行を消す
13text1 = re.sub(r"([a-zA-Z])\n([a-zA-Z])", r"\1\2", text1)
14
15with open(output_path, "w") as f:
16#以下不明

使用ファイル1

fasta.txt
1>Abe |
2FLNALRRERV
3>Sato |
4FLNALRRERV
5>Seo | 
6YLNTLRKERV

使用ファイル2

emboss.txt
1One = 11136.42  		Two = 105 \
2Three  = 110.484 	Four   = 2.6 \
3Five = 7.4662 \
4Six  = 4470 \
5Seven = 0.400 \
6Eight = 0.915 \
7
8One = 11166.39  		Two = 102 \
9Three  = 109.474 	Four   = 2.5 \
10Five = 7.4671 \
11Six  = 4470 \
12Seven = 0.400 \
13Eight = 0.915 \
14
15One = 11166.54  		Two = 110 \
16Three  = 108.486 	Four   = 2.4 \
17Five = 7.4674 \
18Six  = 4470 \
19Seven = 0.400 \
20Eight = 0.915 \

理想の出力

output.csv
1Abe, 7.4662
2Sato, 7.4671
3Seo, 7.4674

補足情報（FW/ツールのバージョンなど）

macOS10.15.4 Python3.7.3 Atom

行動規範の内容に同意します

回答1件

ベストアンサー

こんな感じです。
名前部分やFive部分を抽出する正規表現は細かい仕様がわからないので割とざっくりです。

python
1import re
2
3input_path1 = "fasta.txt"
4input_path2 = "emboss.txt"
5output_path = "output.csv"
6
7# fasta.txtから>から始まる名前の抽出
8text_list1 = []
9with open(input_path1) as f:
10    text_list1 = [re.sub(r">([a-zA-Z]+)\s+\|\s*", r"\1", s) for s in f.read().splitlines() if re.match('>', s)]
11
12# emboss.txtからFiveから始まる数値の抽出
13text_list2 = []
14with open(input_path2) as f:
15    text_list2 = [re.sub(r"Five = ([0-9.]+)\s+\", r"\1", s) for s in f.read().splitlines() if re.match('Five', s)]
16
17# 二つの配列を結合してcsv形式にする
18result_list = [f'{text1}, {text2}' for text1, text2 in zip(text_list1, text_list2)]
19
20# ファイル書き出し
21with open(output_path, mode='w') as f:
22    f.write('\n'.join(result_list))