Ruby nokogiriで<a href>を取得した

Rubyのスクレイピングを行っていますが
該当ページの<a href>の中身を取得したいのですが、うまく行きません。
XPATHの指定を変えてもうまく行かないのでご教授願います。

画像の指定している箇所の<href>の中身を取得したいです。
よろしくお願いいたします。

require 'nokogiri'

require 'open-uri'

require 'csv'

urls = %w(
    https://www.kfm.or.jp/fdb/registration/categorylist/5,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:2,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:3,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:4,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:5,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:6,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:7,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:8,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:9,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:10,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:11,

    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:12,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:13,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:14,
    https://www.kfm.or.jp/fdb/registration/categorylist/5/page:11,

)


titles = []
charset = nil
urls.each do |url|
html = open(url) do |f|
    charset = f.charset #文字種別を取得
    f.read #htmlで読み込んで変数htmlに渡す
end



#htmlをparseしてオブジェクトを作成
doc = Nokogiri::HTML.parse(html, nil, charset)
doc.xpath('//h4').each do |node|
    title = node.css('a').inner_text
titles.push(node)
end
end



#これをcsvファイルに書き出す
CSV.open('titele.csv', 'w') do |csv|
    csv << titles
end

行動規範の内容に同意します

回答1件

ベストアンサー

コードの中でhrefを見ようとしているところが見当たらないのですが、
そもそも属性の取得方法がわからないということですか？
であれば、不明点をちゃんと書きましょう。

ノード["href"]で属性が参照できます。

各css("a")は0～1個のようなので、

Ruby
1doc.xpath('//h4').each do |node|
2    a     = node.css('a')[0]
3    next unless a # aが0個なら飛ばす
4    title = a.text
5    url   = a['href']
6    ～～～～
7end

あるいは、aタグしか見ないなら、最初から、

Ruby
1doc.xpath('//h4/a').each do |a|
2    title = a.text
3    url   = a['href']
4    ～～～～
5end

投稿2019/11/24 06:13

otn

総合スコア84559

maaatuuu22

2019/11/26 13:33

otnさん回答ありがとうございます。以後質問の仕方に注意して行きます。

行動規範の内容に同意します

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.48%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

質問をすることでしか得られない、回答やアドバイスがある。

15分調べてもわからないことは、質問しよう！

Ruby nokogiriで<a href>を取得した

関連した質問