実現したいこと
wav2vec2を利用したモデルの作成
最終的には自分の持つデータ(wav)を使いたいが現時点ではlibrispeechのデータセット(flac)を使用している
前提
google colabolatory上で、librispeechのデータセットとwav2vec2-masterを利用したモデルの作成をしたいと思っているのですがエラーが出てしまいます。
最終的には自分の持つデータを適応したいと考えているのですがデータセットを使用したモデルの構築の時点でエラーが治りません。
発生している問題・エラーメッセージ
2023-02-21 15:53:50 | INFO | train | task: audio_pretraining (AudioPretrainingTask) 2023-02-21 15:53:50 | INFO | train | model: wav2vec2 (Wav2Vec2Model) 2023-02-21 15:53:50 | INFO | train | criterion: wav2vec (Wav2vecCriterion) 2023-02-21 15:53:50 | INFO | train | num. model params: 95044608 (num. trained: 95044608) 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.1.0.bias 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.2.0.bias 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.3.0.bias 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.4.0.bias 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.5.0.bias 2023-02-21 15:53:50 | INFO | trainer | detected shared parameter: feature_extractor.conv_layers.0.0.bias <- feature_extractor.conv_layers.6.0.bias 2023-02-21 15:53:50 | INFO | train | training on 64 devices (GPUs/TPUs) 2023-02-21 15:53:50 | INFO | train | max tokens per GPU = 1400000 and max sentences per GPU = None 2023-02-21 15:53:50 | INFO | trainer | no existing checkpoint found /content/drive/MyDrive/colabo/model/path/checkpoint_last.pt 2023-02-21 15:53:50 | INFO | trainer | loading train data for epoch 1 2023-02-21 15:53:50 | INFO | dataload.audio.raw_audio_dataset | loaded 2637, skipped 37 samples Traceback (most recent call last): File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/dataload/data_utils.py", line 239, in batch_by_size from fairseq.data.data_utils_fast import ( ImportError: cannot import name 'batch_by_size_fast' from 'fairseq.data.data_utils_fast' (/usr/local/lib/python3.8/dist-packages/fairseq/data/data_utils_fast.cpython-38-x86_64-linux-gnu.so) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "train.py", line 324, in <module> cli_main() File "train.py", line 320, in cli_main distributed_utils.call_main(args, main) File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/tools/distributed_utils.py", line 181, in call_main main(args, **kwargs) File "train.py", line 106, in main extra_state, epoch_itr = checkpoint_utils.load_checkpoint(args, trainer) File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/tools/checkpoint_utils.py", line 185, in load_checkpoint epoch_itr = trainer.get_train_iterator( File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/trainer.py", line 305, in get_train_iterator return self.task.get_batch_iterator( File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/tasks/fairseq_task.py", line 213, in get_batch_iterator batch_sampler = dataset.batch_by_size( File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/dataload/fairseq_dataset.py", line 118, in batch_by_size return data_utils.batch_by_size( File "/content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src/dataload/data_utils.py", line 243, in batch_by_size raise ImportError( ImportError: Please build Cython components with: `pip install --editable .`
該当のソースコード
! python train.py --distributed-world-size 64 --distributed-port 1 /content/drive/MyDrive/colabo/manifest/path \ --save-dir /content/drive/MyDrive/colabo/model/path --fp16 --num-workers 6 --task audio_pretraining --criterion wav2vec --arch wav2vec2 \ --log-keys '["prob_perplexity","code_perplexity","temp"]' --quantize-targets --extractor-mode default \ --conv-feature-layers '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2' --final-dim 256 --latent-vars 320 \ --latent-groups 2 --latent-temp '(2,0.5,0.999995)' --infonce --optimizer adam \ --adam-betas '(0.9,0.98)' --adam-eps 1e-06 --lr-scheduler polynomial_decay --total-num-update 400000 \ --lr 0.0005 --warmup-updates 32000 --mask-length 10 --mask-prob 0.65 --mask-selection static --mask-other 0 \ --encoder-layerdrop 0.05 --dropout-input 0.1 --dropout-features 0.1 --feature-grad-mult 0.1 \ --loss-weights '[0.1, 10]' --conv-pos 128 --conv-pos-groups 16 --num-negatives 100 --cross-sample-negatives 0 \ --max-sample-size 250000 --min-sample-size 32000 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \ --max-tokens 1400000 --max-update 400000 --skip-invalid-size-inputs-valid-test --ddp-backend no_c10d
試したこと
!pip install --editable .を行ったところ、
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src
ERROR: file:///content/drive/MyDrive/colabo/wav2vec-master/wav2vec-master/src does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.
という結果が出ました
補足情報(FW/ツールのバージョンなど)
Python 3.8.10
あなたの回答
tips
プレビュー