unity ml-agents トレーニング時のエラーを解消したい

前提・実現したいこと

下記URLの動画に従って、MLAgentsのドキュメントを進めていたのですが、動画2:09:20の箇所で以下のエラーが発生しました。
リンク内容

発生している問題・エラーメッセージ

【エラー状況】
・anacondaプロンプトで学習開始のコマンドを実行(プロンプト上にunityのロゴが表示される)。
・その後、unity上で実行ボタンを押したが、agentが動かない(動画上だと自動で動いて学習している)。
・自動では動かないが矢印キーで操作可能(Behavior Typeはデフォルトに設定しています。)
・一定時間経つと、anacondaプロンプト上に下記のエラーが表示される。

mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
	 The environment does not need user interaction to launch
	 The Agents' Behavior Parameters > Behavior Type is set to "Default"
	 The environment and the Python interface have compatible versions.

該当のソースコード

C#
1using System.Collections;
2using System.Collections.Generic;
3using UnityEngine;
4using Unity.MLAgents;
5using Unity.MLAgents.Sensors;
6using Unity.MLAgents.Actuators;
7
8
9public class RollerAgent : Agent
10{
11
12    private Rigidbody rBody;
13    public Transform target;
14
15    // Start is called before the first frame update
16    void Start()
17    {
18        rBody = GetComponent<Rigidbody>();
19    }
20
21    public override void OnEpisodeBegin()
22    {
23        if (this.transform.localPosition.y < 0)
24        {
25            this.rBody.velocity = Vector3.zero;
26            this.rBody.angularVelocity = Vector3.zero;
27            this.transform.localPosition = new Vector3(0, 0.5f, 0);
28        }
29
30        target.localPosition = new Vector3(Random.Range(-4f, 4f), 0.5f, Random.Range(-4f, 4f));
31    }
32
33    public override void CollectObservations(VectorSensor sensor)
34    {
35        sensor.AddObservation(target.localPosition);
36        sensor.AddObservation(this.transform.localPosition);
37
38        sensor.AddObservation(rBody.velocity.x);
39        sensor.AddObservation(rBody.velocity.z);
40    }
41
42    public float forceMult = 10f;
43    public override void OnActionReceived(ActionBuffers actions)
44    {
45        float x = actions.ContinuousActions[0];
46        float z = actions.ContinuousActions[1];
47        
48        rBody.AddForce(new Vector3(x, 0, z) * forceMult);
49
50        float dist = Vector3.Distance(this.transform.localPosition, target.localPosition);
51
52        if (dist < 1.42f)
53        {
54            SetReward(1.0f);
55            EndEpisode();
56        }
57        else if (this.transform.localPosition.y < 0)
58        {
59            EndEpisode();
60        }
61
62    }
63
64    public override void Heuristic(in ActionBuffers actionsOut)
65    {
66        var continuousActionsOut = actionsOut.ContinuousActions;
67        continuousActionsOut[0] = Input.GetAxis("Horizontal");
68        continuousActionsOut[1] = Input.GetAxis("Vertical");
69    }
70}

yaml
1behaviors:
2  RollerBall:
3    trainer_type: ppo
4    hyperparameters:
5      batch_size: 10
6      buffer_size: 100
7      learning_rate: 3.0e-4
8      beta: 5.0e-4
9      epsilon: 0.2
10      lambd: 0.99
11      num_epoch: 3
12      learning_rate_schedule: linear
13    network_settings:
14      normalize: false
15      hidden_units: 128
16      num_layers: 2
17    reward_signals:
18      extrinsic:
19        gamma: 0.99
20        strength: 1.0
21    max_steps: 500000
22    time_horizon: 64
23    summary_freq: 10000

試したこと

学習中に矢印キーで操作できてしまうこと自体がおかしいと考え、Behavior Typeをデフォルトからinference Onlyに変更したのですがうまくいきませんでした。

補足情報（FW/ツールのバージョンなど）

【環境】
・mac Monterey
・unity 2020.3.23f1
・ml-agents リリース18(動画ではリリース13を使用していましたが、リリース18で学習できるようにしたい)
・ml-agents 0.27.0
・ml-agents-envs 0.27.0
・Communicator API 1.5.0
・PyTorch 1.7.1
↓Agent インスペクターの情報