跳到主要內容

發表文章

目前顯示的是 2月, 2019的文章

A3C in ATARI Pong-V0

ATARI PONG 對戰模式,左邊為遊戲程式,右邊為訓練中的A3C模型。一局以21分決勝負,對手MISS 一球得一分。從LOG可以看出,A3C模型從最初全敗的輸21分,經過2小時左右的TRAINING,已經逆轉至幾乎每局都勝利,偶爾甚至勝出高達13分。 底下為TRAINING A3C MODEL過程的LOG, (base) frank@viper1:~/a3c$ python main.py --env-name "Pong-v0" --num-processes 8 Time 00h 00m 10s, episode reward -21.0, episode length 1026 Time 00h 01m 18s, episode reward -21.0, episode length 1020 Time 00h 02m 26s, episode reward -21.0, episode length 1029 Time 00h 03m 35s, episode reward -21.0, episode length 1023 Time 00h 04m 42s, episode reward -21.0, episode length 1014 Time 00h 05m 50s, episode reward -21.0, episode length 1087 Time 00h 07m 00s, episode reward -21.0, episode length 1359 Time 00h 08m 14s, episode reward -16.0, episode length 1922 Time 00h 09m 30s, episode reward -14.0, episode length 2220 Time 00h 10m 48s, episode reward -15.0, episode length 2431 Time 00h 12m 04s, episode reward -15.0, episode length 2211 Time 00h 13m 23s, episode reward -8.0, episode length 2600 Time 00h 14m 38s, episod