跳到主要內容

增強式學習

 

 迴力球遊戲-ATARI

 



 

賽車遊戲DQN-ATARI


賽車遊戲-TORCS








Ref:
    李宏毅老師 YOUTUBE DRL 1-3

On-policy VS Off-policy

On-policy
    The agent learned and the agent interacting with the environment is the same
    阿光自已下棋學習
Off-policy
    The agent learned and the agent interacting with the environment is different

    佐助下棋,阿光在旁邊看



Add a baseline:
    It is possible that R is always positive
    So R subtract a expectation value


Policy in "Policy Gradient" means output action, like left/right/fire


gamma-discounted rewards:
時間愈遠的貢獻,降低其權重

Reward Function & Action is defined in prior to training

MC v.s. TD
MC 蒙弟卡羅: critic after episode end : larger variance(cuz conditions differ a lot in every episode), unbiased (judge until episode end, more fair)
TD: Temporal-difference approach: critic during episode :smaller variance, biased maybe



atari : a3c  => gym
torcs : ddpg => gym-torcs


PPO
   easy code 
   easy tune
   sample efficient


Replay Buffer:
Put the experience into buffer : St, at, rt, St+1
cuz interaction with environment cost more time than training
玩遊戲的時間通常比TRAIN NETWORK花時間

留言

這個網誌中的熱門文章

Frameworks overview

Picture courtesy from Paul Huang

tensorflow

TensorFlow Docker requirements Install Docker  on your local  host  machine. For GPU support on Linux,  install nvidia-docker . Docker is the easiest way to enable TensorFlow  GPU support  on Linux since only the  NVIDIA® GPU driver  is required on the  host  machine  (the  NVIDIA® CUDA® Toolkit  does not need to be installed). docker run [-it] [--rm] [-p hostPort:containerPort] tensorflow/tensorflow[:tag] [command] $ docker run -it --rm tensorflow/tensorflow    python -c "import tensorflow as tf; print(tf.__version__)" Note:   nvidia-docker  v1 uses the  nvidia-docker  alias, where v2 uses  docker --runtime=nvidia . CUDA 9.0 for TensorFlow < 1.13.0 nvidia-docker2 intsall prerequisites: NVIDIA driver  and Docker  . If you have a custom  /etc/docker/daemon.json , the  nvidia-docker2  package might override it....