- 流程:
- Player Agent開平行 initialize GPU
- 餵前一代模型權重到Player Agent平行玩遊戲蒐集Training Data
- 集中回Local端Trainer Agent Training
- 餵模型權重到Player Agent開平行玩遊戲測TrainDev Score
- 關掉平行kernels (用不同core數開parallel即可)
- Loop......
- 平行方法:
def work(act_weights, crit_weights):
'''Create Actor/Critic, Copy input weights to A/C, play env and gather data.'''
return RewardData
'''Create Actor/Critic, Copy input weights to A/C, play env and gather data.'''
return RewardData
[episodes*RewardData] = parallel(n_core)delayed(work)((**kwarg) for i in range(episodes))
- 關鍵:
- Playing 跟 Run 的kernel數要一模一樣,每代episodes / test rounds也必須整除
- 每次Loop完後須手動殺掉並重開平行kernel
- 先開平行腳本initialize GPU, 再開一個平行腳本餵權重蒐集Reward
- 所有work()裡面,@TF.Function()後的函數都要用.py檔另外call(包括Actor/Critic)
平行GPU initialization Problem
餵權重時由於權重本身是tensorflow的一個object,推測會把本機GPU資訊直接傳到子kernel,導致initialization error(Physical devices cannot be modified after being initialized)
解法就是先跑init func把平行開起來後 initialize GPU
tf.config.experimental.set_memory_growth(physical_devices[0], True)
再跑work function開平行把權重餵進去蒐集Reward
AutoGraph: could not get source code Warning
由於work()會在平行kenel內執行,因此AutoGraph會沒辦法access到目前運行的Jupyter Notebook裡拿@tf.funcion()包起來函數之原始碼,解決方式就是把這些會用到的函數包含Actor/Critic都存成.py檔,讓平行kernel直接讀。
但不跑平行的函數不用另外存。
留言
張貼留言