abacus0214/NHRL

An adaptive algorithm, which should abstract temporally extended actions online, without the need for additional background information (besides a Markovian description of the environment). Several Reinforcement Learning algorithms where embedded in a Hierarchy of policies, among which n-step QL, Expected Sarsa, LSTM neural networks (for Q value learning), Deep Mind's Deep Q-learning architecture, and simultaneous off-policy training (of all abstract actions).

PythonStars 6Forks 0Watchers 6Open issues 5

Details

仓库信息

Ownerabacus0214

Homepage—

GitHubhttps://github.com/abacus0214/NHRL

Last pushed2019-04-02

Last updated2025-12-15

Issues fetched at—

abacus0214/NHRL

Community at a glance