Hi, after a couple of experiments I didn't really understand if i succeeded in the task or not (for cartPole).
For some of the running I got a really good results, but for other runs i can't get to the maximum results.
That maybe makes sense since the option to get the best results depend on the starting configuration, which change every run.
But for some of the runs I get to the maximum results, then after a while drop gradually to zero without ever to rise again. Does that makes sense? why can this thing happen?