TL;DR: Jointly learning when and how to act improves sample efficiency of RL agents through better exploration and improved exploitation.