I've been fascinated by what RL can do and the successes it has had accomplishing really complicated tasks like Go or Robotic hand manipulations. However, I've often wondered if there are reasons why more practical and small scale applications couldn't be done using RL. For instance, I'm often finding myself copying and pasting each item in a list from one application another. I've wondered why RL could 't be trained to do these kinds of practical applications. I understand training would need to happen in some kind of simulated environment, and it seemed to me OpenAI was headed in that direction with the Universe product (which has since been largely abandoned). So is there any work being done in this direction?
Looking forward to reading the book.
RL can certainly be applied to practical and small scale applications. For example RL can be used to replace heuristics with trained policies in resource management (e.g. Resource Management with Deep RL) or to improve the performance of algorithmic problems such as binary search, sorting, and caching (e.g. Predicted Variables in Programming). RL has also been applied to recommender systems (e.g. Top-K Off-Policy Correction for a REINFORCE Recommender System) and chip placement (e.g. Chip Placement with Deep RL). These last two examples are practical but admittedly not small scale.
More generally, there are a number of different aspects to consider when thinking about whether RL is well suited to a problem.
1. Can the problem be framed as an RL problem? I.e. as an agent that takes actions that change the state of an environment?
2. How difficult is it to create a training environment? This includes the states, actions, rewards, and transition function. Part IV of the book is dedicated to environment design and discusses this in more detail.
3. How can the agent be evaluated safely in a realistic setting? How to ensure an agent behaves safely and appropriately when deployed?
Depending on the answers to questions, other machine learning approaches such as supervised learning may be more suitable. Especially since deep RL models are challenging to train when compared with supervised deep learning.