Win a copy of Succeeding with AI this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Liutauras Vilda
  • Junilu Lacar
Sheriffs:
  • Tim Cooke
  • Jeanne Boyarsky
  • Knute Snortum
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:
  • salvin francis
  • fred rosenberger
  • Frits Walraven

How would you implement deep reinforcement learning on smaller processors?

 
Ranch Hand
Posts: 94
9
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have been looking into machine learning applications for small mobile robot applications.  Do you recommendations on the best way to apply deep reinforcement learning techniques with more modest processors?  

I see that you discuss a bi-pedal walker in your Appendix B.  I'm particularly looking forward to reading about that one.

Thanks.  And good luck with you book!

Gary
 
Author
Posts: 4
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Gary,

That's an interesting question.

First, let's look at how much do you need for training and for inference.

Training: RL is known to be very compute-intensive for a number of reasons: it needs a lot of samples to train, and its loss function (or rather objective) can be expensive to compute. Fortunately, the neural networks need are typically smaller than in typical supervised learning tasks, e.g. MLP (MultiLayer Perception/plain feedforward) of 2 layers of 256 units for simple robotics task like bipedalwalker, or a small 3-layer Conv net for Atari game. There's a section in the book that talks a bit about desktop/cloud hardware for training - for non-vision tasks like bipedalwalker, you need only a CPU of at least 2GHz to run a small MLP for a few hours to train, so a laptop suffices. However, something less powerful like an Arduino wouldn't suffice. The most sensible way to train is to either stream the sensor inputs and outputs between the robot and a computer for training, or to train in a simulation of the robot if it's available.

Inference: this is much less compute-intensive. If you use an on-policy RL algorithm, inference (e.g. deploying a robot) involves just a simple forward pass through the policy network with state as input and action as output. Moreover, in a real robot, the "frame rate" is much lower, and you only need to do forward pass a few times per second.

All in all, if you're training a robot you should run the training on a separate computer, which doesn't need to be very high-end. For deploying a trained model on a robot (both Tensorflow and PyTorch support lite/mobile deployments), you can do it on-device.

Hope this answers your question, let us know if you have more!

Cheers, Keng
 
Gary W. Lucas
Ranch Hand
Posts: 94
9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reply.  I think your advice will be useful in my investigations.  I'm thinking about a mobile platform that starts out with some basic behaviors, but learns to optimize them over time.  For example, if I were implementing a walker (and that's just an example, I'm not that good), it would start off with some basic gaits pre-programmed into the system, but would gradually improve them based on experience negotiating its environment.   So that seems to fit the pattern you suggested with the Training and Inference areas.

I look forward to reading your book and, no doubt, significantly revising some of my ideas.

Gary
 
Saloon Keeper
Posts: 22011
151
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
At the risk of not being completely on-topic, I'd like to point out that machine learning - and especially Tensorflow - are quite popular on the Raspberry Pi devices. Many a person has, for example, made a "smart doorbell" that uses a camera (the Pi has its own high-performance camera connector) to "see" - and often identify - someone at the door. Making a Lego block sorter is another project that more than one person I know of has worked on.

Actually, I have in my possession something that goes one step further - an actual Edge Computing device. It ran me about $20, It's not a general-purpose processor, but it's designed for Tensortflow and has both builtin devices and connectors for additional extensions. All on a circuit board about 30mm on a side give or take. It's useful for such things as an Alexa-style voice keyword trigger detection and doubtless many other evil things and one of these days I really need to try exploiting it.
 
Wah Loon Keng
Author
Posts: 4
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Gary,

Another interesting way to go about it is to do some hybridization, for example, using learning-based method to handle the harder part of programming a robot. Like you said, you can start with some preprogrammed gaits, then start improving the areas with the most potential. By breaking down into smaller subproblems and focusing on them one by one, learning will be cheaper too since the problem is now simpler.

A great example this is Caltech's work on landing a drone by Yisong Yue - traditional dynamics model has difficulty learning about the boundary condition of airflow near surfaces so drone landing is difficult to compute, but this hybrid work uses deep learning for precisely that difficult problem and produces huge improvement: [youtube] https://www.youtube.com/watch?v=Ng8-JObbKU4[/youtube]



Tim,

interesting. I don't have experience with smaller devices. Is it common that people use Pi devices for training? Or is it just for inferencing?
 
Tim Holloway
Saloon Keeper
Posts: 22011
151
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A Raspberry Pi is used basically just like a desktop PC would be. Even the earliest models had nearly complete Linux distros. And by complete, I don't mean nearly the core OS and utilities. I'm talking about everything up to and including the Hercules emulator for IBM mainframe computers. I run all manner of music composition and audio synthesis/processing software, have one Pi that was running a shortwave receiver and decoding weather FAX from New Orleans last hurricane season, another that - assisted by a 7-inch touch screen - is my emergency battery-powered digital TV and media player. It has been suggested that the Pi 4 can actually replace your desktop computer, although the 1GB model does suffer from virtual memory thrashing when I try and run a GUI java program plus a web browser and a copy of the Emacs editor on it all at the same time. I hope to see better results when I swap the SD card over to a new 4GB model that I just bought. It was expensive - $56.

So yes, it's quite common to train on the Pi and even train using digital and audio captured by the selfsame Pi. I think my Edge Device does want to use imported training, but it's not intended to run as a general-purpose computer like the Pi is.

Co-incidentally, I just ran across an article on using Arduino-like devices for Machine Learning:

https://www.edgeimpulse.com/blog/make-deep-learning-models-run-fast-on-embedded-hardware/

About 2 weeks ago I even saw something on using one of the $1.35 Arduino models. We're talking an 8-bit processor on an 8-pin chip with limited RAM, so some creativity is required for that one, but some people will try anything.
 
And tomorrow is the circus! We can go to the circus! I love the circus! We can take this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic