This week's book giveaway is in the Artificial Intelligence and Machine Learning forum.
We're giving away four copies of Zero to AI - A non-technical, hype-free guide to prospering in the AI era and have Nicolò Valigi and Gianluca Mauro on-line!
See this thread for details.
Win a copy of Zero to AI - A non-technical, hype-free guide to prospering in the AI era this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Reinforcement Learning In Action - Key Differences

 
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Alexander and Brandon.
How does reinforcement learning differ from normal "gradient descent" learning?

Thanks
Don.
 
Author
Posts: 7
5
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Don Horrell wrote:Hi Alexander and Brandon.
How does reinforcement learning differ from normal "gradient descent" learning?

Thanks
Don.



So gradient descent is actually a particular kind of optimization strategy, just one of many ways of tuning a set of parameters of a function to be optimal according to some objective. All parametric machine learning and statistical models need to be optimized (or "fit" to data) and gradient descent is the most popular as it is scalable, iterative, and works well with big data. So we use gradient descent in RL since RL generally uses neural networks or other complex machine learning models underneath the hood, so to speak.
 
We can fix it! We just need some baling wire, some WD-40, a bit of duct tape and this tiny ad:
Thread Boost feature
https://coderanch.com/t/674455/Thread-Boost-feature
reply
    Bookmark Topic Watch Topic
  • New Topic