Win a copy of Secure Financial Transactions with Ansible, Terraform, and OpenSCAP this week in the Cloud/Virtualization forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Rob Spoor
  • Henry Wong
  • Liutauras Vilda
Saloon Keepers:
  • Tim Moores
  • Carey Brown
  • Stephan van Hulst
  • Tim Holloway
  • Piet Souris
Bartenders:
  • Frits Walraven
  • Himai Minh
  • Jj Roberts

Deep reinforcement learning (some toughts about)

 
Bartender
Posts: 1304
39
IBM DB2 Netbeans IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In the last few weeks, I spent almost my spare time playing a bit with Deep Reinforcement learning. I could say that I rarely experienced such a frustrating experience before.
As a software developer / software architect, I can say I'm really used to have to study new technologies and to shift my mental point of view. But RL is a damn evil beast to tame.
First, the impression I got is that the whole field is (still) really brittle, at least if you are not a real expert.
I mean, while developing software, you know you can rely upon some consolidated best practices, some recipes that, more or less, can help you to solve your problem and go a step further.
With RL I think it's not the case. I was playing with Open AI gym, which provides you with some "enviroments"  your "agents" can play with.
If you try to solve a problem with a too much complex model, you will fail.
If your model is too simple, it will fail.
If you change some of the hyperparameters in your neural network, you may risk to fail.
What lets me astonished is that RL seems a big puzzle that may be solved with a lot of experimentation, and this seems to be pretty equivalent to state "it's solved mainly with the help of a good dose of luck".
something I believe is irreconcilable with Science, in its general term. And with an huge amount of time, and data.
I have to admit I'm a beginner swimming in a sea of ignorance (mine ignorance, of course),  but while I made some progress, I'm really upset and I wonder if someone else here experimented my same experience.
Thanks in advance.
 
I have always wanted to have a neighbor just like you - Fred Rogers. Tiny ad:
SKIP - a book about connecting industrious people with elderly land owners
https://coderanch.com/t/skip-book
reply
    Bookmark Topic Watch Topic
  • New Topic