Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Home > Science, Technology & Agriculture > Electronics and communications engineering > Electronics engineering > Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)


     0     
5
4
3
2
1



Out of Stock


Notify me when this book is in stock
X
About the Book

Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.

Table of Contents:
PREFACE xix CONTRIBUTORS xxiii PART I FEEDBACK CONTROL USING RL AND ADP 1. Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead 3 Paul J. Werbos 1.1 Introduction 3 1.2 What is RLADP? 4 1.3 Some Basic Challenges in Implementing ADP 14 2. Stable Adaptive Neural Control of Partially Observable Dynamic Systems 31 J. Nate Knight and Charles W. Anderson 2.1 Introduction 31 2.2 Background 32 2.3 Stability Bias 35 2.4 Example Application 38 3. Optimal Control of Unknown Nonlinear Discrete-Time Systems Using the Iterative Globalized Dual Heuristic Programming Algorithm 52 Derong Liu and Ding Wang 3.1 Background Material 53 3.2 Neuro-Optimal Control Scheme Based on the Iterative ADP Algorithm 55 3.3 Generalization 67 3.4 Simulation Studies 68 3.5 Summary 74 4. Learning and Optimization in Hierarchical Adaptive Critic Design 78 Haibo He, Zhen Ni, and Dongbin Zhao 4.1 Introduction 78 4.2 Hierarchical ADP Architecture with Multiple-Goal Representation 80 4.3 Case Study: The Ball-and-Beam System 87 4.4 Conclusions and Future Work 94 5. Single Network Adaptive Critics Networks—Development, Analysis, and Applications 98 Jie Ding, Ali Heydari, and S.N. Balakrishnan 5.1 Introduction 98 5.2 Approximate Dynamic Programing 100 5.3 SNAC 102 5.4 J-SNAC 104 5.5 Finite-SNAC 108 5.6 Conclusions 116 6. Linearly Solvable Optimal Control 119 K. Dvijotham and E. Todorov 6.1 Introduction 119 6.2 Linearly Solvable Optimal Control Problems 123 6.3 Extension to Risk-Sensitive Control and Game Theory 130 6.4 Properties and Algorithms 134 6.5 Conclusions and Future Work 139 7. Approximating Optimal Control with Value Gradient Learning 142 Michael Fairbank, Danil Prokhorov, and Eduardo Alonso 7.1 Introduction 142 7.2 Value Gradient Learning and BPTT Algorithms 144 7.3 A Convergence Proof for VGL(1) for Control with Function Approximation 148 7.4 Vertical Lander Experiment 154 7.5 Conclusions 159 8. A Constrained Backpropagation Approach to Function Approximation and Approximate Dynamic Programming 162 Silvia Ferrari, Keith Rudd, and Gianluca Di Muro 8.1 Background 163 8.2 Constrained Backpropagation (CPROP) Approach 163 8.3 Solution of Partial Differential Equations in Nonstationary Environments 170 8.4 Preserving Prior Knowledge in Exploratory Adaptive Critic Designs 174 8.5 Summary 179 9. Toward Design of Nonlinear ADP Learning Controllers with Performance Assurance 182 Jennie Si, Lei Yang, Chao Lu, Kostas S. Tsakalis, and Armando A. Rodriguez 9.1 Introduction 183 9.2 Direct Heuristic Dynamic Programming 184 9.3 A Control Theoretic View on the Direct HDP 186 9.4 Direct HDP Design with Improved Performance Case 1—Design Guided by a Priori LQR Information 193 9.5 Direct HDP Design with Improved Performance Case 2—Direct HDP for Coorindated Damping Control of Low-Frequency Oscillation 198 9.6 Summary 201 10. Reinforcement Learning Control with Time-Dependent Agent Dynamics 203 Kenton Kirkpatrick and John Valasek 10.1 Introduction 203 10.2 Q-Learning 205 10.3 Sampled Data Q-Learning 209 10.4 System Dynamics Approximation 213 10.5 Closing Remarks 218 11. Online Optimal Control of Nonaffine Nonlinear Discrete-Time Systems without Using Value and Policy Iterations 221 Hassan Zargarzadeh, Qinmin Yang, and S. Jagannathan 11.1 Introduction 221 11.2 Background 224 11.3 Reinforcement Learning Based Control 225 11.4 Time-Based Adaptive Dynamic Programming-Based Optimal Control 234 11.5 Simulation Result 247 12. An Actor–Critic–Identifier Architecture for Adaptive Approximate Optimal Control 258 S. Bhasin, R. Kamalapurkar, M. Johnson, K.G. Vamvoudakis, F.L. Lewis, and W.E. Dixon 12.1 Introduction 259 12.2 Actor–Critic–Identifier Architecture for HJB Approximation 260 12.3 Actor–Critic Design 263 12.4 Identifier Design 264 12.5 Convergence and Stability Analysis 270 12.6 Simulation 274 12.7 Conclusion 275 13. Robust Adaptive Dynamic Programming 281 Yu Jiang and Zhong-Ping Jiang 13.1 Introduction 281 13.2 Optimality Versus Robustness 283 13.3 Robust-ADP Design for Disturbance Attenuation 288 13.4 Robust-ADP for Partial-State Feedback Control 292 13.5 Applications 296 13.6 Summary 300 PART II LEARNING AND CONTROL IN MULTIAGENT GAMES 14. Hybrid Learning in Stochastic Games and Its Application in Network Security 305 Quanyan Zhu, Hamidou Tembine, and Tamer Basar 14.1 Introduction 305 14.2 Two-Person Game 308 14.3 Learning in NZSGs 310 14.4 Main Results 314 14.5 Security Application 322 14.6 Conclusions and Future Works 326 15. Integral Reinforcement Learning for Online Computation of Nash Strategies of Nonzero-Sum Differential Games 330 Draguna Vrabie and F.L. Lewis 15.1 Introduction 331 15.2 Two-Player Games and Integral Reinforcement Learning 333 15.3 Continuous-Time Value Iteration to Solve the Riccati Equation 337 15.4 Online Algorithm to Solve Nonzero-Sum Games 339 15.5 Analysis of the Online Learning Algorithm for NZS Games 342 15.6 Simulation Result for the Online Game Algorithm 345 15.7 Conclusion 347 16. Online Learning Algorithms for Optimal Control and Dynamic Games 350 Kyriakos G. Vamvoudakis and Frank L. Lewis 16.1 Introduction 350 16.2 Optimal Control and the Continuous Time Hamilton–Jacobi–Bellman Equation 352 16.3 Online Solution of Nonlinear Two-Player Zero-Sum Games and Hamilton–Jacobi–Isaacs Equation 360 16.4 Online Solution of Nonlinear Nonzero-Sum Games and Coupled Hamilton–Jacobi Equations 366 PART III FOUNDATIONS IN MDP AND RL 17. Lambda-Policy Iteration: A Review and a New Implementation 381 Dimitri P. Bertsekas 17.1 Introduction 381 17.2 Lambda-Policy Iteration without Cost Function Approximation 386 17.3 Approximate Policy Evaluation Using Projected Equations 388 17.4 Lambda-Policy Iteration with Cost Function Approximation 395 17.5 Conclusions 406 18. Optimal Learning and Approximate Dynamic Programming 410 Warren B. Powell and Ilya O. Ryzhov 18.1 Introduction 410 18.2 Modeling 411 18.3 The Four Classes of Policies 412 18.4 Basic Learning Policies for Policy Search 416 18.5 Optimal Learning Policies for Policy Search 421 18.6 Learning with a Physical State 427 19. An Introduction to Event-Based Optimization: Theory and Applications 432 Xi-Ren Cao, Yanjia Zhao, Qing-Shan Jia, and Qianchuan Zhao 19.1 Introduction 432 19.2 Literature Review 433 19.3 Problem Formulation 434 19.4 Policy Iteration for EBO 435 19.5 Example: Material Handling Problem 441 19.6 Conclusions 448 20. Bounds for Markov Decision Processes 452 Vijay V. Desai, Vivek F. Farias, and Ciamac C. Moallemi 20.1 Introduction 452 20.2 Problem Formulation 455 20.3 The Linear Programming Approach 456 20.4 The Martingale Duality Approach 458 20.5 The Pathwise Optimization Method 461 20.6 Applications 463 20.7 Conclusion 470 21. Approximate Dynamic Programming and Backpropagation on Timescales 474 John Seiffertt and Donald Wunsch 21.1 Introduction: Timescales Fundamentals 474 21.2 Dynamic Programming 479 21.3 Backpropagation 485 21.4 Conclusions 492 22. A Survey of Optimistic Planning in Markov Decision Processes 494 Lucian Busoniu, Remi Munos, and Robert Babu¡ska 22.1 Introduction 494 22.2 Optimistic Online Optimization 497 22.3 Optimistic Planning Algorithms 500 22.4 Related Planning Algorithms 509 22.5 Numerical Example 510 23. Adaptive Feature Pursuit: Online Adaptation of Features in Reinforcement Learning 517 Shalabh Bhatnagar, Vivek S. Borkar, and L.A. Prashanth 23.1 Introduction 517 23.2 The Framework 520 23.3 The Feature Adaptation Scheme 522 23.4 Convergence Analysis 525 23.5 Application to Traffic Signal Control 527 23.6 Conclusions 532 24. Feature Selection for Neuro-Dynamic Programming 535 Dayu Huang, W. Chen, P. Mehta, S. Meyn, and A. Surana 24.1 Introduction 535 24.2 Optimality Equations 536 24.3 Neuro-Dynamic Algorithms 542 24.4 Fluid Models 551 24.5 Diffusion Models 554 24.6 Mean Field Games 556 24.7 Conclusions 557 25. Approximate Dynamic Programming for Optimizing Oil Production 560 Zheng Wen, Louis J. Durlofsky, Benjamin Van Roy, and Khalid Aziz 25.1 Introduction 560 25.2 Petroleum Reservoir Production Optimization Problem 562 25.3 Review of Dynamic Programming and Approximate Dynamic Programming 564 25.4 Approximate Dynamic Programming Algorithm for Reservoir Production Optimization 566 25.5 Simulation Results 573 25.6 Concluding Remarks 578 23.6 Conclusions 532 24. Feature Selection for Neuro-Dynamic Programming 535 Dayu Huang, W. Chen, P. Mehta, S. Meyn, and A. Surana 24.1 Introduction 535 24.2 Optimality Equations 536 24.3 Neuro-Dynamic Algorithms 542 24.4 Fluid Models 551 24.5 Diffusion Models 554 24.6 Mean Field Games 556 24.7 Conclusions 557 25. Approximate Dynamic Programming for Optimizing Oil Production 560 Zheng Wen, Louis J. Durlofsky, Benjamin Van Roy, and Khalid Aziz 25.1 Introduction 560 25.2 Petroleum Reservoir Production Optimization Problem 562 25.3 Review of Dynamic Programming and Approximate Dynamic Programming 564 25.4 Approximate Dynamic Programming Algorithm for Reservoir Production Optimization 566 25.5 Simulation Results 573 25.6 Concluding Remarks 578 26. A Learning Strategy for Source Tracking in Unstructured Environments 582 Titus Appel, Rafael Fierro, Brandon Rohrer, Ron Lumia, and John Wood 26.1 Introduction 582 26.2 Reinforcement Learning 583 26.3 Light-Following Robot 589 26.4 Simulation Results 592 26.5 Experimental Results 595 26.6 Conclusions and Future Work 599 References 599 INDEX 601

About the Author :
Dr. Frank Lewis is a Professor of Electrical Engineering at The University of Texas at Arlington, where he was awarded the Moncrief-O'Donnell Endowed Chair in 1990 at the Automation & Robotics Research Institute. He has served as Visiting Professor at Democritus University in Greece, Hong Kong University of Science and Technology, Chinese University of Hong Kong, City University of Hong Kong, National University of Singapore, Nanyang Technological University Singapore. Elected Guest Consulting Professor at Shanghai Jiao Tong University and South China University of Technology. Derong Liu received the B.S. degree in mechanical engineering from the East China Institute of Technology (now Nanjing University of Science and Technology), Nanjing, China, in 1982, the M.S. degree in automatic control theory and applications from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 1987, and the Ph.D. degree in electrical engineering from the University of Notre Dame, Notre Dame, IN, in 1994.


Best Sellers


Product Details
  • ISBN-13: 9781118453964
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: Wiley-IEEE Press
  • Language: English
  • Series Title: 17 IEEE Press Series on Computational Intelligence
  • ISBN-10: 1118453964
  • Publisher Date: 28 Jan 2013
  • Binding: Digital (delivered electronically)
  • No of Pages: 648


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)
John Wiley & Sons Inc -
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control: (17 IEEE Press Series on Computational Intelligence)

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    Fresh on the Shelf


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!