Multi-Agent Reinforcement Learning
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Home > Computing and Information Technology > Computer science > Artificial intelligence > Machine learning > Multi-Agent Reinforcement Learning: Foundations and Modern Approaches
Multi-Agent Reinforcement Learning: Foundations and Modern Approaches

Multi-Agent Reinforcement Learning: Foundations and Modern Approaches


     0     
5
4
3
2
1



International Edition


X
About the Book

The first comprehensive introduction to Multi-Agent Reinforcement Learning (MARL), covering MARL's models, solution concepts, algorithmic ideas, technical challenges, and modern approaches. The first comprehensive introduction to Multi-Agent Reinforcement Learning (MARL), covering MARL's models, solution concepts, algorithmic ideas, technical challenges, and modern approaches. Multi-Agent Reinforcement Learning (MARL), an area of machine learning in which a collective of agents learn to optimally interact in a shared environment, boasts a growing array of applications in modern life, from autonomous driving and multi-robot factories to automated trading and energy network management. This text provides a lucid and rigorous introduction to the models, solution concepts, algorithmic ideas, technical challenges, and modern approaches in MARL. The book first introduces the field's foundations, including basics of reinforcement learning theory and algorithms, interactive game models, different solution concepts for games, and the algorithmic ideas underpinning MARL research. It then details contemporary MARL algorithms which leverage deep learning techniques, covering ideas such as centralized training with decentralized execution, value decomposition, parameter sharing, and self-play. The book comes with its own MARL codebase written in Python, containing implementations of MARL algorithms that are self-contained and easy to read. Technical content is explained in easy-to-understand language and illustrated with extensive examples, illuminating MARL for newcomers while offering high-level insights for more advanced readers. First textbook to introduce the foundations and applications of MARL, written by experts in the field Integrates reinforcement learning, deep learning, and game theory Practical focus covers considerations for running experiments and describes environments for testing MARL algorithms Explains complex concepts in clear and simple language Classroom-tested, accessible approach suitable for graduate students and professionals across computer science, artificial intelligence, and robotics Resources include code and slides

Table of Contents:
Preface xi Summary of Notation xv List of Figures xvii 1 Introduction 1 1.1 Multi-Agent Systems 2 1.2 Multi-Agent Reinforcement Learning 6 1.3 Application Examples 8 1.3.1 Multi-Robot Warehouse Management 8 1.3.2 Competitive Play in Board Games and Video Games 10 1.3.3 Autonomous Driving 11 1.3.4 Automated Trading in Electronic Markets 11 1.4 Challenges of MARL 12 1.5 Agendas of MARL 13 1.6 Book Contents and Structure 15 I FOUNDATIONS OF MULTI-AGENT REINFORCEMENT LEARNING 17 2 Reinforcement Learning 19 2.1 General Definition 20 2.2 Markov Decision Processes 22 2.3 Expected Discounted Returns and Optimal Policies 24 2.4 Value Functions and Bellman Equation 26 2.5 Dynamic Programming 29 2.6 Temporal-Difference Learning 32 2.7 Evaluation with Learning Curves 36 2.8 Equivalence of R(s, a, s') and R(s, a) 39 2.9 Summary 40 3 Games: Models of Multi-Agent Interaction 43 3.1 Normal-Form Games 44 3.2 Repeated Normal-Form Games 46 3.3 Stochastic Games 47 3.4 Partially Observable Stochastic Games 49 3.4.1 Belief States and Filtering 53 3.5 Modelling Communication 55 3.6 Knowledge Assumptions in Games 56 3.7 Dictionary: Reinforcement Learning Game Theory 58 3.8 Summary 58 4 Solution Concepts for Games 61 4.1 Joint Policy and Expected Return 62 4.2 Best Response 65 4.3 Minimax 65 4.3.1 Minimax Solution via Linear Programming 67 4.4 Nash Equilibrium 68 4.5 -Nash Equilibrium 70 4.6 (Coarse) Correlated Equilibrium 71 4.6.1 Correlated Equilibrium via Linear Programming 74 4.7 Conceptual Limitations of Equilibrium Solutions 75 4.8 Pareto Optimality 76 4.9 Social Welfare and Fairness 78 4.10 No-Regret 81 4.11 The Complexity of Computing Equilibria 83 4.11.1 PPAD Complexity Class 84 4.11.2 Computing -Nash Equilibrium is PPAD-Complete 86 4.12 Summary 87 5 Multi-Agent Reinforcement Learning in Games: First Steps and Challenges 89 5.1 General Learning Process 90 5.2 Convergence Types 92 5.3 Single-Agent RL Reductions 95 5.3.1 Central Learning 95 5.3.2 Independent Learning 97 5.3.3 Example: Level-Based Foraging 99 5.4 Challenges of MARL 101 5.4.1 Non-Stationarity 102 5.4.2 Equilibrium Selection 104 5.4.3 Multi-Agent Credit Assignment 106 5.4.4 Scaling to Many Agents 108 5.5 What Algorithms Do Agents Use? 109 5.5.1 Self-Play 109 5.5.2 Mixed-Play 111 5.6 Summary 111 6 Multi-Agent Reinforcement Learning: Foundational Algorithms 115 6.1 Dynamic Programming for Games: Value Iteration 116 6.2 Temporal-Difference Learning for Games: Joint Action Learning 118 6.2.1 Minimax Q-Learning 121 6.2.2 Nash Q-Learning 123 6.2.3 Correlated Q-Learning 124 6.2.4 Limitations of Joint Action Learning 125 6.3 Agent Modelling 127 6.3.1 Fictitious Play 128 6.3.2 Joint Action Learning with Agent Modelling 131 6.3.3 Bayesian Learning and Value of Information 134 6.4 Policy-Based Learning 140 6.4.1 Gradient Ascent in Expected Reward 141 6.4.2 Learning Dynamics of Infinitesimal Gradient Ascent 142 6.4.3 Win or Learn Fast 145 6.4.4 Win or Learn Fast with Policy Hill Climbing 147 6.4.5 Generalised Infinitesimal Gradient Ascent 149 6.5 No-Regret Learning 151 6.5.1 Unconditional and Conditional Regret Matching 151 6.5.2 Convergence of Regret Matching 153 6.6 Summary 156 II MULTI-AGENT DEEP REINFORCEMENT LEARNING: ALGORITHMS AND PRACTICE 159 7 Deep Learning 161 7.1 Function Approximation for Reinforcement Learning 161 7.2 Linear Function Approximation 163 7.3 Feedforward Neural Networks 165 7.3.1 Neural Unit 166 7.3.2 Activation Functions 167 7.3.3 Composing a Network from Layers and Units 168 7.4 Gradient-Based Optimisation 169 7.4.1 Loss Function 170 7.4.2 Gradient Descent 171 7.4.3 Backpropagation 174 7.5 Convolutional and Recurrent Neural Networks 175 7.5.1 Learning from Images – Exploiting Spatial Relationships in Data 175 7.5.2 Learning from Sequences with Memory 178 7.6 Summary 180 8 Deep Reinforcement Learning 183 8.1 Deep Value Function Approximation 184 8.1.1 Deep Q-Learning – What Can Go Wrong? 184 8.1.2 Moving Target Problem 187 8.1.3 Breaking Correlations 188 8.1.4 Putting It All Together: Deep Q-Networks 192 8.1.5 Beyond Deep Q-Networks 193 8.2 Policy Gradient Algorithms 195 8.2.1 Advantages of Learning a Policy 195 8.2.2 Policy Gradient Theorem 197 8.2.3 REINFORCE: Monte Carlo Policy Gradient 199 8.2.4 Actor-Critic Algorithms 202 8.2.5 A2C: Advantage Actor-Critic 203 8.2.6 PPO: Proximal Policy Optimisation 206 8.2.7 Policy Gradient Algorithms in Practice 209 8.2.8 Concurrent Training of Policies 210 8.3 Observations, States, and Histories in Practice 215 8.4 Summary 216 9 Multi-Agent Deep Reinforcement Learning 219 9.1 Training and Execution Modes 220 9.1.1 Centralised Training and Execution 220 9.1.2 Decentralised Training and Execution 221 9.1.3 Centralised Training with Decentralised Execution 222 9.2 Notation for Multi-Agent Deep Reinforcement Learning 222 9.3 Independent Learning 223 9.3.1 Independent Value-Based Learning 224 9.3.2 Independent Policy Gradient Methods 226 9.3.3 Example: Deep Independent Learning in a Large Task 228 9.4 Multi-Agent Policy Gradient Algorithms 230 9.4.1 Multi-Agent Policy Gradient Theorem 231 9.4.2 Centralised Critics 232 9.4.3 Centralised Action-Value Critics 236 9.4.4 Counterfactual Action-Value Estimation 237 9.4.5 Equilibrium Selection with Centralised Action-Value Critics 239 9.5 Value Decomposition in Common-Reward Games 242 9.5.1 Individual-Global-Max Property 244 9.5.2 Linear Value Decomposition 246 9.5.3 Monotonic Value Decomposition 249 9.5.4 Value Decomposition in Practice 255 9.5.5 Beyond Monotonic Value Decomposition 261 9.6 Agent Modelling with Neural Networks 266 9.6.1 Joint Action Learning with Deep Agent Models 267 9.6.2 Learning Representations of Agent Policies 270 9.7 Environments with Homogeneous Agents 274 9.7.1 Parameter Sharing 276 9.7.2 Experience Sharing 278 9.8 Policy Self-Play in Zero-Sum Games 281 9.8.1 Monte Carlo Tree Search 283 9.8.2 Self-Play MCTS 286 9.8.3 Self-Play MCTS with Deep Neural Networks: AlphaZero 288 9.9 Population-Based Training 290 9.9.1 Policy Space Response Oracles 292 9.9.2 Convergence of PSRO 295 9.9.3 Grandmaster Level in StarCraft II: AlphaStar 297 9.10 Summary 301 10 Multi-Agent Deep RL in Practice 305 10.1 The Agent-Environment Interface 305 10.2 MARL Neural Networks in PyTorch 307 10.2.1 Seamless Parameter Sharing Implementation 309 10.2.2 Defining the Models: An Example with IDQN 310 10.3 Centralised Value Functions 312 10.4 Value Decomposition 313 10.5 Practical Tips for MARL Algorithms 314 10.5.1 Stacking Time Steps vs. Recurrent Network vs. Neither 314 10.5.2 Standardising Rewards 315 10.5.3 Centralised Optimisation 315 10.6 Presentation of Experimental Results 316 10.6.1 Learning Curves 316 10.6.2 Hyperparameter Search 318 11 Multi-Agent Environments 321 11.1 Criteria for Choosing Environments 322 11.2 Structurally Distinct 2_x0002_2 Matrix Games 323 11.2.1 No-Conflict Games 323 11.2.2 Conflict Games 324 11.3 Complex Environments 325 11.3.1 Level-Based Foraging 326 11.3.2 Multi-Agent Particle Environment 328 11.3.3 StarCraft Multi-Agent Challenge 329 11.3.4 Multi-Robot Warehouse 330 11.3.5 Google Research Football 331 11.3.6 Hanabi 332 11.3.7 Overcooked 333 11.4 Environment Collections 334 11.4.1 Melting Pot 335 11.4.2 OpenSpiel 335 11.4.3 Petting Zoo 337 A Surveys on Multi-Agent Reinforcement Learning 339 Bibliography 341

About the Author :
Stefano V. Albrecht is Associate Professor in the School of Informatics at the University of Edinburgh, where he leads the Autonomous Agents Research Group. His research focuses on the development of machine learning algorithms for autonomous systems control and decision making, with a particular focus on deep reinforcement learning and multi-agent interaction. Filippos Christianos is a research scientist in multi-agent deep reinforcement learning focusing on how MARL algorithms can be used efficiently and the author of multiple popular MARL-focused code libraries. Lukas Sch fer is a researcher focusing on the development of more generalizable, robust, and sample-efficient decision making using deep reinforcement learning, with a particular focus on multi-agent reinforcement learning.


Best Sellers


Product Details
  • ISBN-13: 9780262049375
  • Publisher: MIT Press Ltd
  • Publisher Imprint: MIT Press
  • Height: 178 mm
  • No of Pages: 394
  • Sub Title: Foundations and Modern Approaches
  • ISBN-10: 0262049376
  • Publisher Date: 17 Dec 2024
  • Binding: Hardback
  • Language: English
  • Returnable: Y
  • Width: 114 mm


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Multi-Agent Reinforcement Learning: Foundations and Modern Approaches
MIT Press Ltd -
Multi-Agent Reinforcement Learning: Foundations and Modern Approaches
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Multi-Agent Reinforcement Learning: Foundations and Modern Approaches

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    Fresh on the Shelf


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!