Audio Source Separation and Speech Enhancement
Home > Science, Technology & Agriculture > Electronics and communications engineering > Communications engineering / telecommunications > Audio Source Separation and Speech Enhancement
Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

|
     0     
5
4
3
2
1




Out of Stock


Notify me when this book is in stock
About the Book

Learn the technology behind hearing aids, Siri, and Echo  Audio source separation and speech enhancement aim to extract one or more source signals of interest from an audio recording involving several sound sources. These technologies are among the most studied in audio signal processing today and bear a critical role in the success of hearing aids, hands-free phones, voice command and other noise-robust audio analysis systems, and music post-production software. Research on this topic has followed three convergent paths, starting with sensor array processing, computational auditory scene analysis, and machine learning based approaches such as independent component analysis, respectively. This book is the first one to provide a comprehensive overview by presenting the common foundations and the differences between these techniques in a unified setting. Key features: Consolidated perspective on audio source separation and speech enhancement. Both historical perspective and latest advances in the field, e.g. deep neural networks. Diverse disciplines: array processing, machine learning, and statistical signal processing. Covers the most important techniques for both single-channel and multichannel processing. This book provides both introductory and advanced material suitable for people with basic knowledge of signal processing and machine learning. Thanks to its comprehensiveness, it will help students select a promising research track, researchers leverage the acquired cross-domain knowledge to design improved techniques, and engineers and developers choose the right technology for their target application scenario. It will also be useful for practitioners from other fields (e.g., acoustics, multimedia, phonetics, and musicology) willing to exploit audio source separation or speech enhancement as pre-processing tools for their own needs.

Table of Contents:
List of Authors xvii Preface xxi Acknowledgment xxiii Notations xxv Acronyms xxix About the Companion Website xxxi Part I Prerequisites 1 1 Introduction 3 Emmanuel Vincent, Sharon Gannot, and Tuomas Virtanen 1.1 Why are Source Separation and Speech Enhancement Needed? 3 1.2 What are the Goals of Source Separation and Speech Enhancement? 4 1.3 How can Source Separation and Speech Enhancement be Addressed? 9 1.4 Outline 11 Bibliography 12 2 Time-Frequency Processing: Spectral Properties 15 Tuomas Virtanen, Emmanuel Vincent, and Sharon Gannot 2.1 Time-Frequency Analysis and Synthesis 15 2.2 Source Properties in the Time-Frequency Domain 23 2.3 Filtering in the Time-Frequency Domain 25 2.4 Summary 28 Bibliography 28 3 Acoustics: Spatial Properties 31 Emmanuel Vincent, Sharon Gannot, and Tuomas Virtanen 3.1 Formalization of the Mixing Process 31 3.2 Microphone Recordings 32 3.3 Artificial Mixtures 36 3.4 Impulse Response Models 37 3.5 Summary 43 Bibliography 43 4 Multichannel Source Activity Detection, Localization, and Tracking 47 Pasi Pertilä, Alessio Brutti, Piergiorgio Svaizer, and Maurizio Omologo 4.1 Basic Notions in Multichannel Spatial Audio 47 4.2 Multi-Microphone Source Activity Detection 52 4.3 Source Localization 54 4.4 Summary 60 Bibliography 60 Part II Single-Channel Separation and Enhancement 65 5 Spectral Masking and Filtering 67 Timo Gerkmann and Emmanuel Vincent 5.1 Time-Frequency Masking 67 5.2 Mask Estimation Given the Signal Statistics 70 5.3 Perceptual Improvements 81 5.4 Summary 82 Bibliography 83 6 Single-Channel Speech Presence Probability Estimation and Noise Tracking 87 Rainer Martin and Israel Cohen 6.1 Speech Presence Probability and its Estimation 87 6.2 Noise Power Spectrum Tracking 93 6.3 Evaluation Measures 102 6.4 Summary 104 Bibliography 104 7 Single-Channel Classification and Clustering Approaches 107 FelixWeninger, Jun Du, Erik Marchi, and Tian Gao 7.1 Source Separation by Computational Auditory Scene Analysis 108 7.2 Source Separation by Factorial HMMs 111 7.3 Separation Based Training 113 7.4 Summary 125 Bibliography 125 8 Nonnegative Matrix Factorization 131 Roland Badeau and Tuomas Virtanen 8.1 NMF and Source Separation 131 8.2 NMF Theory and Algorithms 137 8.3 NMF Dictionary LearningMethods 145 8.4 Advanced NMF Models 148 8.5 Summary 156 Bibliography 156 9 Temporal Extensions of Nonnegative Matrix Factorization 161 Cédric Févotte, Paris Smaragdis, NasserMohammadiha, and Gautham J.Mysore 9.1 Convolutive NMF 161 9.2 Overview of DynamicalModels 169 9.3 Smooth NMF 170 9.4 Nonnegative State-Space Models 174 9.5 Discrete DynamicalModels 178 9.6 The Use of DynamicModels in Source Separation 182 9.7 Which Model to Use? 183 9.8 Summary 184 9.9 Standard Distributions 184 Bibliography 185 Part III Multichannel Separation and Enhancement 189 10 Spatial Filtering 191 Shmulik Markovich-Golan,Walter Kellermann, and Sharon Gannot 10.1 Fundamentals of Array Processing 192 10.2 Array Topologies 197 10.3 Data-Independent Beamforming 199 10.4 Data-Dependent Spatial Filters: Design Criteria 202 10.5 Generalized Sidelobe Canceler Implementation 209 10.6 Postfilters 210 10.7 Summary 211 Bibliography 212 11 Multichannel Parameter Estimation 219 Shmulik Markovich-Golan,Walter Kellermann, and Sharon Gannot 11.1 Multichannel Speech Presence Probability Estimators 219 11.2 Covariance Matrix Estimators Exploiting SPP 227 11.3 Methods forWeakly Guided and Strongly Guided RTF Estimation 228 11.4 Summary 231 Bibliography 231 12 Multichannel Clustering and Classification Approaches 235 Michael I.Mandel, Shoko Araki, and Tomohiro Nakatani 12.1 Two-Channel Clustering 236 12.2 Multichannel Clustering 244 12.3 Multichannel Classification 251 12.4 Spatial Filtering Based on Masks 255 12.5 Summary 257 Bibliography 258 13 Independent Component and Vector Analysis 263 Hiroshi Sawada and Zbynˇek Koldovský 13.1 Convolutive Mixtures and their Time-Frequency Representations 264 13.2 Frequency-Domain Independent Component Analysis 265 13.3 Independent Vector Analysis 279 13.4 Example 280 13.5 Summary 284 Bibliography 284 14 Gaussian Model Based Multichannel Separation 289 Alexey Ozerov and Hirokazu Kameoka 14.1 Gaussian Modeling 289 14.2 Library of Spectral and SpatialModels 295 14.3 Parameter Estimation Criteria and Algorithms 300 14.4 Detailed Presentation of Some Methods 305 14.5 Summary 312 Acknowledgment 312 Bibliography 312 15 Dereverberation 317 Emanuël A.P. Habets and Patrick A. Naylor 15.1 Introduction to Dereverberation 317 15.2 Reverberation Cancellation Approaches 319 15.3 Reverberation Suppression Approaches 329 15.4 Direct Estimation 335 15.5 Evaluation of Dereverberation 336 15.6 Summary 337 Bibliography 337 Part IV Application Scenarios and Perspectives 345 16 Applying Source Separation to Music 347 Bryan Pardo, Antoine Liutkus, Zhiyao Duan, and Gaël Richard 16.1 Challenges and Opportunities 348 16.2 Nonnegative Matrix Factorization in the Case of Music 349 16.3 Taking Advantage of the Harmonic Structure of Music 354 16.4 Nonparametric Local Models: Taking Advantage of Redundancies in Music 358 16.5 Taking Advantage of Multiple Instances 363 16.6 Interactive Source Separation 367 16.7 Crowd-Based Evaluation 367 16.8 Some Examples of Applications 368 16.9 Summary 370 Bibliography 370 17 Application of Source Separation to Robust Speech Analysis and Recognition 377 ShinjiWatanabe, Tuomas Virtanen, and Dorothea Kolossa 17.1 Challenges and Opportunities 377 17.2 Applications 380 17.3 Robust Speech Analysis and Recognition 390 17.4 Integration of Front-End and Back-End 397 17.5 Use of Multimodal Information with Source Separation 403 17.6 Summary 404 Bibliography 405 18 Binaural Speech Processing with Application to Hearing Devices 413 Simon Doclo, Sharon Gannot, Daniel Marquardt, and Elior Hadad 18.1 Introduction to Binaural Processing 413 18.2 Binaural Hearing 415 18.3 Binaural Noise Reduction Paradigms 416 18.4 The Binaural Noise Reduction Problem 420 18.5 Extensions for Diffuse Noise 425 18.6 Extensions for Interfering Sources 431 18.7 Summary 437 Bibliography 437 19 Perspectives 443 Emmanuel Vincent, Tuomas Virtanen, and Sharon Gannot 19.1 Advancing Deep Learning 443 19.2 Exploiting Phase Relationships 447 19.3 AdvancingMultichannel Processing 450 19.4 Addressing Multiple-Device Scenarios 453 19.5 TowardsWidespread Commercial Use 455 Acknowledgment 457 Bibliography 457 Index 465


Best Sellers


Product Details
  • ISBN-13: 9781119279914
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: Standards Information Network
  • Language: English
  • ISBN-10: 1119279917
  • Publisher Date: 24 Jul 2018
  • Binding: Digital (delivered electronically)
  • No of Pages: 512


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Audio Source Separation and Speech Enhancement
John Wiley & Sons Inc -
Audio Source Separation and Speech Enhancement
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Audio Source Separation and Speech Enhancement

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals

    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!