Comprehensive overview of the principles, theories, and techniques behind deep learning enabled semantic communications
Deep Learning Enabled Semantic Communications explores the synergy between deep learning and semantic communication, particularly in the context of advancing 6G networks. It provides a focused introduction to the subject, systematically covering deep learning enabled semantic communication systems and task-oriented semantic transmission paradigms in wireless communication.
The book reviews various aspects of semantic communications, including information theory, multimodal technologies, semantic noise, and semantic sensing. It explores cutting-edge semantic communication architectures, highlighting their advantages over traditional approaches and their potential to drive the future of intelligent information industry. The book also details applications of deep learning-based semantic communication systems across various sources, including text, speech, images, and videos, comprehensively addressing system design, performance optimization, and measurement metrics.
The book is divided into eight main parts, which cover foundational knowledge, system design, multimodal and multitask-oriented semantic communication systems, joint semantic sensing and sampling, semantic noise suppression, and generative AI enabled systems.
Written by a diverse group of experts in academia and research institutions, Deep Learning Enabled Semantic Communications includes information on:
- Fundamental knowledge about deep learning and semantic communications, including the history, neural networks, and semantic information theory
- Compression of multimodal inputs, extraction of global semantic information, and the design of neural networks to boost the capability of handling lengthy speech
- Incorporation of different sources to extract semantic features and serve diverse intelligent tasks at the receiver
- Introduction of semantic impairments in communications to uncover how to design robust systems
- Joint design of data sampling, compression, and coding schemes under the guidance of semantic information
- Framework of generative semantic communications to detail the principles of incorporating generative models into semantic communications
Deep Learning Enabled Semantic Communications is an essential learning resource and reference for graduate and undergraduate students pursuing degrees in wireless communications, signal processing, or deep learning as well as engineers in the telecommunications and IT industries focusing on wireless communication techniques.
Table of Contents:
Foreword ix
Preface xi
Acknowledgments xv
Acronyms xvii
Notation xxi
1 Introduction 1
1.1 Conventional Communications versus Semantic Communications 2
1.1.1 Three-level Communications 2
1.1.2 History of Semantic Communications 3
1.2 Introducing Deep Learning to Semantic Communications 4
1.2.1 Deep Learning Basics 4
1.2.2 Deep Learning Enabled Semantic Communications 11
1.3 Semantic Communications for Further Networks 13
References 15
2 Semantic Information Theory 19
2.1 Semantic Entropy 19
2.1.1 Logical Probability Based 19
2.1.2 Synonymous Mapping Based 21
2.1.3 Fuzzy Theory Based 23
2.1.4 Task Based 24
2.2 Semantic Channel Capacity 24
2.2.1 Logical Probability Based 24
2.2.2 Synonymous Mapping Based 25
2.3 Semantic Source Coding Theorem 26
2.3.1 Logical Probability Based 26
2.3.2 Synonymous Mapping Based 27
2.4 Semantic Channel Coding Theorem 28
2.4.1 Logical Probability Based 28
2.4.2 Synonymous Mapping Based 28
2.5 Information Bottleneck 29
2.5.1 Classical Information Bottleneck 29
2.5.2 Knowledge Collision-based Information Bottleneck 30
References 30
3 Joint Semantic-channel Coding for Source Reconstruction 33
3.1 Semantic Communications for Text 34
3.1.1 Joint Semantic-channel Coding for Text 35
3.2 Semantic Communications for Speech 38
3.2.1 Joint Semantic-channel Coding for Speech 39
3.3 Semantic Communications for Image 42
3.3.1 Joint Semantic-channel Coding for Image 42
3.4 Performance Metrics 48
3.4.1 Performance Metrics for Text Accuracy 48
3.4.2 Performance Metrics for Speech Quality 49
3.4.3 Performance Metrics for Image Quality 49
References 52
4 Task-oriented Semantic Communications 55
4.1 Single-modal Task-oriented Semantic Communications 55
4.1.1 Semantic Communications for Machine Translation 56
4.1.2 Semantic Communications for Speech Recognition and Synthesis 59
4.2 Multimodal Task-oriented Semantic Communications 69
4.2.1 Semantic Communication Systems for Visual Question Answering 69
References 74
5 Joint Sensing and Semantic Communications 77
5.1 Introduction and Framework of Joint Sampling and Coding 77
5.1.1 Semantic Sampling 78
5.1.2 Semantic Reconstruction 79
5.2 Joint Semantic Sampling and Coding for Image 79
5.2.1 Semantic-aware Image Compressed Sensing 80
5.2.2 Adaptive Sampling and Semantic-channel Coding 84
5.3 Joint Semantic Sampling and Coding for Video 89
5.3.1 Semantic-based Video Sampling and Reconstruction 92
References 95
6 Semantic Impairments in Communications 97
6.1 JSCC Framework with Semantic Impairments 98
6.2 Source Semantic Impairments Suppression 100
6.2.1 Robust Semantic Communications for Text 100
6.2.2 Robust Semantic Communications for Speech 106
6.2.3 Robust Semantic Communications for Image 113
6.3 Knowledge Base Semantic Impairments Suppression 120
6.3.1 Robust SKB 120
References 125
7 Generative AI-enabled Semantic Communications 129
7.1 Introducing Generative Models to Semantic Communications 129
7.2 Framework of Generative Semantic Communications 131
7.2.1 Main Components of Generative Semantic Communication System 133
7.2.2 Key Interactions and Processes 135
7.3 Demonstration of Generative Semantic Communication for Video Conferencing 136
7.4 Applications of Semantic Communications in Other Scenarios 138
7.4.1 Immersive Communications 138
7.4.2 Autonomous Driving 139
7.4.3 Smart Cities 140
7.4.4 Satellite Networks 141
References 142
8 Conclusion and Challenges 145
Index 149
About the Author :
Zhijin Qin is an Associate Professor with Tsinghua University, China. She is an Associate Editor for IEEE Transactions on Communications, IEEE Transactions on Cognitive Networking, and IEEE Communications Letters.
Huiqiang Xie, PhD, is an Associate Professor at Jinan University, Guangzhou, Guangdong, China.
Zhenzi Weng is a Postdoctoral researcher at Imperial College London, UK.
Xiaoming Tao is a Professor with the Department of Electronic Engineering at Tsinghua University. She is also a Senior Member of the IEEE.