Vectorization
Home > Computing and Information Technology > Computer science > Artificial intelligence > Machine learning > Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms
Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms

Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms

|
     0     
5
4
3
2
1




International Edition


About the Book

Enables readers to develop foundational and advanced vectorization skills for scalable data science and machine learning and address real-world problems Offering insights across various domains such as computer vision and natural language processing, Vectorization covers the fundamental topics of vectorization including array and tensor operations, data wrangling, and batch processing. This book illustrates how the principles discussed lead to successful outcomes in machine learning projects, serving as concrete examples for the theories explained, with each chapter including practical case studies and code implementations using NumPy, TensorFlow, and PyTorch. Each chapter has one or two types of contents: either an introduction/comparison of the specific operations in the numerical libraries (illustrated as tables) and/or case study examples that apply the concepts introduced to solve a practical problem (as code blocks and figures). Readers can approach the knowledge presented by reading the text description, running the code blocks, or examining the figures. Written by the developer of the first recommendation system on the Peacock streaming platform, Vectorization explores sample topics including: Basic tensor operations and the art of tensor indexing, elucidating how to access individual or subsets of tensor elements Vectorization in tensor multiplications and common linear algebraic routines, which form the backbone of many machine learning algorithms Masking and padding, concepts which come into play when handling data of non-uniform sizes, and string processing techniques for natural language processing (NLP) Sparse matrices and their data structures and integral operations, and ragged or jagged tensors and the nuances of processing them From the essentials of vectorization to the subtleties of advanced data structures, Vectorization is an ideal one-stop resource for both beginners and experienced practitioners, including researchers, data scientists, statisticians, and other professionals in industry, who seek academic success and career advancement.

Table of Contents:
About the Author xiii Preface xv Acknowledgment xix 1 Introduction to Vectorization 1 1.1 What Is Vectorization 1 1.1.1 A Simple Example of Vectorization in Action 2 1.1.2 Python Can Still Be Faster! 3 1.1.3 Memory Allocation of Vectorized Operations 4 1.2 Case Study: Dense Layer of a Neural Network 6 1.3 Vectorization vs. Other Parallel Computing Paradigms 9 1.3.1 Multithreading 9 1.3.2 Multiprocessing 9 1.3.3 Multiworker Distributed Computing 13 Bibliography 16 2 Basic Tensor Operations 19 2.1 Tensor Initializers 19 2.2 Data Type and Casting 24 2.2.1 Tips on Specifying the dtypes During Tensor Initialization 27 2.2.2 Tips on Casting 27 2.3 Mathematical Operations 27 2.4 Reduction Operations 31 2.5 Value Comparison Operations 31 2.6 Logical Operations 32 2.7 Ordered Array-Adjacent Element Operations 33 2.8 Array Reversing 33 2.9 Concatenation, Stacking, and Splitting 35 2.10 Reshaping 35 2.11 Broadcasting 38 2.12 Case Studies 44 2.12.1 Image Normalization 45 2.12.2 Pearson’s Correlation 46 2.12.3 Pair-wise Difference 47 2.12.4 Construction of Magic Squares 48 Bibliography 57 3 Tensor Indexing 61 3.1 Get Values at Index 61 3.1.1 Integer Indexing 61 3.1.2 Flat Index vs. Multi-index 63 3.1.3 Boolean Indexing 69 3.2 Slicing 70 3.2.1 Reusing Slice Configuration 75 3.3 Case Study: Get Consecutive Index 78 3.4 Take and Gather 80 3.4.1 Take 80 3.4.2 Take Along Axis 83 3.4.3 Gather 87 3.4.4 N-Dimensional Gather 91 3.5 Assign Values at Index 95 3.6 Put and Scatter 98 3.6.1 Put 98 3.6.2 Put Along Axis 100 3.6.3 Multi-index Scatter Replacement 101 3.6.4 Additional Scatter Operations from PyTorch 108 3.7 Case Study: Batchwise Scatter Values 113 Bibliography 115 4 Linear Algebra 119 4.1 Tensor Multiplications 119 4.2 The matmul Operation 119 4.2.1 The @ Operator 122 4.3 The tensordot Operation 123 4.3.1 Heuristics of tensordot Operations 125 4.4 Einsum 129 4.5 Case Study: Pair-wise Pearson’s Cross-Correlation 134 4.6 Case Study: Hausdorff Distance 135 4.7 Common Linear Algebraic Routines 139 4.8 Case Study: Fitting Single Exponential Curves 139 Bibliography 144 5 Masking and Padding 145 5.1 Masking 145 5.1.1 Triangular and Diagonal Masks 146 5.1.2 Changing Elements Using the where Operation 146 5.1.3 Use Multiplication to Apply Masks 146 5.1.4 Use Arithmetic Operations as Boolean Operations to Apply and Combine Masks 151 5.1.5 Select Elements Based on Masking 152 5.1.6 Case Study: Top-k Masking 153 5.2 Padding 155 5.2.1 Case Study: Padding in Convolutional Neural Networks 161 5.2.2 Case Study: Truncate or Pad Sequence to Desired Length 163 5.3 Advanced Case Studies 164 5.3.1 Scaled-Dot Product Attention 164 5.3.2 Variable-Length Range via Masking 168 5.3.3 Length Regulator Module of FastSpeech 2 171 Bibliography 181 6 String Processing 183 6.1 String Data Types 183 6.1.1 NumPy String, Bytes, and Object 183 6.1.2 Pandas String 184 6.1.3 Tensorflow Bytes 186 6.1.4 PyTorch 187 6.2 String Operations 187 6.3 Case Study: Parsing DateTime from String Representations 189 6.4 Mapping Strings to Indices 194 6.4.1 NumPy np.unique 194 6.4.2 Pandas pd.Categorical 195 6.4.3 Scikit-learn sklearn.preprocessing.LabelEncoder 198 6.4.4 Tensorflow tf.lookup 198 6.4.5 TorchText torchtext.vocab 200 6.5 Case Study: Factorization Machine 201 6.5.1 Factorization Machine Model 202 6.5.2 More Efficient Optimization Criterion 202 6.5.3 Implementation of Deep Factorization Machine in Tensorflow 203 6.5.4 Training DeepFM on MovieLens 1M Dataset 209 6.6 Regular Expressions (Regex) 215 6.7 Data Serialization and Deserialization 217 Bibliography 221 7 Sparse Matrix 223 7.1 Scipy’s Sparse Matrix Classes 224 7.1.1 Coordinate Sparse Matrix (coo_matrix) 224 7.1.2 Compressed Sparse Column Matrix (csc_matrix) 225 7.1.3 Compressed Sparse Row Matrix (csr_matrix) 227 7.1.4 Block Sparse Row Matrix (bsr_matrix) 228 7.1.5 Dictionary of Keys Sparse Matrix (dok_matrix) 229 7.1.6 Row-Based List of List Sparse Matrix (lil_matrix) 230 7.1.7 Diagonal Storage Sparse Matrix (dia_matrix) 232 7.1.8 Comparisons Between Different Sparse Matrix Formats 233 7.2 Sparse Matrix Broadcasting 235 7.2.1 Scalar Broadcasting 235 7.2.2 Row-wise Broadcasting 236 7.2.3 Column-wise Broadcasting 237 7.2.4 Multiplication on Sparse Indices 237 7.3 Tensorflow’s Sparse Tensors 238 7.3.1 SparseTensor Class 239 7.3.2 Sparse CSR Matrix 240 7.4 PyTorch’s Sparse Matrix 242 7.5 Sparse Matrix in Other Python Libraries 245 7.6 When (Not) to Use Sparse Matrix 245 7.7 Case Study: Sparse Matrix Factorization with ALS 245 7.7.1 Matrix Factorization 246 7.7.2 Parameter Updates with ALS 246 7.7.3 Adding Bias Terms to Matrix Factorization 247 7.7.4 Adding Regularization Term 249 7.7.5 Implementing ALS 250 7.7.6 Training a Model with MovieLens-100k 255 Bibliography 257 8 Jagged Tensors 261 8.1 Left Align a Sparse Tensor to Represent Ragged Tensor 263 8.2 Index to Binary Indicator 269 8.3 Case Study: Jaccard Similarities Using Sparse Matrix 271 8.4 Case Study: Batchwise Set Operations 275 8.5 Case Study: Autoencoders with Sparse Inputs 283 8.5.1 Embedding Lookup on Sparse Inputs 284 8.5.2 Inputs with Weights 287 Bibliography 293 9 Groupby, Apply, and Aggregate 295 9.1 Pandas Groupwise Operations 296 9.2 Reshaping and Windowing of Dense Tensors 298 9.3 Case Study: Vision Transformer (ViT) 305 9.4 Bucketizing Values 315 9.5 Segment-wise Aggregation 319 9.6 Case Study: EmbeddingBag 325 9.7 Case Study: Vocal Duration Constrained by Note Duration 330 9.8 Case Study: Filling of Missing Values in a Sequence 336 Bibliography 341 10 Sorting and Reordering 343 10.1 Sorting Operations 343 10.2 Case Study: Top-k Using argsort and argpartition 346 10.3 Case Study: Sort the Rows of a Matrix 349 10.4 Case Study: Reverse Padded Sequence 353 10.5 Case Study: Gumbel-Max Sampling with Weights 358 10.6 Case Study: Sorting Articles Around Anchored Advertisements 367 Bibliography 373 11 Building a Language Model from Scratch 375 11.1 Language Modeling with Transformer 375 11.1.1 Encoder and Decoder of the Transformer Architecture 376 11.1.2 Training of Transformer Models 377 11.2 Pre-LN vs. Post-LN Transformer 378 11.3 Layer Normalization 383 11.4 Positional Encoding and Embedding 385 11.4.1 Sinusoidal Positional Encoding 385 11.4.2 Position as Categorical Embeddings 387 11.4.3 Relative Positional Encoding (RPE) 388 11.4.4 Rotary Positional Encoding (RoPE) 389 11.5 Activation Functions in Feedforward Layer 395 11.6 Case Study: Training a Tiny LLaMA Model for Next Token Prediction 398 11.7 A Word on AI Safety and Alignment 410 11.8 Concluding Remarks 412 Bibliography 412 Index 419


Best Sellers


Product Details
  • ISBN-13: 9781394272945
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: Wiley-IEEE Press
  • Language: English
  • Returnable: N
  • Returnable: N
  • Weight: 862 gr
  • ISBN-10: 1394272944
  • Publisher Date: 10 Dec 2024
  • Binding: Hardback
  • No of Pages: 448
  • Returnable: N
  • Sub Title: A Practical Guide to Efficient Implementations of Machine Learning Algorithms


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms
John Wiley & Sons Inc -
Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Vectorization: A Practical Guide to Efficient Implementations of Machine Learning Algorithms

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals

    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!