• Data Pragmatist
  • Posts
  • A Beginner's Guide to Quantization in Deep Learning with PyTorch

A Beginner's Guide to Quantization in Deep Learning with PyTorch

Apple could face a $38B fine

Welcome to learning edition of the Data Pragmatist, your dose of all things data science and AI.

πŸ“– Estimated Reading Time: 5 minutes. Missed our previous editions?

πŸ’° Apple could face a $38B fine LINK

  • The European Union has charged Apple with violating the Digital Markets Act due to its App Store policies that hinder competition, marking Apple as the first company under these new regulations.

  • Apple faces fines up to 10 percent of its annual global revenue, or $38 billion, if found guilty, with potential penalties increasing to 20 percent for repeat offenses.

  • The European Commission is also investigating Apple for its support of alternative iOS app stores, focusing on the contentious Core Technology Fee and the complex process required for installing third-party marketplaces.

🀝 Apple in talks with Meta for potential AI integration LINK

  • Apple is reportedly negotiating with Meta to integrate Meta’s generative AI model into Apple's new AI system, Apple Intelligence, according to The Wall Street Journal.

  • Apple is seeking partnerships with multiple AI companies, including Meta, to enhance its AI capabilities and catch up in the competitive AI race.

  • A potential collaboration between Apple and Meta would be significant due to their history of disagreements, and it could greatly impact the AI industry if successful.

🧠 A Beginner's Guide to Quantization in Deep Learning with PyTorch

Highlights

  • Understanding Quantization: Learn what quantization is and why it's necessary.

  • Mathematical Derivations: Dive into the mathematical aspects of quantization.

  • Coding in PyTorch: Perform quantization and de-quantization of LLM weight parameters in PyTorch.

What is Quantization and Why Do You Need It?

Quantization compresses large models by reducing the precision of weight parameters and activations, significantly decreasing model size. For instance, the Llama 3 8B model reduces from 32GB to 8GB with INT8 quantization, and further to 4GB with INT4 quantization. This enables model fine-tuning and inference on devices with limited memory and processing power, reducing the need for expensive cloud resources while maintaining accuracy.

How Does Quantization Work?

Quantization maps higher precision weights (e.g., FP32) to lower precision (e.g., INT8) using linear quantization methods. There are two modes:

  • Asymmetric Quantization: Maps original tensor values to a quantized range, using a scale value (S) and a zero point (Z).

  • Symmetric Quantization: Maps the zero point of the original tensor range directly to zero in the quantized range, eliminating the need for a separate zero point.

Asymmetric Quantization: Mathematical Derivation

Symmetric Quantization

Symmetric quantization simplifies calculations by mapping zero to zero directly, using the same principles without needing a zero point.

Coding in PyTorch

  1. Initialize Weights: Create a random tensor.

  2. Define Functions: Write functions for asymmetric quantization and de-quantization.

  3. Perform Quantization: Calculate quantized weights, scale, and zero point.

  4. De-quantize: Recover original weights from quantized values.

  5. Evaluate Accuracy: Compute quantization error to ensure accuracy.

import torch

def asymmetric_quantization(original_weight):
  quantized_data_type = torch.int8
  Wmax, Wmin = original_weight.max().item(), original_weight.min().item()
  Qmax, Qmin = torch.iinfo(quantized_data_type).max, torch.iinfo(quantized_data_type).min
  S = (Wmax - Wmin) / (Qmax - Qmin)
  Z = Qmin - (Wmin / S)
  Z = int(round(max(min(Z, Qmax), Qmin)))
  quantized_weight = torch.clamp(torch.round((original_weight / S) + Z), Qmin, Qmax).to(quantized_data_type)
  return quantized_weight, S, Z

def asymmetric_dequantization(quantized_weight, scale, zero_point):
  return scale * (quantized_weight.to(torch.float32) - zero_point)

# Example
original_weight = torch.randn((4,4))
quantized_weight, scale, zero_point = asymmetric_quantization(original_weight)
dequantized_weight = asymmetric_dequantization(quantized_weight, scale, zero_point)
quantization_error = (dequantized_weight - original_weight).square().mean()

Conclusion

This guide covers essential quantization concepts and methods, providing a strong foundation for implementing quantization in LLMs and other deep learning models. For advanced techniques like channel and group quantization, stay tuned for future posts.

Top 3 AI Tools for Video Editing

1. Adobe Premiere Pro

Best for: Professional video creators

Key Features:

  • Morph Cut: Smooth transitions between clips.

  • Text-Based Editing: AI-generated transcripts for faster editing.

  • Auto Color: AI-driven color correction and grading.

  • Speech to Text: Automated transcript and caption creation.

  • Remix: Syncs video and audio, adjusts soundtracks and dialogue volumes.

Pricing: $31.49/month or $239.88/year.

2. Wondershare Filmora

Best for: Bloggers, YouTubers, and social media influencers

Key Features:

  • AI Audio Stretch: Matches audio to video length.

  • AI Smart Cutout: Removes unwanted objects and backgrounds.

  • AI Audio Denoise: Cleans background noises from audio.

  • Auto Frame: Keeps the focal point in sight.

  • Silence Detection: Removes unnecessary pauses.

Pricing: $49.99/year or a perpetual license for $79.99.

3. Runway

Best for: Web-based video editing

Key Features:

  • Text to Color: Color grading via text prompts.

  • Blur Faces: Blurs faces in videos.

  • Inpainting: Removes unwanted objects.

  • Super-Slow Motion: Adds slow motion effects.

  • Scene Detection: Detects and splits scene changes.

Pricing: Free basic plan (with watermarks), Standard plan $15/month, Pro plan $35/month, Unlimited plan $95/month.

How did you like today's email?

Login or Subscribe to participate in polls.

If you are interested in contributing to the newsletter, respond to this email. We are looking for contributions from you β€” our readers to keep the community alive and going.