- Data Pragmatist
- Posts
- A Beginner's Guide to Quantization in Deep Learning with PyTorch
A Beginner's Guide to Quantization in Deep Learning with PyTorch
Apple could face a $38B fine
Welcome to learning edition of the Data Pragmatist, your dose of all things data science and AI.
π Estimated Reading Time: 5 minutes. Missed our previous editions?
π° Apple could face a $38B fine LINK
The European Union has charged Apple with violating the Digital Markets Act due to its App Store policies that hinder competition, marking Apple as the first company under these new regulations.
Apple faces fines up to 10 percent of its annual global revenue, or $38 billion, if found guilty, with potential penalties increasing to 20 percent for repeat offenses.
The European Commission is also investigating Apple for its support of alternative iOS app stores, focusing on the contentious Core Technology Fee and the complex process required for installing third-party marketplaces.
π€ Apple in talks with Meta for potential AI integration LINK
Apple is reportedly negotiating with Meta to integrate Metaβs generative AI model into Apple's new AI system, Apple Intelligence, according to The Wall Street Journal.
Apple is seeking partnerships with multiple AI companies, including Meta, to enhance its AI capabilities and catch up in the competitive AI race.
A potential collaboration between Apple and Meta would be significant due to their history of disagreements, and it could greatly impact the AI industry if successful.
π§ A Beginner's Guide to Quantization in Deep Learning with PyTorch
Highlights
Understanding Quantization: Learn what quantization is and why it's necessary.
Mathematical Derivations: Dive into the mathematical aspects of quantization.
Coding in PyTorch: Perform quantization and de-quantization of LLM weight parameters in PyTorch.
What is Quantization and Why Do You Need It?
Quantization compresses large models by reducing the precision of weight parameters and activations, significantly decreasing model size. For instance, the Llama 3 8B model reduces from 32GB to 8GB with INT8 quantization, and further to 4GB with INT4 quantization. This enables model fine-tuning and inference on devices with limited memory and processing power, reducing the need for expensive cloud resources while maintaining accuracy.
How Does Quantization Work?
Quantization maps higher precision weights (e.g., FP32) to lower precision (e.g., INT8) using linear quantization methods. There are two modes:
Asymmetric Quantization: Maps original tensor values to a quantized range, using a scale value (S) and a zero point (Z).
Symmetric Quantization: Maps the zero point of the original tensor range directly to zero in the quantized range, eliminating the need for a separate zero point.
Asymmetric Quantization: Mathematical Derivation
Symmetric Quantization
Symmetric quantization simplifies calculations by mapping zero to zero directly, using the same principles without needing a zero point.
Coding in PyTorch
Initialize Weights: Create a random tensor.
Define Functions: Write functions for asymmetric quantization and de-quantization.
Perform Quantization: Calculate quantized weights, scale, and zero point.
De-quantize: Recover original weights from quantized values.
Evaluate Accuracy: Compute quantization error to ensure accuracy.
import torch
def asymmetric_quantization(original_weight):
quantized_data_type = torch.int8
Wmax, Wmin = original_weight.max().item(), original_weight.min().item()
Qmax, Qmin = torch.iinfo(quantized_data_type).max, torch.iinfo(quantized_data_type).min
S = (Wmax - Wmin) / (Qmax - Qmin)
Z = Qmin - (Wmin / S)
Z = int(round(max(min(Z, Qmax), Qmin)))
quantized_weight = torch.clamp(torch.round((original_weight / S) + Z), Qmin, Qmax).to(quantized_data_type)
return quantized_weight, S, Z
def asymmetric_dequantization(quantized_weight, scale, zero_point):
return scale * (quantized_weight.to(torch.float32) - zero_point)
# Example
original_weight = torch.randn((4,4))
quantized_weight, scale, zero_point = asymmetric_quantization(original_weight)
dequantized_weight = asymmetric_dequantization(quantized_weight, scale, zero_point)
quantization_error = (dequantized_weight - original_weight).square().mean()
Conclusion
This guide covers essential quantization concepts and methods, providing a strong foundation for implementing quantization in LLMs and other deep learning models. For advanced techniques like channel and group quantization, stay tuned for future posts.
Top 3 AI Tools for Video Editing
1. Adobe Premiere Pro
Best for: Professional video creators
Key Features:
Morph Cut: Smooth transitions between clips.
Text-Based Editing: AI-generated transcripts for faster editing.
Auto Color: AI-driven color correction and grading.
Speech to Text: Automated transcript and caption creation.
Remix: Syncs video and audio, adjusts soundtracks and dialogue volumes.
Pricing: $31.49/month or $239.88/year.
2. Wondershare Filmora
Best for: Bloggers, YouTubers, and social media influencers
Key Features:
AI Audio Stretch: Matches audio to video length.
AI Smart Cutout: Removes unwanted objects and backgrounds.
AI Audio Denoise: Cleans background noises from audio.
Auto Frame: Keeps the focal point in sight.
Silence Detection: Removes unnecessary pauses.
Pricing: $49.99/year or a perpetual license for $79.99.
3. Runway
Best for: Web-based video editing
Key Features:
Text to Color: Color grading via text prompts.
Blur Faces: Blurs faces in videos.
Inpainting: Removes unwanted objects.
Super-Slow Motion: Adds slow motion effects.
Scene Detection: Detects and splits scene changes.
Pricing: Free basic plan (with watermarks), Standard plan $15/month, Pro plan $35/month, Unlimited plan $95/month.
How did you like today's email? |
If you are interested in contributing to the newsletter, respond to this email. We are looking for contributions from you β our readers to keep the community alive and going.