Data Pragmatist
Posts
Digest#6 Bias Variance Tradeoff, Diagnosing Hallucinations in ML & Major Upgrades in Text to Image

Digest#6 Bias Variance Tradeoff, Diagnosing Hallucinations in ML & Major Upgrades in Text to Image

Mental Model mondays, Podcast of the week and latest news from the data world

August 14, 2023

Hi, this is Data Pragmatist with another free issue of the Newsletter tailored specifically for you. We are on a mission to make staying up-to-date with the world of data and AI easier. To get full newsletters thrice a week, subscribe:

A new week is a sign to start new things, and this week, let me give you a set of new things to read, review and register. Today, your Mondays will begin with Mental Models, where you will learn about one mental model every monday. Stimulate your brains by engaging in a fun quiz and time for some pearls of wisdom from achievers of the Data Realm.

Finally delve into the latest happenings of the dataverse, who launched what, what became a success story and what hit a hurdle and more.

Mental Model Mondays 🤓🤓

Bias-Variance Tradeoff

This is the ‘walk on the rope’ in data modelling and machine learning. The data model has to balance between high bias and high variance to give out an optical outcome. In data science terms, bias refers to the simplifications or assumptions a model makes to fit data. Variance, on the other hand, captures the patterns in the data and analyses how much the model’s predictions change if given different training data.

For example, while trying to predict housing prices, a high-variance model might have drastically different prices for similar houses in different neighbourhoods (It focuses on the differences). Inversely, a high-bias model might predict the same price for all houses, regardless of their features. Striking the balance between the two is the key to Bias-Variance Tradeoff.

QUIZ of the Week📖📖

Which algorithm is commonly used for clustering in unsupervised machine learning?

🎧Get Your Headphones On🎧

Diagnosing and Curing hallucinations in LLMs, and Journey From PM in Google AI to a CEO in Galileo.

This week on “The Data Scientist Show”, Vikram Chatterji- The CEO of Galilieo- an AI diagnostics and explainability platform used by data science teams building NLP, LLMs, and Computer vision models across the Fortune 500 and high-growth Startups. Before Galileo, Vikram led Product Management at Google where his team built models for companies across retail, finance, healthcare and contact centers.

In this episode of the Podcast, he briefs on his journey into machine learning, diagnosing language models and identifying model hallucinations, one of the major issues that need to be addressed. He also shares his journey from Google AI to Galilieo and how he built his Diagnostics and Explainability platform. The Podcast ends with his valuable words for data scientists joining a Startup.

You can listen to the podcast here.

Latest From the DataVerse

Life-Like AI Clones 😳😳

Behold and Beware, it’s going to get even more difficult to identify real from Clones in the digital media, HeyGen has come up with hyper-realistic AI avatars for video creation with enhancements in voice, accents and video quality. This could open up the floodgates for AI-generated content in the media, which might stay undetectable and create chaos. Here is the sample, that could blow your mind.

MidJourney Announces Major Upgrade 🕺🏼💃🏼

Midjourney is rolling out a GPU cluster Upgrade for its Pro and Mega Users, which will speed up the rendering of images by ~1.5x faster and cheaper. Also, any spare GPU capacity can render such speed for other users as well randomly. It announced this news on its Official Twitter page on August 11th.

Why Bother with ML Theory When Industry Seems Uninterested? 🤔 [Join the Discussion!]

When entering the realm of Data Science, everyone has to cross the valley of ML theory, but does it really add value to your journey or can it be bypassed? ML theory requires mastering statistics, probability and algebra alongside linear Algebra for Deep Learning. This is a time-gobbling process- months and months of it. 🤯🤯 But, after all this trouble, in the ring, all that matters is whether you can successfully deploy a model or not. With pre-trained LLMs in the spotlight, all you are going to need is a few lines of code to create the magic. So, was the preparation needed?

What’s your take on this? Join the Discussion below.👇🏼

Did you find this edition meaningful and informative?

😊 😊 😊 Definitely Yes | 😐 😐 May be | 🥱 Not very much

🐦 Twitter: @DataPragmatist

💼 LinkedIn DataPragmatist

This post is public, so feel free to share and forward it.

taa