Data Pragmatist
Posts
How Spam Filters works and about Open Source Github Copilot - Code Llama.

How Spam Filters works and about Open Source Github Copilot - Code Llama.

4 min read | Study Bayersian Inference, Vicuna, a revolutionary Chatbo and curated news for you

Arun Chinnachamy
August 28, 2023

Hi, this is Data Pragmatist with another free issue of the Newsletter tailored specifically for you. We are on a mission to make staying up-to-date with the world of data and AI easier. If you find this interesting, Feel free to share it with others, tag us and get a chance to win free goodies.

Another Monday — a new week of endless possibilities. Lot of things are happening in the world and we are seeing accelerated innovations across all things AI. Lets look at what we are going to talk about today.

The concept of the week - Bayesian Inference, how is it used in life.
What if an AI coding generator communicated with you like a ChatBot, well that’s Vicuna for you. More on its inception, training and assessment is discussed in this week’s podcast.
Lastly, we have some of the greatest launches, the whole world (Windows Users) were waiting for. Read at our website.

Bayesian Inference

Imagine you're trying to predict whether it will rain tomorrow. Initially, you have a prior belief that there's a 30% chance of rain based on historical data. As you check the weather forecast and see clouds gathering, your belief is updated. The Bayesian approach combines your prior belief (prior probability) with the new evidence (likelihood of clouds indicating rain) to provide an updated probability of rain (posterior probability).

so, basically, Bayesian inference is a statistical approach that involves updating beliefs or probabilities about a hypothesis as new evidence or data becomes available. It's based on Bayes' theorem, which describes how to revise our beliefs in the presence of new information.

Bayesian inference is commonly found in medical diagnosis, recommendation systems being constantly updated based on preferences and user history, and it is extensively used in sentiment analysis. One of the most used use case is spam email filtering.

Bayesian Inference is utilized to classify incoming emails as either spam or not spam based on the occurrence of certain keywords or patterns. By continuously updating the probabilities of certain words being associated with spam or legitimate content, the system becomes more accurate over time in identifying and filtering out unwanted spam emails, providing users with a cleaner and more relevant email experience.

Applying Bayesian Inference to the Future of AI, we all know how AI is going to mould society in the future but to know to what extent and how, the next podcast will let you know and you can further get an updated prediction.

Vicuna: A Revolutionary Open-Source ChatBot Model

This week’s episode is all about Vicuna, a revolutionary open-source chatbot from UC Berkeley, created by the research group there headed by Joey Gonzalez, a computer science professor and co-director of the RISE and Sky Labs, and a member of Berkeley AI Research (BAIR) Group.

Jon Krohn and Joey converse about Vicuna, the latest LLM model from Berkeley, and how Meta AI’s Llama model is the foundation of the Vicuna Model. (continue to read more on Llama and its latest updates.

Llama, though is an embodiment of knowledge, doesn’t work as a chatbot or imparter of knowledge. So, their team came up with a method to train the Llama model to behave more like a chatbot with ChatGPT as their guidance mechanism.

For the database, the team of students used the public APIs for the SharGPT website and put it together with the Alpaca training scripts to fine-tune the Vicuna model. He also briefed in detail about the development and testing process. He noted how their model performed against other models on Chatbot Arena. He also talked about Gorilla, their open-source ChatGPT alternative.

They, in the end, discussed the importance of Open source models explaining that they widen access to capabilities that can critically improve our work environment and beyond, and what’s in store in the Chatbot Arena in the future. The way he explains the progress of his projects is really inspiring and insightful.

If you’re interested check out the full episode here. So, what do you think the future holds for the AI or in better words the AI holds for the future?

Drop your answers in the poll of the week.

Which is more beneficial?

Now for some curated news from the Data Science fraternity.

Code Llama: Meta’s AI Tool for Coding

Code Llama is an AI model built on top of Llama 2, where you can generate and discuss code. Code iLlama is a state-of-the-art LLM model now free for research and commercial use.

It makes a developer’s work easy as he or she can improve their productivity by using Code Llama to generate code with a few text prompts, reducing their work time.

Since the model is fed with a natural language instruction input, it is more accurate in generating expected outputs with fewer prompts or words.

Windows 11 to Get an AI Upgrade?

Microsoft after its biggest launch of Python integration last week, is eyeing developing AI capabilities on Windows 11 for apps like Paint, Snipping Tool, Camera and Photos.

Imagine effortlessly removing people and objects from your photos using the 'Photos' app, or having the Snipping Tool and Camera instantly recognize and extract text from screenshots.

Hold on, that's not all – even the trusty 'Paint' app is getting a futuristic boost thanks to AI inspired by Bing Image Generator.

Mark your calendar for September 21st, when Microsoft unveils more exciting details. Brace yourself for a Windows platform update in 2024 that promises to seamlessly integrate these AI marvels.

Microsoft's game plan? Supercharging familiar apps with AI smarts instead of starting from scratch. Taking a leaf from Google's book, this clever integration could speed up how quickly everyone hops on the AI bandwagon.

Read the full story here.

Moving on Here’s something really useful for you.

18-in-11 ChatGPT Hack and Tutorial

📖Alvarado Cintas has compiled his most used instructions into one and has also added a video tutorial in this Twitter thread. This gives you a list of some out-of-the-box prompts and instructions to play around in ChatGpt or any other Generative AI model.

Check it out here and add any more if you need anything.

Ending with a Good News

Tech Giants Have Dropped Huge Sums into Hugging Face

🤗 In a blockbuster move, AI startup Hugging Face 🤗secured $235M in Series D funding, led by tech giants like Google, Amazon, Nvidia, Salesforce, and IBM.

🤗 Hugging Face is a hub for open-source AI, offering tools, libraries, and a vast model collection with 500,000 models and 250,000 datasets.

🤗 This funding pushes their total raised amount to $395.2M and their valuation to an impressive $4.5B. CEO Clem Delangue envisions 100M AI builders relying on Hugging Face in 5 years, as AI becomes central to software development.

🤗 This funding spree underscores the surging demand for collaborative AI innovation. Hugging Face is a dominant player in open-source, perfectly positioned for the era of exploding custom AI models.

Check out the full news here.

Started the week on the right footing, more to come. Happy week ahead.

In case you have missed last two posts, do check them out.

Most exciting release "Python in Excel" & Few great alternative BI tools

Can you run python in Excel Sheets? Weekend Data Challenge and AI Civilisations.

datapragmatist.com/p/python-in-excel

What is Kurtosis? Can we teach common sense to AI?

Neuro Symbolic AI & Data Visualisations

datapragmatist.com/p/kurtosis-can-teach-common-sense-ai

🐦 Twitter: @DataPragmatist

💼 LinkedIn DataPragmatist

This post is public, so feel free to share and forward it.