- Data Pragmatist
- Posts
- Problem Solving, Feature Engineering & Prompt Engineering
Problem Solving, Feature Engineering & Prompt Engineering
Can ChatGPT write better SQL than a data analyst?
Welcome to the new format of DataPragmatist. We are trying out new content format based on feedback from you all. Thanks for the feedback. Now the email going to be shorter and condensed for you.
It is Friday and today is about interesting reads about AI and Data Science across the internet from my reading list.
Marie Truong ran set of experiments to see if ChatGPT can write better SQL compared with her. The results are interesting and mixed. But outcome is clear that ChatGPT is not going to replace data analysts.
I think it’s fair to say that I “won” this SQL challenge against ChatGPT. I was still impressed by its abilities, and amazed that it is able to correct its mistake! ChatGPT definitely beats me on speed; it writes valid SQL syntax in a few seconds, whereas I need a few minutes.
So ChatGPT can write queries faster but it makes mistakes 50% of the times. And even if it had completed successfully every challenge, we need not worry about our jobs yet. Stakeholders never come to analysts with such a well-defined request and output examples. They come with a business question, and we have to think about the best way to answer it with available data.
Prompt Engineering can be described as an art form, creating input requests for Large Language Models (LLMs) that will lead to a envisaged output. Here are twelve different techniques in crafting a single or a sequence of prompts.
Prompting is becoming more useful and also critical on how we conduct our career now. Considering everyone have access to the best LLMs, how we prompt the AI makes all the difference whether we will be productive or not. These are the list of 12 prompting techniques which will make you more productive and improves your ability to get the right answer from ChatGPT.
What is problem solving and what it is not from data analysts perspective. It attracts business value and ability to get the insights and recommendations from dry data. It is definitely not having several unrelated metrics/KPIs on a dashboard. Ideal Analyst’s job is to be the connecting bridge between the data and business through insightful insights and inferences. As always, knowing a framework theoretically does not help so practise always even in your little projects. In Nancy Amandi’s words, what problem solving is,
Having several charts on a dashboard that answer sub-questions that all come together to answer one big question.
Feature engineering describes the process of formulating relevant features that describe the underlying data science problem as accurately as possible and make it possible for algorithms to understand and learn patterns. According to the Forbes magazine, Data Scientists spend about 80% of their time collecting and preparing relevant data, with the data cleaning and data organizing alone taking up about 60% of the time. So what are the seven most used techniques in feature engineering.
Encoding
Feature Hashing
Binning
Transformer
Normalise
Feature Crossing
Read further to understand each of the techniques and when to use them.
How did you like today's email? |
If you are interested in contributing to the newsletter, respond to this email. We are looking for contributions from you — our readers to keep the community alive and going.