• Data Pragmatist
  • Posts
  • What's Cool about dbt, Statistical Techniques and Data Science/Analysis Job Roles

What's Cool about dbt, Statistical Techniques and Data Science/Analysis Job Roles

Lessons learnt from a year of implementing dbt at Intercom

Welcome to the new format of DataPragmatist. We are trying out new content format based on feedback from you all. Thanks for the feedback. Now the email going to be shorter and condensed for you.

It is Wednesday and today is about interesting reads about Data Science and some interviews across the internet.

dbt streamlines data transformation with its organized code base, utilizing opinionated folder structures. Noteworthy features include staging models for data consistency, a snapshot command for managing slowly changing dimensions, and efficient lineage and orchestration through "ref" and "source" methods. Macros offer extensibility, and built-in tests simplify test coverage. Automatic documentation generation further enhances the dbt experience.

While dbt presents barriers for new users, its benefits in consistency and documentation outweigh initial hurdles. Ease of iteration and potential boilerplate issues exist, but defining clear win conditions, adopting a gradual "land and expand" approach, and separating schemas can optimize implementation in existing code bases, ensuring a smooth transition to this powerful data transformation tool.

Natassha Selvaraj shares her journey of transitioning from a computer science student with limited practical skills to securing a data analyst job within six months. Recognizing the industry demand for data analysts and the lower entry barrier compared to data science, she focused on acquiring key skills. Her learning path included mastering SQL through a Udacity course, becoming proficient in BI tools like Tableau, and optionally adding Python to her skill set. Natassha emphasizes the importance of practical experience, suggesting that learning by doing, especially through personalized projects, significantly accelerates skill development. Additionally, she highlights the value of networking, personalized outreach to hiring managers, and the significance of Excel proficiency in securing entry-level data analyst positions.

In retrospect, Natassha suggests prioritizing Excel skills for quick job placement, given its prevalence in data storage among many companies. She also recommends leveraging AI tools like ChatGPT and Microsoft Copilot for enhanced data analysis efficiency. Natassha underscores the importance of maintaining a balance between mastering fundamental skills and incorporating new tools and techniques in the dynamic field of data analytics.

The article discusses the evolving role of Data Scientists, emphasizing the importance of statistical learning in data science. James Le highlights the significance of understanding these techniques for effective data analysis and mentions their applications in various real-world scenarios.

It covers ten key statistical techniques:-

  1. Linear Regression

  2. Classification (Logistic Regression and Discriminant Analysis)

  3. Resampling Methods (Bootstrapping and Cross-Validation)

  4. Subset Selection (Best-Subset, Forward Stepwise, Backward Stepwise)

  5. Shrinkage (Ridge Regression and Lasso)

  6. Dimension Reduction (Principal Component Regression and Partial Least Squares)

  7. Nonlinear Models (Step Functions, Piecewise Functions, Splines, Generalized Additive Models)

  8. Tree-Based Methods (Bagging, Boosting, Random Forest)

  9. Support Vector Machines (SVM)

  10. Unsupervised Learning (Principal Component Analysis, k-Means Clustering, Hierarchical Clustering)

Data Science/Analysis Job Roles to Pursue Right Now

  1. Survey Programmer Expert (WFH) at Uplers

  2. Data Consultant with Banking Domain at TribolaTech Inc

  3. Quality Analyst (FinCrime) at Revolut

  4. Data Collector - WFH - No experience needed at TransPerfect

  5. Data Science - Technical Lead at One97 Communications Limited

Today, I want to recommend a newsletter Geek AI - Get 3 min newsletter on what matters in AI.

If you are interested in contributing to the newsletter, respond to this email. We are looking for contributions from you — our readers to keep the community alive and going.