Modern Data Engineering and the Lost Art of Data Modelling
Necessity was the mother of invention. Now, an abundance of cheap storage and compute makes for data anarchy.
Machine Learning in the Life Sciences Has a Data Problem
In a time of AI prosperity, the life sciences are at risk of being left behind
Approximating Shapley Values for Machine Learning
The how and why of Shapley value approximation, explained in code
Gnillehcs' Model of Integration
What happens to segregated communities as people increasingly seek diversity?
How Shapley Values Work
In this article, we will explore how Shapley values work - not using cryptic formulae, but by way of code and simplified explanations
Industry Perspective: Tree-Based Models vs Deep Learning for Tabular Data
Tree-based models aren't just highly performant - they offer a host of other advantages
4 Pandas Anti-Patterns to Avoid and How to Fix Them
pandas is a powerful data analysis library with a rich API that offers multiple ways to perform any given data
Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis
Cluster analysis is a popular method for identifying subgroups within a population, but the results are often challenging to interpret
Utility vs Understanding: the State of Machine Learning Entering 2022
The empirical utility of some fields of machine learning has rapidly outpaced our understanding of the underlying theory: the models
Explaining Machine Learning Models: A Non-Technical Guide to Interpreting SHAP Analyses
With interpretability becoming an increasingly important requirement for machine learning projects, there's a growing need for the complex outputs of techniques such as SHAP to be communicated to non-technical stakeholders.