Impromptu Engineer

Apr

08

A Guide to Structured Generation Using Constrained Decoding

The how, why, power, and pitfalls of constraining generative language model outputs

Apr 8, 2024

13 min read

Jul

22

Modern Data Engineering and the Lost Art of Data Modelling

Necessity was the mother of invention. Now, an abundance of cheap storage and compute makes for data anarchy.

Jul 22, 2023

5 min read

Jun

23

Machine Learning in the Life Sciences Has a Data Problem

In a time of AI prosperity, the life sciences are at risk of being left behind

Jun 23, 2023

6 min read

Jun

07

Approximating Shapley Values for Machine Learning

The how and why of Shapley value approximation, explained in code

Jun 7, 2023

6 min read

Apr

07

Homogeneous neighbourhoods in the Schelling model of segregation

Gnillehcs' Model of Integration

What happens to segregated communities as people increasingly seek diversity?

Apr 7, 2023

3 min read

Dec

31

How Shapley Values Work

In this article, we will explore how Shapley values work - not using cryptic formulae, but by way of code and simplified explanations

Dec 31, 2022

10 min read

Aug

13

Industry Perspective: Tree-Based Models vs Deep Learning for Tabular Data

Tree-based models aren't just highly performant - they offer a host of other advantages

Aug 13, 2022

3 min read

Jul

11

4 Pandas Anti-Patterns to Avoid and How to Fix Them

pandas is a powerful data analysis library with a rich API that offers multiple ways to perform any given data

Jul 11, 2022

9 min read

May

16

Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis

Cluster analysis is a popular method for identifying subgroups within a population, but the results are often challenging to interpret

May 16, 2022

9 min read

Jan

02

Utility vs Understanding: the State of Machine Learning Entering 2022

The empirical utility of some fields of machine learning has rapidly outpaced our understanding of the underlying theory: the models

Jan 2, 2022

12 min read

Featured Articles

A Guide to Structured Generation Using Constrained Decoding

Modern Data Engineering and the Lost Art of Data Modelling

Machine Learning in the Life Sciences Has a Data Problem

How Shapley Values Work

Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis

Utility vs Understanding: the State of Machine Learning Entering 2022

Explaining Machine Learning Models: A Non-Technical Guide to Interpreting SHAP Analyses

latest

A Guide to Structured Generation Using Constrained Decoding

Modern Data Engineering and the Lost Art of Data Modelling

Machine Learning in the Life Sciences Has a Data Problem

Approximating Shapley Values for Machine Learning

Gnillehcs' Model of Integration

How Shapley Values Work

Industry Perspective: Tree-Based Models vs Deep Learning for Tabular Data

4 Pandas Anti-Patterns to Avoid and How to Fix Them

Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis

Utility vs Understanding: the State of Machine Learning Entering 2022