Aidan Cooper

Aidan Cooper

London, UK
Tech Lead, working on developing biomedical AI using generative language models. I write about data science, ML, and engineering data products.
Apr
08
A Guide to Structured Generation Using Constrained Decoding

A Guide to Structured Generation Using Constrained Decoding

The how, why, power, and pitfalls of constraining generative language model outputs
13 min read
Jul
22
Modern Data Engineering and the Lost Art of Data Modelling

Modern Data Engineering and the Lost Art of Data Modelling

Necessity was the mother of invention. Now, an abundance of cheap storage and compute makes for data anarchy.
5 min read
Jun
23
Machine Learning in the Life Sciences Has a Data Problem

Machine Learning in the Life Sciences Has a Data Problem

In a time of AI prosperity, the life sciences are at risk of being left behind
6 min read
Jun
07
Approximating Shapley Values for Machine Learning

Approximating Shapley Values for Machine Learning

The how and why of Shapley value approximation, explained in code
6 min read
Apr
07
Homogeneous neighbourhoods in the Schelling model of segregation

Gnillehcs' Model of Integration

What happens to segregated communities as people increasingly seek diversity?
3 min read
Dec
31
A power set of feature coalitions.

How Shapley Values Work

In this article, we will explore how Shapley values work - not using cryptic formulae, but by way of code and simplified explanations
10 min read
Aug
13
Industry Perspective: Tree-Based Models vs Deep Learning for Tabular Data

Industry Perspective: Tree-Based Models vs Deep Learning for Tabular Data

Tree-based models aren't just highly performant - they offer a host of other advantages
3 min read
Jul
11
4 Pandas Anti-Patterns to Avoid and How to Fix Them

4 Pandas Anti-Patterns to Avoid and How to Fix Them

pandas is a powerful data analysis library with a rich API that offers multiple ways to perform any given data
9 min read
May
16
Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis

Supervised Clustering: How to Use SHAP Values for Better Cluster Analysis

Cluster analysis is a popular method for identifying subgroups within a population, but the results are often challenging to interpret
9 min read
Jan
02
Utility vs Understanding: the State of Machine Learning Entering 2022

Utility vs Understanding: the State of Machine Learning Entering 2022

The empirical utility of some fields of machine learning has rapidly outpaced our understanding of the underlying theory: the models
12 min read