Explicable Modelling

Tailored Modelling

I wrote the open-source interpretable modelling library Durkon from scratch, using nothing more complicated than NumPy and Pandas. Because it’s my own work, I know exactly how to modify and extend it to fit your organisation’s unique needs.

"Hugh is the best I've ever worked with at certain invaluable things: he thinks from first principles and covers all of the many ways that ML training and deployment goes wrong. I've seen him implement unprecedented techniques by hand, all the while watching for the actual business effect behind the metrics." Gavin Leech, AI Researcher, cofounder of Arb Research

Inherent Interpretability

All Durkon models can be represented as a collection of Partial Dependency Plots and/or Relativity Tables. This means the decisions they make will always be easy to explain to your clients / customers / regulators / underwriters / superiors / subordinates / self. (Interactive example graphs are here.)

"We've used multiple Durkon models in our model stack, both for legible uses internally, but also as live models - they are both quicker to build and more accurate than [handmade] GLM models. Hugh's support to build and deploy the models, as well as understanding business problems to update and improve on the model features, has been second to none." Aled Price, Head of Pricing, By Miles

Eloquent Documentation

I am that most precious of things, an engineer who knows how to communicate. I wrote everything on this website; I’ll write the documentation for my work in a similar style and to a similar standard.

"Hugh is very good at using words and images to explain technical concepts in clear and engaging ways." Dr Jessica Rumbelow (née Cooper), AI Researcher

Optional Extras

Free Demo

If you want to confirm ahead of time that I can handle whatever problem your company is facing, select or generate an appropriate dataset and I’ll use it to demonstrate what my tools can do before beginning paid work.

A dataset.
Details of the model you’d need.

A model built using that dataset.
Confidence that Explicability is the right choice for your business.

“Willis Towers Holmes” Guarantee

Every time I’ve seen a model made by Willis Towers Watson, or using their software, it’s turned out to contain multiple trivial errors. If you’re using an Emblem model, I guarantee I can provide something that does the same job better.

Whatever you gave WTW and/or your Emblem modelling team.
Whatever they gave you in return.

A better model.
A detailed explanation of everything wrong with your current model.
. . . or your money back!

Personnel Training

Most clients just want models and model-producing scripts. However, if you’re interested in developing in-house talent for using Durkon, I'd be very happy to talk through my approach in greater detail.

Someone who knows Python, NumPy, and Pandas to an acceptable level.

Someone who knows everything they need to know about Durkon modelling.

Model Adjustment

You may have a model you want to update for a new context, or to which you want to add a handful of new features, while leaving most of the structure intact. Durkon is quite capable of handling that.

Your current model’s predictions on a training dataset.
A list of features I should use for adjustments.

A list of suggested adjustments.

Novel Constraints

Do you want some features to have monotonic effects on the predicted outcome? Are there columns which should never produce a >30% change to the final prediction? Is there some other limitation that you want your model optimized around? Let me know, and I’ll build something which ticks all your boxes without crossing any of your red lines.

A list of things you want the model (not) to do.

A model which does(n’t) do those things.

Feature Selection

Data costs money, and using more of it complicates models while increasing regulatory burdens. My modelling approach allows me to use Lasso Penalization – combined with common sense – to make sure models only use the features they need.

Guidelines for how to make the tradeoff between performance and feature count (“use the 15 best features”, “use as few externally-sourced features as possible”, etc).

A model built using those guildelines.

GBT Benchmarking

In addition to being a specialist in interpretable modelling, I also know how to use standard ML libraries and paradigms. While benchmarking should ideally be handled in-house (otherwise, I’m marking my own work), if you want me to build a GBT or Neural Net to compare my Durkon model to, that’s not a problem on my end.

A frankly inexplicable level of faith in my integrity.

An XGB model which does the same thing my legible model does.
Comparisons between the Durkon and XGB models on key performance metrics.

Drift Correction

When context changes, models become systematically less accurate. Absent correction, high predictions will in general be too high, and low predictions will in general be too low.* If you expect a multi-year lag – or any other kind of change in context – between getting your data and deploying the model built on it, I can adjust for expected model decay.

Data from X years ago.
Data from 2*X years ago.
Ideally, data from 3*X and 4*X years ago.

A model which avoids systematic biases towards extreme predictions when predicting in today’s context.

Error Modelling

Model accuracy can be represented numerically. Things which can be represented numerically can be modelled. Therefore, it’s possible to model how accurate your model will be, predict which predictions are most credible, and provide that information alongside the predictions themselves.

A dataset where the response variable is predictable enough from the explanatory variables that error modelling is feasible.

An error-predicting model which predicts how accurate the main model will be for each row.

Tobit Modelling for Censored Data

Sometimes**, part or all of your response column will not be of the form “[thing you want to predict]=x”, and will instead be of the form “[thing you want to predict] is greater/less than x”. I’ve spent quite some time figuring out exactly how to best work around this kind of distortion, and I’d be very interested in sharing my expertise.

A dataset with censored response variables.
An indication of which records are censored and how.

A model which makes accurate and unbiased predictions despite that censorship.

Flexible Deployment

Durkon models are simple and intuitive enough that they can easily be converted into Emblem and other formats by hand, and low-dependency enough that they can be deployed in any backend with Python, NumPy and Pandas. However, if you want model output to automatically take a particular shape, I’m happy to oblige.

An example of the target model format.

A Durkon model expressed in that model format.
A tool for converting Durkon models to that model format.

Custom Options

Want something that’s not in this list? Tell me, and I’ll look into it. And if it’s something that seems like it’d be a good addition to Durkon, I’ll build it for free and release it as open-source.

A new challenge.

Creativity, ingenuity, and dedication.

*To see why this is true, consider the pathological case where changes in context are so extreme that your model is uncorrelated with reality: if you guess people’s heights randomly, the people you expect to be tall will be on average shorter than you expect, and the people you expect to be short will be on average taller than you expect.

**For example, when modelling market trends in sealed-bid first-place auctions if you only have the winning bids and are already part of the market.

Other Work

Model Fisking

Somehow, not everyone who goes to the trouble of making an explicable model finds time to explicate it. If you already have a transparent model – i.e., anything built using Excel or Emblem – I can look through it for you, and produce a list of suboptimalities / potential regulatory issues / just-plain-weirdness for you to address.

D&D.Sci

I make Data Science challenges and release them for free online. This is more of a hobby than a service, but if you offered me large amounts of money to write challenges tailored to your requirements I wouldn’t say no.

Conceptual Proofreading

I like editing things, and I’m pretty good at it. If you want me to look through a document for mistakes – grammatical, stylistic, factual, logical, strategic, or moral – I’d charge very reasonable rates. (. . . by Data Science standards, at least.)