OPEN-SOURCE SCRIPT

Volume Compressor

Turns volume into a more informative representation, ready to be further analyzed
...

Rationale

Volume
Back in the "before the quant" days I was a big fan of market & volume profile. Thing is J. Steidlmayer had lotta different ideas & works aside of profiling, it's just most of them ain't got to mainstream, one of them was "Hot / Cold volume" (yes, you can't really google it). From my interpretation, the idea was that in a given asset there is a usual constant volume that stays there no matter what, and if it ever changes it changes very slow and gradually; and there's another kind of, so to say, 'active' volume that actually influences price dynamics and very volatile by its nature. So I've met concept lately, and decided to quantify & model it one day when I'll have an idea how. That day was yesterday.

Compression
When we do music we always use different kinds of filters (low-pass, high pass, etc) for equalization and filtering itself. That stuff we use in finance as well. What we also always use in music are compressors, there dynamic processors that automatically adjust volume so it will be more consistent. Almost all the cool music you hear is compressed (both individual instruments (especially vocals) and the whole track afterwards), otherwise stuff will be too quite and too weak to flex on it, and also DJing it would be a nightmare. I am a big adept of loudness war. So I was like, how can I use compression in finance, when ima get an idea? That day was yesterday as well.

Volume structure
Being inspired by Steidlmayer's idea, I decided to distinguish volume this way:
1) Passive / static volume. The ~ volume that's always there no matter what (hedges, arbitrages, spread legs, portfolio parts etc etc), doesn't affect things;
2) Active / dynamic volume. The volume that flows from one asset to another, really matters and affects things;
3) Excess volume. The last portion of number 2 volume, that doesn't represent any powerful value to affect things.

Now it's clear that we can get rid of number 1 and number 3, the components that don't really matter, and concentrate on number 2 in order to improve information gain, both for ourselves and for the models we feed this data. How?

Model
I don't wanna explain it all in statistical / DSP way for once.
First of all, I think the population of volumes is log-normally distributed, so let's take logs of volumes, now we have a ~ normally distributed data. We take linearly weighted mean, add and subtract linearly weighted standard deviation from it, these would be our thresholds, the borders between different kinds of volumes explained before.
The upper threshold is for downward compression, that will not let volume pass it higher.
The lower threshold is for upward compression, all the volumes lower than this threshold will be brought up to the threshold's level.
Then we apply multipliers to the thresholds in order to adjust em and find the sweet spots. We do it the same way as in sound engineering when we don't aim for overcompression, we adjust the thresholds until they start to touch the signal and all good.
Afterwards, we delete all the number 1 and number 3 volume, leaving us exclusively with the clear main component, ready to be processed further.
We return the volumes to dem real scale.

About the parameters, based on testing I don't recommend changing the thresholds from dem default values, first of all they make sense statistically and second they work as intended.
Window length can and should be adjusted, find your own way, or leave the default value. ML (moving location) length is up to you as well.

So yeah, you can see now we can smooth the data and make it visually appealing not only by applying a smooth filter over it.

All good TV?

إخلاء المسؤولية