I'm a blog

Remember blogs? I do. I'm a millenial. I am much more comfortable here than your favorite social network.

This photo was taken early in the morning when I was trying to get good light as the sun rose. Ash and I. My dad met up with us. I'm sure he was sleeping at this time.

We loved Cinque Terre even though it is more touristy than any city we've ever been. Ligurian pasta. Hiking. And too many limoncello drinks with another tourist that we met up with at a bar.

I've focused on trying to extract as much value out of my upper middle-class jobs as I can. I can't help but think I am wasting my life doing this. In a thousand years I will be forgotten. Shit. In 100 years I will be forgotten. If I'm lucky I'll be a reference point in a grandchild's story about something they vaguely remember doing at some point in their youth.

I want more time.

I want people in my life to know I love them. Saying it as a one-off doesn't seem to be good enough.

I want people to live like we aren't seeing each other again when we die. I think religion spoils this for us and people act as if there is an infinite amount of time where we exist when nothing is guaranteed.

Am I depressed? I love a lot in my life. Can i be depressed that the people I love will die? That I will? And there is nothing I can do about it? I guess in that sense I do feel depression. It's outside of my control and I cannot reason about it because no wikipedia article on the planet will reconcile the idea of death for me.

I love my wife. My mom. My dad. My grandparents. My dog. My future son. My grandparents. I want us to live forever.

On the other hand I don't want to leave life kicking and screaming, digging my nails into the ground and crying. This is how I would deal with it at this moment. Trying, in vain, to hold on to it all as the lights fade out. I can't be like this.

I want to accept it gracefully. To be at peace with death. But I can't stop rejecting or denying the notion of my consciousness being annihilated. My consciousness contains my thoughts which attach me to the people I love; the only thing I really care about even if I stare at video games or movies mindlessly to pass time.

I'm afraid of nothingness. Nothingness is a void of feeling. A vacuum. Your thoughts are gone. A black hole. It's not even that you won't speak to people again. It's that you won't have the presence to know that you haven't spoken to these people again.

I don't want to leave earth. I don't want to never speak with loved ones again. I can't accept this. How can I?

I hope there is a way to find peace.

Posted in #machinelearning #datascience

Advancements in AI have brought a lot of attention to a number of subdomains within this vast field. One interesting one is natural language processing.

What is a Hugging Face Transformer?

Why don’t we let their pretrained models answer this question?

Transformers provides general-purpose architectures for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages . The library currently contains PyTorch, Tensorflow and Flax implementations, pretrained model weights, usage scripts and conversion utilities for the following models .

Not bad AI. Not bad at all. The above quote is what a pretrained model using a summarization pipeline provides when applied to the contents of the Hugging Face Transformers documentation.

Using these pipelines allow pretty much anybody to get started down the road of natural language processing without much insight into the back-end of PyTorch or TensorFlow.

How to use Hugging Face Text Summarization

First you have to install the transformers package for Python.

pip3 install transformers

Once you have this installed it is a simple matter of importing the pipeline, specifying the type of model we want to run; in this case summarization, and then passing it your content to summarize.

            from transformers import pipeline
            text = "Insert a wall of text here"
            summarization = pipeline("summarization")
            summary_text = summarization(text)[0]['summary_text']
            print(summary_text)

For beginners and experts

The simplicity of these libraries mean you can get started quickly. You can do a lot out of the gate with these libraries and you’ll quickly notice the limitations of the vanilla models. Don’t get me wrong, they are amazing, but if you want to do fine tuning, expect to get reading on some documentation.

I’d suggest identifying a community contributed model that seems interesting and then reverse engineering that if you want to see how they come together.

Ultimately, I believe Hugging Face brings a democratization of NLP for developers in a sense. It is much easier to apply pretrained models to accomplish common tasks such as sentiment analysis, text summarization, and even question generation!

It also opens up NLP and AI practitioners to get involved by contributing to model building and improving the quality of the output that enthusiasts such as myself can enjoy without pouring through documentation tuning parameters when that isn’t my day job!

Give these transformers and pretrained models a try and let me know what you think! Have you found interesting uses for these on any projects?

A blog. Not the kind of blog with carefully curated lists to attract clicks. The kind where I put fingers to keys and just type what I feel like when I feel like. No pressure to build an audience. No pressure to promote. Just me..writing..whatever I want.

What, you may be asking, can you read about here if you commit some of your precious life minutes occupying my digital real estate?

  • Web Dev stuff
  • Cloud infra stuff
  • Data engineering stuff
  • Ruminations on life.

I must warn you potential reader. I am not an expert in any of these. But I do commit life minutes to engaging with them and have decided to write about them.

Posted in #technerdery

With several weeks of Pop! OS as my daily driver, I decided to switch things up and try out KDE. I'm actually realling digging what Reddit has started referring to as “K-Pop! OS”!

You get the customization, look, and feel of KDE with the 'just-works' nVidia setup that my MSI laptop has.

If you want to try it out and you currently have PopOS or another Gnome distro installed you can take KDE for a spin with.

sudo apt install plasma-desktop

It is a few seconds slower to get the desktop situated on a fresh startup but otherwise it's pretty great. Reporting back soon.

Posted in #datascience

A short while ago I published a rather technical post on the development of a python-based attribution model that leverages a probabilistic graphical modeling concept known as a Markov chain. This post elaborates on the mechanics of the underlying markov chain attribution model I built in Python.

I realize what might serve as better content is actually the motivation behind doing such a thing, as well as providing a clearer understanding of what is going on behind the scenes. So to that end, in this post I’ll be describing the basics of the Markov process and why we would want to use it in practice for attribution modeling.

What is a Markov Chain

A Markov chain is a type of probabilistic model. This means that it is a system for representing different states that are connected to each other by probabilities.

The state, in the example of our attribution model, is the channel or tactic that a given user is exposed to (e.g. a nonbrand SEM ad or a Display ad). The question then becomes, given your current state, what is your next most likely state?

Well one way to estimate this would be to get a list of all possible states branching from the state in question and create a conditional probability distribution representing the likelihood of moving from the initial state to each other possible state.

So in practice, this could look like the following:

Let our current state be SEM in a system containing the possible states of SEM, SEO, Display, Affiliate, Conversion, and No Conversion.

After we look at every user path in our dataset we get conditional probabilities that resemble this.

P(SEM|SEM) = .1 P(SEO|SEM) = .2 P(Affiliate|SEM) = .05 P(Display|SEM) = .05 P(Conversion|SEM) = .5 P(No Conversion|SEM) = .1

This can be graphically represented.

Graphical representation of a Markov Chain

Notice how the sum of the probabilities extending from the SEM state equal to one. This is an important property of a Markov process and one that will arise organically if you have engineered your datset properly.

Connect all the nodes

Above we only identified the conditional probabilities for scenario in which our current state was SEM. We now need to go through the same process for every other scenario that is possible to build a networked model that you can follow indefinitely.

Connected nodes

Intuition

Now up to this point I’ve written a lot about the process of defining and constructing a Markov chain but I think at this point it is helpful to explain why I like these models over standard heuristic based attribution models.

Look again at the fully constructed network we have created, but pay special attention to the outbound Display vectors that I’ve highlighted in blue below.

Fully connected with highlighted path

According to the data, we have a high likelihood of not converting at about 75% and only a 5% chance of converting the user. However, that user has a 20% probability of going proceeding to SEM as the next step. And SEM has a 50% chance of converting!

This means that when it comes time to do the “attribution” portion of this model, Display is very likely to increase its share of conversions.

Attributing the Conversions

Now that we have constructed the system that represents our user behavior it’s time to use it to re-allocate the total number of conversions that occured for a period of time.

What I like to do is take the entire system’s probability matrix and simulate thousands of runs through the system that end when our simulated user arrives at either conversion or null. This allows us to use a rather small sample to generalize because we can simulate the random walk through the different stages of our system with our prior understanding of the probability of moving from one stage to the next. Since we pass a probability distribution into the mix we are allowing for a bit more variation in our simulation outcomes.

After getting the conversion rates of the system we can simulate what occurs when we remove channels from the system one by one to understand their overall contribution to the whole.

We do this by calculating the removal effect1 which is defined as the probability of reaching a conversion when a given channel or tactic is removed from the system.

In other words, if we create one new model for each channel where that channel is set to 100% no conversion, we will have a new model that highlights the effect that removing that channel entirely had on the overall system.

Mathematically speaking, we’d be taking the percent difference in the conversion rate of the overall system with a given channel set to NULL against the conversion rate of the overall system. We would do this for each channel. Then we would divide the removal CVR by the sum of all removal CVRs for every channel to get a weighting for each of them so that we could finally then multiply that number by the number of conversions to arrive at the fractionally attributed number of conversions.

If the above paragraph confuses you head over to here and scroll about a third of the way down for a clear removal effect example. I went and made my example system too complicated for me to want to manually write out the the removal effect CVRs.

That’s it

Well by now you have a working attribution model that leverages a Markov process for allocating fractions of a conversion to multiple touchpoints! I have also built a proof-of-concept in Python that employs the above methodology to perform markov model based attribution given a set of touchpoints.2

Anderl, Eva and Becker, Ingo and Wangenheim, Florian V. and Schumann, Jan Hendrik, Mapping the Customer Journey: A Graph-Based Framework for Online Attribution Modeling (October 18, 2014). Available at SSRN: https://ssrn.com/abstract=2343077 or http://dx.doi.org/10.2139/ssrn.2343077

https://github.com/jerednel/markov-chain-attribution