# Datascience

• This blog post was inspired by a conversation that I had recently with some people who were interested in hiring a data scientist. It involved a lot of communication, and defining a number of terms I'd never had to define in collaboration before.

• So we have seen the importance of retention rates, and how they affect expected CLV. So the next question is: how do we estimate the retention rates? So far we’ve been assuming that $$r$$ is already known. Warning: This gets real math, real fast, but it has a nice elegant result at the end. Also remember that we’re still operating with the assumptions that customers pay for a subscription, and if they stop paying (i.e. leave) they’re never allowed back again. So a fantastic business model is what I’m saying. These are the basics remember, we’ll build up to more general and more useful models. What kind of data do we need? […] Model Parameters […] Then equations (3) and (4) are used to derive the maximum likelihood estimate for the retention rate ($$r$$). $L(r)= \prod_{i=1}^{n_1}P(T={t_i})\prod_{i=1}^{n_0}S(c_i + 1) = \prod_{i=1}^{n_1}(1-r)r^{t_i - 1}\prod_{i=1}^{n_0}r^{c_i} \tag{10}$ Which is maximized by waving your magic math wand and taking logs, differentiating with …

• Note: For this blog I’ve tried adding in R function definitions as I go. If you find them useful, let me know. If they’re ugly then also let me know and I’ll just provide links to a repo with the functions in it. […] Using the mighty power of Probability Theory we can now formulate a (relatively simple) probabilistic model for CLV. Let […] RECALL: The assumptions for an SRM are still being applied here. The retention rate ($$r$$) must be constant over time and across customers. Also the event that a customer cancels in period $$t$$ is independent of the event that a customer cancels in any other period. Under these assumptions, $$T$$ has a geometric distribution, which has the following probability mass function (PMF): $f(t) = P(T = t) = r^{t-1}(1-r) \tag{3}$ The rest of the calculations in this section are based on the fact that under the current assumptions, time until cancellation (T) follows a geometric distribution. “But what does this PMF mean for our model?”. …

• The simple retention model (SRM) are applicable in situations where a customer enters into a contract. The contract specifies that the customer must make regular payments at equal time intervals (most commonly monthly). The number of payments to be made is also set at the start of the contract (e.g. 12 monthly payments in a year long contract). Each payment must be of the same amount. In this model, customers are not allowed to cancel the contract before it expires. Here’s a little example: A customer signs up to a 12 month mobile phone plan, paying \$60 per month for the phone and services. The customer has to pay this amount every month, regardless of whether they decide they hate the phone or whether they drop it on tiles and smash the screen (again). There is a lot of information a business may wish to know about the relationships these customers have with products of this type: […] We have to start very much with the basics though, and work towards answering these more …

• In the past, companies were not particularly focused on considering their customers as human beings with emotions and the free will to make choices. Instead they were just viewed as a series of transactions: discrete-time cash flow events that either generated a profit or a loss for the company. So the focus of a business was just to decrease the cost to serve these customers and increase the profit margins of their products. The ultimate goal was to maximise the overall profitability of the company. This worked for a long time but was flip-turned upside down very rapidly. The internet was invented; people became more computer literate and discovered that it’s pretty easy to steal other peoples ideas and implementations. The strategies behind the product-centric view described above were easy to copy. This is a small aspect of the idealogical shift that sparked the realisation in marketing departments that their suite of products are not necessarily their most important assest. …