A data exploration project on the Telco dataset from Kaggle to determine factors that cause customers to cancel the fictional "Telco" service.
The Telco Dataset Analysis was a part of the final project for a Data Analysis elective that I took while at university. The Telco dataset is a popular dataset from Kaggle that is used to practice data analysis and machine learning techniques. The objective of this project was to identify what could cause customer churn, or customers to cancel their fictional "Telco" service. On this project page, I've copied the insights I provided on the final PDF document and their graphs.
Before analyzing the data, the dataset first had to be cleaned and preprocessed. All column data needed to be converted to a numeric datatype and normalized to a value between 0 and 1.
Once preprocessing is complete, the fun part of data analysis can begin! Here is when you need to get creative, and try to create different graphs to try and visualize the data and to try and identify causes or traits that can result in customer churn. This is important because the conclusions from the data analysis can be used to drive marketing campaigns, business decisions, and product development. Underneath each of the graphs, I will post my commentary.
Heat Map - How do all the categories relate to one another?
The correlation map is pretty large, and also rather interesting. There are a lot of variables that are examined, so of course a lot of variables have low correlation. It is very interesting to see the positive/negative correlations between variables. Specifically looking at churn for now, having a Monthly contract, Fiber Optic internet, having a high monthly charge, and (surprisingly) paperless billing are the most impactful factors that are proportional to churn. On the other side of the spectrum, long tenure, having a Two/One Year Contract, and being a long-paying customer (Total Charges) seem to make a customer less likely to churn, with those all being inversely related to churn.
Some other interesting relations: An obvious being having a partner and having dependents, and total charges leading to a higher tenure. It is interesting how having a two-year contract and a monthly contract having virtually the opposite effects on a few things. Importantly, on any additional internet packages the customer has, churn rate, tenure, and dependents. Also, noticing how much more expensive fiber internet is than DSL. Lets explore some more and get a more focused look at how everything relates to Churn.
Bar Chart - Relationship to Churn
Similar to what we saw on the heat map, this is a more focused view on what effects a user to churn or not. It seems that the top 7 characteristics contributing to churn are having are: having a monthly contract, having fiber internet, having paperless billing, having a high monthly charge, not having a two/one year contract (basically having a monthly contract) and having a low amount of total charges. Some of these are interesting, and make sense. People with monthly contracts typically have an easier time cancelling, and don't need to wait months to cancel if they are not satisfied. On the contrary, two/one year contract holders are likely happy with their services (hence their confidence in having a long contract). Fiber is also very highly related to churn, possibly telco offers very poor fiber internet? Paperless billing seems a bit random, and possibly by chance that it has a high correlation to churn.
Pie Chart - What Services did Churn have?
This pie chart gives us a lot of insight as to what services may be making customers unhappy and forcing them to cancel their service (as well as some services that customers may be happy with). People seem rather unhappy with the phone service, as well as having multiple phones lines with telco. In addition people seem much more unhappy about fiber internet versus DSL internet. However, people appeared to be relatively happy with the internet-related packages offered by telco, such as Online Security, Online Backup, Device Protection, and Tech Support. The streaming services appear to have minimal impact on churns. For some reason as well, paperless billing is a large factor as to why
customers churn, which seems quite odd to me. I thought this was a coincidence at first, since Paperless Billing is not something that can really have "quality". Maybe there is something else to this.
Aggregate Bar Plot - Monthly vs One Year vs Two Year
The monthly users have a significantly higher churn rate than One Year and Two Year contract holders. Why is
this? A significant amount use Fiber Internet over DSL internet. However, people are more likely to buy internet-related packages if you have a longer contract. Nothing seems to be a revealing factor that may cause monthly users to cancel, besides them being more likely to use fiber internet. Overall, the most significant relations we can draw from this graph are that you are less likely to churn if you have some of the various internet packages Telco offers, and if you have DSL over Fiber internet. Contract length seems to be a very large player in terms of churn, as nearly 33% of Month-to-month contract holders churn, while less than 12% of one and two year contract holders churn.
Boxplot - How much were churns spending, and how long?
This plot reveals a few interesting trends to us. The plot indicates to us that people who churn have an average monthly higher than non-churns, a lower total charge than churns, and a significantly lower tenure than non-churns. Now, what can we make of all of this? It appears that people who churn are most likely people with high monthly charges and have not been long-tenured customers (by tenure length and total charges). This could result from new customers being quickly dissatisfied with their service after having spent more money on average, and immediately trying to find a new service that offers similar, but 'better' services than Telco. Longer-tenured customers may also be unhappy about some of the offerings by Telco, but could be stubborn to change it. People tend to be loyal to service providers after being customers for a while, and may be unhappy with quality of some services yet not feel like going through the effort of actually changing their service provider. This plot does great at giving more description on what a churning customer is like.
Initial Takeaways
Some important charachteristics of churning customers are:
High Monthly Charge (due to Fiber internet over DSL?)
Short Tenured Customer
Has a Monthly Contract versus 1 or 2 Year Contract
Has Fiber Internet
Has Phone Service (and multiple lines)
Has paperless billing activated
Does not have many of the additional internet packages
Logistic Regression Bar Plot - Can customer churn be predicted?
Total Charges, Fiber Optic Internet, Monthly Contract, and having a streaming service are most strongly correlated with a customer churning. Meanwhile, Tenure, Monthly Charges, Having no internet service, and having a two year contract are all strongly associated with a customer staying with Telco.
These results partially align with the analysis from above. Starting with the contract lengths, it appears that both Monthly Contract and Two - Year Contract both have the same effects still, however a one-year contract has a more neutral effect in Section 3 than in Section 2. Total Charges has a completely different effect in Section 3 than in Section 2, with Total Charges in Section 3 being the most likely to cause churn, while it was the third least-likely to cause churn in Section 2. I feel like our model incorrectly predicted the impact of Total Charges, and its magnitude in which it changed could have affected the ammount of False Potiives and False Negatives. The same goes for Monthly Charges, it has an opposite affect on churn in our machine learning model compared to our data analysis. Fiber Optic strongly contributed to churn in both sections as well. Having Movie or TV Streaming seemed to cause churn more in Section 3 than in Section 2. Other than that, everything appeared to be mostly the same or it had an insignificant change.
Graphs have been plotted, and insights have been seen! Now, it is time to wrap everything up and complete the data analysis. My conclusion is below:
Throughout this analysis on Telco, the tendencies of their customers, and what they purchased, there were a few factors that stood out consistently and significantly. To summarize once more, being a long-tenured customer, having a high monthly bill, having a two or one year contract, having neither of the internet provider services, and having some of the various internet-related packages such as Tech Support and Online Security are all factors that push a customer to stay with Telco. All of these factors were seen to be inversely related to churn in both our data analysis and in our machine learning model that predicts if a customer will churn. On the contrary, having fiber optic internet and having either movie streaming or television streaming contributed towards a customer churning.
Now, we can see that in our data analysis, Total Charges was inversely related to churn while Monthly Charges was directly related to churn. However, in our machine learning model they both flipped: Total Charges was now directly related to churn and Monthly Charges were inversely related to churn. Despite this change, I am going to stick with the data analysis' showings on Total Charges and Monthly Charges and say that Total Charges play a significant factor in preventing churn and that Monthly Charges play a significant factor in causing churn. This is because all throughout our data analysis, specifically in the boxplot, we see evidence that upholds this. In addition, it makes sense that a long-tenured customer would have more total charges, and that they would both have the same effect on churn, and not opposites. Resultant from our data analysis, I would recommend Telco do the following:
Invest money into improving fiber optic internet and DSL internet: Fiber Optic is one of the strongest factors contributing to churn, and it is clear as day that Telco customers are not happy at all with their fiber optic internet. Telco should invest money into improving fiber optic especially as it rises in popularity among homeowners or they risk more unhappy customers. DSL was not as bad as fiber optic, and infact was always inversely related to churn. However, you can see how impactful having no internet was on a customer staying with Telco. I think it is safe to say that Telco can greatly improve in the internet which they offer.
Give incentive for one and two year contracts: We see that customers who are on one and two year contracts are less likely to churn than customers on monthly contracts. Whether it be a temporary discounted rate, free packages for upgrading, or anything else, having more people on longer contracts will keep customers more likely to stay. In addition, that increases their tenure and total charges also. Looking at our bar graph comparing various categories' variance from contract to contract, customers with longer contracts also tend to spend more money on additional packages from Telco. Speaking of internet packages...
Encourage customers to purchase internet packages: Although their overall influence may be small, it is consistent throughout our study and should not be neglected. Similar with the contracts, try and get more customers to buy internet-related packages. It is clear that current customers with these packages enjoy them.
Improve quality of Movie and TV Streaming: Telco's streaming services were not drastically correlated with churn, it is significant enough to be brought up. I'd recommend Telco survey their customers and see what they may be unhappy with about their stremaing services. Is there a poor selection of things to watch? Maybe their app has a poor graphical interface and is hard to navigate? Or their streaming could buffer frequently while trying to watch. There are loads of things that could cause people to be unhappy with their streaming services, so finding out more information about this and trying to improve them could turn something that pushes customers away from Telco into something that draws them to Telco.
Senior Citizen Discount!: Being a senior citizen surprisingly relates to a customer churning. Maybe offering a senior citizen discount can help to somewhat resolve churn!
Family Plan: We see throughout our analysis that having a partner and/or dependents is likely to keep a customer with Telco. Maybe if a Family Plan is offered, then more people with families will enroll with Telco's services and hopefully become long-tenured customers.
Take Care of Long-Tenure Customers: Long-tenured customers (assuming that all of them have a decent amount of total charges) are also seen as being the most important factor in keeping a customer at Telco. I would advise that Telco do things every once in a while that makes long-tenured customers feel appreciated. It does not have to be anything big, but atleast something can keep them happy.
This concludes my data analysis on the Telco company. Thank you for reading through my report!