For this project, I’ll be working on understanding the customer service experience offered by a hypothetical airline. When support is required, a customer initiates contact with the airline, either over the phone or using an online platform. Once connected, an agent of the airline works with the customer to understand the problem. When the interaction is complete, the agent creates records about the case. This includes the overall category of the issue along with whether the problem was resolved.

The airline would like to better understand the differences between support provided over the phone and online. With different types of interactions, the agents require different kinds of training, and the costs and resources are different in each modality. With this in mind, the airline is quite interested in comparing the quality of each type of service with regard to how well the cases can be resolved. They are also generally interested in improving their customer service and better understanding the experience.

I will be working with the information provided and consider the question of how to best understand the customer service experience.


Data: CSV File airline customer service

For each call, the following characteristics were measured:

  • category: Each case was classified according to the type of service that was requested by the customer.
  • service: This indicates whether the support was provided by phone or online.
  • waiting.time.minutes: This records how long the customer waited to begin interacting with an agent.
  • session.minutes: This records how long the customer interacted with the agent.
  • customer.demeanor: This records the agent’s perception of the customer based on the tone of the conversation.
  • resolved: This measures whether the agent was able to solve the customer’s main concern in the interaction.

The primary research question that will ultimately help the airline is: Which different types of service will ultimately increase the rate of resolved case percentage with customers? The airline is interested in the type of services that will increase the quality of the number of cases solved. (For example Service provided via phone/online)

The research study conducted with the available data is an observational study. We use historical data in order to study the relationship between the types of service and the results- resolve rate of the customer cases. (We do not have the experimental data available)

Yes, there are drawbacks to this observational study. There may be various uncertain factors that may be taken into consideration. There are many confounding factors in the research. There may be a selection bias. Other confounding factors include customer age, and their choice of service, and how serious the case actually is. Observational studies also tend to be difficult to generate casual inferences between the service type and the quality.

If I were to devise my own experiment, I would randomly select customers who have similar types of questions/ similar kinds of issues in the cases. I would also separate the customers into smaller groups based on sectional demographics (such as gender, age, etc) and separate the groups into two different types of services, phone and online. Using the comparative experiment, we would eliminate many confounding factors and have more solid results between the relationship of the different types of services and the quality of service.

Perform a statistical test that would analyze the relationship between the independent and dependent variables. In this test, do not consider any other variables. What would you conclude from this test?

prop.test(table(customer$service,customer$resolved))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  table(customer$service, customer$resolved)
## X-squared = 43.991, df = 1, p-value = 3.299e-11
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.01999422 0.03751558
## sample estimates:
##    prop 1    prop 2 
## 0.2297749 0.2010200

The different types of service highly impact the resolving rate of customer cases because the p-value is less than the 0.05 significance. (p-value is 3.299e-11). The independent variable is which type of service is provided, whether it is online or by phone, and the dependent variable is the number of resolved cases over those that are unresolved. We ultimately reject the bull hypothesis since there isn’t a difference between the two groups’ rates of resolving cases.

Apart from the independent and dependent variables, how should we think about the other variables in the data, and what would be the best way to consider them in the analysis?

The other variables are variables like waiting time, session time, customer behavior, and demeanor. It can have a heavy influence on the dependent variables, waiting time is extremely important to customers and can highly impact the resolve rate. Session time and customer demeanor will also highly impact it. All of this can challenge the validity of the study. We can use logistic regression.

What would be an appropriate way to incorporate these other measured variables into an analysis of the relationship between the independent and dependent variables?

We should use a logistic regression model to test the statistical significance of the other variables and the relationship between the measured variables and dependent variables. The statistical values such as value will conclude if the independent variables will have a big impact on the resolve rate.

Create a model that estimates the effect of the independent variable on the dependent variable while incorporating the other measured variables. Show the estimates and any measures of significance.

```r
model1<-glm(resolved~as.factor(category)+as.factor(service)+waiting.time.minutes+session.minutes+customer.demeanor,data=customer)
summary(model1)
## Call:
## glm(formula = resolved ~ as.factor(category) + as.factor(service) + 
##     waiting.time.minutes + session.minutes + customer.demeanor, 
##     data = customer)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.9156   0.1345   0.1838   0.2254   0.3854  
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    0.6529357  0.0215941  30.237  < 2e-16 ***
## as.factor(category)Service    -0.0432294  0.0080561  -5.366 8.08e-08 ***
## as.factor(category)Technology -0.1116636  0.0066636 -16.757  < 2e-16 ***
## as.factor(category)Tickets    -0.0298330  0.0058737  -5.079 3.81e-07 ***
## as.factor(service)Phone        0.0115662  0.0050013   2.313   0.0207 *  
## waiting.time.minutes          -0.0042963  0.0003323 -12.927  < 2e-16 ***
## session.minutes                0.0167533  0.0021053   7.958 1.79e-15 ***
## customer.demeanorNeutral       0.0966451  0.0096280  10.038  < 2e-16 ***
## customer.demeanorNice          0.0752612  0.0079174   9.506  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.1621341)
## 
##     Null deviance: 8223.9  on 49999  degrees of freedom
## Residual deviance: 8105.2  on 49991  degrees of freedom
## AIC: 50938
## 
## Number of Fisher Scoring iterations: 2
2-sample test for equality of proportions with continuity correction

data: table(customerservice,customerservice,customerresolved) X-squared = 43.991, df = 1, p-value = 3.299e-11 alternative hypothesis: two.sided 95 percent confidence interval: 0.01999422 0.03751558 sample estimates: prop 1 prop 2 0.2297749 0.2010200

We find out that all of the variables and values are less than 0.05, as the waiting time, sessions of minutes, and custom demeanor. This shows a high significance to the solved cases.

Are there any other concerns with regard to this study and its design?

The customers should be grouped into certain demographics in order to ensure validity at the end. There are too many confounding factors and there may be a selection bias.

Because there is a selection bias, and there is a possibility that the variables used aren’t the actual ones impacting the final result. These variables may not have a significant influence on the dependent variables and may cause the conclusion to be not valid.

The customer service organization is also interested in understanding the quality of its work and the overall satisfaction of the customer. At the conclusion of the call, there is an opportunity to conduct a survey. What is the best way to implement this idea?

The best way to implement this idea is to conduct a survey with more solid outcomes so that the data will be much easier to analyze, such as a survey compromised of straightforward questions and numerical or categorical answers to ensure fast and efficient results. (Such as ranking customer satisfaction from one to ten and asking what exactly on the list can be improved upon) We should choose suitable methods to increase our response rate and ultimately decreased the complexity of the questions.

What are some possible topics that you might ask about? Select three potential areas and briefly discuss why these are important to gather information about.

  1. Ask about the customer satisfaction with the service because it’s important we know if our customers are satisfied and if they have any preference of which types of service they prefer.
  2. Ask if they are satisfied with the wait time or if it was reasonable
  3. How effective was the call and if it ultimately solved their problem and resolved the case.

For each of the three areas that you selected above, design a survey question. Keep in mind that the design should be appropriate for the setting. Provide the question, the possible answers, and the meaning of the answers.

Question 1: On a scale of 1 to 5, how satisfied are you with the customer service that you have received today? (1 being the lowest and 5 being the highest)

Question 2: Was the waiting time reasonable, how accessible was it to reach out to us? 1-5, 1 being the lowest and 5 being the highest. (Very unreasonable to very reasonable)

Question 3: How effective were we in helping you solve your problem/resolving the case/coming up with a solution? 1-5 (1 being the lowest and 5 being the highest)

I honestly believe that three questions should be all it takes. Many customers do not have that large of an attention span after their issue is solved and most people do it out of courtesy or politeness but going beyond 3 may be overboard. It’s important to make it simple, sweet, straightforward, and easy to respond to.

The airline has a larger number of questions that it would like to ask. What would be your strategy for gathering all of this information? Explain your answer.

If they have any more questions, a follow-up email can be sent to the customer’s email address directly after the call. Customers are more willing to complete follow-up surveys and go out of their way to complete something if there is an incentive signed up such as an award upon completion, to increase their willingness to participate. We can also put voluntary surveys and questionnaires on the homepage in order to collect some quick data for those who enter our site

In this context, what are the advantages of a longer survey, and what are the benefits for a shorter survey? Explain your answers.

In this scenario, a longer survey would be beneficial because it would increase data collection and more data means more information that we can use. By asking more questions and getting more results, we’re able to increase our chances of improving upon many of these (potentially). However, the downside to this is that asking more questions does not always equate to getting more results, in fact, there may be fewer participants because of the length. This can lead to fewer samples which may not be good for further analysis.

If the airline’s managers are adamant about asking all of the preferred questions, what are some alternatives to this automated survey?

If the airline’s managers are adamant about asking all of the preferred questions, some alternatives may be to use incentives and rewards to increase the rate of response among customers. We can also send out follow-up emails directly after the point of contact on the call or chat letting them know that there is a chance for them to win something (reward bait) Our main goal is to increase feedback and incentives are always great ideas.

Which customers would be more likely to participate in the automated survey after the customer service call, and which customers would be less likely?

The customers that are on the extreme ends of the scale (such as those that are extremely satisfied with the service and extremely dissatisfied/maybe even angry with the service) are more likely to participate in the survey after their point of contact on the call or chat. Customers that also have their cases solved in an extremely fast and timely manner may also respond because they still have the attention span to respond. Ultimately, strong emotions increase customer response to surveys significantly.

How reliable would you consider the information that comes from the automated survey to be?

Assuming that the automated survey is simple, the information should be rather accurate however, because our participants may be biased (either extreme ends of the scale) there may be a skewed result so it may not actually be that reliable. We will end up with data from more emotional customers and less from those that actually define the majority of the customer population, so we have a selection biased. We then have polarized data so it would not be too reliable in the end.

What else could you recommend to the managers of the airline’s customer service center to help them achieve the stated goal of understanding the quality of its work and the overall satisfaction of the customer? Provide a number of strategic recommendations that are actionable, measurable, and amenable to experimentation.

Assuming that it is not done already, it’s vital to keep every single customer’s profile in a neatly organized database so that they can be referenced immediately and have their case solved quicker. This way, we can also track data and see their customer history, the number of times they’ve called or have questions, their customer loyalty to the airline, and scale their priority. It’s also extremely important to monitor every single call in order to ensure customer satisfaction and the best training practices for these customer support agents.