Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Comparing Methods Assignment

This assignment will be completed in teams of students.

Introduction

The purpose of this assignment is to demonstrate your knowledge and understanding of the analytical techniques and tools learned in the course and to show your understanding of how it relates to a business scenario. This assignment is somewhat different from previous ones: I do not give you very detailed instructions on how to build your analytical process in RapidMiner. Instead, you are expected to do the modeling, validation and performance analysis on the given dataset so you could answer the questions below and make some recommendations in the business situation as it applies.

Submission Instruction

Perform the necessary tasks using RapidMiner, answer the questions below and prepare the required screenshots.

- a Word document file with the answers and screenshots to the lettered questions. (Make sure that the lettering of questions stays the same!) Place the team member names on the top of the document. Name your file Comparing Methods Assignment LastName1-LastName2... .docx. (Warning: for full points, make sure that you name documentscorrectly and keep the answers correctly numbered lettered.)

- the RapidMiner project file, named Comparing MethodsAssignment LastName1-LastName2... .rmp. (The project file can be generated from RapidMiner by going to File -> Export Process. Select the destination folder and the name for the file. It will be saved as a .rmp file.)

Instructions

Download the mobile-churn.csvfile posted on Canvas. The file contains a dataset collected by a phone company about attrition, in other words, about customers who cancelled their services and possibly signed up with another company. The company is interested in what it could do to keep customers, to prevent their defection. Look at the data and make some recommendations based on the findings of your analysis.

Here is the explanation of the variables in the dataset:

a. Gender_Female: female or not
b. PhoneService_Yes: whether the customer has phone service with the company
c. MultipleLines_Yes: whether the customer has multiple line service
d. InternetService_DSL: whether the customer has DSL internet
e. InternetService_Fiber optic: whether the customer has Fiber optic internet
f. StreamingTV_Yes: customer streams TV
g. StreamingMovies_Yes: customer streams movies
h. Contract_One year: type of contract for customer: 1 yr
i. Contract_Two year: type of contract the customer: 2 yr
j. PaperlessBilling_Yes: whether the customer signed up for paperless billing
k. PaymentMethod_ Automatic: payment set up to be automatic
l. Retired: 0 for not, 1 for yes
m. Tenure (months): how long has been a customer with the company
n. MonthlyCharges: $ amount of monthly payments for the subscribed services
o. Churn: Whether the customer churned (i.e. is not a customer any more)

1. As a first step, build 3 models using different classification techniques (Neural Net: use the default settings; Decision Tree: use gini_index as the criterion; and Logistic Regression: use the default settings) that are capable of classifying customers into 2 categories (churn/no churn.)Use the X-validation operator right away for each techniques used. Set the number of folds to 3 (it will result in shorter process runtimes).For measuring the performance of the 3 models, look at the following performance measures:Accuracy, Kappa, Lift, F-measure, AUC (NOT the optimistic or pessimistic). (Hint: use the binomial classification performance operator to obtain all of these measures.)

Make 3 readable screenshots of the following for all 3 models (9 screenshots; 9pts):

- Top level processes
- Parametersettings for the 3 different techniques that are inside the cross validation operator
- Appropriate model results (Network, Tree, Weights)

2.

a. Make a screenshot of the confusion matrix output for each of the 3 methods.

b. Prepare a table to report the 5 performance measuresfor the 3 models. Put the different models in the rows and have 5 columns for the 5 measures.

 

Accuracy

Kappa

Lift

F

AUC

NN

 

 

 

 

 

DT

 

 

 

 

 

LR

 

 

 

 

 

c. Discuss the performance for each of the three models based on the performance measures. Relate the performances to the baseline model (calculate thea priori probabilities first!).

Prepare a visual evaluation of the 3 models by including a screenshot of the ROC comparison chart. (Hint: Use the Compare ROC operator. Have the same models with the same parameters as in the other runs above.)

d. Usingthe observed performance measures, compare the performance of the 3 models. Do they perform the same? Which one is better, worse, why?

e. Are the 3 models giving you more or less the same suggestions regarding the important factors/variables? If there are differences, what are they?

3. Choose one of the models (possibly the best performing one) and address the following questions:How can you interpret the results of the model? Which attributes seem to matter the most? How do you know it? Discuss their importance and/or effect sizes.

4. How could the results of the model be useful for the telecommunications company? What business recommendations can be suggested based on the results?

Attachment:- Mobile-Churn.rar

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M92563183

Have any Question?


Related Questions in Computer Engineering

Scenario you have been asked to setup a lvm volume for the

Scenario: You have been asked to setup a LVM volume for the Sales group. Your task is to use /dev/sdb to create a logical volume named sales_lvm, format it with XFS, and mount under /sales. Make sure the sales group owns ...

Please discuss the followingas demand increased for these

Please discuss the following: As demand increased for these mortgage backed securities, lenders reacted by relaxing their approval standards to increase production. No longer were "all" borrowers required to document the ...

Question the use of encryption can have adverse effects on

Question: The use of encryption can have adverse effects on incident response and incident investigations; however, from a security standpoint, encryption is a major component in network confidentiality. Present your arg ...

A small sports club keeps information about its members and

A small sports club keeps information about its members and the fees they pay. The secretary wants to be able to record when members pay and print a report similar to that in the figure below. last name - first_narne - p ...

What is the purpose of load balancing i need full

What is the purpose of load balancing? (I need full explanation) I already know what load balancing is, I just need to know why we use it

Sorting amp searching i need this written in cwrite a

Sorting & Searching ( I need this written in C) Write a program that will allow a user the opportunity to compare and analyze the efficiency of several sorting algorithms. The program will sort integer arrays of size 10, ...

Sam smartypants likes how splitting the problem up into

Sam Smartypants likes how splitting the problem up into halves in merge sort reduces the sorting problem from O(n 2 ) to O(n lg n). He decides that splitting the array into thirds will make things even better. That is, h ...

A single precision ieee 754 number is stored in memory at

A single precision IEEE 754 number is stored in memory at address X. Write a sequence of ARM instructions to multiply the number at X by 16 and store the result back at X. You must accomplish this without using any float ...

Suppose you have two sorted arrays of n integers x01n and

Suppose you have two sorted arrays of n integers: X0[1..n] and X1[1..n]. Devise and analyze an efficient algorithm for finding the median of all numbers in arrays X0 and X1. Now, suppose you have not two, but three such ...

A random sample ofnbsp87nbspeighth gradenbspstudents scores

A random sample of 87 eighth grade? students' scores on a national mathematics assessment test has a mean score of 279. This test result prompts a state school administrator to declare that the mean score for the? state' ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As